SoC Logo HowTo
Perform Bit Twiddling in C Prof. Bartenstein
 

Contents:

What is Bit Twiddling?

In the C language, the basic unit of memory is an 8-bit byte. In fact, in UNIX, the smallest piece of memory that can be read or written at a single time is a single byte. The concept of "bit twiddling" is code used in the C language to manipulate individual bits within a larger piece of memory. We can use bit twiddling to implement single bit flags, or arbitrary bit fields which are multi-bit subset fields of byte or multi-byte values.

Bit Twiddling is performed on integer variables. Integer variables can be declared as one of four built-in types, with typical memory usage as follows:

Type Bytes Bits
Char 1 8
Short 2 16
Int 4 32
Long 8 64

Because we are dealing with individual bits, when bit twiddling, programmers typically use the unsigned versions of these types to avoid corruption of field values when C propagates a sign bit.

Bit twiddling is accomplished using the C bit-wise operators with masks, fields of the same length as our target value, with these bit-wise operators to specify which bits within the target to operate on. See below for more details on how each bitwise operator works, and how to generate masks to use with the bitwise operators.

Bit twiddling also depends on the use of the C shift operators. More details on the shift operators appear below as well.

arrow_circle_up

Bitwise Operators used in Bit Twiddling

Bit twiddling is accomplished using four important bit-wise operators:

~
bitwise INV, invert to flip ones to zeroes and zeroes to ones.
|
bitwise OR to turn bits on
&
bitwise AND to turn bits off.
^
bitwise XOR to reverse the value of individual bits.

This section describes each of those operators in more detail. We use masks, fields of the same length as our target value, with these bit-wise operators to specify which bits within the target to operate on. See the masks section below for more on how to generate masks to use with the bitwise operators.

arrow_circle_up

The Bitwise INV (~) operator

The Bitwise INV (~) operator returns the inverse of each column of its single operand, basically flips all the bits from 1 to 0 and 0 to 1. The width (number of columns) depend on the number of bits in the operand.

For example, the C code:


  unsigned short a = 0b0001000111101110;
  unsigned short c = ~a; // Flip all the bits in a, and put the result in c
... would put the inverted values of each bit in a into c, as follows.

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 0 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0
c 1 1 1 0 1 1 1 0 0 0 0 1 0 0 0 1

Even column indexes are shaded to make the table easier to read.

arrow_circle_up

The Bitwise OR (|) operator

The Bitwise Or (|) operator performs an OR operation on each column of its two operands, and returns the results. The width (number of columns) depends on the number of bits in the operands. (If there are different sized operands, the compiler converts to the larger of the two sizes before performing the OR.)

For example, for the C code:


    unsigned short a = 0b0001000111101110;
    unsigned short b = 0b0010000011001100;
    unsigned short c = a | b; // Bitwise OR a and b, and put the results in c.
...then the result in c would be based on the bitwise OR of the columns in a and b, as follows:

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 0 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0
b 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 0
c 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 0

This table shows that bit 0 from variable a has a 0, which is ORed with bit 0 from variable b, which is also a 0 to produce a 0 in the bit 0 position of c. Bit 1 of a is 1, bit 1 of b is 0, which when ORed together produces a 1 in bit 1 of c, and so on.

We can make a mask with 0's in columns we don't want to change, and 1's in columns we want to set to 1. This mask can then be bitwise OR'ed with a value. This works because:

For example, if we want to turn on all four bits in positions 7-4, we can bitwise OR with the mask that contains the values 0b0000000011110000, as in:


  unsigned short result = a | 0b0000000011110000;
That would result in (for example) the following:

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 0 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0
mask 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0
result0 0 0 1 0 0 0 1 1 1 1 1 1 1 1 0

In this example, bits 15-8 and 3-0 (in cells with a green or blue background) are unmodified in the result because the mask is 0 in those positions, but bits 7-4 (shown in cells with a pink, white, or yellow background) are all turned on in the result because of the 1's in the mask. The values in positions 7-4 in a (pink background) are disregarded in the result.

See Making Masks below for more detail on how to make a useful mask.

arrow_circle_up

The Bitwise AND (&) operator

Similar to the bitwise OR operator, the bitwise AND operator performs a bitwise AND operation between each column of its two operands, and returns the results. The width (number of columns) depends on the number of bits in the operands. (If there are different sized operands, the compiler converts to the larger of the two sizes before performing the AND.)

For example, for the C code:


    unsigned short a = 0b0001000111101110;
    unsigned short b = 0b0010000011001100;
    unsigned short c = a & b; // Bitwise AND a and b, and put the results in c.
...then the result in c would be based on the bitwise AND of the columns in a and b, as follows:

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 0 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0
b 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 0
c 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0

This table shows that bit 0 from variable a has a 0, which is ANDed with bit 0 from variable b, which is also a 0 to produce a 0 in the bit 0 position of c. Bit 1 of a is 1, bit 1 of b is 0, which when ANDed together produces a 0 in bit 1 of c, and so on.

We can use a bitwise AND with a mask that contains a 1 in every position we want to preserve, but a 0 in every bit position we want to turn off, because:

For example, if we want to turn off all four bits in positions 7-4, we can bitwise AND with the mask that contains the values 0b1111111100001111, as in:


  unsigned short result = a & 0b1111111100001111;

That would result in (for example) the following:

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 0 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0
mask 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1
result0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 0

This time, bits 15-8 and 3-0 (in cells with a background green) are unmodified from a to the result because the mask is 1 in those positions (cells with a blue background). Bits 7-4 are ignored in a (cells with a pink background), and replaced with 0 in the result (cells with a yellow background) because the mask has 0's in those positions (cells with a white background).

See Making Masks below for more detail on how to make a useful mask.

arrow_circle_up

The Bitwise Exclusive XOR (^) operator

Similar to the bitwise OR and bitwise AND operators, the bitwise XOR operator performs a bitwise Exclusive Or operation between each column of its two operands, and returns the results. The width (number of columns) depends on the number of bits in the operands. (If there are different sized operands, the compiler converts to the larger of the two sizes before performing the XOR.)

For example, for the C code:


    unsigned short a = 0b0001000111101110;
    unsigned short b = 0b0010000011001100;
    unsigned short c = a ^ b; // Bitwise AND a and b, and put the results in c.
...then the result in c would be based on the bitwise AND of the columns in a and b, as follows:

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 0 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0
b 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 0
c 0 0 1 1 0 0 0 1 0 0 1 0 0 0 1 0

This table shows that bit 0 from variable a has a 0, which is XORed with bit 0 from variable b, which is also a 0 to produce a 0 in the bit 0 position of c. Bit 1 of a is 1, bit 1 of b is 0, which when XORed together produces a 1 in bit 1 of c, and so on.

We can use a mask and a bitwise XOR to with 1's to invert certain bits and 0's to leave bits unchanged. This is because:

For example, if we want to invert all four bits in positions 7-4, we can bitwise XOR with the mask that contains the values 0b0000000011110000, as in:


  unsigned short result = a ^ 0b0000000011110000;

That would result in (for example) the following:

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 0 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0
mask 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0
result0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0

In this case, where the mask is 0, bits 15-8 and 3-0 (blue background cells), the a value is transferred directly to the result (green background cells), but where the mask value was 1, positions 7-4 (white background cells), the a value (pink cells) was inverted before being copied into the result (yellow cells).

See Making Masks below for more detail on how to make a useful mask.

The bitwise XOR has one other useful property that is very important for secret codes. If you bitwise XOR with a random mask, it flips random bits in the result, producing a randomized result. But if you XOR again with the same mask, you can recover the original bits. For example:


  int a=getPassNum(); // Get the users password number
  int key=rand(); // Get a random key.
  int secretValue=a ^ key; // Make a secret value that no longer contains the user passnum
  ...
  int passNum=secretValue ^ key; // Recover the passNum from the secret value

arrow_circle_up

The Shift Operators in C

The shift operators in C move bits within a given value, either to the left or to the right.

<<
The shift left operator moves bits to the left.
>>
The shift right operator moves bits to the right. A shift right with signed operand uses the shift right arithmetic semantics. If the operand is unsigned, uses the shift right logical semantics.

Details for each shift operator are described below.

arrow_circle_up

The Shift Left (<<) operator

The Shift Left (<<) operator moves bits within the first operand to the left by the number of bit positions specified in the second operand. Shift left discards extra bit values on the left of the input, and pads on the right with zeroes.

The second argument to the shift operator, the number of bits to shift, must be less than the number of bits in the first operand, the value to shift. If this is not true, the results are undefined! Different compilers or machine architectures may produce very different results.

Here's an example of a valid shift left:


  short a = 0b1101001101110001;
  short c = a << 7;

... The bits in a are shifted to the left 7 positions and the result is stored in c

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 1 1 0 1 0 0 1 1 0 1 1 1 0 0 0 1
c 1 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0

Notice that the extra 7 bits on the left side of a with pink background have disappeared altogether in c, the green background bits in positions 8-0 in a have moved to positions 15-7 in c, and the right side of c has been padded with 7 zeros with yellow background.

arrow_circle_up

The Shift Right (>>) operator on Signed Values

The Shift Right (>>) operator moves bits from the first operand to the right by the number of positions in the second operand. Shift right discards extra bit values on the right of the input, and pads on the left with the sign bit from the first argument. This ensures that the sign of the result is preserved. (This padding is different if the first operand is a signed values than for unsigned values.)

Similar to a shift left, The second argument to the shift right operator, the number of bits to shift, must be less than the number of bits in the first operand, the value to shift. If this is not true, the results are undefined!

Here is an example of a shift right on a signed operand.


  short a = 0b1101001101110001;
  short c = a >> 7;

... The bits in a are shifted to the right 7 positions and the result is stored in c

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 1 1 0 1 0 0 1 1 0 1 1 1 0 0 0 1
c 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 0

The extra 7 bits on the right side of a with pink background have disappeared altogether in c, the green background bits in positions 15-7 in a have moved to positions 8-0 in c, and the left side of c has been padded with the sign bit (position 15) from a.

arrow_circle_up

The Shift Right (>>) operator on Unsigned Values

The Shift Right (>>) operator moves bits from the first operand to the right by the number of positions in the second operand. Shift right discards extra bit values on the right of the input, and pads on the left with zeroes. (This padding is different if the first operand is an unsigned values than for signed values.)

Similar to a shift left, The second argument to the shift right operator, the number of bits to shift, must be less than the number of bits in the first operand, the value to shift. If this is not true, the results are undefined!

Here is an example of a shift right on an unsigned operand.


  unsigned short a = 0b1101001101110001;
  short c = a >> 7;

... The bits in a are shifted to the right 7 positions and the result is stored in c

Var 15 14 13 12 11 10 9 8 7 6 5 4321 0
a 1 1 0 1 0 0 1 1 0 1 1 1 0 0 0 1
c 0 0 0 0 0 0 0 1 1 0 1 0 0 1 1 0

The extra 7 bits on the right side of a with pink background have disappeared altogether in c, the green background bits in positions 15-7 in a have moved to positions 8-0 in c, and the left side of c has been padded with 7 zeros (the unsigned "pad" bit) with yellow background.

Notice that the shift operator doesn't care whether the target (in this case, c) is signed or unsigned. The left side padding is based only on whether the first argument to the right shift operator (in this case, a) is signed or unsigned.

arrow_circle_up

Making Masks for Bit Twiddling

Making single 1 bit masks is quite simple in C. Making masks with multiple contiguous one bits is a little harder, but not too hard. General masks are a little bit harder. This section describes all three techniques.

arrow_circle_up

Single 1 Bit Masks

The simplest form of 1 bit masks is to create a literal value for that mask using hexadecimal or binary literals. Often this is done using pre-processor #define statements so that each bit can be "named". For example:

  #define ERRORFLAG 0x00000001
  #define WARNINGFLAG 0x00000002
  ...
  status |= ERRORFLAG; // Turn on error flags
  ...
  status &= ~WARNINGFLAG;// Turn off the warning flag
  ...
  if (status & ERRORFLAG) {
    // If the errof flag is on...
  } else if (status & WARNINGFLAG) {
    // If the warning flag is on
  }

In this example, the error flag is defined as the rightmost bit in the status word. The warning flag is the bit one position to the left of the error flag. Only two flags were shown, but it's easy to pack up to 32 flags into a single status, each of which are kept independently. A flag can be turned on by bitwise ORing the flag literal value with the status word. A flag can be turned off by bitwise ANDing the inverse of the flag literal (e.g. ~WARNINGFLAG) with the status word. A flag can be checked by bitwise ANDing the literal value with the status word and checking to see if the result is true (non-zero) or false (zero).

Using this technique works quite well for static flags, where the programmer knows exactly which bit position to work on. In more dynamic cases, the programmer may want to access a flag at some arbitrary position. Assuming the programmer can evaluate the bit position of the flag in question, it's easy to make a mask using the idiom:


  unsigned int mask = 1<<bitPos;

This results in a 32 bit mask, and as long as bitPos is a variable whose value is a valid bit position between 0 and 31, the code will create a mask with a single 1 at the requested bit position.

To make a 64 bit mask, you must force C to use a 64 bit literal before shifting, as in:


  unsigned long mask = 1L<<bitPos;

In this case, bitPos can have values between 0 and 63.

If you don't add the "L" suffix to the literal "1", then C assumes 1 is a 32 bit value. For bitPos less than 31, this still works. It makes a 32 bit value with a single 1 at the bitPos position, and then extends that value to mask which is 64 bits by padding on the left with zeroes. However, for cases where bitPos is greater than 31, the 1<<bitPos expression shifts a 32 bit value by more than 32 bits, so the result is undefined. Unfortunately, neither the compiler nor the C runtime checks for overshifting, so if you forget the L suffix when it is needed, it is often hard to debug and figure out what you did wrong.

arrow_circle_up

Contiguous 1 Bit Masks

Contiguous 1 bit masks are masks which have several 1 bits on, but all the one bits are right next to each other (contiguous). Again, the easiest way to make contiguous 1 bit masks is to use #defines with binary or hexadecimal literal values such as:


  #define byte2Mask 0b00000000000000001111111110000000000

Or, using hexadecimal:


  #define byte2Mask 0x0000FF00

Then, we can extract the value of byte2 using code such as:


  byte2Val=(val & byte2Mask) >>8; // Extract byte2 bits and shift them to the correct position

It is also quite simple, using a leaky abstraction, to get a mask in which all bits are at a 1 value. The shortcut for two's complement numbers tells us that we can get -1 by flipping on the bits in the two's complement representation of 1, which is 0b000...0001, by flipping the bits to 0b111...1110, and adding 1, getting 0b1111....1111.

For example, to get a 32 bit mask of all 1's use the code:


	usigned int mask=-1;

Under the covers, C first represents the -1 as a signed two's complement integer, 0b1111...1111, and then converts that to unsigned, which does not change the bit value - just the integer interpretation of that value.

In a more generic application, to get a mask that has n contiguous 1 bits starting at position pos, with all zeroes elsewhere, use an idiom such as:


  unsigned int mask = 1<<n -1; // Make a string of right justified n 1 bits
  mask = mask << (pos); // Shift so that the rightmost 1 is at bit position pos

This assumes n is between 1 and 31, and pos is between 1 and 31-n.

For longer (64 bit) masks, use:


  unsigned long mask = ((1L << n) - 1) << (pos);

arrow_circle_up

Masks that have Multiple Non-contiguous 1 Bits

If you need a mask with multiple, non-contiguous 1 bits, then start your mask by assigning it to zero (all bits off), create single bit or multi-bit sub-masks using the techniques described in Single 1 Bit Masks or Contiguous 1 Bit Masks, and bitwise OR them together to create a more complicated mask.

For example, to make a mask that is 32 bits long, and has all bits in the second byte and the last bit on, you might code the following:


  int mask=0; // Start off with a mask with no one bits on 0x00000000
  mask = mask | ((1<<8) - 1) << 16; // Or in all 8 bits in the byte to get 0x0000FF00
  mask = mask | 1; // Or in the last bit to get 0x0000FF01

arrow_circle_up

Some Examples of Bit Twiddling C Code

The following are a few examples of bit twiddling in C, mostly taken from the literature (see the references below.)

arrow_circle_up

Turning a Single Bit Flag On

To turn on a single bit flag in status with the mask in FLAG, simply:


  status |= FLAG; // Turn on Flag

Note that if the flag is already on, it will stay on.

arrow_circle_up

Turning a Single Bit Flag Off

To turn off a single bit flag in status with the mask in FLAG:


  status &= ~FLAG;

arrow_circle_up

Checking if a Single Bit Flag is On

To check to see whether the single bit flag with the mask in FLAG in on in value status:


  if (status & FLAG) {
    // Flag is on in status
  } else {
    // Flag is off in status
  }

arrow_circle_up

Extracting Nibbles from an Integer

The following code extracts the 8 four-bit nibbles from a 32 bit integer value, and prints each nibble's value.

  unsigned int mask=0xF0000000;
  for(int nibble=1;nibble<=8;nibble++) {
    int nval=(value & mask)>>(8-nibble)*4;
    printf("Nibble %2d has value %d\n",nibble,nval);
    mask=mask>>4;
  }

The mask variable contains 1 bits for the nibble we are currently working on, and zero bits elsewhere. The value & mask bitwise AND turns off all bits other than the nibble we are interested in. The (8-nibble)*4 evaluates to the number of bits to the right of the nibble we are interested in, so when we do >>(8-nibble)*4, that shifts the value so that the nibble we are interested in is right justified.

arrow_circle_up

Is a Value a Power of Two

The following line of code determines if an int val variable contains a number that is a power of two.


  int isPowerOfTwo = (val != 0) && (val & (val-1) ) == 0;

This works because 2n in binary is a binary value with a single 1 bit turned on. If val is zero it is not a power of two. If val is a power of two, then it has a single 1 bit, and val-1 is a string of 1 bits, starting one to the right of val 1 bit, going to the end of the value. Clearly in this case, bitwise ANDing these two values results in zero. If val is non-zero and not a power of two, then it must have more than one 1 bit on. That forces at least one position in val and val-1 to have 1 bits on, so the bitwise AND of val and val-1 must be non-zero, which makes isPowerOfTwo false.

arrow_circle_up

Computing the Absolute Value of an Integer

To compute the absolute value of signed integer val and put it in absVal:


  const int bit31 = val >> 31;
  absVal = (val ^ bit31) - bit31;

This method does not require any branching, which tends to slow down CPUs quite a bit. This relies on the algorithm to "switch the bits and add 1" to negate a two's complement number. The right shift of bit 31 propagates the sign bit through the entire bit31 variable. If the sign bit is zero (i.e. it's a positive number) the second line is just (val ^ 0) - 0, and since XOR with 0 doesn't change anything, val is left unchanged. If the sign bit is 1 (i.e. it's a negative number) then bit31 contains 32 one bits. XORing with one bits inverts a bit, so all bits in val are flipped, and then we subtract bit31, and since bit31 evaluates to -1, that's the same as adding 1.

arrow_circle_up

Bit Twiddling References

arrow_circle_up