Bitwise Operations
CS 301 Lecture, Dr. Lawlor
So most of the usual work you do in C/C++/Java/C# manipulates integers
or strings. For example, you'll write a simple line like:
x = y + 4;
which adds 4 to the value of y.
But sometimes you have to understand how this works internally. For example, on a 32-bit machine, this code returns... 0.
long x=1024;
long y=x*x*x*4;
return y;
(Try this in NetRun now!)
Why? The real answer is 4 billion (and change), which requires 33
bits: a 1 followed by 32 zero bits. But on a 32-bit machine, all
you get is the zeros; the higher bits "overflow" and (at least in
C/C++) are lost! Understanding the bits underneath your familiar
integers can help you understand errors like this one. (Plus, by
writing assembly code, you can actually recover the high-order bits
after a multiplication if you need them.)
So because bits are important, C/C++/Java/C# include "bitwise" operators that manipulate the underlying bits
of the integer. It's like the computer counts on its (32 or 64)
fingers, does the operation on those bits, and then converts back to an
integer. Except, of course, deep down the hardware only knows about
bits, not integers!
Bitwise Left Shift: <<
Makes values bigger, by shifting the value's bits
into higher places, tacking on zeros in the vacated lower places.
As Ints
|
As Bits
|
3<<0 == 3
|
0011<<0 == 0011
|
3<<1 == 6
|
0011<<1 == 0110
|
3<<2 == 12
|
0011<<2 == 1100
|
Interesting facts about left shift:
- 1<<n pushes a 1 up into bit
number n, creating the bit pattern 100... n zeros here ...00.
- The value of (k<<n) is actually k*2n. This means bit shifting can be used as a faster multiply.
- k<<0 == k, for any k.
- (k<<n) >= k, for any n and k (unless you have "overflow"!).
- On a 32-bit machine, (k<<32) == 0, plus a compiler warning, because all the bits of k have overflowed away.
- Left shift always shifts in fresh new zero bits.
- You can
left shift by as many bits as you want.
- You can't left shift by a negative number of bits.
In C++, the << operator is also overloaded for iostream
output. I think this was a poor choice, in particular because
"cout<<3<<0;" just prints 3, then 0! To actually
print the value of "3<<0", you need parenthesis, like this:
"cout<<(3<<0);". Operator precedence is screwy for
bitwise operators, so you really want to use excess parenthesis!
Bitwise Right Shift: >>
Makes values smaller, by shifting
them into lower-valued places. Note
the bits in the lowest places just "fall off the end" and vanish.
As Ints
|
As Bits
|
3>>0 == 3
|
0011>>0 == 0011
|
3>>1 == 1
|
0011>>1 == 0001
|
3>>2 == 0
|
0011>>2 == 0000
|
6>>1 == 3
|
0110>>1 == 0011
|
Interesting facts about right shift:
- The value of (k>>n) is actually k/2n. This can be used as a faster divide.
- (k<<n)>>n == k, unless overflow has happened.
- Right shift can do strange things to negative values (we'll talk about negative values later).
- On a 32-bit machine, (k>>32) == 0, plus a compiler warning, because all the bits of k have fallen off the end.
If you're dyslexic, like me, the left shift << and right shift
>> can be really tricky to tell apart. I always remember it
like this:
- k<<n pumps up the value of k (the point of the << is injecting bigness into k)
- k>>n drains away the value of k (the point of the >> is draining bigness from k)
Bitwise AND: &
Output bits are 1 only if both corresponding input bits are 1. This is useful to "mask out" bits you don't want, by ANDing them with zero.
As Ints
|
As Bits
|
3&5 == 1
|
0011&0101 == 0001
|
3&6 == 2
|
0011&0110 == 0010
|
3&4 == 0
|
0011&0100 == 0000
|
-
0=A&0 (AND by 0's creates 0's--used for masking)
-
A=A&~0 (AND by 1's has no effect)
-
A=A&A (AND by yourself has no effect)
Bitwise AND is a really really useful tool for extracting bits from a
number--you often create a "mask" value with 1's marking the bits you
want, and AND by the mask. For example, this code figures out if
bit 2 of an integer is set:
int mask=(1<<2); // in binary: 100
int value=...; // in binary: xyz
if (0!=(mask&value)) // in binary: x00
...
In C/C++, bitwise AND has the wrong precedence--leaving out the parenthesis in the comparison above gives the wrong answer! Be sure to use extra parenthesis!
Bitwise OR: |
Output bits are 1 if either input bit is 1. E.g., 3|5 == 7; or 011 | 101 == 111.
As Ints
|
As Bits
|
3|0 == 3
|
0011|0000 == 0011
|
3|3 == 3
|
0011|0011 == 0011
|
1|4 == 5
|
0001|0100 == 0101
|
-
A=A|0 (OR by 0's has no effect)
-
~0=A|~0 (OR by 1's creates 1's)
-
A=A|A (OR by yourself has no effect)
Bitwise OR is useful for sticking together bit fields you've prepared
separately. Overall, you use AND to pick apart an integer's
values, XOR and NOT to manipulate them, and finally OR to assemble them
back together.
Bitwise XOR: ^
Output bits are 1 if either input bit is 1, but not both. E.g., 3^5 == 6; or 011 ^ 101 == 110. Note how the low bit is 0, because both input bits are 1.
As Ints
|
As Bits
|
3^5 == 6
|
0011&0101 == 0110
|
3^6 == 5
|
0011&0110 == 0101
|
3^4 == 7
|
0011&0100 == 0111
|
- A=A^0 (XOR by zeros has no effect)
- ~A = A ^ ~0 (XOR by 1's inverts all the bits)
-
0=A^A (XOR by yourself creates 0's--used in cryptography)
The second property, that XOR by 1 inverts the value, is useful for flipping a set of bits.
Bitwise NOT: ~
Output bits are 1 if the
corresponding input bit is zero. E.g., ~011 ==
111....111100. (The number of leading ones depends on the size of
the machine's "int".)
As Ints
|
As Bits
|
~0 == big value
|
~...0000 == ...1111 |
I don't use bitwise NOT very often, but it's handy for making an integer whose bits are all 1: ~0 is all-ones.
Non-bitwise Logical Operators
Note that the logical operators &&, ||, and ! work exactly the
same as the bitwise values, but for exactly one bit. Internally,
these operators map multi-bit values to a single bit by treating zero
as a zero bit, and nonzero values as a one bit. So (2&&4)
== 1 (because both 2 and 4 are nonzero), but (2&4) == 1 (because
2==0010 and 4 == 0100 don't have any overlapping one bits).
Use of Bitwise Operators
Say you're Google. You've got to search all the HTML pages on the
net for any possible word. One way to do this is for each
possible word, store a giant table
of every HTML document on the net (maybe 10 billion documents)
containing one bit per document: 1 if the word appears in that
document, 0 if the word doesn't appear. This table is 10 billion
bits, about 1GB uncompressed, or only a few dozen megabytes
compressed. Given two search words, you can find all the
pages that contain both words by ANDing both tables. The output
of the bitwise AND, where both bits are set to 1, is a new table
listing the HTML pages that contain both search terms; now sort by
pagerank, and you're done! Note that storing the big table by
bits saves a lot of space, and doing a bitwise AND instead of a regular
logical AND saves a lot of time (over 10x speedup in my testing!):
enum {n=1}; // Number of integers in our tables (== size of internet / 32)
unsigned int funky_table[n]={(1<<24)|(1<<17)|(1<<12)|(1<<4)};
unsigned int aardvark_table[n]={(1<<31)|(1<<24)|(1<<15)|(1<<6)|(1<<4)};
/* Match up the bits of these two tables using bitwise operations */
void both_tables(const unsigned int *a,const unsigned int *b,unsigned int *o) {
for (int i=0;i<n;i++) o[i]=a[i]&b[i]; /* bitwise AND */
}
/* Match up the bits of these two tables using logical (one-bit) operations */
void both_tables_logical(const unsigned int *a,const unsigned int *b,unsigned int *o)
{
for (int i=0;i<n;i++) {
o[i]=0;
for (int bit=0;bit<32;bit++)
{
unsigned int a_bit=a[i]&(1<<bit);
unsigned int b_bit=b[i]&(1<<bit);
if (a_bit && b_bit) /* logical AND */
o[i]=o[i]|(1<<bit);
}
}
}
int foo(void) {
unsigned int output_table[n];
both_tables(funky_table,aardvark_table,output_table);
return output_table[0];
}
(Try this in NetRun now!)
The same bitwise testing idea shows up in the "region codes" of Cohen-Sutherland clipping, used in computer graphics.