Bits and Bitwise Operators

Recall that deep down everything on the machine is just bits. There are a whole group of "bitwise" operators that operate on those bits.

AND operator&, is used to mask out bits.
OR operator|, is used to reassemble bit fields.
XOR operator^, is used to controllably invert bits.
NOT operator~, is used to invert all the bits in a number.
Left shift operator<<, makes numbers bigger by shifting their bits to higher places.
Right shift operator>>, makes numbers smaller by shifting their bits to lower places.

If you'd like to see the bits inside a number, you can loop over the bits and use AND to extract each bit:

int i=9; // 9 == 8 + 1 == 1001

for (long bit=31;bit>=0;bit--) { // print each bit
	long mask=(1L<<bit); // only this bit is set
	long biti=mask&i; // extract this bit from i
	if (biti!=0) std::cout<<"1";
	else         std::cout<<"0";
	if (bit==0)  std::cout<<" integer\n";
}

(Try this in NetRun now!)

Because binary is almost perfectly unreadable (was that 1000000000000000 or 10000000000000000?), we normally use hexadecimal, base 16.

Decimal	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
Hex	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F	10
Binary	1	10	11	100	101	110	111	1000	1001	1010	1011	1100	1101	1110	1111	10000

Remember that every hex digit represents four bits. So if you shift a hex constant by four bits, it shifts by one entire hex digit:

0xf0d<<4 == 0xf0d0
0xf0d>>4 == 0xf0

If you shift a hex constant by a non-multiple of four bits, you end up interleaving the hex digits of the constant, which is confusing:

0xf0>>2 == 0x3C (?)

Bitwise operators make perfect sense working with hex digits, because they operate on the underlying bits of those digits:
    0xff0 & 0x0ff == 0x0f0
    0xff0 | 0x0ff == 0xfff
    0xff0 ^ 0x0ff == 0xf0f

You can use these bitwise operators to peel off the hex digits of a number, to print out stuff in hex.

int v=1024+15;
for (int digit=7;digit>=0;digit--) {
	char *digitTable="0123456789abcdef";
	int d=(v>>(digit*4))&0xF;
	std::cout<<digitTable[d];
}
std::cout<<std::endl;
return v;

(Try this in NetRun now!)

You could also use printf("%X",v);

Bitwise Left Shift: <<

Makes values bigger, by shifting the value's bits into higher places, tacking on zeros in the vacated lower places.

As Ints	As Bits
3<<0 == 3	0011<<0 == 0011
3<<1 == 6	0011<<1 == 0110
3<<2 == 12	0011<<2 == 1100

Interesting facts about left shift:

1<<n pushes a 1 up into bit number n, creating the bit pattern 1 followed by n zeros.
The value of (k<<n) is actually k*2ⁿ. This means bit shifting can be used as a faster multiply by a power of two.
k<<0 == k, for any k.
(k<<n) >= k, for any n and k (unless you have "overflow"!).
On a 32-bit machine, (k<<32) == 0, plus a compiler warning, because all the bits of k have overflowed away.
Left shift always shifts in fresh new zero bits.
You can left shift by as many bits as you want.
You can't left shift by a negative number of bits.

In C++, the << operator is also overloaded for iostream output. I think this was a poor choice, in particular because "cout<<3<<0;" just prints 3, then 0! To actually print the value of "3<<0", you need parenthesis, like this: "cout<<(3<<0);". Operator precedence is screwy for bitwise operators, so you really want to use excess parenthesis!

In assembly:

shl is "shift left". Use it like "shl eax,4" (Try this in NetRun now!). Note that the '4' can be a constant, or register cl (low bits of ecx), but not any other register (Try this in NetRun now!).
sal is the same instruction (same machine code).
There's also a "rol" that does a circular left shift: the bits leaving the left side come back in the right side.

Bitwise Right Shift: >>

Makes values smaller, by shifting them into lower-valued places. Note the bits in the lowest places just "fall off the end" and vanish.

As Ints	As Bits
3>>0 == 3	0011>>0 == 0011
3>>1 == 1	0011>>1 == 0001
3>>2 == 0	0011>>2 == 0000
6>>1 == 3	0110>>1 == 0011

Interesting facts about right shift:

The value of (k>>n) is actually k/2ⁿ. This can be used as a faster divide.
(k<<n)>>n == k, unless overflow has happened.
On a 32-bit machine, (k>>32) == 0, plus a compiler warning, because all the bits of k have fallen off the end.
There are two flavors of right shift: signed, and unsigned. Unsigned shift fills in the new bits with zeros. Signed shift fills the new bits with copies of the sign bit, so negative numbers stay negative after the shift.

If you're dyslexic, like me, the left shift << and right shift >> can be really tricky to tell apart. I always remember it like this:

k<<n pumps up the value of k (the point of the << is injecting bigness into k)
k>>n drains away the value of k (the point of the >> is draining bigness from k)

In assembly:

shr is the unsigned shift.
sar is the signed (or "arithmetic") shift.
Again, there's a circular right shift "ror".

Bitwise AND: &

Output bits are 1 only if both corresponding input bits are 1. This is useful to "mask out" bits you don't want, by ANDing them with zero.

As Ints	As Bits
3&5 == 1	0011&0101 == 0001
3&6 == 2	0011&0110 == 0010
3&4 == 0	0011&0100 == 0000

Properties:

0=A&0 (AND by 0's creates 0's--used for masking)
A=A&~0 (AND by 1's has no effect)
A=A&A (AND by yourself has no effect)

Bitwise AND is a really really useful tool for extracting bits from a number--you often create a "mask" value with 1's marking the bits you want, and AND by the mask. For example, this code figures out if bit 2 of an integer is set:
    int mask=(1<<2); // in binary: 100
    int value=...;           // in binary: xyz
    if (0!=(mask&value)) // in binary: x00
       ...

In C/C++, bitwise AND has the wrong precedence--leaving out the parenthesis in the comparison above gives the wrong answer! Be sure to use extra parenthesis!

In assembly, it's the "and" instruction. Very simple!

Bitwise OR: |

Output bits are 1 if either input bit is 1. E.g., 3|5 == 7; or 011 | 101 == 111.

As Ints	As Bits
3\|0 == 3	0011\|0000 == 0011
3\|3 == 3	0011\|0011 == 0011
1\|4 == 5	0001\|0100 == 0101

A=A|0 (OR by 0's has no effect)
~0=A|~0 (OR by 1's creates 1's)
A=A|A (OR by yourself has no effect)

Bitwise OR is useful for sticking together bit fields you've prepared separately. Overall, you use AND to pick apart an integer's values, XOR and NOT to manipulate them, and finally OR to assemble them back together.

Bitwise XOR: ^

Output bits are 1 if either input bit is 1, but not both. E.g., 3^5 == 6; or 011 ^ 101 == 110. Note how the low bit is 0, because both input bits are 1.

As Ints	As Bits
3^5 == 6	0011&0101 == 0110
3^6 == 5	0011&0110 == 0101
3^4 == 7	0011&0100 == 0111

A=A^0 (XOR by zeros has no effect)
~A = A ^ ~0 (XOR by 1's inverts all the bits)
0=A^A (XOR by yourself creates 0's--used in cryptography)

The second property, that XOR by 1 inverts the value, is useful for flipping a set of bits. Generally, XOR is used for equality testing (a^b!=0 means a!=b), controlled bitwise inversion, and crypto.

Bitwise NOT: ~

Output bits are 1 if the corresponding input bit is zero. E.g., ~011 == 111....111100. (The number of leading ones depends on the size of the machine's "int".)

As Ints	As Bits
~0 == big value	~...0000 == ...1111

I don't use bitwise NOT very often, but it's handy for making an integer whose bits are all 1: ~0 is all-ones.

Non-bitwise Logical Operators

Note that the logical operators &&, ||, and ! work exactly the same as the bitwise values, but for exactly one bit. Internally, these operators map multi-bit values to a single bit by treating zero as a zero bit, and nonzero values as a one bit. So
(2&&4) == 1 (because both 2 and 4 are nonzero)
(2&4) == 0 (because 2==0010 and 4 == 0100 don't have any overlapping one bits).

Use of Bitwise Operators

Say you're Google. You've got to search all the HTML pages on the net for any possible word. One way to do this is for each possible word, store a giant table of every HTML document on the net (maybe 10 billion documents) containing one bit per document: 1 if the word appears in that document, 0 if the word doesn't appear. This table is 10 billion bits, about 1GB uncompressed, or only a few dozen megabytes compressed. Given two search words, you can find all the pages that contain both words by ANDing both tables. The output of the bitwise AND, where both bits are set to 1, is a new table listing the HTML pages that contain both search terms; now sort by pagerank, and you're done! Note that storing the big table by bits saves a lot of space, and doing a bitwise AND instead of a regular logical AND saves a lot of time (over 10x speedup in my testing!):

enum {n=1}; // Number of integers in our tables (== size of internet / 32)
unsigned int funky_table[n]={(1<<24)|(1<<17)|(1<<12)|(1<<4)};
unsigned int aardvark_table[n]={(1<<31)|(1<<24)|(1<<15)|(1<<6)|(1<<4)};

/* Match up the bits of these two tables using bitwise operations */
void both_tables(const unsigned int *a,const unsigned int *b,unsigned int *o) {
	for (int i=0;i<n;i++) o[i]=a[i]&b[i]; /* bitwise AND */
}

/* Match up the bits of these two tables using logical (one-bit) operations */
void both_tables_logical(const unsigned int *a,const unsigned int *b,unsigned int *o) 
{
	for (int i=0;i<n;i++) {
		o[i]=0;
		for (int bit=0;bit<32;bit++)
		{
			unsigned int a_bit=a[i]&(1<<bit);
			unsigned int b_bit=b[i]&(1<<bit);
			if (a_bit && b_bit) /* logical AND */
				o[i]=o[i]|(1<<bit);
		}
	}
}

int foo(void) {
	unsigned int output_table[n];
	both_tables(funky_table,aardvark_table,output_table);
	return output_table[0];
}

(Try this in NetRun now!)

The same bitwise testing idea shows up in the "region codes" of Cohen-Sutherland clipping, used in computer graphics.

CS 301 Lecture Note, 2014, Dr. Orion Lawlor, UAF Computer Science Department.

As Ints	As Bits
3\|0 == 3	0011\|0000 == 0011
3\|3 == 3	0011\|0011 == 0011
1\|4 == 5	0001\|0100 == 0101