Cryptography Terminology

CS 463/480 Lecture, Dr. Lawlor

Terminology:

Two parties A and B are trying to exchange data safely. By convention, they're named Alice and Bob, but ATM and Bank is just as good.
The adversary E is working against Alice and Bob. By convention, the adversary is named Eve, short for Evil.
Eve is assumed to be able to read anything Alice and Bob send to each other across the network. This assumption is because communication networks are physically big and hard to secure.
Eve might be able to read the exchanged data, like a bank account number, that Alice and Bob are trying to keep secret. This is a violation of data secrecy.
Eve can also modify information on the network interactively, known as a man-in-the-middle attack even though Eve is a woman.
Eve might be able to modify the exchanged data, like a credit or debit amount, that Alice and Bob are trying to keep unmodified. This is a violation of data integrity.
One simple modification of communication is to just repeat messages Alice and Bob have already sent, known as a replay attack. Preventing the ATM from talking is bad, but making thousands of fake ATM deposits arrive is worse.
Eve might be able to read not just the data but also Alice and Bob's timing information, or heat usage, or another side channel that is difficult to obscure. On the cloud, Eve might be physically on the same server as Bob, which opens many new side channels including cache misses, page merging, and TLB misses.
Normally we assume Alice and Bob have secure local storage and can be trusted to execute complicated algorithms correctly. In an era of stealth rootkits, sleeper malware, and virtual machine introspection, this assumption is increasingly hard to justify: Bob might not even realize he's a Manchurian Candidate.
We also assume Eve knows exactly how Alice and Bob operate. Keeping their algorithm secret is very hard, especially since Eve can just buy an ATM and disassemble it in her warehouse.

The cryptographic tools we use for this include:

A symmetric cipher converts the message from the original human-readable plaintext into ciphertext (encryption) before sending, and back again (decryption) after it is received. It requires Alice and Bob to share a secret key beforehand. Eve can intercept the ciphertext, but can't reconstruct the plaintext unless she has the key.

Example symmetric cipher: AES.
Key distribution is a big problem with symmetric ciphers. A key exchange protocol (e.g., Diffie-Hellman) can be used to set up a shared secret key even while Eve is listening to everything.
Eve can usually corrupt the ciphertext, and change the plaintext in some unpredictable way--symmetric ciphers provide only secrecy, not integrity.
Eve can always just try all possible keys, a brute force attack.
Eve can usually guess at least some of the plaintext (known plaintext), listen for slightly different plaintexts (differential cryptanalysis), or even force the parties to encrypt or decrypt custom messages.

A hash algorithm converts a message into a block of gibberish known as a hash. Given the message, it's easy to compute the hash, but given the hash, it's hard to compute the message.

Example hash algorithm: SHA-256.
Any symmetric cipher can be converted into a hash by encrypting zeros using the message as the key.
Hashing can be used to protect data integrity, using a Hash-based Message Authentication Code (HMAC). Alice sends the message plus HMAC=hash(message + secret key). Bob recomputes the hash from the message and compares it against the HMAC Alice sent. If they match, Eve hasn't tampered with the message in transit, even though she could read everything.

An asymmetric cipher uses two different keys, a public key and a private key. Alice can encrypt messages with Bob's public key, and only Bob can decrypt them using his private key, guaranteeing secrecy (unless Eve guesses or steals Bob's private key). Alice can also encrypt messages using her private key, "signing" the message, and anybody including Bob can decrypt them using her public key, which guarantees message integrity.

Example asymmetric ciphers: RSA, elliptic curve (e.g., ECDSA).
Asymmetric ciphers are normally much slower than symmetric ciphers: for example, with RSA encrypting 128 bytes of data might take tens of milliseconds; with AES it takes under a microsecond.
There are often tricky security restrictions on the use of asymmetric ciphers: for example, if you RSA-encrypt a message of m and m+1, the difference between the two ciphertexts can be used to recover your private key!
This means the only typical uses of asymmetric ciphers are (1) to encrypt a secret key, for key exchange in setting up a symmetric cipher, or (2) to sign a hash.

Even though they're designed to keep secrets, generally cryptographic algorithms and protocols themselves are not secret: it is somewhat harder to crack an unknown cryptosystem, but it's definitely much easier for a secret cryptosystem to contain backdoors and design or implementation flaws. That is, the security lost from revealing the algorithm is more than made up for by the security gained by having lots of very smart people looking for problems.

One huge challenge today is implementing crypto on insecure devices--the best end-to-end encryption ensures the message cannot be captured in transit across the network, but don't protect the message after it's been decrypted on either end.

"Encryption works. Properly implemented strong crypto systems are one of the few things that you can rely on. Unfortunately, endpoint security is so terrifically weak that NSA can frequently find ways around it."
- Edward Snowden, former NSA contractor, now fugitive

CS 463 Lecture

Strong Crypto

	Good Cipher	Bad Cipher
Example ciphers	RC5, AES	Ceaser, Vignere, most student-built ciphers
Key space	Key is hundreds of bits long, so it is impractical to search the key space.	Key is one letter (5 bits) or one word (15 bits), so it is trivial to search the key space.
Key schedule	Key is mixed well before applying to cipher, so even weak keys with lots of zeros still produce secure output.	Key is input into the operations directly, so weak keys (e.g., "AAAA") produce barely-encrypted output or even unmodified plaintext.
Operations	Operations are carefully chosen to preserve bits of the key. Good operations include bitwise XOR, addition, bit rotation, or table lookup in a carefully designed table.	Operations are chosen haphazardly, and may throw away bits of the key or produce predictable output. For example, multiplication by an even number will result in an even output. Bit shift loses bits.
Mixing	Every bit of the key interacts with every bit of the cleartext, so changing any bit results in a totally different ciphertext.	Parts of the key interact with limited or repeating parts of the cleartext, so changes in the cleartext or key produce predictable changes in the ciphertext, allowing differential cryptanalysis.
Rounds	Can repeat the rounds to increase mixing, because the rounds use operations that do not combine algebraically.	Repeated rounds can be algebraically rearranged, so they are equivalent to just using a slightly different key.

For example, one very simple but apparently good cipher is RC5, a 1990's cipher by Ronald Rivest (the "R" in RSA). The central function in RC5 uses a data dependent bit rotation plus a bitwise XOR, and mixes in the key via an ordinary addition operation.

enum {n_rounds=12, n_keys=2*n_rounds+2};

// Circular bit rotate left: rotate v left by count bits
unsigned int ROTL(unsigned int v,unsigned int count)
{
	return (v<<count) | (v>>(32-count));
}

// Mix two integers worth of data, A and B, with these keys.
void RC5_rounds(unsigned int &A,unsigned int &B,const unsigned int *keys) {
	A += keys[0];
	B += keys[1];
	for (int round=1;round<=n_rounds;round++) {
		A = A^B; // A gets modified by B and key
		A = ROTL(A,B) + keys[2*round];

		B = B^A; // B gets modified by A and key
		B = ROTL(B,A) + keys[2*round+1];
	}
}

// Run some simple tests, encrypting ascending data
void foo(void) {
	for (int test=0;test<20;test++) {
		unsigned int keys[n_keys]={1,2,3,4}; // for real RC5, need key schedule to add nonzeros here
		unsigned int A=test, B=0; // plaintext
		RC5_rounds(A,B,keys);
		printf("Test %4d: 0x%08x  0x%08x\n",test,A,B);
	}
}

(Try this in NetRun now!)

Even with this terribly predictable set of keys, which are nearly impossible to get from the hash-like key scheduler used by the real RC5, we still get well scrambled ciphertext out:

Test    0: 0x741d3ad1  0x853b4646
Test    1: 0xcae2ccc0  0xb419a866
Test    2: 0x384a59fe  0x72f4776e
Test    3: 0x1f81b553  0xcede0649
Test    4: 0x4c0e2ebf  0xd37fd15a
Test    5: 0x62dc3be4  0x3b2cd17d
Test    6: 0xc6079b35  0x1199861f
Test    7: 0x12ddde65  0x5ba39c19
Test    8: 0xcd126ea1  0x563f69a5
Test    9: 0xd02b9096  0x35249443
Test   10: 0x495bf4bc  0x87cdb550
Test   11: 0xaf6ff53d  0xbfbf6d57
Test   12: 0xe4005b7d  0x4b62d4d4
Test   13: 0x8d47a25e  0xd932ce1c
Test   14: 0xf81fe948  0xd8956af0
Test   15: 0xa89c2cde  0xc78bdd58
Test   16: 0xb30d25bd  0x6f2dddcc
Test   17: 0xadeb718d  0xed18b88d
Test   18: 0x29f86b80  0x89e0c1ad
Test   19: 0x3a09684b  0x8b7f2a5f

We can see what's happening inside the cipher by printing A and B before each round.

	printf("Round %4d: 0x%08x  0x%08x\n",round,A,B);

(Try this in NetRun now!)

For test 0, A=B=0, but once we add in the non-zero keys, the values rapidly become well-mixed.

Round    1: 0x00000001  0x00000002
Round    2: 0x0000000f  0x00068004
Round    3: 0x006800b0  0x80b4006e
Round    4: 0x0037a037  0x2cc041d0
Round    5: 0xe1e72cf7  0x93e693b6
Round    6: 0xd05c806f  0x09eca1dd
Round    7: 0x5b360436  0x7ad4b6a9
Round    8: 0xc5653e43  0xfd8c4755
Round    9: 0x22c71d2f  0xad3d6fa5
Round   10: 0xff4e5151  0x7de8a4e6
Round   11: 0xa9bd6de0  0xd455c906
Round   12: 0x7a29399f  0xd73e784c

This cascading growth of unpredictable binary numbers is known as the avalanche effect.

One weakness of RC5 (addressed in RC6) is the bit shift only depends on the low 5 bits of A and B. If both data and key have zeros in these bits, no avalanche occurs, and the series of XOR operations falls into a predictable cycle. This is bad, because now running more rounds will not help.

unsigned int keys[n_keys]={0x100,0x2000,0x3000,0x40000};

(Try this in NetRun now!)

Round    1: 0x00000100  0x00002000
Round    2: 0x00005100  0x00047100
Round    3: 0x00042000  0x00005100
Round    4: 0x00047100  0x00042000
Round    5: 0x00005100  0x00047100
Round    6: 0x00042000  0x00005100
Round    7: 0x00047100  0x00042000
Round    8: 0x00005100  0x00047100
Round    9: 0x00042000  0x00005100
Round   10: 0x00047100  0x00042000
Round   11: 0x00005100  0x00047100
Round   12: 0x00042000  0x00005100

Read the whole implementation in the RC5 RFC, including the key schedule step, which is designed to produce good randomness in the expanded keys to avoid this sort of failure.

The RC5 patent expired November 1, 2015.