Cryptography Terminology
CS
463/480 Lecture, Dr.
Lawlor
Terminology:
- Two parties A and B are trying to exchange data safely.
By convention, they're named Alice and Bob, but ATM and Bank is
just as good.
- The adversary E is working against Alice and Bob. By
convention, the adversary is named Eve, short for Evil.
- Eve is assumed to be able to read anything Alice and
Bob send to each other across the network. This assumption
is because communication networks are physically big and hard to
secure.
- Eve might be able to read the exchanged data, like a
bank account number, that Alice and Bob are trying to keep
secret. This is a violation of data secrecy.
- Eve can also modify information on the network
interactively, known as a man-in-the-middle attack even
though Eve is a woman.
- Eve might be able to modify the exchanged data, like a
credit or debit amount, that Alice and Bob are trying to keep
unmodified. This is a violation of data integrity.
- One simple modification of communication is to just repeat
messages Alice and Bob have already sent, known as a replay
attack. Preventing the ATM from talking is bad, but
making thousands of fake ATM deposits arrive is worse.
- Eve might be able to read not just the data but also Alice and
Bob's timing information, or heat usage, or another side
channel that is difficult to obscure. On the cloud,
Eve might be physically on the same server as Bob, which opens
many new side channels including cache misses, page merging, and
TLB misses.
- Normally we assume Alice and Bob have secure local storage and
can be trusted to execute complicated algorithms
correctly. In an era of stealth rootkits, sleeper malware,
and virtual machine introspection, this assumption is
increasingly hard to justify: Bob might not even realize he's a
Manchurian
Candidate.
- We also assume Eve knows exactly how Alice and Bob
operate. Keeping their algorithm secret is very hard,
especially since Eve can just buy an ATM and disassemble it in
her warehouse.
The cryptographic tools we use for this include:
- A symmetric cipher converts the message from the
original human-readable plaintext into ciphertext
(encryption) before sending, and back again (decryption) after
it is received. It requires Alice and Bob to share a secret
key beforehand. Eve can intercept the ciphertext,
but can't reconstruct the plaintext unless she has the
key.
- Example symmetric cipher: AES.
- Key distribution is a big problem with symmetric
ciphers. A key exchange protocol (e.g.,
Diffie-Hellman) can be used to set up a shared secret key even
while Eve is listening to everything.
- Eve can usually corrupt the ciphertext, and change the
plaintext in some unpredictable way--symmetric ciphers provide
only secrecy, not integrity.
- Eve can always just try all possible keys, a brute force
attack.
- Eve can usually guess at least some of the plaintext (known
plaintext), listen for slightly different plaintexts
(differential cryptanalysis), or even force the parties to
encrypt or decrypt custom messages.
- A hash algorithm converts a message into a block of
gibberish known as a hash. Given the message, it's easy to
compute the hash, but given the hash, it's hard to compute the
message.
- Example hash algorithm: SHA-256.
- Any symmetric cipher can be converted into a hash by
encrypting zeros using the message as the key.
- Hashing can be used to protect data integrity, using a
Hash-based Message Authentication Code (HMAC). Alice
sends the message plus HMAC=hash(message + secret key).
Bob recomputes the hash from the message and compares it
against the HMAC Alice sent. If they match, Eve hasn't
tampered with the message in transit, even though she could
read everything.
- An asymmetric cipher uses two different keys, a public
key and a private key. Alice can encrypt messages with
Bob's public key, and only Bob can decrypt them using his
private key, guaranteeing secrecy (unless Eve guesses or steals
Bob's private key). Alice can also encrypt messages using
her private key, "signing" the message, and anybody
including Bob can decrypt them using her public key, which
guarantees message integrity.
- Example asymmetric ciphers: RSA, elliptic curve (e.g., ECDSA).
- Asymmetric ciphers are normally much slower than symmetric
ciphers: for example, with RSA encrypting 128 bytes of data
might take tens of milliseconds; with AES it takes under a
microsecond.
- There are often tricky security restrictions on the use of
asymmetric ciphers: for example, if you RSA-encrypt a message
of m and m+1, the difference between the two ciphertexts can
be used to recover your private key!
- This means the only typical uses of asymmetric ciphers are
(1) to encrypt a secret key, for key exchange in setting up a
symmetric cipher, or (2) to sign a hash.
Even though they're designed to keep secrets, generally
cryptographic algorithms and protocols themselves are not secret: it
is somewhat harder to crack an unknown cryptosystem, but it's
definitely much easier for a secret cryptosystem to contain
backdoors and design or implementation flaws. That is, the
security lost from revealing the algorithm is more than made up for
by the security gained by having lots of very smart people looking
for problems.
One huge challenge today is implementing crypto on insecure
devices--the best end-to-end encryption ensures the message cannot
be captured in transit across the network, but don't protect the
message after it's been decrypted on either end.
"Encryption works. Properly implemented strong crypto
systems are one of the few things that you can rely on.
Unfortunately, endpoint security is so terrifically weak that NSA
can frequently find ways around it."
- Edward
Snowden, former NSA contractor, now fugitive
CS 463 Lecture
Strong Crypto
|
Good Cipher
|
Bad Cipher
|
Example ciphers
|
RC5,
AES
|
Ceaser, Vignere, most student-built ciphers
|
Key space
|
Key is hundreds of bits long, so it is
impractical to search the key space.
|
Key is one letter (5 bits) or one word (15
bits), so it is trivial to search the key space.
|
Key schedule
|
Key is mixed well before applying to cipher,
so even weak keys with lots of zeros still produce secure
output.
|
Key is input into the operations directly, so
weak keys (e.g., "AAAA") produce barely-encrypted output or
even unmodified plaintext.
|
Operations
|
Operations are carefully chosen to preserve
bits of the key. Good operations include bitwise XOR,
addition, bit rotation, or table lookup in a carefully
designed table.
|
Operations are chosen haphazardly, and may
throw away bits of the key or produce predictable
output. For example, multiplication by an even number
will result in an even output. Bit shift loses bits.
|
Mixing
|
Every bit of the key interacts with every bit
of the cleartext, so changing any bit results in a totally
different ciphertext.
|
Parts of the key interact with limited or
repeating parts of the cleartext, so changes in the
cleartext or key produce predictable changes in the
ciphertext, allowing differential cryptanalysis.
|
Rounds
|
Can repeat the rounds to increase mixing,
because the rounds use operations that do not combine
algebraically.
|
Repeated rounds can be algebraically
rearranged, so they are equivalent to just using a slightly
different key.
|
For example, one very simple but apparently good cipher is RC5, a
1990's cipher by Ronald Rivest (the "R" in RSA). The central
function in RC5 uses a data dependent bit rotation plus a bitwise
XOR, and mixes in the key via an ordinary addition operation.
enum {n_rounds=12, n_keys=2*n_rounds+2};
// Circular bit rotate left: rotate v left by count bits
unsigned int ROTL(unsigned int v,unsigned int count)
{
return (v<<count) | (v>>(32-count));
}
// Mix two integers worth of data, A and B, with these keys.
void RC5_rounds(unsigned int &A,unsigned int &B,const unsigned int *keys) {
A += keys[0];
B += keys[1];
for (int round=1;round<=n_rounds;round++) {
A = A^B; // A gets modified by B and key
A = ROTL(A,B) + keys[2*round];
B = B^A; // B gets modified by A and key
B = ROTL(B,A) + keys[2*round+1];
}
}
// Run some simple tests, encrypting ascending data
void foo(void) {
for (int test=0;test<20;test++) {
unsigned int keys[n_keys]={1,2,3,4}; // for real RC5, need key schedule to add nonzeros here
unsigned int A=test, B=0; // plaintext
RC5_rounds(A,B,keys);
printf("Test %4d: 0x%08x 0x%08x\n",test,A,B);
}
}
(Try
this in NetRun now!)
Even with this terribly predictable set of keys, which are nearly
impossible to get from the hash-like key scheduler used by the real
RC5, we still get well scrambled ciphertext out:
Test 0: 0x741d3ad1 0x853b4646
Test 1: 0xcae2ccc0 0xb419a866
Test 2: 0x384a59fe 0x72f4776e
Test 3: 0x1f81b553 0xcede0649
Test 4: 0x4c0e2ebf 0xd37fd15a
Test 5: 0x62dc3be4 0x3b2cd17d
Test 6: 0xc6079b35 0x1199861f
Test 7: 0x12ddde65 0x5ba39c19
Test 8: 0xcd126ea1 0x563f69a5
Test 9: 0xd02b9096 0x35249443
Test 10: 0x495bf4bc 0x87cdb550
Test 11: 0xaf6ff53d 0xbfbf6d57
Test 12: 0xe4005b7d 0x4b62d4d4
Test 13: 0x8d47a25e 0xd932ce1c
Test 14: 0xf81fe948 0xd8956af0
Test 15: 0xa89c2cde 0xc78bdd58
Test 16: 0xb30d25bd 0x6f2dddcc
Test 17: 0xadeb718d 0xed18b88d
Test 18: 0x29f86b80 0x89e0c1ad
Test 19: 0x3a09684b 0x8b7f2a5f
We can see what's happening inside the cipher by printing A and B
before each round.
printf("Round %4d: 0x%08x 0x%08x\n",round,A,B);
(Try
this in NetRun now!)
For test 0, A=B=0, but once we add in the non-zero keys, the values
rapidly become well-mixed.
Round 1: 0x00000001 0x00000002
Round 2: 0x0000000f 0x00068004
Round 3: 0x006800b0 0x80b4006e
Round 4: 0x0037a037 0x2cc041d0
Round 5: 0xe1e72cf7 0x93e693b6
Round 6: 0xd05c806f 0x09eca1dd
Round 7: 0x5b360436 0x7ad4b6a9
Round 8: 0xc5653e43 0xfd8c4755
Round 9: 0x22c71d2f 0xad3d6fa5
Round 10: 0xff4e5151 0x7de8a4e6
Round 11: 0xa9bd6de0 0xd455c906
Round 12: 0x7a29399f 0xd73e784c
This cascading growth of unpredictable binary numbers is known as
the avalanche
effect.
One weakness of RC5 (addressed in RC6) is the bit shift only depends
on the low 5 bits of A and B. If both data and key have zeros
in these bits, no avalanche occurs, and the series of XOR operations
falls into a predictable cycle. This is bad, because now
running more rounds will not help.
unsigned int keys[n_keys]={0x100,0x2000,0x3000,0x40000};
(Try
this in NetRun now!)
Round 1: 0x00000100 0x00002000
Round 2: 0x00005100 0x00047100
Round 3: 0x00042000 0x00005100
Round 4: 0x00047100 0x00042000
Round 5: 0x00005100 0x00047100
Round 6: 0x00042000 0x00005100
Round 7: 0x00047100 0x00042000
Round 8: 0x00005100 0x00047100
Round 9: 0x00042000 0x00005100
Round 10: 0x00047100 0x00042000
Round 11: 0x00005100 0x00047100
Round 12: 0x00042000 0x00005100
Read the whole implementation in the RC5 RFC, including
the key schedule step, which is designed to produce good randomness
in the expanded keys to avoid this sort of failure.
The RC5 patent
expired November 1, 2015.