Cryptography Terminology
CS
463/480 Lecture, Dr.
Lawlor
"Three people
may keep a secret, if two of them are dead."
-- Ben Franklin
I personally have a really hard time
with secrets. My job is explaining things to people.
Basically all the software I write is open source.
Yet I still have secrets. I can't
give out my passwords, my bank account numbers, my SSH private
keys, or my social security number without losing control over my
identity.
The folks breaking into the Linux kernel
distribution site kernel.org aren't trying to
"steal the source code", which would be absurd since the whole
point of kernel.org is to give out the source code. They're
trying to modify the official copy of the source code, thus
placing a back door in the millions of computers, cellphones, and
industrial control systems that use this software. Back
doors are a key difference between the control required for
open-source code, and the looser access to open-access text such
as Wikipedia, where the damage due to vandalism is narrower.
Table of secrets: On
revealing the
secret, + means benefit or enjoyment, - means harm, dislike, or
disapproval.
|
+You
|
-You
|
+Me
|
Surprise party
Anonymous benefactor
|
Too Much Information (TMI)
Exhibitionism
|
-Me
|
Voyeurism
Identity theft
Login credentials
Troop movements
|
Most bodily functions
Most medical problems
|
Secrets
versus Accountability
In most organizations, the payroll
processing system is fairly tricky to set up correctly, since you
normally want to keep salaries secret (though not at UAF!), and certainly must keep blank check stock and account
credentials secret. Yet you also need sufficient
transparency and auditing to make it difficult for a corrupt
employee to embezzle payroll money: by paying fictional employees,
paying real employees too much and splitting the difference with
them, diverting money withheld for retirement or taxes, etc.
Some organizations have even harder
problems. Say you're working in Human Resources at the CIA. Seemingly simple tasks like
paying employees become difficult when the employee is not only
behind enemy lines, but working at enemy headquarters--thus,
simply mailing your informant a check or a tax form could get them
killed! If you're running the CIA payroll, consider the
difficulty of distinguishing these situations:
Activity
|
Authorized Uses
|
Unauthorized Uses
|
Read
|
Direct deposit paychecks.
Audit informant list.
|
Steal bank account numbers.
Sell informant list.
|
Write (modify)
|
Hire new employees.
Fire employees.
|
Pay fake employees.
Burn current employees.
|
There isn't a general technical solution
to these challenges, but cryptography provides some options--for
example, you could hide employee identities behind a one-way hash
function, or encode authorization using public key cryptography.
How
paranoid do you need to be?
Some organizations, like intelligence
services, know they're targeted, and act accordingly. This
means the backup CIA payroll computer, like the primary, probably
sits in a vault with two armed guards, inside a bunker, inside a
military base. Backup tapes are hand-carried from the
primary to the secondary by a carefully scripted, well defended
transport system, then incinerated onsite.
Other organizations don't think of
themselves as targets. For example, stores that sell hammers
can still lose millions of customer credit card numbers due to malware on
their point-of-sale computers. A number of breaches
involve unencrypted USB drives, such as a series of disclosures
involving millions of health care patients
in British Columbia; but 94% of US hospitals reported a
data breach in the last two
years. After a series of stolen laptops,NASA now has an agency-wide
whole-disk encryption policy.
Basically everybody, and every
organization, has at least some secrets that need to be
protected. This course will explore how to do that.
Adversarial Algorithms
One big difference between normal code
and cryptographic code is the latter is "adversarial": smart
people giving input are allowed to try to structure it to screw up
your code. For example, if your sort has a quadratic worst
case, but it's rare and happens only on highly non-random
pathological datasets, you can probably ignore it because it won't
happen often enough to make a difference. The same situation
for crypto is a recipe for disaster, because an attacker can hand
you pathlogical cases all day long.
Terminology:
- Two parties A and B are trying to exchange data safely.
By convention, they're named Alice and Bob, but ATM and Bank is
just as good.
- The adversary E is working against Alice and Bob. By
convention, the adversary is named Eve, short for Evil.
- Eve is assumed to be able to read anything Alice and
Bob send to each other across the network. This assumption
is because communication networks are physically big and hard to
secure.
- Eve might be able to read the exchanged data, like a
bank account number, that Alice and Bob are trying to keep
secret. This is a violation of data secrecy.
- Eve can also modify information on the network
interactively, known as a man-in-the-middle attack even
though Eve is a woman.
- Eve might be able to modify the exchanged data, like a
credit or debit amount, that Alice and Bob are trying to keep
unmodified. This is a violation of data integrity.
- One simple modification of communication is to just repeat
messages Alice and Bob have already sent, known as a replay
attack. Preventing the ATM from talking is bad, but
making thousands of fake ATM deposits arrive is worse.
- Eve might be able to read not just the data but also Alice and
Bob's timing information, or heat usage, or another side
channel that is difficult to obscure. On the cloud,
Eve might be physically on the same server as Bob, which opens
many new side channels including cache misses, page merging, and
TLB misses.
- Normally we assume Alice and Bob have secure local storage and
can be trusted to execute complicated algorithms
correctly. In an era of stealth rootkits, sleeper malware,
and virtual machine introspection, this assumption is
increasingly hard to justify: Bob might not even realize he's a
Manchurian
Candidate.
- We also assume Eve knows exactly how Alice and Bob
operate. Keeping their algorithm secret is very hard,
especially since Eve can just buy an ATM and disassemble it in
her warehouse.
The cryptographic tools we use for this include:
- A symmetric cipher converts the message from the
original human-readable plaintext into ciphertext
(encryption) before sending, and back again (decryption) after
it is received. It requires Alice and Bob to share a secret
key beforehand. Eve can intercept the ciphertext,
but can't reconstruct the plaintext because she doesn't have the
key.
- Example symmetric cipher: AES.
- Key distribution is a big problem with symmetric
ciphers. A key exchange protocol (e.g.,
Diffie-Hellman) can be used to set up a shared secret key even
while Eve is listening to everything.
- Eve can usually corrupt the ciphertext, and change the
plaintext in some unpredictable way--symmetric ciphers provide
only secrecy, not integrity.
- Eve can always just try all possible keys, a brute force
attack.
- Eve can usually guess at least some of the plaintext (known
plaintext), listen for slightly different plaintexts
(differential cryptanalysis), or even force the parties to
encrypt or decrypt custom messages.
- A hash algorithm converts a message into a block of
gibberish known as a hash. Given the message, it's easy to
compute the hash, but given the hash, it's hard to compute the
message.
- Example hash algorithm: SHA-256.
- Any symmetric cipher can be converted into a hash by
encrypting zeros using the message as the key.
- Hashing can be used to protect data integrity, using a
Hash-based Message Authentication Code (HMAC). Alice
sends the message plus HMAC=hash(message + secret key).
Bob recomputes the hash from the message and compares it
against the HMAC Alice sent. If they match, Eve hasn't
tampered with the message in transit, even though she could
read everything.
- An asymmetric cipher uses two different keys, a public
key and a private key. Alice can encrypt messages with
Bob's public key, and only Bob can decrypt them using his
private key, guaranteeing secrecy (unless Eve guesses or steals
Bob's private key). Alice can also encrypt messages using
her private key, "signing" the message, and anybody
including Bob can decrypt them using her public key, which
guarantees message integrity.
- Example asymmetric cipher: RSA.
- Asymmetric ciphers are normally much slower than symmetric
ciphers: for example, with RSA encrypting 128 bytes of data
might take tens of milliseconds; with AES it takes under a
microsecond.
- There are often tricky security restrictions on the use of
asymmetric ciphers: for example, if you RSA-encrypt a message
of m and m+1, the difference between the two ciphertexts
equals your private key!
- This means the only typical uses of asymmetric ciphers are
(1) to encrypt a secret key, for key exchange in setting up a
symmetric cipher, or (2) to sign a hash.