Cryptography Terminology

CS 463/480 Lecture, Dr. Lawlor

"Three people may keep a secret, if two of them are dead."
-- Ben Franklin

I personally have a really hard time with secrets. My job is explaining things to people. Basically all the software I write is open source.

Yet I still have secrets. I can't give out my passwords, my bank account numbers, my SSH private keys, or my social security number without losing control over my identity.

The folks breaking into the Linux kernel distribution site kernel.org aren't trying to "steal the source code", which would be absurd since the whole point of kernel.org is to give out the source code. They're trying to modify the official copy of the source code, thus placing a back door in the millions of computers, cellphones, and industrial control systems that use this software. Back doors are a key difference between the control required for open-source code, and the looser access to open-access text such as Wikipedia, where the damage due to vandalism is narrower.

Table of secrets: On revealing the secret, + means benefit or enjoyment, - means harm, dislike, or disapproval.

+You

-You

+Me

Surprise party

Anonymous benefactor

Too Much Information (TMI)

Exhibitionism

-Me

Voyeurism

Identity theft

Troop movements

Most bodily functions

Most medical problems

Secrets versus Accountability

In most organizations, the payroll processing system is fairly tricky to set up correctly, since you normally want to keep salaries secret (though not at UAF!), and certainly must keep blank check stock and account credentials secret. Yet you also need sufficient transparency and auditing to make it difficult for a corrupt employee to embezzle payroll money: by paying fictional employees, paying real employees too much and splitting the difference with them, diverting money withheld for retirement or taxes, etc.

Some organizations have even harder problems. Say you're working in Human Resources at the CIA. Seemingly simple tasks like paying employees become difficult when the employee is not only behind enemy lines, but working at enemy headquarters--thus, simply mailing your informant a check or a tax form could get them killed! If you're running the CIA payroll, consider the difficulty of distinguishing these situations:

Activity	Authorized Uses	Unauthorized Uses
Read	Direct deposit paychecks. Audit informant list.	Steal bank account numbers. Sell informant list.
Write (modify)	Hire new employees. Fire employees.	Pay fake employees. Burn current employees.

There isn't a general technical solution to these challenges, but cryptography provides some options--for example, you could hide employee identities behind a one-way hash function, or encode authorization using public key cryptography.

How paranoid do you need to be?

Some organizations, like intelligence services, know they're targeted, and act accordingly. This means the backup CIA payroll computer, like the primary, probably sits in a vault with two armed guards, inside a bunker, inside a military base. Backup tapes are hand-carried from the primary to the secondary by a carefully scripted, well defended transport system, then incinerated onsite.

Other organizations don't think of themselves as targets. For example, stores that sell hammers can still lose millions of customer credit card numbers due to malware on their point-of-sale computers. A number of breaches involve unencrypted USB drives, such as a series of disclosures involving millions of health care patients in British Columbia; but 94% of US hospitals reported a data breach in the last two years. After a series of stolen laptops,NASA now has an agency-wide whole-disk encryption policy.

Basically everybody, and every organization, has at least some secrets that need to be protected. This course will explore how to do that.

Adversarial Algorithms

One big difference between normal code and cryptographic code is the latter is "adversarial": smart people giving input are allowed to try to structure it to screw up your code. For example, if your sort has a quadratic worst case, but it's rare and happens only on highly non-random pathological datasets, you can probably ignore it because it won't happen often enough to make a difference. The same situation for crypto is a recipe for disaster, because an attacker can hand you pathlogical cases all day long.

Terminology:

Two parties A and B are trying to exchange data safely. By convention, they're named Alice and Bob, but ATM and Bank is just as good.
The adversary E is working against Alice and Bob. By convention, the adversary is named Eve, short for Evil.
Eve is assumed to be able to read anything Alice and Bob send to each other across the network. This assumption is because communication networks are physically big and hard to secure.
Eve might be able to read the exchanged data, like a bank account number, that Alice and Bob are trying to keep secret. This is a violation of data secrecy.
Eve can also modify information on the network interactively, known as a man-in-the-middle attack even though Eve is a woman.
Eve might be able to modify the exchanged data, like a credit or debit amount, that Alice and Bob are trying to keep unmodified. This is a violation of data integrity.
One simple modification of communication is to just repeat messages Alice and Bob have already sent, known as a replay attack. Preventing the ATM from talking is bad, but making thousands of fake ATM deposits arrive is worse.
Eve might be able to read not just the data but also Alice and Bob's timing information, or heat usage, or another side channel that is difficult to obscure. On the cloud, Eve might be physically on the same server as Bob, which opens many new side channels including cache misses, page merging, and TLB misses.
Normally we assume Alice and Bob have secure local storage and can be trusted to execute complicated algorithms correctly. In an era of stealth rootkits, sleeper malware, and virtual machine introspection, this assumption is increasingly hard to justify: Bob might not even realize he's a Manchurian Candidate.
We also assume Eve knows exactly how Alice and Bob operate. Keeping their algorithm secret is very hard, especially since Eve can just buy an ATM and disassemble it in her warehouse.

The cryptographic tools we use for this include:

A symmetric cipher converts the message from the original human-readable plaintext into ciphertext (encryption) before sending, and back again (decryption) after it is received. It requires Alice and Bob to share a secret key beforehand. Eve can intercept the ciphertext, but can't reconstruct the plaintext because she doesn't have the key.

Example symmetric cipher: AES.
Key distribution is a big problem with symmetric ciphers. A key exchange protocol (e.g., Diffie-Hellman) can be used to set up a shared secret key even while Eve is listening to everything.
Eve can usually corrupt the ciphertext, and change the plaintext in some unpredictable way--symmetric ciphers provide only secrecy, not integrity.
Eve can always just try all possible keys, a brute force attack.
Eve can usually guess at least some of the plaintext (known plaintext), listen for slightly different plaintexts (differential cryptanalysis), or even force the parties to encrypt or decrypt custom messages.

A hash algorithm converts a message into a block of gibberish known as a hash. Given the message, it's easy to compute the hash, but given the hash, it's hard to compute the message.

Example hash algorithm: SHA-256.
Any symmetric cipher can be converted into a hash by encrypting zeros using the message as the key.
Hashing can be used to protect data integrity, using a Hash-based Message Authentication Code (HMAC). Alice sends the message plus HMAC=hash(message + secret key). Bob recomputes the hash from the message and compares it against the HMAC Alice sent. If they match, Eve hasn't tampered with the message in transit, even though she could read everything.

An asymmetric cipher uses two different keys, a public key and a private key. Alice can encrypt messages with Bob's public key, and only Bob can decrypt them using his private key, guaranteeing secrecy (unless Eve guesses or steals Bob's private key). Alice can also encrypt messages using her private key, "signing" the message, and anybody including Bob can decrypt them using her public key, which guarantees message integrity.

Example asymmetric cipher: RSA.
Asymmetric ciphers are normally much slower than symmetric ciphers: for example, with RSA encrypting 128 bytes of data might take tens of milliseconds; with AES it takes under a microsecond.
There are often tricky security restrictions on the use of asymmetric ciphers: for example, if you RSA-encrypt a message of m and m+1, the difference between the two ciphertexts equals your private key!
This means the only typical uses of asymmetric ciphers are (1) to encrypt a secret key, for key exchange in setting up a symmetric cipher, or (2) to sign a hash.