Cryptography Terminology
    
    CS
      463/480 Lecture, Dr.
      Lawlor
    
    
    "Three people
      may keep a secret, if two of them are dead."
        -- Ben Franklin
    
    I personally have a really hard time
      with secrets.  My job is explaining things to people. 
      Basically all the software I write is open source.
    
    Yet I still have secrets.  I can't
      give out my passwords, my bank account numbers, my SSH private
      keys, or my social security number without losing control over my
      identity.
    
    The folks breaking into the Linux kernel
      distribution site kernel.org aren't trying to
      "steal the source code", which would be absurd since the whole
      point of kernel.org is to give out the source code.  They're
      trying to modify the official copy of the source code, thus
      placing a back door in the millions of computers, cellphones, and
      industrial control systems that use this software.  Back
      doors are a key difference between the control required for
      open-source code, and the looser access to open-access text such
      as Wikipedia, where the damage due to vandalism is narrower.
    
    Table of secrets: On
      revealing the
      secret, + means benefit or enjoyment, - means harm, dislike, or
      disapproval.
    
    
         
        
          | 
 | +You | -You | 
        
          | +Me | Surprise party Anonymous benefactor | Too Much Information (TMI) Exhibitionism | 
        
          | -Me | Voyeurism Identity theft Login credentials Troop movements | Most bodily functions Most medical problems | 
      
    
    
    
    
    
    
    
    Secrets
      versus Accountability
    
    In most organizations, the payroll
      processing system is fairly tricky to set up correctly, since you
      normally want to keep salaries secret (though not at UAF!), and certainly must keep blank check stock and account
      credentials secret.  Yet you also need sufficient
      transparency and auditing to make it difficult for a corrupt
      employee to embezzle payroll money: by paying fictional employees,
      paying real employees too much and splitting the difference with
      them, diverting money withheld for retirement or taxes, etc.
    
    Some organizations have even harder
      problems.  Say you're working in Human Resources at the CIA.  Seemingly simple tasks like
      paying employees become difficult when the employee is not only
      behind enemy lines, but working at enemy headquarters--thus,
      simply mailing your informant a check or a tax form could get them
      killed!  If you're running the CIA payroll, consider the
      difficulty of distinguishing these situations:
    
      
        
          | Activity 
 | Authorized Uses 
 | Unauthorized Uses 
 | 
        
          | Read 
 | Direct deposit paychecks. Audit informant list.
 
 | Steal bank account numbers. Sell informant list.
 
 | 
        
          | Write (modify) 
 | Hire new employees. Fire employees.
 
 | Pay fake employees. Burn current employees.
 
 | 
      
    
    
    There isn't a general technical solution
      to these challenges, but cryptography provides some options--for
      example, you could hide employee identities behind a one-way hash
      function, or encode authorization using public key cryptography.
    How
      paranoid do you need to be?
    
    Some organizations, like intelligence
      services, know they're targeted, and act accordingly.  This
      means the backup CIA payroll computer, like the primary, probably
      sits in a vault with two armed guards, inside a bunker, inside a
      military base.  Backup tapes are hand-carried from the
      primary to the secondary by a carefully scripted, well defended
      transport system, then incinerated onsite. 
    
    Other organizations don't think of
      themselves as targets.  For example, stores that sell hammers
      can still lose millions of customer credit card numbers due to malware on
        their point-of-sale computers.  A number of breaches
      involve unencrypted USB drives, such as a series of disclosures
      involving millions of health care patients
      in British Columbia; but 94% of US hospitals reported a
      data breach in the last two
      years.  After a series of stolen laptops,NASA now has an agency-wide
      whole-disk encryption policy.
    
    Basically everybody, and every
      organization, has at least some secrets that need to be
      protected.  This course will explore how to do that.
    
    Adversarial Algorithms
    One big difference between normal code
      and cryptographic code is the latter is "adversarial": smart
      people giving input are allowed to try to structure it to screw up
      your code.  For example, if your sort has a quadratic worst
      case, but it's rare and happens only on highly non-random
      pathological datasets, you can probably ignore it because it won't
      happen often enough to make a difference.  The same situation
      for crypto is a recipe for disaster, because an attacker can hand
      you pathlogical cases all day long.
      
      Terminology:
    
    
      - Two parties A and B are trying to exchange data safely. 
        By convention, they're named Alice and Bob, but ATM and Bank is
        just as good.  
 
- The adversary E is working against Alice and Bob.  By
        convention, the adversary is named Eve, short for Evil.
- Eve is assumed to be able to read anything Alice and
        Bob send to each other across the network.  This assumption
        is because communication networks are physically big and hard to
        secure.
- Eve might be able to read the exchanged data, like a
        bank account number, that Alice and Bob are trying to keep
        secret.  This is a violation of data secrecy.
 
- Eve can also modify information on the network
        interactively, known as a man-in-the-middle attack even
        though Eve is a woman.
- Eve might be able to modify the exchanged data, like a
        credit or debit amount, that Alice and Bob are trying to keep
        unmodified.  This is a violation of data integrity. 
        
 
- One simple modification of communication is to just repeat
        messages Alice and Bob have already sent, known as a replay
          attack.  Preventing the ATM from talking is bad, but
        making thousands of fake ATM deposits arrive is worse.
- Eve might be able to read not just the data but also Alice and
        Bob's timing information, or heat usage, or another side
          channel that is difficult to obscure.  On the cloud,
        Eve might be physically on the same server as Bob, which opens
        many new side channels including cache misses, page merging, and
        TLB misses.
 
- Normally we assume Alice and Bob have secure local storage and
        can be trusted to execute complicated algorithms
        correctly.  In an era of stealth rootkits, sleeper malware,
        and virtual machine introspection, this assumption is
        increasingly hard to justify: Bob might not even realize he's a
        Manchurian
          Candidate.
- We also assume Eve knows exactly how Alice and Bob
        operate.  Keeping their algorithm secret is very hard,
        especially since Eve can just buy an ATM and disassemble it in
        her warehouse.
 
The cryptographic tools we use for this include:
    
      - A symmetric cipher converts the message from the
        original human-readable plaintext into ciphertext
        (encryption) before sending, and back again (decryption) after
        it is received.  It requires Alice and Bob to share a secret
          key beforehand.  Eve can intercept the ciphertext,
        but can't reconstruct the plaintext because she doesn't have the
        key.  
        - Example symmetric cipher: AES.
 
- Key distribution is a big problem with symmetric
          ciphers.  A key exchange protocol (e.g.,
          Diffie-Hellman) can be used to set up a shared secret key even
          while Eve is listening to everything. 
 
- Eve can usually corrupt the ciphertext, and change the
          plaintext in some unpredictable way--symmetric ciphers provide
          only secrecy, not integrity.
 
- Eve can always just try all possible keys, a brute force
            attack.  
 
- Eve can usually guess at least some of the plaintext (known
          plaintext), listen for slightly different plaintexts
          (differential cryptanalysis), or even force the parties to
          encrypt or decrypt custom messages.
 
- A hash algorithm converts a message into a block of
        gibberish known as a hash.  Given the message, it's easy to
        compute the hash, but given the hash, it's hard to compute the
        message.  
 
        - Example hash algorithm: SHA-256.
- Any symmetric cipher can be converted into a hash by
          encrypting zeros using the message as the key.
 
- Hashing can be used to protect data integrity, using a
          Hash-based Message Authentication Code (HMAC).  Alice
          sends the message plus HMAC=hash(message + secret key). 
          Bob recomputes the hash from the message and compares it
          against the HMAC Alice sent.  If they match, Eve hasn't
          tampered with the message in transit, even though she could
          read everything.
 
- An asymmetric cipher uses two different keys, a public
        key and a private key.  Alice can encrypt messages with
        Bob's public key, and only Bob can decrypt them using his
        private key, guaranteeing secrecy (unless Eve guesses or steals
        Bob's private key).  Alice can also encrypt messages using
        her private key, "signing" the message, and anybody
        including Bob can decrypt them using her public key, which
        guarantees message integrity.  
 
        - Example asymmetric cipher: RSA.
- Asymmetric ciphers are normally much slower than symmetric
          ciphers: for example, with RSA encrypting 128 bytes of data
          might take tens of milliseconds; with AES it takes under a
          microsecond.
- There are often tricky security restrictions on the use of
          asymmetric ciphers: for example, if you RSA-encrypt a message
          of m and m+1, the difference between the two ciphertexts
          equals your private key!
- This means the only typical uses of asymmetric ciphers are
          (1) to encrypt a secret key, for key exchange in setting up a
          symmetric cipher, or (2) to sign a hash.