Certificate Formats

CS 463 Lecture, Dr. Lawlor

There are a ton of specialized formats and standard names used in the https certificate checking process, as standardized in X.509. Starting from the outside and working inward, we have:

PEM

PEM stands for "Privacy Enhanced Mail", since certificates were first used for authenticating email.  You see the same data in a ".crt" certificate file.

Google's .pem file looks like this:
-----BEGIN CERTIFICATE-----
MIIDgDCCAumgAwIBAgIKFIUNngAAAAB9QDANBgkqhkiG9w0BAQUFADBGMQswCQYD
VQQGEwJVUzETMBEGA1UEChMKR29vZ2xlIEluYzEiMCAGA1UEAxMZR29vZ2xlIElu
dGVybmV0IEF1dGhvcml0eTAeFw0xMzAyMjAxMzM0NTZaFw0xMzA2MDcxOTQzMjda
MGgxCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9ybmlhMRYwFAYDVQQHEw1N
b3VudGFpbiBWaWV3MRMwEQYDVQQKEwpHb29nbGUgSW5jMRcwFQYDVQQDEw53d3cu
Z29vZ2xlLmNvbTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEA4PUVszIbQhPw
k6LYSXpFVyIEmngQ19OSkna+f8dSr6COmuZQ3EtK9wr4Py8GmSrw3jVC/7zY/JO5
kgHSmDYIl+zTsLn5kBCfCbTUOJCMz+PaMpvkZ6A4FFieBtGQA9IYK5/MnL5AFZt3
WG2px4hEQQj8kfulQaCD3RdRCAF10FsCAwEAAaOCAVEwggFNMB0GA1UdJQQWMBQG
CCsGAQUFBwMBBggrBgEFBQcDAjAdBgNVHQ4EFgQUCBmaAp7Irw9cgY4BT7/mv/E3
LmEwHwYDVR0jBBgwFoAUv8Aw6/VDET5nup6R+/xq2uNrEiQwWwYDVR0fBFQwUjBQ
oE6gTIZKaHR0cDovL3d3dy5nc3RhdGljLmNvbS9Hb29nbGVJbnRlcm5ldEF1dGhv
cml0eS9Hb29nbGVJbnRlcm5ldEF1dGhvcml0eS5jcmwwZgYIKwYBBQUHAQEEWjBY
MFYGCCsGAQUFBzAChkpodHRwOi8vd3d3LmdzdGF0aWMuY29tL0dvb2dsZUludGVy
bmV0QXV0aG9yaXR5L0dvb2dsZUludGVybmV0QXV0aG9yaXR5LmNydDAMBgNVHRMB
Af8EAjAAMBkGA1UdEQQSMBCCDnd3dy5nb29nbGUuY29tMA0GCSqGSIb3DQEBBQUA
A4GBAJvolyDMFonlbMzlMEnldcFmTRrCdoLl38pA2gASQL5FY4CwMIzdw8odva9y
PPNiL7Gwdl2U/XdxeWPjc/7x19gyfZavVng4KGGXfqKZaxw7scFqSu0p//l4Emr6
Q0eccUWKGlcizUsWFdLVzLnhT4ZvFLTbLjlOHNKduxezw4mI
-----END CERTIFICATE-----
With openssl, you can dump a PEM or .crt file's contents using:
   openssl x509 -in google.pem -noout -text 
This produces the following text version of the certificate:
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
14:85:0d:9e:00:00:00:00:7d:40
Signature Algorithm: sha1WithRSAEncryption
Issuer: C=US, O=Google Inc, CN=Google Internet Authority
Validity
Not Before: Feb 20 13:34:56 2013 GMT
Not After : Jun 7 19:43:27 2013 GMT
Subject: C=US, ST=California, L=Mountain View, O=Google Inc, CN=www.google.com
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (1024 bit)
Modulus:
00:e0:f5:15:b3:32:1b:42:13:f0:93:a2:d8:49:7a:
45:57:22:04:9a:78:10:d7:d3:92:92:76:be:7f:c7:
52:af:a0:8e:9a:e6:50:dc:4b:4a:f7:0a:f8:3f:2f:
06:99:2a:f0:de:35:42:ff:bc:d8:fc:93:b9:92:01:
d2:98:36:08:97:ec:d3:b0:b9:f9:90:10:9f:09:b4:
d4:38:90:8c:cf:e3:da:32:9b:e4:67:a0:38:14:58:
9e:06:d1:90:03:d2:18:2b:9f:cc:9c:be:40:15:9b:
77:58:6d:a9:c7:88:44:41:08:fc:91:fb:a5:41:a0:
83:dd:17:51:08:01:75:d0:5b
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Subject Key Identifier:
08:19:9A:02:9E:C8:AF:0F:5C:81:8E:01:4F:BF:E6:BF:F1:37:2E:61
X509v3 Authority Key Identifier:
keyid:BF:C0:30:EB:F5:43:11:3E:67:BA:9E:91:FB:FC:6A:DA:E3:6B:12:24

X509v3 CRL Distribution Points:

Full Name:
URI:http://www.gstatic.com/GoogleInternetAuthority/GoogleInternetAuthority.crl

Authority Information Access:
CA Issuers - URI:http://www.gstatic.com/GoogleInternetAuthority/GoogleInternetAuthority.crt

X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Subject Alternative Name:
DNS:www.google.com
Signature Algorithm: sha1WithRSAEncryption
9b:e8:97:20:cc:16:89:e5:6c:cc:e5:30:49:e5:75:c1:66:4d:
1a:c2:76:82:e5:df:ca:40:da:00:12:40:be:45:63:80:b0:30:
8c:dd:c3:ca:1d:bd:af:72:3c:f3:62:2f:b1:b0:76:5d:94:fd:
77:71:79:63:e3:73:fe:f1:d7:d8:32:7d:96:af:56:78:38:28:
61:97:7e:a2:99:6b:1c:3b:b1:c1:6a:4a:ed:29:ff:f9:78:12:
6a:fa:43:47:9c:71:45:8a:1a:57:22:cd:4b:16:15:d2:d5:cc:
b9:e1:4f:86:6f:14:b4:db:2e:39:4e:1c:d2:9d:bb:17:b3:c3:
89:88

DER

You can extract the underlying binary DER glob from a PEM with
   grep -v "[-]----" google.pem | base64 -d > google.der
Here's the raw binary DER data in the above PEM file.  This is actually a pretty compact format--2500 bytes of certificate text is 1200 bytes of base64, and only 900 bytes of DER.
0<82>^C<80>0<82>^B<E9><A0>^C^B^A^B^B
^T<85>^M<9E>^@^@^@^@}@0^M^F *<86>H<86><F7>^M^A^A^E^E^@0F1^K0 ^F^CU^D
^F^S^BUS1^S0^Q^F^CU^D
^S
Google Inc1"0 ^F^CU^D^C^S^YGoogle Internet Authority0^^^W^M130220133456Z^W^M130607194327Z0h1^K0 ^F^CU^D^F^S^BUS1^S0^Q^F^CU^D^H^S
California1^V0^T^F^CU^D^G^S^MMountain View1^S0^Q^F^CU^D
^S
Google Inc1^W0^U^F^CU^D^C^S^Nwww.google.com0<81><9F>0^M^F *<86>H<86><F7>^M^M^A^A^A^E^@^C<81>
<8D>^@0<81><89>^B<81><81>^@<E0><F5>^U<B3>2ESCB^S<F0><93><A2>
<D8>IzEW"^D<9A>x^P<D7><D3><92><92>v<BE>^?<C7>R<AF><A0><8E><9A><E6>P<DC>KJ<F7>
<F8>?/^F<99>*<F0><DE>5B<FF><BC><D8><FC><93><B9><92>^A<D2><98><97><EC><D3><B0>
<B9><F9><90>^P<9F> <B4><D4>8<90><8C><CF><E3><DA>2<9B><E4>g<A0>8^TX<9E>^F<D1><90>^C<D2>^X+<9F><CC>
<9C><BE>@^U<9B>wXm<A9><C7><88>D<FC><91><FB><A5>A<A0><83><DD>^W
^Au<D0>[^B^C^A^@^A<A3><82>^AQ0<82>^AM0^]^F^CU^]%^D^V0^T^F^H+^F^A^E^E^G^C^A^F^H+
^F^A^E^E^G^C^B0^]^F^CU^]^N^D^V^D^T^H^Y<9A>^B<9E><C8><AF>^O\<81><8E>^AO<BF><E6>
<BF><F1>7.a0^_^F^CU^]#^D^X0^V<80>^T<BF><C0>0<EB><F5>C^Q>g<BA><9E><91><FB><FC>j
<DA><E3>k^R$0[^F^CU^]^_^DT0R0P<A0>N<A0>L
<86>Jhttp://www.gstatic.com/GoogleInternetAuthority/GoogleInternetAuthority.crl0f^F^H+^F^A^E^E^G^A^A^DZ0X0V^F^H+^F^A^E^E^G0^B
<86>Jhttp://www.gstatic.com/GoogleInternetAuthority/GoogleInternetAuthority.crt0^L^F^CU^]^S^A^A<FF>^D^B0^@0^Y^F^CU^]^Q^D^R0^P
<82>^Nwww.google.com0^M^F
*<86>H<86><F7>^M^A^A^E^E^@^C<81><81>^@<9B><E8><97> <CC>^V<89><E5>l<CC><E5>0I<E5>u<C1>fM^Z<C2>v<82><E5><DF><CA>@<DA>^@^R@
<BE>Ec<80><B0>0<8C><DD><C3><CA>^]
<BD><AF>r<<F3>b/<B1><B0>v]<94><FD>wqyc<E3>s<FE><F1><D7><D8>2}<96><AF>Vx8(a<97>~
<A2><99>k^\;<B1><C1>jJ<ED>)<FF><F9>x^Rj<FA>CG<9C>qE<8A>^ZW"<CD>K^V^U<D2><D5><CC><B9><E1>O<86>o^T<B4><DB>.9N^\<D2><9D><BB>^W<B3><C3><89><88>
The format is a little weird here (it's from "less"):
Note that the strings are length-counted, then appear directly. 

See the Wikipedia DER encoding example for how this works.  Class tags you see in certificates include:
There's some fairly self-contained DER certificate decoding code as part of PolarSSL, but it's still pretty long and confusing.  I/O code is often long and confusing!

RSA Signature ("the good part")

Here's the Google web server's signature:
    Signature Algorithm: sha1WithRSAEncryption
9b:e8:97:20:cc:16:89:e5:6c:cc:e5:30:49:e5:75:c1:66:4d:
1a:c2:76:82:e5:df:ca:40:da:00:12:40:be:45:63:80:b0:30:
8c:dd:c3:ca:1d:bd:af:72:3c:f3:62:2f:b1:b0:76:5d:94:fd:
77:71:79:63:e3:73:fe:f1:d7:d8:32:7d:96:af:56:78:38:28:
61:97:7e:a2:99:6b:1c:3b:b1:c1:6a:4a:ed:29:ff:f9:78:12:
6a:fa:43:47:9c:71:45:8a:1a:57:22:cd:4b:16:15:d2:d5:cc:
b9:e1:4f:86:6f:14:b4:db:2e:39:4e:1c:d2:9d:bb:17:b3:c3:
89:88
Here's the public key exponent for the certificate one level up, which is "Google Internet Authority":
        Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (1024 bit)
Modulus:
00:c9:ed:b7:a4:8b:9c:57:e7:84:3e:40:7d:84:f4:
8f:d1:71:63:53:99:e7:79:74:14:af:44:99:33:20:
92:8d:7b:e5:28:0c:ba:ad:6c:49:7e:83:5f:34:59:
4e:0a:7a:30:cd:d0:d7:c4:57:45:ed:d5:aa:d6:73:
26:ce:ad:32:13:b8:d7:0f:1d:3b:df:dd:dc:08:36:
a8:6f:51:44:9b:ca:d6:20:52:73:b7:26:87:35:6a:
db:a9:e5:d4:59:a5:2b:fc:67:19:39:fa:93:18:18:
6c:de:dd:25:8a:0e:33:14:47:c2:ef:01:50:79:e4:
fd:69:d1:a7:c0:ac:e2:57:6f
Exponent: 65537 (0x10001)
If I raise the signature to the signing authority's public key (exponent and modulus), I get this big integer:

s=1ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffff003021300906052b0e03021a0500041419a7376a093fab1374222db17b6e421accc00b1b

This is a standard decoded PKCS#1-v1.5 signature:
This hash covers everything *up to* the signature.  You could easily decode and check a hash that included the signature, but generating a message with the message's own hash at the end (finding the fixed-point hash H=hash(M+H)) would be extremely computationally difficult, as hard as brute-forcing a 256-bit key.

Here's the code to raise a signature to a public key modulus:
	// Pick public key: Google Internet Authority
BigInteger n; // public key modulus
n.readHex(QUOTE(
00:c9:ed:b7:a4:8b:9c:57:e7:84:3e:40:7d:84:f4:
8f:d1:71:63:53:99:e7:79:74:14:af:44:99:33:20:
92:8d:7b:e5:28:0c:ba:ad:6c:49:7e:83:5f:34:59:
4e:0a:7a:30:cd:d0:d7:c4:57:45:ed:d5:aa:d6:73:
26:ce:ad:32:13:b8:d7:0f:1d:3b:df:dd:dc:08:36:
a8:6f:51:44:9b:ca:d6:20:52:73:b7:26:87:35:6a:
db:a9:e5:d4:59:a5:2b:fc:67:19:39:fa:93:18:18:
6c:de:dd:25:8a:0e:33:14:47:c2:ef:01:50:79:e4:
fd:69:d1:a7:c0:ac:e2:57:6f
));
BigInteger e=65537; // public key exponent

// Pick message: signature of Google web server
BigInteger c;
c.readHex(QUOTE(
9b:e8:97:20:cc:16:89:e5:6c:cc:e5:30:49:e5:75:c1:66:4d:
1a:c2:76:82:e5:df:ca:40:da:00:12:40:be:45:63:80:b0:30:
8c:dd:c3:ca:1d:bd:af:72:3c:f3:62:2f:b1:b0:76:5d:94:fd:
77:71:79:63:e3:73:fe:f1:d7:d8:32:7d:96:af:56:78:38:28:
61:97:7e:a2:99:6b:1c:3b:b1:c1:6a:4a:ed:29:ff:f9:78:12:
6a:fa:43:47:9c:71:45:8a:1a:57:22:cd:4b:16:15:d2:d5:cc:
b9:e1:4f:86:6f:14:b4:db:2e:39:4e:1c:d2:9d:bb:17:b3:c3:
89:88
));

// Check signature (decode message)
BigInteger s=c.modPow(e,n);

// Print signed value--
// Starts with 0x0001
// lots of 0xff
// one 0x00
// DER-encoded
std::cout<<"s="<<s.hex()<<"\n";

Try the code on your machine: Zip, Tar-gzip.  (License is BSD.)