Deploying SSL – Beyond the certificate and cipher suites

If you were to go do a search on the internet for “configuring SSL” you would find a ton of references on configuring your favorite web server to do SSL some of it good and some of it not so good. But what you don’t see a lot of content on is how to deploy it successfully.

What do I mean by successfully? These articles ignore the larger picture, for example:

  1. Are there changes to your content you will need to make?
  2. What about external content and script references?
  3. Are there any SEO considerations?
  4. Are there other related considerations?

To some these things may be common-sense but even for those a refresher never hurt so lets go over them again briefly.

 

Are there changes to your content you will need to make?

Probably, lots of content I encounter explicitly references a protocol serving it (aka href=”http://…” and src=”http://…”) and if that’s the way your content looks then yes you will want to update your content to use relative references, for example

href=”//{hostname}/{uri}”

src=”/{uri}”

This way your content is independent of what protocols are used to transport it, it will also help prevent your users from encountering “mixed content” warnings.

 

What about external content and script references?

Another scenario that causes mixed content warnings is when sites use of scripts and content hosted on other servers that is explicitly referenced over HTTP. The two most common I encounter are YouTube Embeds and Google Analytics but there are lots of different third-party content and scripts out there and each one you embed will also need to support SSL.

Thankfully I have never encountered one that does not support SSL and in most cases you will just need to make the reference relative (“//”) and let the browser decide what protocol to use to get the reference. In the very rare cases where this does not work a quick email to support at the content/script provider will get you the URL to the SSL version of the content/script.

Though this has always been the case one thing to keep in mind is that the perceived performance and actual security of your site is dependent on the performance and security of the providers you include in it. I strongly recommend you check their performance and SSL configuration and ask them to make any changes necessary to address issues this might identify.

 

Are there any SEO considerations?

Aren’t there always? So to achieve all of security benefits of SSL you have to deploy SSL across your entire site (this is commonly referred to as Always On SSL). This means that as far as a search engine is concerned there could be two copies of the same content. This is treated as a negative condition in most page ranking schemes, we address this in a few ways:

1. Tell the search engine which content is authoritative (aka which one we want them to index), we do this using:

    • Updating <link rel=”canonical”> to point to the HTTPS version.
    • Updating the XML Sitemap to refer to the HTTPS version of the content.

Making these two changes ensures the search engine will index the SSL version of the site so the first link the user visits will be your HTTPS version.

These things not only improve the users experience by making them get at the content quicker (instead of relying on a rewrite rule to get them to the HTTPS content) but also help to mitigate MITM attacks that would be possible for organic traffic based on your HTTP urls.

2. Ensure the robots.txt is available over SSL.

3. Redirect all HTTP requests to your site to the HTTPs version using a permanent redirect (a HTTP 301), this will transfer your PageRank to the SSL url.

4. Update the search engine webmaster tools to refer to the HTTPS url instead of the HTTP URL.

 

Are there other considerations?

There are a few, for one there is performance. There is a myth that SSL is computationally expensive, it’s simply not true (at least today) but that doesn’t mean you don’t need to be concerned with performance.

There are several settings you care about, for example it’s common for websites to use domain sharding means when you’re using SSL is each one of those requests represents a new SSL negotiation and the negotiation is the most costly part of the SSL session. While we can’t eliminate this cost we can ensure that the servers terminating our SSL sessions implement session caching and reuse to reduce the impact of the SSL overhead. We can also try to limit the number of domains we use when sharding so reduce the number of SSL sessions needed to finish rendering a site.

You may also want to look at deploying a forward proxy in front of your web servers where all SSL would be terminated; this can give you performance benefits beyond SSL and can simplify key and SSL management in your environment at the same time.

Then there is the question of cookies, while all sensitive cookies should already be marked “secure” so they won’t get sent over non-secure sessions you should consider marking all cookies as “secure” since the whole site is now supposed to be served over SSL.

Depending on how you have authored your rewrite rules there may be static references to HTTP buried in there, you will want to review your rewites to ensure they are protocol independent (where appropriate) so that you don’t end up forcing users through an unnecessary redirect.

And finally setting the HTTP Strict Transport Security header means browsers will visit you over HTTPS the every time, even if not from search results; this will improve relative perceived performance and help protect from MITM attacks.

 

Ryan

 

Resources

1. Choose the Right Certificate, CA Security

2. Deploying SSL – How to get your server configuration right, Ryan Hurst

3. SSL Configuration Checker, X509 Labs

4. SSL Pulse, Trustworthy Internet Movement

5. Bulletproof SSL/TLS and PKI, Ivan Ristic

6. High Performance Browser Networking, Ilya Grigorik

7. How to get the latest stable OpenSSL, Apache and Nginx, Ryan Hurst

8. Always On SSL, OTA

9. Revocation Report, X509 Labs

10. SSL/TLS Deployment Best Practices, Qualys Labs

11. Transport Layer Security, WikiPedia

12. How to botch TLS forward secrecy, AGL

Deploying SSL – How to get your server configuration right

They say the most complicated skill is to be simple; despite SSL and HTTPS having been around for a long time, they still are not as simple as they could be.

One of the reasons for this is that the security industry is constantly learning more about how to design and build secure systems; as a result, the protocols and software used to secure online services need to continuously evolve to keep up with the latest risks.

This situation creates a moving target for server administrators, creating a situation where this year’s “best practice” may not meet next year’s. So how is a web server administrator to keep up with the ever-changing SSL deployment best practices?

There is, of course, a ton of great resources on the web that you can use to follow industry trends and recent security research, but it’s often difficult to distill this information into actionable and interoperable SSL configuration choices.

To help manage this problem there are tools like the X509labs SSL Configuration Checker which look your server’s configuration and makes recommendations on what you should change to address current industry best practices. This tool makes recommendations that are based on current and past security research, trends, and both client and server behavior and capability.

The tool performs over 33 different tests on your server configuration and, based on the results, recommends specific changes you should make to address its findings.

In general, the guidance the tool provides can be categorized as follows:

 

Support latest versions of TLS protocol

Often organizations are slow to pick up newer versions of their web server and SSL implementations.  This is normally a conscious decision attributed to the old adages of “if it’s not broken don’t fix it.”

The problem is that these older versions are plagued with security issues. In many cases, these organizations pick up security patches, but these patches do not include the more recent (and more secure) versions of the protocols.

It is important that all sites add support for TLS 1.2 as this new version of the protocol offers security improvements over its predecessors and lays the groundwork for addressing future security concerns.

Disable older known insecure versions of the SSL protocol

SSL was defined in 1995 and has evolved significantly since then, SSL 2.0 in particular has been found to have a number of vulnerabilities. Thankfully these issues have been resolved in later version of the protocol.

Unfortunately at least 28% of sites today still support it (based on SSL-pulse data); when I speak to server administrators about why they enable this older version they commonly mention concerns over client interoperability. Thankfully browser statistics show us that TLS 1.0 support is ubiquitous and it is no longer necessary to support the older insecure version of the protocol.

 

Choose secure and modern cipher suites

This is one of the more confusing parts of configuring SSL; it’s also one of the most important. No matter how strong the cryptographic key material that goes into your certificate, the strength of your SSL is only as secure as the cryptography used to encrypt the session.

You don’t need to be a cryptographer or security researcher to make the right choices though, the X509Labs SSL configuration checker will help you keep on top of current recommendations. Based on current research, the following would be solid choices for you to go with:

Apache

SSLProtocol -ALL +SSLv3 +TLSv1 +TLSv1.1 +TLSv1.2;
SSLCipherSuite "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH+aRSA+RC4 EECDH EDH+aRSA RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS"
SSLHonorCipherOrder on;

Nginx

ssl_protocols SSLv3 TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH+aRSA+RC4 EECDH EDH+aRSA RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS";
ssl_prefer_server_ciphers on;

These settings were chosen based on several factors including strength of the cryptography, interoperability and support for forward secrecy whenever it is supported by both the client and the server.

What is forward secrecy? You have forward secrecy when an attacker needs more than the encrypted traffic from your server and its private key to decrypt the traffic.

For you to be able to use the cipher suites that support forward secrecy here you will need to be using a version of OpenSSL and your web server that was built with ECDHE support. If you’re not you can still use these settings you just won’t offer forward secrecy to your users.

 

Disable insecure options in SSL and HTTP

As a general rule, protocols have options; these options can have unforeseen side-effects.

A great example of this is the option of SSL compression. Compression was added to SSL to improve performance of the protocol but it had a side effect – it enabled attackers to perform cryptanalysis on the cryptographic keys used in SSL. This attack was called CRIME (Compression Ratio Info-leak Made Easy) and, as such, this option is disabled today in secure SSL configurations.

Ensuring your configuration does not enable any such options is key to having a secure SSL configuration.

 

Enable performance optimizing options in SSL

To truly benefit from deploying SSL you need to apply it to your whole site—not doing so exposes sessions to attacks. The most common reason I hear from organizations as to why they are not deploying SSL across their whole site concerns performance.

This is a legitimate concern, according to Forester Research “The average online shopper expects your pages to load in two seconds or less, down from four seconds in 2006; after three seconds, up to 40% will abandon your site”.

And while it is true that an improperly configured web server will perform notably different than a properly configured one, it’s not difficult to configure your servers so that performance is not a major concern.

 

Ryan

 

Resources

  1. Getting the Most Out of SSL Part 1: Choose the Right Certificate, CA Security
  2. SSL Configuration Checker, CA Security
  3. SSL Pulse, Trustworthy Internet Movement
  4. Bulletproof SSL/TLS and PKI, Ivan Ristic
  5. High Performance Browser Networking, Ilya Grigorik
  6. How to get the latest stable OpenSSL, Apache and Nginx, Ryan Hurst
  7. Always On SSL, OTA
  8. Revocation Report, X509 Labs
  9. Transport Layer Security, WikiPedia
  10. Perfect forward secrecy , Wikipedia
  11. SSL Labs: Deploying Forward Secrecy, Qualys
  12. Intercepted today, decrypted tomorrow, Netcraft
  13. How to Build Your Own OpenSSL, Ryan Hurst
  14. Deploying forward secrecy on RedHat, Centos or Fedora based systems, Ryan Hurst

How to do a quick and dirty benchmark on a smart card

So you have to make a decision on which smart card or crypto token you’re going to use on a given project, there are lots of things to consider including price, platform support, certifications, build and software quality, number of certificates and keys it can hold, what algorithms and key sizes it implements amongst numerous others.

But even with those things well understood you need to understand the performance characteristics of the device, the two most common operations your users will likely be doing in an authentication deployment is sign and certificate requests.

Thankfully in Windows it’s fairly easy for us to get an idea of how these operations will perform for users with just a little bit of PowerShell.

In both cases we will use the measure_runs script that can be found here.

Measuring Signing Performance

To do this we will use the support PowerShell has for Code Signing, first thing is you need a code signing certificate whose key is on your smart card; I won’t include how to create one in this post but if you need help just ping me on twitter (@rmhrisk).

1. Make sure you have at least one code signing certificate by running this command:

gci cert:\CurrentUser\My –CodeSigningCert

 

Assuming you do you should see something like this:

 

Thumbprint Subject

———- ——-

EEB2729E922E72E9DCC03000129795939F194358 CN=PowerShell User

 

2. Make sure you can sign with that certificate:

$cert = @(gci cert:\currentuser\my -CodeSigningCert)[0]; Set-AuthenticodeSignature measure_runs.ps1 $cert

 

If things work you will see something like this:

SignerCertificate Status Path

—————– —— —-

EEB2729E922E72E9DCC03000129795939F194358 Valid measure_run..

 

3. Sign 100 times averaging the results:

.\measure_runs.ps1 -numberRuns 100 -run { Set-AuthenticodeSignature measure_runs.ps1 $cert} -before {$cert = @(gci cert:\currentuser\my -CodeSigningCert)[0]}

 

If things work you will see something like this:

[ 1 / 100] (Preparing… ) Running… 0.1874 seconds

[ 2 / 100] (Preparing… ) Running… 0.1700 seconds

[ 3 / 100] (Preparing… ) Running… 0.1316 seconds

[ 4 / 100] (Preparing… ) Running… 0.1426 seconds

[ 5 / 100] (Preparing… ) Running… 0.1468 seconds

Average

——-

0.262226129

 

A few things to keep in mind, you will have been prompted for a pin once in this exercise, why once? Because Windows is caching the handle to the card for the PowerShell process this means anything in the context of that PowerShell session will be able to sign with the key unless you pull the token.

Measuring Certificate Request Generation

To do this we will use a script that generates a self-signed certificate I stumbled on while searching the internet you can find it here (http://poshcode.org/1793), you will need to modify this to refer to the appropriate CSP/KSP for your smart card so we are testing the right device/software.

While a self-signed certificate isn’t exactly a Certificate Request to create both generate a key pair and sign a message so it’s a going to be representative of the CSR generation time also.

To do this test run this command:

.\measure_runs.ps1 -numberRuns 10 -run { .\createselfsigned.ps1}

 

If things work you will see something like this:

[ 1 / 100] Running… 11.9023 seconds

[ 2 / 100] Running… 9.8994 seconds

[ 3 / 100] Running… 9.9996 seconds

[ 4 / 100] Running… 10.7412 seconds

[ 5 / 100] Running… 8.9172 seconds

Average

——-

9.0738614

 

Hope this helps you, if you have any questions don’t hesitate to ask.

 

Ryan

How to get the latest stable OpenSSL, Apache and Nginx

Unfortunately many distributions are slow to pick up the most recent distributions of these core software packages. I see many arguments why this not a problem, the most common being the enterprise distributions backport the most important security fixes so it’s not necessary to get more recent versions.

The problem with this argument is that sometimes security fixes are not patches but are in-fact new features. TLS 1.2 is a great example of this, it has numerous security fixes in it that don’t exist in earlier incarnations of the protocol and the older versions of OpenSSL simply do not support it.

Another argument I hear is “make install” is so easy it doesn’t really matter that distributions do not carry the latest packages because you can just build it yourself. This argument has several issues, one of which is production systems should hardened with only the minimal binaries on it to be supportable, bringing in a development environment on is about as opposite of that as you can be.

So what is a server administrator to do? Thankfully there are several additional repositories available for Enterprise Linux Distributions that offer stable and recent builds of the most commonly used packages, two such repositories are:

Between these two (depending on your OS version) you can probably get the most recent OpenSSL, Nginx, and Apache distributions and all the goodness they carry.

In my case I use Centos 6.3, if you do also you can follow the steps here to add these repositories to you own systems.

Once you have added the repositories you can simply use the IUS replace plug-in to replace your current distribution of OpenSSL with the latest, for example:

yum replace openssl --replace-with=openssl10 --enablerepo=ius-testing

That’s it, now you can enable TLS 1.2 and any other modern TLS features carried by this build of OpenSSL. It is worth noting that at this time this build does not include support for ECC and ECDH which are required for forward secrecy with modern browsers, to get a version that supports these algorithms you will have to build your own.

Example Nginx SSL / TLS configuration

Configuring your server for SSL can be a little overwhelming. To help with this I am writing three posts (one for Nginx, Apache and IIS) with example configurations that (to the extent possible) result in the same configuration regardless of what server you are using.

Let’s start with Nginx, for this site :

  1. Running nginx/1.4.1 and openssl 1.0.1e
  2. All static content is handled by Nginx.
  3. All dynamic content is handled by Node.js and Express.
  4. We use the X-Frame-Options header to help protect from Click-Jacking.
  5. We use the X-Content-Security-Policy
    header to help protect from Cross-Site-Scripting.
  6. All requests for content received over http are redirected to https.
  7. Once the user visits the https version of the site the Strict-Transport-Security header instructs the browser to always start with the https site.
  8. We have chosen SSL cipher suites to offer a blend of performance and security.
  9. We have disabled SSL v2 and v3 and enabled all versions of TLS.
  10. We have enabled OCSP stapling.
  11. We have enabled SSL session caching.
  12. We have put all certificates and keys into their own folder (certs.d/).
  13. Set the owner of the of the certs.d folder to the process that the server runs as.
  14. We have restricted the certs.d folder and key files so only the owner can read and write (chmod 600).

Here is the configuration file:

server {
listen  80;
server_name  example.com;

# tell users to go to SSL version this time
if ($ssl_protocol = "") {
rewrite     ^   https://$server_name$request_uri? permanent;
}

}

server {
listen  443 ssl;
server_name  example.com;

# tell users to go to SSL version next time
add_header Strict-Transport-Security "max-age=15768000; includeSubdomains;";

# tell the browser dont allow hosting in a frame
add_header X-Frame-Options DENY;

# tell the browser we can only talk to self and google analytics.
add_header X-Content-Security-Policy "default-src 'self'; \
script-src 'self' https://ssl.google-analytics.com; \
img-src 'self' https://ssl.google-analytics.com";

ssl_protocols               TLSv1 TLSv1.1 TLSv1.2;

# ciphers chosen and ordered for mix of performance, interoperability and security
#ssl_ciphers                 AES128-GCM-SHA256:ECDHE-RSA-AES128-SHA256:RC4:HIGH:!MD5:!aNULL:!EDH;

# ciphers chosen for security (drop RC4:HIGH if you are not worried about BEAST).
#ssl_ciphers                  RC4:HIGH:HIGH:!aNULL:!MD5;

# ciphers chosen for FIPS compliance.
#ssl_ciphers !aNULL:!eNULL:FIPS@STRENGTH;

# ciphers chosen for forward secrecy an compatibility
ssl_ciphers "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH+aRSA+RC4 EECDH EDH+aRSA RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS";


ssl_prefer_server_ciphers   on;
ssl_certificate_key         certs.d/example.key;
ssl_certificate             certs.d/example.cer;

ssl_session_cache    shared:SSL:10m;
ssl_session_timeout  10m;

# enable ocsp stapling
resolver 8.8.8.8;
ssl_stapling on;
ssl_trusted_certificate certs.d/example.cer;

# let nginx handle the static resources
location ~ ^/(htm/|html/|images/|img/|javascript/|js/|css/|stylesheets/|flash/|media/|static/|robots.txt|humans.txt|favicon.ico) {

root /usr/share/nginx/example/public;
access_log off;
expires @30m;
}

# redirect to node for the dynamic stuff
location / {
proxy_pass http://localhost:8003;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;

proxy_hide_header X-Powered-By;

#proxy_redirect off;
#proxy_set_header   X-Real-IP            $remote_addr;
#proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
#proxy_set_header   X-Forwarded-Proto $scheme;
#proxy_set_header   X-NginX-Proxy    true;
}

error_page  404              /404.html;

# redirect server error pages to the static page /50x.html
#
error_page   500 502 503 504  /50x.html;

location = /50x.html {
root   /usr/share/nginx/html;
}
}

Average CRL size and download time

The other day I had a great conversation with Robert Duncan over at Netcraft, he showed me some reports they have made public about CRL and OCSP performance and uptime.

One thing that I have been meaning to do is to look at average CRL size across the various CAs in a more formal way I just never got around to doing it; conveniently one of the Netcraft reports though included a column for CRL size. So while I was waiting for a meeting to start I decided to figure out what the average sizes were; I focused my efforts on the same CAs I include in the revocation report, this is what I came up with:

 

CA Average CRL Size(K) CRL Download Time @ 56k (s)
Entrust 512.33 74.95
Verisign 200.04 29.26
GoDaddy 173.79 25.42
Comodo 120.75 17.66
Cybertrust/Verizon 75.00 10.97
DigiCert 21.66 3.17
GlobalSign 21.25 3.11
Certum 20.00 2.93
StartSSL 9.40 1.38
TrendMicro 1.00 0.15

 

From this we can derive two charts one for size and another for download time at 56k (about 6% of internet users as of 2010):

clip_image002 clip_image004

 

I overlaid the red line at 10s because that is the timeout that most clients use to indicate when they will give up trying to download, some clients will continue trying in the background so that the next request would have the CRL already cached for the next call.

This threshold is very generous, after all what user is going to hang around for 10 seconds while a CRL is downloaded? This gets worse though the average chain is greater than 3 certificates per chain, two that need to have their status checked :/.

This is one of the reasons we have soft-fail revocation checking, until the Baseline Requirements were published inclusion of OCSP references was not mandatory and not every CA was managing their CRLs to be downloadable within that 10 second threshold.

There are a few ways CAs can manage their CRL sizes, one of the most common is simply roll new intermediate CAs when the CRL size gets unmanageable.

There is something you should understand about the data in the above charts; just because a CRL is published doesn’t mean it represents active certificates – this is one of the reasons I had put of doing this exercise because I wanted to exclude that case by cross-referencing the signing CA with crawler data to see if active certificates were associated with each CRL.

This would exclude the cases where a CA was taken out of operation and all of the associated certificates were revoked as a precautionary exercise – this can happen.

So why did I bother posting this then? It’s just a nice illustration as to why we cannot generally rely on CRLs as a form of revocation checking. In-fact this is very likely why some browsers do not bother trying to download CRLs.

All posts like this should end with a call to action (I need to do better about doing that), in this case I would say it is for CAs to review their revocation practices and how they make certificate status available to ensure it’s available in a fast and reliable manner.

How not to collect sensitive information

So I was chatting with a friend today about the recent Register article about TeliaSonera’s application to add a new root into into the Mozilla root program.

I could not recall ever visiting a site that used a certificate from TeliaSonera so I started looking around for one in my web crawler data, a few moments later my friend pointed me at Telia’s own home page where he noticed they were collecting user id’s and passwords from a CSS layer not served over SSL:

image

More interestingly they also accept your Social Security Number as an alternative to your user id:

image

Now it does submit these over SSL but if your reading this you know that the information is still susceptible to a Man In The Middle attack.

A quick look at its SSL configuration also shows the server has a number of SSL configuration problems:

1. Susceptible to DDOS because it supports client initiated renegotiation

2. Vulnerable to MITM because it supports insecure renegotiation

3. Supports weak and extra weak (export strength) ciphers

 

The certificate itself also doesn’t include OCSP pointers (which is required now under the baseline requirements, though it was issued before they become mandated) but more importantly it only includes a ldap reference to a CRL which most clients wont chase and if they did most firewalls wont allow out – in other words it can’t be meaningfully revoked by anyone other than the browsers.

More concerning is it is issued off of a root certificate authority which means its likely an online CA, vs something in a offline vault which means all sort of badness if you consider not all patches of browsers get adopted and the only way to revoke certificates like this is to patch the browser.

A look at revocation repository uptime

It is no secret that in the last two months GlobalSign was affected by outages at relating to our use of CloudFlare. I won’t go into the specifics behind those outages because the CloudFlare team does a great job of documenting their outages as well as working to make sure the mistakes of the past do not reoccur. With that said we have been working closely with CloudFlare to ensure that our services are better isolated from their other customers and to optimize their network for the traffic our services generate.

I should add that I have a ton of faith in the CloudFlare team, these guys are knowledgeable, incredibly hard working and very self critical — I consider them great partners.

When looking at these events it is important to look at them holistically; for example one of the outages was a result of mitigating what has been called the largest publically announced DDOS in the history of the Internet.

While no downtime is acceptable and I am embarrassed we have had any downtime it’s also important to look at the positives that come from these events, for one we have had an opportunity to test our mitigations for such events and improve them so that in the future we can withstand even larger such attacks.

Additionally it’s also useful to look the actual uptime these services have had and to give those numbers some context look at them next to one of our peers. Thankfully I have this data as a result of the revocation report which tracks performance and uptime from 21 different network worldwide perspectives every minute.

For 05/2012-12/2012 we see:

Service Uptime(%) Avg(ms)
GlobalSign/AlphaSSL OCSP 100.00 101.29
VeriSign/Symantec/Thawte/GeoTrust/Trustcenter OCSP 99.92 319.40
GlobalSign/AlphaSSL CRL 100 96.86
VeriSign/Symantec/Thawte/GeoTrust/Trustcenter CRL 99.97 311.42

 

For 01/2013 to 04/2013 we see:

Service Uptime(%) Avg(ms)
GlobalSign/AlphaSSL OCSP 99.98 76.44
VeriSign/Symantec/Thawte/GeoTrust/Trustcenter OCSP 99.85 302.88
GlobalSign/AlphaSSL CRL 99.98 76.44
VeriSign/Symantec/Thawte/GeoTrust/Trustcenter CRL 99.22 296.97

NOTE:  Symantec operates several different infrastructures – which one you hit is dependent on which brand you buy from and some cases which product you buy. We operate only two brands which share the same infrastructure. I averaged the results for each of their brands together to create these two tables. If you want to see the independent numbers see the Excel document linked to this post.

 

As you can see no one is perfect; I don’t share this to say our downtime is acceptable because it is not, but instead I want to make it clear this is data we track and use to improve our services and to make it clear what the impact really was.

By the way if you want to see the data I used in the above computation you can download these spreadsheets.

Why we built the Revocation Report

For over a year I have been monitoring the industry’s largest OCSP and CRL repositories for performance and uptime. I started this project for a few reasons but to understand them I think it’s appropriate to start with why I joined GlobalSign.

If you’re reading this post you are likely aware of the last few years of attacks against public Certificate Authorities (CA). Though I am no stranger to this space, like you I was watching it all unfold from the outside as I was working at Microsoft in the Advertising division where I was responsible for Security Engineering for their platform.

I recall looking at the industry and feeling frustrated about how little has changed in the last decade, feeling like the Internet was evolving around the CA ecosystem – at least technologically. The focus seemed almost exclusively on policies, procedures and auditing which are of course extremely important when you’re in this business but by themselves they are not a solution.

When I looked at the CA ecosystem there were a few players who I thought understood this; the one I felt got it the most was GlobalSign. Instead of joining the race to the bottom they were developing solutions to help with key management, certificate lifecycle management, and publishing guides to help customers deploy certificates cost effectively.

As a result when they approached me with the opportunity to join them as their CTO and set the technology direction for the company I was intrigued. Those of you who know me know I love data, I believe above all things successful businesses (if they recognize it or not) leverage the Define, Measure, Analyze, Improve and Control cycle to ensure they are solving the right problems and doing so effectively.

To that end when I joined GlobalSign as their CTO and I wanted market intelligence on what the status quo was for technology, operating practices and standards compliance so that I could use to adjust my own priorities as I planned where GlobalSign was going to focus its investments.

It was more than that though, as many of you know I am not new to PKI and especially not to revocation technologies having developed several products / features in this area as well as contributing to the associated standards over the years. I was always frustrated by many public certificate authorities’ inability or unwillingness to acknowledge the inadequacy of their revocation infrastructure and its contribution to slow TLS adoption and bad user agent behavior when it comes to revocation checking.

More directly the reliability and performance of major CA operational infrastructure was why browsers had to implement what is now called “soft-fail” revocation behaviors; the treating of failures to check the status of a certificate the same as a successful check. Yet it is these same people who point fingers at the browsers when the security implications of this behavior are discussed.

I wanted to see that change.

This is why from the very beginning of this project I shared all the data I had with other CAs, my hope was they would use it to improve their infrastructure but unfortunately short of one or two smaller players no one seemed concerned – I was shouting at the wind.

With the limited feedback I had received for the data I had been collecting I decided to put together what is now the revocation report. As part of this project I switched to a different monitoring provider (Monitis) because it gave me more control of what was being monitored and had a more complete API I could use to get at the data.

In parallel I began to work with CloudFlare to address what I felt was one barrier to optimally using a CDN to distribute OCSP responses (inability to cache POSTs). The whole time chronicling my experiences, thoughts and decisions on my blog so that others could learn from my experience and the industry as a whole could benefit.

When I set up the Monitis account I licensed the ability to monitor the top responders from 21 locations worldwide every minute. At first I just published the graphical reports that Monitis had but they had a few problems:

  1. They did not perform very well (at the time).
  2. It was not laid out in such a way you could see all the data at once (also now fixed).
  3. It did not exclude issues associated with their monitoring sensors.
  4. It gave no context to the data that was being presented.

This is what took me to working with Eli to build the revocation report we have today, the internet now has a public view into approximately eleven months (and growing) of performance data for revocation repositories. Eli and I are also working on mining and quantizing the data so we can do something similar for responder uptime but this has taken longer than expected due to other priorities — we will finish it though.

So the question at this point is “was the effort worth it?” — I think so, both of us put a lot of time into this project but I believe it’s been a success for a few reasons:

  1. It allowed me to figure out how to improve our own revocation infrastructure; we now perform at about the same speed as gstatic.google.com for a similarly sized object which is what the bar should be.
  2. Both StartSSL and Entrust have now followed suit and made similar changes to their infrastructure improving their performance by about 3x (besting our performance by a few ms!).
  3. Symantec has improved their primary revocation repository performance by almost 40% and I understand more improvements are on the way.
  4. We are a closer to having data based argument we can present to browsers about why they can and should re-enable hardfail revocation checking by default.
  5. It gives customers visibility into the invisible performance hit associated with the decision of who you choose as your certificate provider.

What do you think? Do you find this valuable? Are there any other elements you think we should be tracking?

Certificate-based Mozilla Persona IdP

My name is David Margrave, I am a guest author on unmitigatedrisk.com.  I have worked in the security sphere for 20 years at various U.S. federal agencies, financial institutions, and retailers.  My interests include improving the state of client authentication on the Internet, which is an area that saw robust developments in the 1990s, then languished for a number of years as the Internet at large seemed content with reusable passwords and form-based authentication over SSL/TLS, but has received renewed scrutiny because of recent large scale data breaches and the veiled promise from the Federal government to ‘fix this mess or we will fix it for you’.

 

The Mozilla Persona project is a recent initiative to improve and standardize browser-based authentication.  For a long time (over 10 years) the most widely-used form of browser-based authentication has been based on HTML forms.  At its most basic level, a user will enter an identifier and reusable password into an HTML form, and submit the form in an HTTPS request to access a protected resource.  The server will receive these values, validate them, and typically return state information in an encrypted and encoded HTTP cookie.  Subsequent visits to the protected resource will send the cookie in the HTTP request, and the server will decrypt and validate the cookie before returning the protected content.   This entire exchange usually takes place over HTTPS, but in many instances the authentication cookie is used over an HTTP connection after initial authentication has completed successfully.  There are other forms of HTTP authentication and other previous attempts at standardization, but a quick survey of the largest retailers and financial institutions will show that HTML form-based authentication is still the most common by far.

 

Assuming that the implementers of these cookie schemes are competent amateur cryptographers and avoided the most glaring mistakes (see this paper by MIT researchers), all of these authentication schemes which rely on HTTP cookies suffer from the same critical flaw:  An attacker who obtains the cookie value can impersonate the user.  The crucial problem is that HTML form-based authentication schemes have not been capable of managing cryptographic keying material on the client side.  More secure schemes such as Kerberos V5 use a ticket in conjunction with an accompanying session key, both of which are stored in a credentials cache.  In contrast to flawed cookie-based schemes, in the Kerberos V5 protocol, a service ticket is useless to an adversary without the accompanying service ticket session key.  An authentication exchange in Kerberos V5 includes the service ticket, and a value encrypted with the service ticket session key, to prove possession.There are some proprietary or enterprise-level solutions to this situation.  For example, Microsoft Internet Explorer and IIS have long had (for over 10 years) the capability to use HTTP Negotiate authentication and to use GSS-API with Kerberos V5 as the underlying mechanism.  The Apache web server has had the capability to accept HTTP Negotiate authentication for several years as well, but the adoption of these solutions on the Internet at large has not been widespread.  At a high level, the Mozilla Persona project improves this situation by bringing the credentials cache and cryptographic capabilities into the browser, and doing so in a standardized manner.  Although the underlying cryptographic algorithms may differ from the Kerberos V5 example, the importance of this project can’t be understated.

 

Persona introduces the concept of the Identity Data Provider (IdP).  The basic idea is that a domain owner is responsible for vouching for the identity of email addresses in that domain.  This could involve whatever scheme the domain owner wishes to implement.  If a domain does not implement an IdP, the Persona system will use its own default IdP which uses the email verification scheme that all Internet users are familiar with:  you prove your ability to receive email at a particular address.  When signing-in to a website which uses Persona authentication, the user will be presented with a dialog window asking for the email address to use.

Screenshot from 2013-04-10 13:14:26

Behind the scenes, the Persona system determines which IdP to use to verify the address.  A domain implementing an IdP must publish some metadata (the public key, and provisioning and verification URLs), in JSON format, at the URL https://domain/.well-known/browserid.  The server at the URL must have a certificate from a trusted certificate authority, and the returned value must be properly-formatted JSON with certain required metadata information (described here).

 

The author implemented an IdP at the domain margrave.com as a research exercise, borrowing from the NodeJS browserid-certifier project.  This particular IdP was written to accept X.509 client certificates issued by a commercial certificate authority, to extract the email address from the X.509 certificate, and issue a persona certificate with that email address. The .well-known/browserid file for node.margrave.com is shown here:

{
    "public-key": {"algorithm":"DS",
        "y":"aab45377fa024964a6b3339d107b91887adf85b96649b5b447a7ac7390866c92d88ed101f6525e717c0d703d5fd8727e0d1d8adb60bb80c7123730616c197326f1eed326fdfc136d7594ffce39a05005a433add8d3344813ea89f6e426d8f5b0bc0d3fdb59c8ec7c19583ba7f14d3636713b84c1ebe62a6866e9c2091def5c25aba967670eabc4591ee3f536006ce5c550265d4b2264c5a989abf908763b41014f35eb2949a0b027a1a1054203a3e13eeb1f16ffb171d6942405546a8407c3fb7e73227e432d150834054edc379de8f8988a8e3b102b70fe5b1164a28a4a453310313e00de1aa177f5ac2b73ef31670e16914607ba4196c06e57f7e5209bc7e4",
        "p":"d6c4e5045697756c7a312d02c2289c25d40f9954261f7b5876214b6df109c738b76226b199bb7e33f8fc7ac1dcc316e1e7c78973951bfc6ff2e00cc987cd76fcfb0b8c0096b0b460fffac960ca4136c28f4bfb580de47cf7e7934c3985e3b3d943b77f06ef2af3ac3494fc3c6fc49810a63853862a02bb1c824a01b7fc688e4028527a58ad58c9d512922660db5d505bc263af293bc93bcd6d885a157579d7f52952236dd9d06a4fc3bc2247d21f1a70f5848eb0176513537c983f5a36737f01f82b44546e8e7f0fabc457e3de1d9c5dba96965b10a2a0580b0ad0f88179e10066107fb74314a07e6745863bc797b7002ebec0b000a98eb697414709ac17b401",
        "q":"b1e370f6472c8754ccd75e99666ec8ef1fd748b748bbbc08503d82ce8055ab3b",
        "g":"9a8269ab2e3b733a5242179d8f8ddb17ff93297d9eab00376db211a22b19c854dfa80166df2132cbc51fb224b0904abb22da2c7b7850f782124cb575b116f41ea7c4fc75b1d77525204cd7c23a15999004c23cdeb72359ee74e886a1dde7855ae05fe847447d0a68059002c3819a75dc7dcbb30e39efac36e07e2c404b7ca98b263b25fa314ba93c0625718bd489cea6d04ba4b0b7f156eeb4c56c44b50e4fb5bce9d7ae0d55b379225feb0214a04bed72f33e0664d290e7c840df3e2abb5e48189fa4e90646f1867db289c6560476799f7be8420a6dc01d078de437f280fff2d7ddf1248d56e1a54b933a41629d6c252983c58795105802d30d7bcd819cf6ef"
    },
    "authentication": "/persona/sign_in.html",
    "provisioning": "/persona/provision.html"
}

 

The public key from the browserid file is the public portion of the key pair used by the IdP to certify users in the domain.  The fact that it must be served over a URL protected with a certificate issued from a trusted CA, is how the Persona system builds on the existing trust infrastructure of the Internet, instead of attempting to re-implement their own from scratch, or requiring operators of websites relying on Persona authentication to establish shared secrets out-of-band.  The authentication and provisioning URLs are how browsers interact with the IdP.

 

In the Certificate-based IdP implemented at margrave.com, the page located at /persona/provision.html includes some javascript which does the following things:  calls an AJAX API to get the email address from the certificate, receives the email address that the user entered in the Persona login dialog via a javascript callback, validates that they match, and calls another AJAX API to issue the certificate.  Note that the email address comparison performed in client-side javascript is purely for UI and troubleshooting purposes, the actual issuance of the Persona certificate uses the email address from the X.509 certificate (if the provisioning process progresses to that point), irrespective of what username was entered in the Persona login dialog.  The client-side validation of the email address is to provide the ability to troubleshoot scenarios where a user may choose the wrong certificate from the browser certificate dialog box, etc.  The client-side provisioning source code is shown below (ancillary AJAX error handling code is omitted).

 

function provision() {

  // Get the email from the cert by visiting a URL that requires client cert auth and returns our cert's email in a json response.
  // This is not strictly necessary, since the server will only issue persona certificates for the email address from the X.509 certificate,
  // but it is useful for troubleshooting, helping the user avoid choosing the wrong certificate from the browser dialog, etc.
  getEmailFromCert(function(emailFromCert) {
      if (emailFromCert) {
          navigator.id.beginProvisioning(function(emailFromPersona, certDuration) {
              if (emailFromPersona===emailFromCert) {
                  navigator.id.genKeyPair(function(publicKey) {
                      // generateServerSide makes an AJAX call to a URL that also requires client cert auth
                      generateServerSide(publicKey, certDuration, function (certificate) {
                          if (navigator.id && navigator.id.registerCertificate) {
                              //alert('registering certificate: ' + certificate);
                              navigator.id.registerCertificate(certificate);
                          }
                      });
                  });
              } else {
                  navigator.id.raiseProvisioningFailure('user is not authenticated as target user');
              }
          });
      } else {
          navigator.id.raiseProvisioningFailure('user does not have a valid X.509 certificate');
      }
  });
}

function generateServerSide(pubkey, duration, cb) {
    $.ajax({
        // Note that this URL requires SSL client certificate authentication,
        // and performs its own authorization on the email address from the certificate,
        // (for example, based on issuing CA or email address domain),
        // and so does not need the email address as an explicit input parameter
        url: "https://node.margrave.com/cert_key",
        type: "POST",
        global: false,
        data: {pubkey: pubkey,
               duration: duration},
               dataType: "json",
        success: function(response) {
                if (response.success && response.certificate) {
                    cb(response.certificate);
                }
            }
    });
    return false;
}

function getEmailFromCert(cb) {
        $.ajax({
            // Note that this URL requires SSL client certificate authentication,
            // and performs its own authorization on the email address from the certificate.
            url: "https://node.margrave.com/email",
            type: "POST",
            global: false,
            dataType: "json",
            success: function(response) {
                cb(response.email);
            }
        });}

 

The other portion of a Persona IdP, the authentication URL, turned out not to be necessary in this case, because the authentication is implicit in the use of X.509 client certificate-authenticated AJAX calls.  Once the Persona certificate has been provisioned, the user is able to access the protected resource.  If things don’t work as expected, the error messages do not seem to bubble up to the UI dialog, and I had to resort to tracing XHR calls with Firebug to determine what went wrong.  In one case, it was a clock skew error that was corrected by installing ntpd on my IdP server.   In another case, one of my IdP AJAX calls may return an error but this error gets masked by a vague UI message.  It may be helpful to standardize HTTP return code and JSON field names to return descriptive error text to the Persona UI.

 

Screenshot from 2013-04-10 13:15:32

 

 

In its current form, this concept could be useful for enterprises, but not really for the Internet at large, since it requires a) that you have a client cert and b) that the IDP for your email domain is certificate-aware.  However, If the persona-default IDP were certificate-aware, or CAs were persona-aware, then there are some interesting possibilities.

  1. The persona default IDP could skip the verification email if a trusted X.509 client certificate is provided.   Possession of a certificate from a trusted CA implies the email address has already been verified, at a minimum.  The Persona system already accepts CA’s trust when retrieving .well-known/browserid, this idea extends CA trust to clients.
  2. Going the other direction: If a CA were to accept a persona certificate from either a domain’s IDP or from the persona-default IDP, and using that to issue X.509 client certificates, or as one part of the client certificate enrollment process (higher assurance certificates may verify more information than email, such as state-issued identification).  This idea seems promising because the email verification scheme is the wheel that everyone on the Internet has reimplemented, in many cases with security flaws.