Why bother with short-lived certificates and keys in TLS?

There seems to be a lot of confusion and misinformation about the idea of short-lived certificates and keys so I thought I would pen some thoughts about the topic in the hope of providing some clarification.

I have seen some argue the rationale behind short-lived certificates is to address the shortcoming in the CA and browser revocation infrastructure, I would argue this is not the case at all.

In my mind, the main reason for them is to address the issue of key compromise. Long-lived keys have a long period in which they are exposed to theft (see Heartbleed) and therefor are a higher value to an attacker since a stolen key enables the attacker to impersonate the associated website for that period.

The most important thing to keep in mind is that the nature of key compromise is such that you almost never know it has happened until it is too late (consider Diginotar as an extreme example).

The importance of protecting the SSL private key is why in the 90s when SSL deployment started in earnest, large companies tried to deploy SSL in their environment using SSL “accelerators” and “security modules” that protected the keys. This was the “right technical” thing to do not not the right “practical thing” because it significantly reduced the scalability of SSL protected services and at the same time massively increased the cost to deploy SSL.

By the mid 00’s we recognized this was not workable and the use of accelerators basically stopped outside a few edge cases and software keys were used instead. Some implementations, like the Microsoft SCHANNEL implementation, tried to protect the keys by moving them to a separate process mitigating the risk of theft to some degree. Others simply loaded the keys into the web servers process and as a result exposed them to Heartbleed like attacks.

What’s important here is that when this shift happened no one pushed meaningful changes to the maximum validity period of certificates and by association their private keys. This meant that we had software keys exposed to theft for five years (max validity period at the time) with no reliable way to detect they were stolen and in the unlikely event we did find out of the theft we would rely on the unreliable CA revocation infrastructure to communicate that issue to User Agents — clearly this was not an ideal plan.

In short, by significantly reducing the validity of the certificate and keys, we also significantly reduce the value to the attacker.

Another issue that short-lived certificates help mitigate is the evolution of the WebPKI, with a long-lived certificate you get virtually no security benefits of policy and technology improvements until the old certificates and keys are expunged from the ecosystem. Today this is rollout of new policy largely accomplished via “natural expiration” which means you have to wait until the last certificate that was issued under the old policies expires before absolute enforcement is possible.

So what is the ideal certificate validity period then? I don’t think there is a one-size-fits-all answer to that question. The best I can offer is:

As short as possible but no shorter.

With some systems it is not possible to deploy automation and until it is one needs to pick a validity period that is short enough to mitigate the key compromise and policy risks I discussed while long enough to make management practical.

It is probably easier to answer the question what is the shortest validity period we can reliably use. The answer to that question is buried in the clock skew of the relying parties. Chrome recently released a new clock synchronization feature that significantly reduces errors related certificate validity periods. But until that is fully deployed and other UAs adopt similar solutions you are probably best to keep certificate validity periods at 30 days to accommodate skew and potential renewal failures.

In-short, long-lived keys for SSL exist because we never re-visited the threat model of key compromise when we stopped using hardware-protected keys in our SSL deployments and short-lived keys help deal with the modern reality of SSL deployments.

Ryan

– I have added a few pictures for fun and made some minor text changes for clarity.

5 thoughts on “Why bother with short-lived certificates and keys in TLS?

  1. Igor

    I disagree with you. Well, I am not sure, but I think so.

    Statements such as “As short as possible but no shorter.” don’t help. Nobody will set up a service with TLS for a short period. I.e. a web page or service which will be gone after a short period of time.

    Short living certificates are actually reducing security more than anything else. Let me explain:

    TLS is doing 3 things:

    1) The connection is private (or secure) because symmetric cryptography is used to encrypt the data transmitted. I.e. you can expect that nobody else (as long as the key material is secured) can see what you are transmitting or receiving.
    2) The connection ensures integrity because each message transmitted includes a message integrity check using a message authentication code to prevent undetected loss or alteration of the data during transmission.
    3) The identity of the communicating parties can be authenticated using public-key cryptography.

    While 1 and 2 is still given with short living certificates, 3 is gone: With a short living certificate you cannot be sure anymore that you are actual communicating with the service (server) you wanted to communicate with or if for example any MITM is in between you and your desired target.

    Yeah, I am sure that not many people have recorded fingerprints or any other details from certificates in the past to be able to recognize the service they were using. But some did and at least these people were able to control that they were actual communicating with the right service (i.e. no MITM, no DNS attack, state-sponsored attack). If you were unsure you could contact you bank for example using *another* channel to get the required details to identify their servers. Same goes for private sites now using Let’s Encrypt and other short living free certificates.

    Now, this is gone. Google was the first when they bought a leaf CA for their own usage (i.e. when connecting to youtube.com, gmail.com or any other Google service you were seeing different certificates on each request due to their load balancing/CDN) but now the majority of the internet is turning to short living certificates. How should one know today if he/she is actually connecting to https://unmitigatedrisk.com for example or is the victim of a MITM or DNS attack (remember all the people using public networks in café or restaurant)? If you were using HTTP Strict Transport Security and HTTP Public Key Pinning at least (which you don’t do, see https://schd.io/1Mig) I could be sure if I would have visited your site before from a trusted network without any interferences so that my browser would have cached these information and would throw a warning if they don’t match because I am connecting through an untrusted network and someone trying to attack me. But this doesn’t help for users visiting you the first time or revisiting after a period of time (i.e. after the cached information went invalid).

    That’s why I am saying that Let’s Encrypt and all the other short living certificates are actually more damaging the security on the internet than anything else before because they actually removed the “identity” part from the TLS model.

    But a secure communication can only happen if you can identify the person you are communicating with. If you don’t know for sure if this is the person he/she says he/she is it is nice that nobody else can read what you communicate with that person but if you are actually already talking with the bad guys than you are already lost but don’t know. And if you feel secure due to the lock icon and because your browser says “secure” you are actually more at risk than ever before because nothing is more dangerous than a wrong assumption…

    Reply
  2. rmhrisk Post author

    Igor,

    I appreciate you taking the time to post.

    If I understand your disagreement correctly, you are suggesting that the use of short-lived certificates reduces your ability to know who you are dealing with.

    There is some truth to that in that it takes more planning to deploy HPKP due to the smaller window, but HPKP is largely undeployable in my opinion anyway so I don’t know that holding that against shorter lived certificates is an appropriate thing to do. As an aside I do hope that in the future expect/require-ct will be a safer and more deployable alternative to HPKP.

    That said, since a short lived certificate undergoes the exact same authentication process as a long lived one, regardless of if it is a EV, OV or DV certificate. As such, our ability to rely on the “WebPKI” to handle initial vetting process is not materially diminished.

    Another part of your argument, seems not about short-lived certificates but about the availability of automation. Specifically your examples are dependent on it being easy to get a certificate, not that the certificate and keys are “short-lived”. Though I do not attribute this secondary effect to short lived certificates / keys since the same automation works for long lived certificates and keys I do agree this makes it easier for an attacker to acquire a certificate during a transient attack on infrastructure like BGP and DNS.

    In short, I agree with your points, just not with your attribution.

    To your larger points I do worry about secondary effects as we move to a totally encrypted web but I also do believe that the greater good is being served with this transition.

    Ryan

    Reply
  3. rmhrisk Post author

    For posterity sake I thought I would reply with some of the various other aspects I have discussed with folks on twitter about this topic.

    With a automated lifecycle solution the credentials used to acquire the certificates are as good as the long lived certificate.
    See this post where we briefly discuss that. The short form response to that can be summarized as:

    In a properly designed system the account key is operated in another privileged context; for example on linux you may have a`slsladmin` user which can write to the folder where the keys and certs are kept but does not have permission to run as modify anything else. The same would be true of `wwwroot` it should not have access to where the account key is kept. I should note this is how all ACME clients I have seen document their use.

    If you use an automated lifecycle solution and a attacker compromises the web server I can get as many certificates as I want. The short form response to that can be summarized as:

    If compromising the server is a binary thing, you are doing it wrong. When building a modern system you design it for least privilege. In other words you want each component to have the minimal set of permissions necessary to perform its narrow task. You also want to design it so that there are many permission boundaries. If you design your entire system to run at common privilege necessary for all tasks then a failure in any one component takes down the whole thing. Basically this model is another anti-pattern from the 90s.

    Shortening the validity period of a certificate helps mitigate key compromise but does not eliminate it. This is basically restating the first example but I figured I would address it directly. See this post. The short form response to that can be summarized as:

    That is true, the specific variant discussed in the thread is only true if :
    – You design your solution to to run at one common permission level,
    – The attacker can find a flaw allowing them to go from reading your server configuration and keys to ring 0 where they can get the credentials to enroll,
    – You miss configure the permissions on the enrollment credentials.

    Basically this argument is “once I get root, I can get root” and that’s not a solvable problem.

    Another person points out that short lived certificates do not require a new key to be used. See this post.

    This is true, but there is no good reason to re-use the same key in and end entity authentication cert and doing so is an example of a bad design of enrollment software and not a flaw in short lived certificates and keys.

    You will notice these are not arguments against short lived certificates and keys, they are arguments against poorly designed or administered software and/or automation in general.

    I would also argue short-lived certificates and keys are not strictly dependent on there being automation a 90 or 180 day certificate can be manually enrolled (I know one very large organization that does this) and is still far shorter than the maximum certificate validity allowed by the CAB Forum.

    Ryan

    Reply
  4. Pingback: Short lived certificates cannot solve the “un-aware victim” problem

  5. Pingback: It is my turn or short-lived certificates part 2 | UNMITIGATED RISK

Leave a Reply

Your email address will not be published. Required fields are marked *