Tag Archives: SSL

WebPKI, TLS, cross-signs, device compatibility, and TLS record size.

Both the Chrome and Mozilla root program have signaled the intent to substantially shorten the time we rely on roots in the WebPKI. I believe this to be a good objective but I am struggling to get my head around the viability and implications of the change.

In this post, I wanted to capture my inner dialog as I try to do just that 🙂

I figure a good place to start when exploring this direction is why this change is a good idea. The simplest answer to this question is probably that root certificates are really just “key bags” — by that I mean they are just a convenient way to distribute asymmetric keys to clients. 

Asymmetric keys have an effective lifetime, this means the security properties they offer start to decay from the moment the key is created. This is because the longer the key exists the more time an attacker has had to guess the corresponding private key from its public key, or worse compromise the private key in some way. For this reason, there are numerous sources of guidance on usage periods for asymmetric keys, for example, NIST guidance from 2020 says that a 2048 key is usable between 2019 and 2030.

There is also the question of ecosystem agility to consider. Alexander Pope once said To Err is Human, to forgive divine. This may actually be the most important quote when looking at computer security. After all, security practitioners know that all systems will have security breaches, this is why mature organizations focus so much on detection and response. I believe the same thing is needed for ecosystem management, the corollary, in this case, is how quickly you can respond to unexpected changes in the ecosystem.

Basically having short-lived root keys is good for security and helps ensure the ecosystem does not become calcified and overly dependent on very old keys continuing to exist by forcing them to change regularly.

With that, all said, if it were easy we would already be doing it! So what are the challenges that make it hard to get to this ideal? One of the hairiest is the topic of the Internet of Things (IOT). Though I would be the first to admit there is no one size fits all answer here, broadly these devices should not use the WebPKI.

In 2022, the market for the Internet of Things is expected to grow 18% to 14.4 billion active connections. It is expected that by 2025, as supply constraints ease and growth further accelerates, there will be approximately 27 billion connected IoT devices. 

IOT-Analytics.com

A big reason for this is that these devices last a very long time compared to their mobile phone and desktop counterparts. They also, unfortunately, tend not to have managed root stores and if they do, they do it as part of firmware updates and have limited attach rates of these updates. To make things worse these devices take crude simplifying assumptions around what cryptographic algorithms are supported and what root CAs will be used by servers in the future.

But for most IoT applications, like those enabling the smart city, the device life cycle is 10 to 20 years or more. For instance, it doesn’t make sense to replace the wireless module in smart streetlights every few years.

Ingenu

A good topical example of the consequences here might be the payment industry making a decision in the 90s to adopt a single WebPKI CA  (a VeriSign-owned and operated root certificate) in payment terminals without having a strategy for updating the device’s root stores.  These terminals communicated to web servers that were processing payments. In short, the servers had to use certificates from this one WebPKI Root CA or no payment terminals could reach them.

This design decision was made during a time when SHA-1 was considered “secure”. The thing is cryptographic attacks are always evolving, and since 2005 SHA-1 has not been considered secure. As a result, browsers moved to prevent the use of this algorithm. 

This meant that one of two things would happen, either browsers would break payment terminals by preventing WebPKI CAs from using this algorithm, or payment terminals’ lack of an update strategy would put the internet at risk by slowing the migration from SHA-1. Unfortunately, the answer was the latter.

In short, when you have no update story for roots and you rely on the WebPKI you are essentially putting both your product and the internet at risk. In this case, while a root and firmware update story should have existed regardless, the payment industry should have had a dedicated private PKI for these use cases.

OK, so what does this have to do with moving to a shorter root validity period in Browser root programs? Well, when a TLS server chooses what certificate to present to a client it doesn’t know much more than the IP address of the client. This means it can’t selectively choose a TLS certificate based on it being an IOT device, a phone running an old version of a browser, or a desktop browser.

This is where the concept of “root ubiquity” and what those in the industry sometimes called “root baking”. Some root programs process requests for new root inclusion very quickly, maybe within a year your root can become a member of their program but that isn’t enough.  You need the very large majority of the devices that rely on that root program to pick up the new root before the devices will trust it.

In the case of Microsoft Windows, the distribution of a root certificate happens via a feature called AutoRoot Update. The uptake for consumer devices of a new root is very high as this is seldom turned off and the URL that is used to serve this update is seldom blocked. Enterprises and data centers are another matter altogether though. While I do not have any data on what the split is, anecdotally I can say it’s still hit or miss if a root has been distributed in this system as a result of this.

Chrome on the other hand uses a configuration management system for all Chrome that also happens to control what roots are trusted. Again while I have no data I can point out anecdotally it appears this works very well and since chrome updates all browsers automatically they tend to have the latest binaries and settings.

Safari on the other hand manages the roots in the firmware/operating system update. This means if a user does not update their devices’ software the roots will never be present even if you are a member of the root program.

These are just a few examples of some of the root programs CAs have to worry about, and the problems they have to consider.  

What this means is the slowest root program in the WebPKI to accept and distribute roots holds back the adoption of new CAs.  

As an aside, It is worth noting that around 59% of all browser traffic (not including IoT devices) is mobile. This includes a lot of developing country traffic where phones are kept longer and replacement models are often still old models with older software and root stores that are not updated. This is before you get to TVs, printers, medical devices, etc that seldom do forced firmware updates and never manage roots independently from firmware.

As a result of the above, these days I generally tell people that it takes 5-7 years post inclusion to get sufficient ubiquity in a root to be able to rely on it for a popular service as a result.

A CA will typically try to address this problem by acquiring what is known as a cross-sign (see Unobtainium), there are two recent examples of this I can point to. One is the cross-sign for Let’s Encrypt and the other is the cross-sign for Google Trust Services. The problem is most CAs don’t want to help other CAs with cross-signs for obvious reasons. Beyond that increasingly there are fewer and fewer of the “old keys” still in use on the web that is baked into the oldest devices meaning there are not a lot of choices for those cross signs even if you can get one.

Let’s assume for the purpose of this (now long) post that you are able to get a cross sign either from a competitor or because you happen to have custody of older key material that was grandfathered into the root program that has the needed ubiquity.

In the simplest case, you get to a situation where you are asking those that use your certificates to include the two CA certificates in their TLS bundle.  At a minimum, we start to pay a performance penalty for doing this.

As an example consider that each certificate is about 1.5KB in size, by adding this new certificate to the bundle every fresh TLS negotiation will carry this tax.  If we assume a typical chain is normally 3 in length that makes our new total 4.5kb in overhead. It’s not a large figure but if you consider a high-traffic site like Amazon it adds up quickly.

How many purchases are made on Amazon daily? On a daily basis, Amazon ships more than 66,000 orders per hour in the US. About 1.6 million packages are shipped on a daily basis.

capitalcounselor.com

If we assume each of Amazon’s orders equates to a single fresh TLS negotiation (it should be many more than that since many servers power the site) that is a daily increase of 3.6GB of traffic. Now Amazon can deal with that with no problem but it would slow down the experience for users, and potentially cost those users without unlimited internet plans some of their allotted bandwidth. There is also the potential of fragmentation as the TLS packets get larger which increases the latency for the user.

Don’t get me wrong, these are not deal breakers but they are taxes that this cross sign approach represents. 

My fear, and to be clear I’ve not completed the thought process yet, the above story creates a situation where CAs need to provide multiple cross signs to support all the various device combinations that are out there in this new world.

If so this scares me for a few reasons, off the top of my head these include:

  1. It is already hard enough to get a single cross sign doing so multiple times will be that much harder,
  2. It will put new entrants at a disadvantage because they will not have legacy key material to rely on nor the high capital requirements to secure cross signs if commercial terms can even be reached allowing them to get one,
  3. We have spent the last decade making TLS the default and we are close to being able to declare victory on this journey if TLS becomes less reliable we may lose ground on this journey,
  4. It is a sort of regressive taxation on people with older and slower devices that are likely already slow and paying for data on a usage basis.
  5. The additional data required will have a negative impact on Time to First Paint (TTFP) in that this extra data has to be exchanged regardless.

Mom always said don’t complain if you don’t have a proposal to make things better, so I figure I should try to propose some alternatives. Before I do though I want to reiterate that this post isn’t a complaint, instead it’s just a representation of my inner dialog on this topic. 

OK, so what might be a better path for us getting to this world of shorter-lived roots? I guess I see a few problems that lead to the above constraints

  1. Even in the browser ecosystem, we need to see root lists updated dynamically,
  2. Inclusion into a root store should take months for your initial inclusion and later updates should happen in weeks,
  3. Browsers should publish data to help CAs understand how ubiquitous their root distributions are so they can assess if cross signs are even necessary,
  4. There needs to be some kind of public guidance to IoT devices on what they should be doing for the usage of certificates in their devices and this needs to include root management strategies.
  5. IOT devices need to stop using the WebPKI, it is increasingly the Desktop and Browser PKI and anything else will get squashed like a bug in the future if they are not careful 🙂
  6. There should be a Capability and Maturity Model for root programs and associated update mechanisms that can be used to drive change in the existing programs 
  7. CAs should stop using one issuing hierarchy for all cases and divide up the hierarchies into as narrow of slices as possible to reduce the need for one root to be trusted in all scenarios.
  8. Root programs should allow and go so far as to encourage CAs to have more than the 3 (typical) roots they allow now to support CAs in doing that segmentation 
  9. WebPKI root programs should operate two root programs, the legacy one with all of its challenges and the new one focused on agility, automation, and very narrow use cases. Then they should use the existence of that program to drive others to the adoption of those narrower use cases.
  10. And I guess finally we have a chance with the Post Quantium TLS discussions to look at creating an entirely new WebPKI (possibly without X.509!) that is more agile and narrow from the get-go.

Why crawling is not an adequate measurement methodology for the WebPKI

The answer is simple — It’s an incomplete view of the use of the WebPKI.

There are a number of different methodologies a web crawler-based approach might take in measuring the size of the WebPKI. The most naive approach would be to simply scan all IPv4 address space and log all of the certificates you see during this scan.

The problem is that this only shows a small fraction of the certificates that are out there. When you connect to an IP address and the associated web server doesn’t know what host you are trying to connect to it will return its “default” website and use the associated certificate.

That same IP address may literally be responsible for serving millions of sites based on the client’s indicated hostname. With this IP-based enumeration approach at best you would get one certificate from that host, at worse you wouldn’t even get that because some servers are not configured with a default site. This is just one problem with this approach there are many more.

Though most WebPKI market share reports do not document their methodology anecdotally it appears most work on this crawler approach and at least historically some have taken periodic drops from CAs to make their view “more complete”.

Today though the only way to measure CA market share that should be used is by relying on the pre-certificate counts in Certificate Transparency logs.

What value can a third-party provide users when browsing the web?

While at the CA/Browser Forum I was asked by a friend if we wanted to replace EV with a new class of certificate what would that certificate look like?

My response was that I would frame the question differently. The “real” question is what problems does a typical user have that a third-party with the strengths of a CA could help with?

With this in mind, you need to first understand who this stereotypical user is, a software engineer may have different needs than a grocery store clerk. They may also have common needs, you won’t know that until you do research.

The only way to do reliable research on this topic is to actually work with those users to understand what their needs are. While this is much harder than it sounds due to biases introduced in such processes a real needs analysis requires that you start here.

With that said, I suspect this exercise would show a broad swath of the target users is concerned with these questions:

  • Will I have a good experience working with the people behind the website?
  • Do the people behind this website have a good reputation?
  • Are the people behind this website experts in their craft?
  • How do I figure out how to reach a real human when and if I need to?

I would put those concerns into the context of the interaction they will have with the website (buying a product, downloading software, etc).

With that understanding I would then try to understand what the strengths of the CA are, having been a CA for a long time I would say:

CAs are good at verifying claims relating to the subject of a certificate.

I would then try to map the identified problems and strengths together to see what potential value the CA could provide that user.

Again the right thing to do is formally do those above explorations but for the purpose of this post I suspect these exercises would find that:

  • When a user visits a website they may struggle to find out how to contact the sales/support for that business,
  • When a user visits a site for the first time it may be hard for them to determine what the companies true line of business is,
  • After a user previously visited a website and completed a transaction with it they sometimes need to contact that business after the fact and could be assisted in finding the right contact information,
  • Before deciding to do a high-value transaction with a business, customers may want to find out the experience others have had with that business.

Now, just because a user may have these problems and a CA may be able to help solve them, it does not mean the SSL indicator is the right place to help answer these questions. It just means that there is a problem and skills intersection.

When, and how to solve this problem is another exercise altogether. Let’s explore EV for a second to give that some context.

Today if we assume the information in an EV certificate is correct (and not confusing see: this and this for context) we can say it provides the answer to “if I need to sue these people where do I tell my lawyer they are at?”.

The problem with that is that you may not have that information when you need it. I say this because you typically need to sue someone after you completed a transaction with them not before. After the fact, you have no assurance that this information in the certificate will be available at the site you did the transaction with.  The website may have gone away, they could have changed their certificate, or could some other change may have taken place that makes that information not readily available to you when you need it.

In any event, the point of this post is to say CAs should not be asking what they can put into certificates but what problems users have that CAs are well suited to solve. Unless they start there they will not be solving a real problem, they will just be bolting more things onto a certificate and asking why browsers and users don’t users see value in it.

Reality vs Fantasy – The DV vs EV argument

This morning I woke up to a blog post from Melih, the founder of Comodo titled “Problem vs Solution Value mapping”.

This is a follow-up to an ongoing discussion Melih and I have been having about the value of EV, and positive trust indicators. On my blog, the conversation started July 2017 if you’re interested.

Melih’s focuses his most recent post on the assessment of “value”, correctly attempting to define it as the basis of the rest of the post. He chooses to define it as  “the direct result of a resolution to a problem.” I think it is this definition is the first part of his argument I have an issue with. Namely, The Oxford Dictionary defines “value” as “the regard that something is held to deserve; the importance, worth, or usefulness of something.”

When considering “value” with this definition, I believe an analysis of “value” would start by building a case on what is “deserved”. To do that, we have to also define a context in which that value is assessed. I think this is probably the hardest part, and probably where most of the disagreement on “value” of EV stems from.

If we say the context of this assessment is “the security and privacy guarantees that can be provided to the user by user agents to users” EV’s value is no better than that of DV. It is not a hard case to make either.

The security model of the browser is based on the concept of “origin” where that origin is essentially the “hostname” that the content was retrieved from. Any external website or resource embedded in the site (with rare exception) has the same permission as the original website as a result of this model. This is how web analytics work, advertising and many other products and services that make up the web.

Until user agents required all of these entities that make up a given site to use EV and to have the legal entity in all of the associated certificates match; EV is a false flag. It says “you are talking to this legal entity” when in-fact your talking to many legal entities and any one of them could equally harm you.

The reality is that if this change were to be made that you would almost never see EV badges though. This is because virtually every site is made up of content and services from across the web and this condition would almost never be met. This is why we do not see CAs making the argument that this rule should be enforced by UAs.

If we say the context of this assessment is “the average users practical ability to protect themselves from phishing” again EV does not fair well. There have been lots of user studies done on how users do not understand positive trust indicators, and in general, do even notice them in most cases.

Furthermore, even if we disregard these well-run studies (and the associated common sense) as Ian Carroll showed with his Stripe, Inc business in Kentucky the values displayed in these indicators can trivially be made, at a very low cost and with no traceability, be made to say whatever an attacker wants. This again frames EV as a false flag because it can so easily be used to lend credence to a phisher’s site by giving them the EV badge that says the same thing as their target site.

If this was not enough, again if we disregard these well-run studies and say that people need to take the responsibility for looking at the EV badge to get confidence they are dealing with a trustworthy entity we need to look no further than the work James Burton did when he got a certificate for his business “Identity Verified”.  In this case, if a user has been taught to look at the EV indicator for an abstract concept of “trustworthiness” we are back to the user being mislead.

All of this ignores another very real problem, that being most phishing sites are not bespoke sites, instead, they are sites that are hacked and re-purposed. A good example of this is this one from a few weeks ago. What we appear to have here is a company called Northern Computer Services, LLC hosting a website for a business with the domain name “stampsbyjudith.com” hosting a Bank of America phishing site.

Now EV proponents surely see this as an example of EV working but if you look at it critically you will see it is exactly the opposite. First, could a customer believe that this “Northern Computer Services” is somehow a service provider to Bank of America? It seems reasonable to assume that the average user does not know anything about the way Bank Of America operates its services. In-fact even if you do have some level of understanding it’s incredibly common for banks to use service providers for different capabilities, maybe this Northern Computer Services hosts the BoFa website or provide billpay or mortgage services. How is the average user to know?

But what about the URL? There is no plausible way Bank Of America is hosting their site on the domain stampsbyjudith.com! Your absolutely right! it’s a fair expectation of us that if a user happens to look at the address bar that they should be able to figure that out. This is of course something you get when you use DV though, no EV necessary. Then there is the issue that studies also show that users do not look at the address bar either.

This is why Microsoft has created SmartScreen and Google has created Safe Browsing. These solutions utilize the massive scale and technology depth of these organizations along with machine learning and other advanced techniques to find phishing sites. As a result when a user navigates to a site similar to this one they get a interstitial warning them about proceeding.

In summary, in this context, I would argue that as EV exists today it actually makes things harder on the user and easier on the attacker.

With that context in mind let’s explore each of the arguments that Melih makes.

Users want protection from Transit Providers. Sure they do but I would say the if a user framed the topic this way it would demonstrate the how little they actually understand of the problem in question. It is not just “transit providers” they need protection from, it is every entity other than those that are necessary to serve the application hosted at a domain.

Networking is so complex it is not possible to expect even some of the most technical users to understand all of the nuances involved here.

I would like to point out that Melih again attempts to redefine terms, this time in a disingenuous way. Specifically, in this part of his post suggest there is some common understanding that there is a difference between “encipherment” and “encryption”.

Let’s again take a look at what the Oxford Dictionary says:

Encryption – The process of converting information or data into a code, especially to prevent unauthorized access.

Encipherment – Convert (a message or piece of text) into a coded form.

As you can see, these words mean the same thing. The only difference being the example use case in one of the definitions. But maybe this inconsistency is use  is because the Oxford Dictionary does not address a cryptographers view on these words? Unfortunately, that is not the case either, if you were to look at books like Serious Cryptography, Cryptography and Network Security, or even the very dated Applied Cryptography you will find no usage of these terms in this way.

What Melih has suggested in the past, and continues to do so in this section is that somehow if you authenticate only the domain and use that authentication as the basis for the session protection that this is not “encryption”.

Going so far to suggest that it is only encryption if you authenticate the legal entity. This is frankly ludicrous and I can not even respond to this more than I just have here.

I can say, that redefining a term, especially in such a specious way devalues any other valid points he may have.

But what about the users! The users want to know who they are dealing with! I actually agree with this but I also think it is far more complicated than users actually understand. So much so I would argue it is not possible to do in most cases. As a father when I run into situations where my kids want things that are not possible I sometimes joke with them and say “Well I want a pony!”.

It feels to me this is probably a case where that response is appropriate. The reality is there is not a globally unique business name, this is also the case with logos. Probably the best mainstream examples of this are the fake Starbucks stores and the notorial “Apple Stores” of Asia.

Fake Apple Store Highlights Counterfeit China

77778-full

This is the nature of brand names, in-fact there is an entire discipline of law (Trademark Law) dedicated to this topic and multilateral international agreements on how such disputes are to be handled.

So in the context of the url, does EV as it stands today add or remove value? From my perspective, it seems to me at a minimum in this context it provides no value but I could also make a reasonable argument it makes things worse here as well due to the introduction of more surface area for confusion.

User’s want to know if its “safe” to interact with the website! Again I can agree with this, the problem is names do not harm — we even teach our kids rhymes to remind them of this fact:

Sticks and stones may break my bones, but names can never hurt me.

To keep users safe we have to look at far more than the name a website is hosted under; there are literally thousands of features that a solution intending to protect users safety need to consider and I would not be surprised to find out that the name is one of the least important.

This is, again, why we have solutions like SmartScreen and Safe Browsing these solutions are constantly watching feeds of data to determine if a website is safe or not. It is not possible to solve the “safety” problem in any meaningful way without similar techniques.

But user’s want to be able to trust the content they see! Again, I also think this is something that users want, I just don’t think they can have everything they want.

But before I talk about this I want to talk about how Melih is redefining a term again, he suggests that “trust” means “having the ability to validate VISA, Paypal logo etc”. The oxford dictionary defines trust as “Firm belief in the reliability, truth, or ability of someone or something.”

With that, I would think that it would be more correct to say that they want to believe what they see. This is of course a very natural thing, something scammers have taken advantage of since the dawn of time.

When considering this desire I think we have to ask ourselves what the best way we have to service the desire. We also have to acknowledge that malicious content is everywhere in the world (don’t forget our Fake Starbucks and Apple Stores from above) that the best we can do is provide a speedbump.

This is, again, why we have solutions like SmartScreen and Safe Browsing as they were designed, engineered and continually evolve to address these risks.

In closing, I believe EV as it stands today is a round peg in a square hole. This does not mean there is not value in knowing the legal identity of the organization who operates a website, it is also not because these third-parties can’t do more to help users manage the risks they are exposed to.

It is because EV is being sold as something it is not, a anti-phishing tool. Simply put it is not well suited to help with that problem and I would go so far that when we teach users to see it as such it even helps phishers.

Positive Trust Indicators and SSL

[Full disclosure I work at Google, I do not speak for the Chrome team, and more generically am not speaking for my employer in this or any of my posts here]

Recently Melih did a blog post on the topic of browser trust indicators. In this post he makes the argument that DV certificates should not receive any positive indicator in the browser user interface.

I agree with him, just not for the same reasons. Positive trust indicators largely do not work and usability studies prove that is true. Browsers introduced the “lock” user interface indicator as part of a set of incentives and initiatives intending to encourage site operators to adopt SSL.

What is important is that these efforts to encrypt the web are actually working, over half the web is now encrypted and more importantly the adoption rate is demonstrating hockey stick growth.

As a result, in 2014 Chrome started down the path of deprecating positive trust indicators all together. In-fact today Chrome already marks HTTP pages as “Not secure” if they have password or credit card fields.

The eventual goal being able to mark all HTTP pages as insecure but for this to happen SSL adoption needs to be much higher, I suspect browsers will want to see adoption in the 90% to 95% range before they are willing to make this change.

This is relevant in this case because if all pages are encrypted what value does a positive trust indicator have? None. This means that when all HTTP pages get marked “Not secure” we will probably see the lock icon disappear.

So, as I said, I agree with Mehli, the “Secure” indicator should go away but so should the lock, the question is not if, but when.

But what does that mean for EV trust indicators? I am a member of a small group, a group that thinks that EV certificates can provide value. With that said today EV certificates have some major shortcomings wich significantly limit their value, some of which include:

  • It is not possible to get an EV wildcard certificate,
  • CAs, largely, have ignored automation for EV certificates,
  • Due to the lack of automation EV certificates are long lived and their keys more susceptible to theft as a result,
  • The vetting processes used in the issuance of EV certificates are largely manual making them expensive and impractical to use in many cases,
  • CAs market them as an antiphishing tool when there are no credible studies that support that,
  • The business name in the certificate is based on the legal name of the entity, not the name they do business under (DBA),
  • The business address details in the certificate are based on where business is registered (e.g. Delaware),
  • There is no contact information in the certificate, short of the taxation address, that a user can use to reach someone in case of an issue.

Addressing these issues have either been actively been resisted by CAs, for example, DigiCert has tried to get EV wildcard certificates to be a thing in the CA/Browser Forum a number of times but CAs have voted against it every time, or simply ignored.

There are some people who are working towards addressing these gaps, for example, the folks over at CertSimple but without CAs taking a leading role in redefining the EV certificate the whole body of issues can nott be resolved. Importantly until that happens you won’t see browsers even considering updates to the EV UI.

Given this reality, browsers have slowly been minimizing the details shown in EV certificates since they can give users the wrong impression and have limited value given the contents of the certificate.

It is my belief that unless the CAs work together to address the above systematic issues in EV certificates that minimization will continue and when the web is “encrypted” it won’t just be DV that loses its positive trust indicator, it will be EV also.

 

My thoughts on Let’s Encrypt

Today about 80% of all SSL certificates on the Internet that are in use are what are commonly referred to at Domain Validated (DV) certificates. The name is a bit of a misnomer in that not all DV certificates authenticate control of a Domain in-fact most actually authenticate the control of a specific server in the domain.

The large majority of these certificates can be issued with little to no human interaction. In a typical manual enrollment a server administrator generates and submits a certificate request and in return is provided a random value that they are instructed to place into a HTML meta-tag in /index.html that the CA will check for periodically to see if administrator was able to place it there. The idea being that modifying a the meta-tag there is sufficient to prove control over the website. Once the CA notices the administrator was able to complete this task it performs a handful of other checks and the certificate is issued.

Most certificates used for SSL end up coming from hosting providers, service providers and certificate resellers that sell these certificates for as little as a few dollars and in many cases they simply give them away for free.

These folks will also commonly automate the issuance, installation and maintenance of these certificates. Hosting providers typically do this using a plugin that comes from the issuing CA that hooks into their management console (WHM, etc) and the larger more advanced ones write their own based on the web services exposed by the certificate authorities.

So today, contrary to common perception certificates are in-fact are cheap to free and in many cases fully automated. With that said there are a number of pretty important cases where that automation is missing such as cloud service providers (AWS, Azure, Google Cloud, Rackspace, etc), corporate servers and Internet connected devices.

At some point all of the cloud service providers will provide SSL for free after all Mozilla has recently stated that they are working to deprecate HTTP all together and I am sure all other browsers will follow them when there is sufficient SSL ubiquity.

The Let’s Encrypt project aims to make this transition happen faster by being yet another place to get free certificates and making the acquisition of these certificates even easier by closely integrating the certificate lifecycle management into the most commonly used servers.

It is this last part that I think is the most important contribution that Let’s Encrypt will make to the Internet. There are a few reasons for this; for various reasons I could go on about for hours each of the Certificate Authorities have gone and created their own protocols for certificate enrollment instead of working together to define a common one. These protocols (like their cousins from device and operating system vendors) are designed around their specific back-ends and not generic enough to be used when they are not the entity behind them.

To address this the the Let’s Encrypt people have proposed a new modern REST based protocol that does not have this baggage. In fairness it also doesn’t solve all of the CAs needs either but I can easily envision how one would extend it to do so (in-fact it looks a lot like a protocol I designed for GlobalSign’s use).

The other problem not many actually understand is how many issues exist inside the various SSL implementations that prevent a third-party from properly automating the lifecycle of a certificate without downtime. The simplest example being for a external program to change certificates on a running web server it often has to rely on HUPing a the server to force it to pick up the new certificate.

Unfortunately Certificate Authorities are not exactly the most loved people on the Internet and I know from my experience trying to get the maintainers of web servers and SSL stacks to support things like OCSP Stapling that the scale of changes that are necessary to make automated certificate lifecycle totally seamless (and with low risk) for everyone was unlikely going to happen when driven by CAs.

NOTE: In my opinion a big reason for the resistance is that CAs have basically treated these projects as core infrastructure without supporting them financially or by hiring developers to contribute to them. That said this has been slowly changing and despite that the “love” still continues.

The Let’s Encrypt project is a project for developers by developers with the skill, credibility and motivation to fix these issues.

When they are successful (and I am confident they will be) those solutions that use the clients based on their code and protocol will rarely if ever experience an outage due to an expired certificate. Notice I didn’t say the clients that use Let’s Encrypt ? Thats because what they are doing is solving the plumbing problem that CAs have failed to solve and the CAs will be able to benefit from this work also.

It will also enable a class of products and services that otherwise would not have the technical experience, financial means or motivation to otherwise integrate SSL into their product.

Imagine your next refrigerator having a web portal you could log into at https://myhome.refrigerators.com where you could check if you needed to bring home milk where the portal was protected with SSL. These and other projects are unlikely to happen without something like Let’s Encrypt.

So when people tell me “Certificates are already practically free why do we need Let’s Encrypt?” I tell them they need to look at the long game.

What are some upsides of googles’s SHA1 deprecation plan?

NOTE: Google has since adopted a more gradual plan for migration which will addresses the potential false sense of urgency the prior plan represented. Personally I think the new plan is a good one. The upsides in this post are still accurate and it is my hope people switch to SHA256 based certificates as quickly as possible.

The Internet is about to embark on another Heartbleed-esq certificate migration. This time there is no immediate danger (which was certainly not the case with Heartbleed) and there is a proposed twelve weeks to plan and respond.

During this time (unless that plan changes) a large majority of the SSL secured Internet will need to swap out their SSL certificates or the users of these sites will see a little scarier user experience. To be fair some of these certificates will be expiring regardless and need to be replaced anyway but this still represents a large number of additional sites that will need to replace certificates sooner than they had planned.

That said there are upsides, for example given how many of the top sites now use SSL the users of these sites will need to move to modern browsers not dependent on platform crypto or update to a newer version of Windows in the process gaining access  to modern web technologies and security fixes.

Another benefit is that CAs that are not active participant in the CABFORUM and who do not follow the root program requirements closely will be sure to stop their use of SHA1 based signatures as soon as they see the user experience impacted.

The same thing will be true of device companies and enterprises who do not as of today have the option to participate in the CABFORUM and even if they did are frankly unlikely to. That is when they see their support calls go up they will change their products and/or processes so that such certificates are not used.

The net of which is by the end of 2017 we will most likely see the complete end-of-life of SHA1 as part of signature suites and we may see an above average increase in modern browser adoption.

Ryan

 

 

Why might you have a certificate with a SHA1 based signature in its chain that is valid beyond 2016/1/1?

NOTE: Google has updated the plan they will be using to deprecate SHA1 based certificates. The content in this post is still mostly accurate but for dates please see the thread. Personally I think the new plan is a good one. The upsides in this post are still accurate and it is my hope people switch to SHA256 based certificates as quickly as possible.

So there is a plan under discussion to “degrade” the user experience for SSL sessions protected with certificates (or chains) that contain a SHA1 based signature that are valid beyond 2016/1/1.

This 2016/1/1 date was apparently discussed at a CAB Forum meeting six months ago, prior to that the “sunset date” for SHA2 was considered to be 2017/1/1.

Given Chrome represents such a large percentage of the browser ecosystem and they appear to be unwaveringly marching towards this new date I think its fair to refer to this date as the “new sunset date”.

There have been lots of conversations about this topic from the perspective of a CA and that of a browser but not so much from a perspective of a certificate holder.

There are a few cases why you might have such a certificate:

  1. Your certificate was issued before the new sunset date was specified.
  2. When the new sunset date was specified your certificate authority did not update their system to restrict use of that algorithm to expire by that new date.
  3. Your certificate authority gave you the option of choosing which signature suite (and hash algorithm) and expiration dates to use and you chose SHA1.

Some might ask why CAs did not simply stop issuing certificates that utilize SHA1 based signatures all together when Microsoft issued their goal to deprecate by 2017. The answer to this is simple; there is a large number of XP machines out there (15% of the Internet and over 35% of browsers in China) and its unclear how many of them have Service Pack 3 which is necessary to support certificates with SHA2. There are also concerns about the number of mobile and embedded devices that also do not support SHA2.

So how big of a risk is the interoperability impact? It’s hard to say; some numbers i have seen suggest it is less than 1% of traffic but honestly it doesn’t appear possible to measure  the number of XP machines without SP3 and if it were it still wouldn’t take into consideration the devices that do not support SHA2 and we know such devices were shipping as recently as two years ago.

So that takes me to the main reason for this post; it’s my guess that the primary reason you have a certificate that will be effected by this change is that the CAs honestly did not realize google was moving the sunset date forward and were adopting migration plans that they felt balanced interoperability, usability and security.

With that said I believe google sincerely feels this change is in the best interest of the internet and that the user interface changes they are proposing are subtle enough that it wont be noticed by most (see : A Large-Scale Field Study of Browser Security Warning Effectiveness [pdf]).

Unfortunately this leaves you the server administrator stuck somewhat in the middle. You will have to choose to give up views and revenue from these clients that do not support SHA2 or all of your users who use Chrome will see a degraded user experience.

What will Chrome’s SHA1 early warning look like?

NOTE: Google has since revised its plan to enable a more gradual migration to SHA256, this post is no longer accurate.

For the last few weeks there has been an ongoing discussion on the Chromium security-dev mailing list on how Google intends to implement a user interface change to warn users that a SHA1 certificate is in use.

I wont talk to the reasoning behind this change or to the current and future security properties of SHA1 in this post but I thought some folks might be interested in what this might ultimately look like. I say might because right now there is only a mail thread and who knows how things will evolve and what the copy would be in such user interfaces.

With that said the thread does describe what affordances they intend to use when a site has a certificate where it or the corresponding certificate chain has SHA1 based signature in it (excluding the root) that expires after 2016/1/1 the user interface may be “degraded” for these sessions.

At this time it seems the “red x” that is used for mixed content will be used; if so this will look something like this:

 1

 

 

 

 

For the SHA1 certificates that expire after 2017/1/1 if that page contains active content such as JavaScript and CSS that is served over a SSL session with such a certificate they will not be loaded unless the user explicitly chooses to approve their execution, this would look something like this:

2

 

 

 

 

 

Again for SHA1 certificates that expire after 2017/1/1 if the page contains passive content (such as images) that is served over a SSL session with such a certificate it will not be loaded unless the user chooses to do so and the lock will get a yellow arrow, which will look something like this:

3

 

 

 

 

 

 

 

 

Which combinations of these things one will see would be dependent on the specific combination of conditions but this will give you some idea on what these changes may look like.

Ryan

What’s in a certificate chain and why?

Have you ever wondered why your web server certificate has a “chain” of other certificates associated with it?

The main reason is so that browsers can tell if your certificate was issued by a CA that has been verified to meet the security, policy and operational practices that all CAs are mandated to meet. That certificate at the top of the chain is commonly called the “root”. Its signature on a certificate below it indicates that the CA operating the root believes that practices of the CA below it meets that same high bar.

But why not issue directly off of the “root” certificates? There are a few reasons; the main one is to prevent key compromise. To get a better understanding, it’s useful to know that the private keys associated with the “root” are kept in an offline cryptographic appliance located in a safe, which is located in a vault in a physically secured facility.
These keys are only periodically brought out to ensure the associated cryptographic appliance is still functioning, to issue any associated operational certificates (for example an OCSP responder certificate) that may be needed, and to sign fresh Certificate Revocation Lists (CRLs). This means that for an attacker to gain access to these keys, they would need to gain physical access to this cryptographic appliance as well as the cryptographic tokens and corresponding secrets that are used to authenticate the device.

CAs do this because keeping keys offline is a great way to reduce the risk of a compromised key, but it’s a poor way to offer a highly available and performant service, so the concept of an Issuing CA (ICA) was introduced. This concept also enabled the “root” to respond to CA key compromise events by revoking a CA certificate that should no longer be trusted. This also enables delegation of control, limiting those who can influence a given ICA to sign something.

Another way CAs solve the “online CA” problem is to use what is commonly referred to as a Policy Certificate Authority (PCA). This model allows a CA to segment operational practices more granularly. For example, maybe the CA is audited to be in compliance with a specific set of government standards so the ICAs associated with those practices would be signed by the corresponding PCA. This not only allows segmentation of policy and procedures, but it also enables separation of usage scenarios. For example, one PCA may only allow issuance of certificates for secure mail while the other PCA may allow issuance of SSL certificates. These PCAs are also very commonly operated as offline entities and have ICAs right underneath them.

While the above two models represent the most common ways a PKI might be segmented, they are not the only two. For example, the operational practices required to be a publicly trusted CA are far stricter than what a typical data center might employ. For this reason, it’s very common for CAs to manage PKIs for other organizations within their facilities.

CAs may also “roll” ICAs as a means to manage CRL size. For example, if a given CA has had to revoke many certificates during its lifespan, it may decide to manage the size of CRLs – it would be appropriate to create a new ICA and take the previous one out of service so that future CRLs can still be downloaded quickly by clients. When this happens both CA certificates may be valid for an overlapping time, but only the more recent one is actively in use.

Long story short, some counts on the number of Certificate Authorities that exist on the internet can be deceiving. One of the easiest ways to see this is to look at a CA called DFN-Verein. They are an educational PKI that manages all of the CAs in their PKI in the same facilities, using the same practices, but for security reasons they create separate ICAs for each organization in their network.

Simply put, the count of CAs in a PKI is not a good way to assess the number of entities issuing certificates in the PKI ecosystem. What you really want to count is how many facilities manage publicly trusted certificates. The problem is that it is too difficult to count – what you can do, however, is count the number of organizations associated with ownership of each “root”. Thankfully Microsoft makes this fairly easy. In March, I did a post on my blog showing a breakdown of the ownership. Unfortunately, this approach does not give you a count of operational facilities that are used for the subordinate CAs, but it’s quite likely that given the operational requirements and costs associated with maintaining them that these two numbers are relatively close.

So what would I like for you to take away from this post? I suppose there are two key points:

  • A public CA using several Certificate Authorities under their direct control is actually a good thing as it indicates they are managing the risk of operating their services and planning for migrations to new algorithms and keys as appropriate.
  • Counting the number of “roots” and “subordinate CAs” found by crawling the web does not actually represent the number of organizations that can act as publicly trusted certificate authorities.

That is not to say the efforts to crawl the web to understand how PKI is deployed and used is not valuable, it is – quite valuable. These projects are an important way to keep an eye on the practices that are actually used in the management of Public PKI.

Additionally, efforts to support Least Privilege designs in PKI and adopt means to actively monitor certificate issuance, such as Certificate Transparency, all represent positive moves to help us better understand what is actually out there.