Monthly Archives: August 2022

2 Replies

The first thing you will need to do is to find a Windows machine, that is because it is only possible to enroll for an EV Code Signing certificate on Windows. The only browser that provides a way to enroll for a certificate with a smart card is Internet Explorer.

Internet explorer was deprecated on June 15, 2022, but the reality is that its market share had dropped to almost nothing long before that. Today .28% of web browser traffic is from IE.

Edge does have an “IE Mode” that allows you to reload a page with this older version of the browser which still supports this smart card enrollment capability. The thing is that usually IE is only needed for one small part of a flow, for example when ordering an EV certificate you may complete the entire flow in your primary browser and have to start over again in this IE mode to do the actual enrollment.

While IE mode will continue to be supported through 2029. Edge only has about 4.11% of the browser market share today which means at a minimum most users must change browsers before they can get to this deprecated functionality. This pain is all rooted in an older technology known as COM. IE used to let you access COM components on the web, this was done via something known as ActiveX.

ActiveX was a bad idea that was poorly executed but it did enable ways to break out of the browser sandbox to do interesting things. Unfortunately, those interesting things were also often interesting to attackers. This is why it was deprecated by IE long before IE itself was deprecated.

A limited set of ActiveX components got a stay of execution when that happened, one such component is CertEnroll. This control was allowed listed and has been used by CAs to facilitate certificate lifecycle management of user certificates on the web. Despite the proliferation of individual user/organization certificates (they are used more now than ever) there was no replacement made.

The thing is, even when it worked, it didn’t work well. For example, the CertEnroll component will commonly becomes non-responsive — as a result, the web browser ends up freezing. Recently I did an enrollment and IE hung for 4 minutes. I had two options, kill the browser with task manager or wait and pray it would return.

In summary, 96% of people on the web would need to change their browser and jump through a user experience that was designed in the early 2000s, an experience that was never invested in sufficiently, to get an EV code signing certificate today.

Another problem that exists here is that this is all smoke and mirrors from a security perspective. These EV certificates are required to have their keys stored on a smart card, this is effectively accomplished with some client-side javascript filtering of what middleware is used for the enrollment. This can be trivially bypassed with a few lines of Javascript in the browser console window. Doing this gives you an EV code signing certificate that has keys stored in the software. I have, for example, encountered CA websites that use third-party analytics javascript in their enrollment flows, any of these third parties could exfiltrate the code signing keys and the user wouldn’t notice. And if they did it would be because they would have had a better experience because of the issues I call out above.

YubiKey is the only smart card I know that supports an attestation for X.509 certificate keys that tells the CA that it is managing the key. This would mitigate this issue but there is no way to access this attestation information from the browser. If it were me I would mandate that only the YubiKey token be used for EV code signing certificates and either use a third-party solution like Fortify to interact with these tokens or require the YubiKey console tools be used for the enrollment. This would allow CAs to verify one of these tokens was actually being used to protect these keys.

Code signing is an important tool in protecting the software supply chain. It helps us understand an artifact’s origin, its integrity, and in the best case what quality bars the code meets. It would be fantastic if the scaffolding that enabled it was made so it was both easy and secure to do the right thing.

How to keep bad actors out in open ecosystems

2 Replies

There is a class of problems in information security that are intractable. This is often because they have conflicting non-negotiable requirements.

A good example of this is what I like to call the “re-entry” problem. For instance, let’s say you operated a source control service, let’s call it “SourceHub”. To increase adoption you need “SourceHub” to be quick and easy for anyone to join and get up and running.

But some users are malicious and will do nefarious things with “SourceHub” which means that you may need to kick them off and keep them off of your service in a durable way.

These two requirements are in direct conflict, the first essentially requires anonymous self-service registration and the second requires strong, unique identification.

The requirement of strong unique identification might seem straightforward on the surface but that is far from the case. For example, in my small social circle, there are four “Natallia’s” who are often are at the same gatherings, everyone must qualify which one we are talking to at these events. I also used to own ryanhurst.com but gave it up because of the volume of spam I would get from fans of Ryan Hurst the actor because spam filters would fail to categorize this unique mail as spam due to the nature of its origin.

Some might say this problem gets easier when dealing with businesses but unfortunately, that is not the case. Take for example the concept of legal identity in HTTPS certificates — we know that business names are not globally unique, they are not even unique within a country which often makes the use of these business names as an identifier useless.

We also know that the financial burden to establish a “legal business” is very low. For example in Kentucky, it costs $40 to open a business. The other argument I often hear is that despite the low cost of establishing the business the time involved is just too much for an attacker to consider. The problem with this argument is the registration form takes minutes to fill out and if you toss in an extra $40 the turnaround time goes from under 3 weeks to under 2 days — not exactly a big delay when looking at financially incentified attackers.

To put this barrier in the context of a real-world problem let’s look at Authenticode signing certificates. A basic organizationally validated Authenticode code signing certificate costs around $59, With this certificate and that business registration, you can get whatever business name and application name you want to show in the install prompt in Windows.

This sets the bar to re-enter this ecosystem once evicted to around $140 dollars and a few days of waiting.

But what about Authenticode EV code signing? By using an EV code signing certificate you get to start with some Microsoft Smart Screen reputation from the get-go – this certainly helps grease the skids for getting your application installed so users don’t need to see a warning.

But does this reduce the re-entry problem further? Well, the cost for an EV code signing certificate is around $219 which does take the financial burden for the attacker to about $300. That is true at least for the first time – about $50 of that first tome price goes to a smart card like the SafeNet 5110cc or Yubikey 5. Since the same smart card can be used for multiple certificates that cost goes down to $250 per re-entry. It is fair to say the complication of using a smart card for key management also slows the time it takes to get the first certificate, this gets the timeline from incorporation to having an EV code signing certificate in hand to about 1.5 weeks.

These things do represent a re-entry hurdle, but when you consider that effective Zero Day vulnerabilities can net millions of dollars I would argue not a meaningful one. Especially when you consider the attacker is not going to use their own money anyway.

You can also argue that it offers some rate-limiting value to the acquisition pipeline but since there are so many CAs capable of issuing these certificates you could register many companies, somewhat like Special Purpose Acquisition Companies (SPAC) in the stock market, so that when the right opportunity exists to use these certificates it’s ready and waiting.

This hurdle also comes at the expense of adoption of code signing. This of course begs the question of was all that hassle was worth it?

Usually the argument made here is that since the company registration took place we can at least find the attacker at a later date right? Actually no, very few (if any) of these company registrations verify the address of the applicant.

As I stated, in the beginning, this problem isn’t specific to code signing but in systems like code signing where the use of the credential has been separated from the associated usage of the credential, it becomes much harder to manage this risk.

To keep this code signing example going let’s look at the Apple Store. They do code-signing and use certificates that are quite similar to EV code-signing certificates. What is different is that Apple handles the entire flow of entity verification, binary analysis, manual review of the submission, key management, and signing. They do all of these things while taking into consideration the relationship of each entity and by considering the entire history available to them. This approach gives them a lot more information than you would have if you did each of these things in isolation from each other.

While there are no silver bullets when it comes to problems like this getting the abstractions at the right level does give you a lot more to work with when trying to defend from these attacks.

What would it look like to go back to first principles when it comes to root store management in 2022?

1 Reply

In the early 2000s, Microsoft mandated that all CAs in its root program would need to be audited for conformity to WebTrust For CAs (WebTrust), It was the first root program to do so and I was the root program manager that rolled out that change.

For those who are interested Web Trust for CAs is an audit that looks back at a period of past behavior to assess conformity to a set of criteria specific to certificate authorities. Only approved auditors are allowed to officially perform these assessments.

The decision to require the WebTrust audit was pretty simple, there was literally no other independent standard that attempted to assess the operational readiness and security of certificate authorities, and as far as such standards went it was more good than it was bad.

The other root programs didn’t have objective criteria, some operated on a pay-to-play model with virtually no requirements, while others simply wanted a management assertion of conformity to a small list of requirements and a contract signed.

The idea of working with an independent group that had a vested interest in creating a baseline standard for the operation of CAs, that would be able to evolve over time, seemed like a no-brainer given how poor and inconsistent CA practices were at the time.

Nearly two decades later the CA/Browser Forum is where the AICPA and others go to discuss how the audit criteria should evolve over time. It’s objectively a success in that today there is quite a bit of uniformity in the operational practices that CAs have and its existence has led to many improvements to the WebPKI.

With that said, if we go back to first principles today I think I would take a different approach. I would design a root program that was heavily based on technical security controls, and behaviors that can be verified externally continuously.

If we’re going first principles then let’s take a step back at what responsibility a CA has on the internet. A TLS certificate does one thing fundamentally, it binds a domain name to a public key. It does this so that user agents (browsers) can verify that the server on the other end can verify that they are talking to the right server.

This user agent believes this certificate is telling the truth because the browser operators will remove CAs that demonstrate they are not trustworthy.

Today I would guess 60-70% of the certificates on the web are issued via the ACME (RFC 8555) protocol. this protocol has a handful of methods it allows a requestor to use to prove they control the domain name (for example DNS-01, HTTP-01, and TLS-ALPN-02).

What this means is if ACME is the only way you can get a certificate from a CA, it also means that it is possible to externally test how the CA performs domain control verification.

There has been another innovation in this space, we now have Certificate Transparency, this is a community of append-only ledgers that in union contain all of the WebPKI certificates. If a certificate does not get logged to the CT ledgers then users with major browsers (Safari, Chrome, Edge, etc) will see an error when the user visits the associated site.

This now gets us a list of all the things, with this list we can programmatically review 100% of the certificate issuance to ensure that they meet the externally visible technical requirements. This already happens today in the WebPKI thanks to the excellent Zlint.

What about how the CAs key material is protected? I love a good HSM too! Well, some HSMs already support attestations about the keys they are protecting, for example, Google Cloud, and Amazon Web Services use Marvel HSMs in their cloud offering and those devices offer hardware attestations about how a given key is protected.

What about the software that makes up the service? Well, some languages make naturally reproducible builds, for example, Go. This means that you could conceptually run CA software on hardware capable of making attestations about what it is running and from that attestation and if that software was open source you could audit the behavior of that software for security and conformity criteria.

You could even go so far as incorporating these attestations into certificate transparency pre-certificates along with how the domain control verification was performed for the “final” certificate so these things could be audited on a per certificate basis.

You could take the additional step of schematizing all of the audit logs that each of these CAs is required to produce and require them to make them public and verifiable via something like an append-only ledger.

We could even go so far as to eliminate the need for revocation checking by adopting very short-lived certificates as a mandate.

Now don’t get me wrong, this isn’t a complete list, and I doubt there is any set of technical controls that would totally eliminate the need for onsite independent visits but I would argue that they could be more frequent or technically deeper if as an industry we could asses much of the conformity of CAs in real-time and track the changes to their infrastructure.

This sounds like a big lift, after all, there is a top of CAs that exist in the WebPKI, the reality is that the large majority of these don’t focus on TLS as demonstrated by the fact that 5 CAs issue 98.59% of all TLS certificates. This means that you wouldn’t need to get many entities to embrace this new model to cover most of the internet.

If you had enough participants this new root program could be very agile for this new root program. After a short period of initial automated testing and a signed letter of conformity from an auditor, you could automate the inclusion processes. Should the continental monitoring find an issue you could automate the reporting to the CA and even automate the distrust of the CA in some cases.

This new WebPKI would need to exist in parallel to the old one for a long time, in particular, the existence of all of the embedded devices without reasonable firmware update would hold back its total adoption as websites need certificates that work everywhere but you can see a path to an agile, continually verified web PKI that leveraged more of these technical controls rather than all of the manual and procedural verification that is involved today.

I want to be clear, the above is a thought exercise, I am not aware of anyone planning on doing anything like the above, but I do think after 20 years it is entirely reasonable to rethink the topic of what the WebPKI looks like and we don’t need to rest on old patterns “just because that’s how it is done” — we have come a long distance in the last couple of decades and it’s worth looking at back these problems now and again.

WebPKI, TLS, cross-signs, device compatibility, and TLS record size.

1 Reply

Both the Chrome and Mozilla root program have signaled the intent to substantially shorten the time we rely on roots in the WebPKI. I believe this to be a good objective but I am struggling to get my head around the viability and implications of the change.

In this post, I wanted to capture my inner dialog as I try to do just that 🙂

I figure a good place to start when exploring this direction is why this change is a good idea. The simplest answer to this question is probably that root certificates are really just “key bags” — by that I mean they are just a convenient way to distribute asymmetric keys to clients.

Asymmetric keys have an effective lifetime, this means the security properties they offer start to decay from the moment the key is created. This is because the longer the key exists the more time an attacker has had to guess the corresponding private key from its public key, or worse compromise the private key in some way. For this reason, there are numerous sources of guidance on usage periods for asymmetric keys, for example, NIST guidance from 2020 says that a 2048 key is usable between 2019 and 2030.

There is also the question of ecosystem agility to consider. Alexander Pope once said To Err is Human, to forgive divine. This may actually be the most important quote when looking at computer security. After all, security practitioners know that all systems will have security breaches, this is why mature organizations focus so much on detection and response. I believe the same thing is needed for ecosystem management, the corollary, in this case, is how quickly you can respond to unexpected changes in the ecosystem.

Basically having short-lived root keys is good for security and helps ensure the ecosystem does not become calcified and overly dependent on very old keys continuing to exist by forcing them to change regularly.

With that, all said, if it were easy we would already be doing it! So what are the challenges that make it hard to get to this ideal? One of the hairiest is the topic of the Internet of Things (IOT). Though I would be the first to admit there is no one size fits all answer here, broadly these devices should not use the WebPKI.

In 2022, the market for the Internet of Things is expected to grow 18% to 14.4 billion active connections. It is expected that by 2025, as supply constraints ease and growth further accelerates, there will be approximately 27 billion connected IoT devices.
IOT-Analytics.com

A big reason for this is that these devices last a very long time compared to their mobile phone and desktop counterparts. They also, unfortunately, tend not to have managed root stores and if they do, they do it as part of firmware updates and have limited attach rates of these updates. To make things worse these devices take crude simplifying assumptions around what cryptographic algorithms are supported and what root CAs will be used by servers in the future.

But for most IoT applications, like those enabling the smart city, the device life cycle is 10 to 20 years or more. For instance, it doesn’t make sense to replace the wireless module in smart streetlights every few years.
Ingenu

A good topical example of the consequences here might be the payment industry making a decision in the 90s to adopt a single WebPKI CA (a VeriSign-owned and operated root certificate) in payment terminals without having a strategy for updating the device’s root stores. These terminals communicated to web servers that were processing payments. In short, the servers had to use certificates from this one WebPKI Root CA or no payment terminals could reach them.

This design decision was made during a time when SHA-1 was considered “secure”. The thing is cryptographic attacks are always evolving, and since 2005 SHA-1 has not been considered secure. As a result, browsers moved to prevent the use of this algorithm.

This meant that one of two things would happen, either browsers would break payment terminals by preventing WebPKI CAs from using this algorithm, or payment terminals’ lack of an update strategy would put the internet at risk by slowing the migration from SHA-1. Unfortunately, the answer was the latter.

In short, when you have no update story for roots and you rely on the WebPKI you are essentially putting both your product and the internet at risk. In this case, while a root and firmware update story should have existed regardless, the payment industry should have had a dedicated private PKI for these use cases.

OK, so what does this have to do with moving to a shorter root validity period in Browser root programs? Well, when a TLS server chooses what certificate to present to a client it doesn’t know much more than the IP address of the client. This means it can’t selectively choose a TLS certificate based on it being an IOT device, a phone running an old version of a browser, or a desktop browser.

This is where the concept of “root ubiquity” and what those in the industry sometimes called “root baking”. Some root programs process requests for new root inclusion very quickly, maybe within a year your root can become a member of their program but that isn’t enough. You need the very large majority of the devices that rely on that root program to pick up the new root before the devices will trust it.

In the case of Microsoft Windows, the distribution of a root certificate happens via a feature called AutoRoot Update. The uptake for consumer devices of a new root is very high as this is seldom turned off and the URL that is used to serve this update is seldom blocked. Enterprises and data centers are another matter altogether though. While I do not have any data on what the split is, anecdotally I can say it’s still hit or miss if a root has been distributed in this system as a result of this.

Chrome on the other hand uses a configuration management system for all Chrome that also happens to control what roots are trusted. Again while I have no data I can point out anecdotally it appears this works very well and since chrome updates all browsers automatically they tend to have the latest binaries and settings.

Safari on the other hand manages the roots in the firmware/operating system update. This means if a user does not update their devices’ software the roots will never be present even if you are a member of the root program.

These are just a few examples of some of the root programs CAs have to worry about, and the problems they have to consider.

What this means is the slowest root program in the WebPKI to accept and distribute roots holds back the adoption of new CAs.

As an aside, It is worth noting that around 59% of all browser traffic (not including IoT devices) is mobile. This includes a lot of developing country traffic where phones are kept longer and replacement models are often still old models with older software and root stores that are not updated. This is before you get to TVs, printers, medical devices, etc that seldom do forced firmware updates and never manage roots independently from firmware.

As a result of the above, these days I generally tell people that it takes 5-7 years post inclusion to get sufficient ubiquity in a root to be able to rely on it for a popular service as a result.

A CA will typically try to address this problem by acquiring what is known as a cross-sign (see Unobtainium), there are two recent examples of this I can point to. One is the cross-sign for Let’s Encrypt and the other is the cross-sign for Google Trust Services. The problem is most CAs don’t want to help other CAs with cross-signs for obvious reasons. Beyond that increasingly there are fewer and fewer of the “old keys” still in use on the web that is baked into the oldest devices meaning there are not a lot of choices for those cross signs even if you can get one.

Let’s assume for the purpose of this (now long) post that you are able to get a cross sign either from a competitor or because you happen to have custody of older key material that was grandfathered into the root program that has the needed ubiquity.

In the simplest case, you get to a situation where you are asking those that use your certificates to include the two CA certificates in their TLS bundle. At a minimum, we start to pay a performance penalty for doing this.

As an example consider that each certificate is about 1.5KB in size, by adding this new certificate to the bundle every fresh TLS negotiation will carry this tax. If we assume a typical chain is normally 3 in length that makes our new total 4.5kb in overhead. It’s not a large figure but if you consider a high-traffic site like Amazon it adds up quickly.

How many purchases are made on Amazon daily? On a daily basis, Amazon ships more than 66,000 orders per hour in the US. About 1.6 million packages are shipped on a daily basis.
capitalcounselor.com

If we assume each of Amazon’s orders equates to a single fresh TLS negotiation (it should be many more than that since many servers power the site) that is a daily increase of 3.6GB of traffic. Now Amazon can deal with that with no problem but it would slow down the experience for users, and potentially cost those users without unlimited internet plans some of their allotted bandwidth. There is also the potential of fragmentation as the TLS packets get larger which increases the latency for the user.

Don’t get me wrong, these are not deal breakers but they are taxes that this cross sign approach represents.

My fear, and to be clear I’ve not completed the thought process yet, the above story creates a situation where CAs need to provide multiple cross signs to support all the various device combinations that are out there in this new world.

If so this scares me for a few reasons, off the top of my head these include:

It is already hard enough to get a single cross sign doing so multiple times will be that much harder,
It will put new entrants at a disadvantage because they will not have legacy key material to rely on nor the high capital requirements to secure cross signs if commercial terms can even be reached allowing them to get one,
We have spent the last decade making TLS the default and we are close to being able to declare victory on this journey if TLS becomes less reliable we may lose ground on this journey,
It is a sort of regressive taxation on people with older and slower devices that are likely already slow and paying for data on a usage basis.
The additional data required will have a negative impact on Time to First Paint (TTFP) in that this extra data has to be exchanged regardless.

Mom always said don’t complain if you don’t have a proposal to make things better, so I figure I should try to propose some alternatives. Before I do though I want to reiterate that this post isn’t a complaint, instead it’s just a representation of my inner dialog on this topic.

OK, so what might be a better path for us getting to this world of shorter-lived roots? I guess I see a few problems that lead to the above constraints

Even in the browser ecosystem, we need to see root lists updated dynamically,
Inclusion into a root store should take months for your initial inclusion and later updates should happen in weeks,
Browsers should publish data to help CAs understand how ubiquitous their root distributions are so they can assess if cross signs are even necessary,
There needs to be some kind of public guidance to IoT devices on what they should be doing for the usage of certificates in their devices and this needs to include root management strategies.
IOT devices need to stop using the WebPKI, it is increasingly the Desktop and Browser PKI and anything else will get squashed like a bug in the future if they are not careful 🙂
There should be a Capability and Maturity Model for root programs and associated update mechanisms that can be used to drive change in the existing programs
CAs should stop using one issuing hierarchy for all cases and divide up the hierarchies into as narrow of slices as possible to reduce the need for one root to be trusted in all scenarios.
Root programs should allow and go so far as to encourage CAs to have more than the 3 (typical) roots they allow now to support CAs in doing that segmentation
WebPKI root programs should operate two root programs, the legacy one with all of its challenges and the new one focused on agility, automation, and very narrow use cases. Then they should use the existence of that program to drive others to the adoption of those narrower use cases.
And I guess finally we have a chance with the Post Quantium TLS discussions to look at creating an entirely new WebPKI (possibly without X.509!) that is more agile and narrow from the get-go.

Why crawling is not an adequate measurement methodology for the WebPKI

UNMITIGATED RISK

un.mit.i.gat.ed: Adj. Not diminished or moderated in intensity or severity; unrelieved. risk: N. The possibiity of suffering harm or loss; danger.

Monthly Archives: August 2022

Why hasn’t SHAKEN/STIR made a big dent in the volume of robocalls?

User-managed smart cards are an oxymoron

Getting an EV Code Signing Certificate in 2022

How to keep bad actors out in open ecosystems

What would it look like to go back to first principles when it comes to root store management in 2022?

WebPKI, TLS, cross-signs, device compatibility, and TLS record size.

Why crawling is not an adequate measurement methodology for the WebPKI