Raising the Bar: The Urgent Need for Enhanced Firmware Security and Transparency

Firmware forms the foundation of all our security investments. Unfortunately, firmware source code is rarely available to the public and as a result is one of the least understood (and least secure) classes of software we depend on today.

Despite this, the hardware industry is known for its lack of transparency and inadequate planning for the entire lifecycle of their products. This lack of planning and transparency makes it hard to defend against and respond to both known and unknown vulnerabilities, especially when the industry often knows about issues for ages, but customers do not.

In today’s world, automation allows builders, defenders and attackers to automatically identify zero-day vulnerabilities with just a binary it has become increasingly important that embargo times for vulnerabilities are as short as possible, allowing for quick defense and, when possible, remediation.

Despite this, organizations like the UEFI Forum are proposing extended disclosure periods, suggesting a 300-day wait from initial reporting to the vendor before notifying customers. During this year-long waiting period, customers are exposed to risks without defense options. The longer the period, the more likely it is that automation enables the attacker to identify the issue in parallel, giving them a safe period to exploit the zero-day without detection.

Simply put, this duration seems way too long, considering the ease of proactively catching issues now — especially given the industry’s overall underinvestment in product security. It would be a different case if these organizations had a history of handling issues effectively, but the reality is far from this. Their apparent neglect, demonstrated by unreliable update mechanisms, continuously shipping models with the same issues that have been resolved in other models, and the frequency of industry-wide issues highlight this reality. More often than any other industry, we see hardware manufacturers often reintroducing previously resolved security issues due to poor security practices and poor management of their complex supply chains. This reality makes this position highly irresponsible. We must do better. Concealing vulnerabilities like this is no longer viable — if it ever was.

It is possible we will see changes as a result of shifts in software liability and regulatory changes, like those in White House Executive Order 1428. This order demands that organizations responsible for “critical software” comply with long-standing best practices. Although “critical software” lacks a clear definition, firmware’s role in underpinning all security investments suggests it likely falls into this category. This executive order starts with basics like publishing known dependencies, which is useful but insufficient, especially in this segment given the prevalence of shared reference code and static dependencies that are not expressed as a library dependencies. This language includes adoption of formal vulnerability management practices, bug bounties, and more. This and the EU Cyber Resilience Act are all efforts to get these and other vendors to align with long-time security best practices, like those captured by efforts like the NIST’s vulnerability management recommendations.

This landscape will likely shift once we see enforcement cases emerge, but customers must insist on higher standards from hardware manufacturers and their suppliers, or nothing will change in the near term.

Words matter in cryptography or at least they used to

I was listening to Security Cryptography Whatever today, and they were discussing a topic that has been bothering me for a while.

A common theme in post-quantum cryptography is its pairing with classical cryptography. This “belts and suspenders” approach seems sensible as we transition to relatively new ways to authenticate and protect data. We have already seen some of these new post-quantum methods fail, which underscores the importance of agility in these systems.

However, merging two approaches like this introduces complexity, which is important since as a general rule, complexity is the root of all security issues. Another concern is the labeling of various strategies for doing this as “Hybrid.” This wording makes it challenging to understand what the different approaches are doing and why.

With this background in mind, let’s explore three different “Hybrid” approaches to PQC and classical cryptography. By giving each a unique name and using simple examples, to see if we we can show how they differ: Nested Hybrid Signatures, Side-by-Side Hybrid Protocols, and the proposed Merged Hybrid Signatures.

Nested Hybrid Signatures: A box within a box

In this approach, imagine verifying the authenticity of a letter. The nested hybrid signature method is like putting this letter within a secure box, protected by a classical signature scheme like ECDSA. But we don’t stop there. This box is then placed within another, even stronger box, strengthened with a post-quantum signature scheme like Dilithium. This nested structure creates a situation where even if one layer is broken, the inner core remains trustable..

Side-by-Side Hybrid Protocols: Simultaneous and Nested

In this method, imagine two separate safes, each protecting a part of your secret message. One safe has a classical lock, while the other has a modern, quantum-resistant lock. To see the entire message, one must unlock both safes, as the full message remains trustable unless both safes are broken into. 

Merged Hybrid Signatures: Holding onto the past

This method tries to mix the elements of classical and post-quantum signature schemes into a single, unified signature format. The goal of this approach is to enable minimal changes to existing systems by maintaining a single field that combines a classical signature with a post-quantum signature. This has several issues and seems misguided to me. Firstly, this mixing of PQC and classical cryptography is a temporary problem; eventually, we should have enough confidence that post-quantum cryptography alone is enough at which point this complexity wouldn’t be needed. It also messes with the current assumptions associated with existing signatures, and while it’s not clear what the issues may be, keeping each of the signatures isolated seems less risky. To stick with the lock analogy, it’s somewhat like designing a safe with two different locks on the same door, which must be unlocked at the same time with the same key.

Conclusion

While it’s tough to find the right words to describe new developments as they happen we can do better to avoid using the same terms for different approaches. This will make it easier for everyone to understand what’s being discussed without having to study each protocol in detail. 

Document Authenticity in the Age of Generative AI

In our rapidly evolving lives, the credibility of documents, images, and videos online has emerged as a concern. The pandemic and recent elections have helped highlight this issue. In the case of elections, one area that stands out to me is concerns over voter roll integrity, a pillar of our democratic process in the US.  

As we grapple with these issues, it is important to explore what a solution might look like that balances the associated privacy concerns. Is it possible to provide assurance of integrity and transparency while also providing accommodations for privacy and accountability?

Misinformation in the Digital Age

Despite its challenges, the pandemic did have a silver lining — it brought attention to the internet’s role as a breeding ground for misinformation campaigns. These campaigns featured manipulated images and documents, creating confusion and distrust globally. They also underscored a glaring gap in our current systems — no broad deployment of reliable mechanisms to verify the authenticity and origin of digital content.

The recent advancements in generative AI over the last two years have further complicated this issue. Now with a few words, anyone on the web can create images that at first blush look real. This technology will only continue to get better which means we will need to begin to more formally look at how we build solutions to tackle this new reality.

Existing Solutions and Their Shortcomings

Several technologies have recently been discussed as the way forward to address at least portions of these problems. One such example is the Content Authenticity Initiative which proposes that devices like cameras cryptographically sign all pictures and videos with a device credential, a feature aimed at enabling the detection of any alterations made post-capture. 

This method raises significant privacy concerns. Essentially, it could create a surveillance infrastructure where each content piece could be unexpectedly traced back to an individual or a group of devices, potentially becoming a surveillance tool.

Google DeepMind also recently brought forth the idea of opt-in watermarking for images created through AI technologies. While this initiative seems promising at a glance, it fails to address the nuances of the threat model. For instance, a nation-state with intentions to manipulate an election using generative AI assets wouldn’t likely volunteer to watermark these materials as AI-generated. This significant loophole sets a precarious stage where misinformation can still flourish.

These approaches, though developed with noble intentions, reveal critical gaps in addressing the complex landscape of content authenticity. They either infringe upon individual privacy rights or are vulnerable to exploitation when faced with a real threat model. 

Middle Ground: Publisher Signatures and Blinding as a Potential Solution

A more nuanced approach could utilize optional cryptographic signatures linked to a publisher, instead of devices, when signed, the publisher, not their devices, opts into staking their reputation on the authenticity of the artifact. Coupled with a feature to enable cryptographically blinding the publisher’s identity, this strategy could offer a safe avenue for them to reveal their identity at a later time, if necessary. Such a situation might arise in cases of whistleblower claims, where shielding the publisher’s identity becomes crucial for their safety. This blinding could strike a balance, granting publishers temporary anonymity while preserving the potential to enable them to later opt-in to publicly stand behind the artifact or to take accountability for any misinformation.

In contrast to devices subtly leaking metadata that would put subjects in the position to have to prove a negative, for example, needing to explain a picture does not tell the whole story. Or even worse putting the subject of a picture in a situation where they need to prove that the device that captured it was compromised, This is similar to what happens today with red-light cameras and automated radar guns where poorly calibrated devices result in innocent people being charged.

The proposed model shifts the identification to publishers in the hope of fostering a responsible publishing culture where publishers have the discretion to unveil their identity, promoting accountability without completely compromising privacy.

It is also worth noting that a transition from ink signatures to cryptographic signatures for documents appears more pertinent than ever. Generative AI and graphic design technologies have enhanced the ability to replicate handwriting styles, making traditional signatures highly susceptible to forgery. The use of cryptographic techniques offers a more secure alternative, integrating seamlessly into modern workflows and standing resilient against unauthorized alterations.

Publisher Signatures Are Not Enough

In information security, it’s now accepted that insider threats are a significant risk. This realization has steered us away from merely verifying the identity of a publisher, especially in cryptographic signing systems such as code signing. 

There are a few reasons, but one of the largest is that key management proves to be challenging, often due to the fact that publishers frequently represent groups rather than individuals, leading to key management practices being more permissive than ideal from a security standpoint. 

Additionally, if a solution is to incorporate the possibility of anonymity through cryptographic blinding we can not simply bet on the presence and blind trust in that identity.

This is the same reason that led modern code-signing solutions to adopt ledgers that record an artifact’s provenance and history. For instance, in a Binary Transparency system, a ledger might house not only a list of software packages and their contents but also offer qualitative attestations about the software, for example indicating whether it has been screened for malware or verified to be reproducible from its source. This mechanism allows devices to understand not just the origin of the code but also to grasp the intended release of the software and potentially qualitative aspects to consider before reliance on it.

Applying this pattern to our document provenance and integrity problem, this system can offer value even when the identity remains undisclosed. It can provide a basic idea of the time of existence and allow third parties to vouch for the authenticity, possibly linking to other corroborative artifacts.

Building a continuously verifiable record coupled with supportive evidence for artifacts like documents seems to be a step in the right direction. This approach has demonstrated its value in other problem spaces.

With that said it’s essential to acknowledge that, as with any opt-in system, documents, images, and videos will not all contain this additional provenance and like with all technology this too would not be perfect. As a result, this means that rather than outright dismissal, all content will need to be evaluated based on merit, and the evidence collected about it. At a minimum, we must recognize that it can take years for any new system to gain substantial traction.

Learning from the rest of the world.

This issue is not confined to the US, so we should not restrict ourselves to looking at approaches used by US Big Tech. For instance, the strategies suggested here significantly draw upon the principles of electronic signatures, particularly e-Seals, which are prevalent in the EU and other regions. This European model offers vital insights and presents a robust strategy for resolving disputes, albeit reliant on specific technologies.

Interestingly, US online notarization rules have also borrowed elements from the EU, mandating the use of cryptographic signatures, akin to the EU’s emphasis on Advanced Signatures.

By combining this approach with the lessons learned from Certificate and Binary Transparency, where Merkle trees of published materials, continuous monitoring, and third-party evaluation help ensure a more complete picture — we might find a path forward. 

The addition of blinding the publisher’s identity in a way where they could selectively disclose their identity in the future also seems to provide a plausible way to balance the privacy concerns that could enable this path to become the default in the future.

Motivating Participation through Innovation, Policy and Leadership

Adoption of this approach cannot solely rely on goodwill or regulation. It would require a combination of standardization, regulatory changes, creating incentives for publishers, engagement with civil society and other stakeholders, and some tangible leadership by example by a large player or two. Some ideas in this direction include:

  • Initiating revisions to the existing digital signature legislation seems to be a sensible first step. Given our substantial background with the current laws, it’s clear that without regulatory changes, technological investments are unlikely to shift.
  • The government can lead by example by enhancing initiatives like the current Digital Autopen project. This project allows groups of individuals to access shared signing credentials without tying them to individual users, addressing a notable challenge prevalent in code signing and other organizational signing efforts.
  • I also believe that investing in a singular, immensely impactful use case as a case study could vividly illustrate how these approaches can make a significant difference. The voter registration rolls mentioned earlier seem like an excellent use case for this.
  • Further research and standardization could explore integrating cryptographic blinding of signer identities within the current document signing infrastructure, allowing for future disclosure. Investigating the potential structure and security of ledgers, and considering the extension of signing protocols to other artifact formats, seems to be a natural progression in supporting a system of this kind.
  • Simultaneously, collaboration with civil society, tech companies, and other stakeholders, including publishers, appears vital. This will guarantee that their concerns are integrated into the developed solutions and that appropriate policies are instituted to effectively incorporate this metadata into their operations.
  • I also believe investing in a singular and hugely impactful use case as a case study of how these approaches can make a big difference. The voter registration rolls discussed earlier seem like a great use case for this.

While these efforts would not necessarily lead to adoption it does seem that adoption would minimally be a predicate on efforts like these.

A Pathway to Trust and Accountability

Balancing privacy and accountability in the digital age is a nuanced but achievable goal, especially if we build on top of existing successes. By adopting a well-rounded approach that integrates cryptographic signatures with mechanisms for temporary anonymity, we can carve a pathway toward a society where digital content maintains its integrity and trustworthiness. 

Moreover, by fostering an environment where content, even without clear provenance, is evaluated critically rather than dismissed, we encourage a richer discourse and a healthier digital ecosystem.

Through the union of technology and policy, we can create a more secure, transparent, and accountable future for content authenticity.

The Scale of Consequence: Storm-0558 vs DigiNotar

When we look at the Storm-0558 and DigiNotar incidents side by side, we find striking similarities in their repercussions and severity. Both cases involve significant breaches orchestrated by nation-states – China and Iran respectively, targeting critical digital infrastructure and security protocols that are designed to safeguard user data and communications.

In the case of Storm-0558, the skilled dismantling of Microsoft’s authentication infrastructure not only compromised the integrity of exchange inboxes but potentially rendered confidential information accessible to unauthorized entities.

Similarly, the DigiNotar breach constituted a severe undermining of internet security, as the attackers were able to issue trusted certificates that facilitated man-in-the-middle attacks. This compromised user interactions with sensitive services, including email communications.

Given their similar impact on user privacy and internet security, it begs the question are we treating both incidents with equal gravitas and severity?

If not we must ask the question as to why and what are the consequences of that reality?

To answer these questions it might be useful to think about a different kind of breach of trust that happened in the late 2010s where a fake vaccination campaign was used as a cover to collect DNA samples in the hunt for Osama bin Laden. That move ended up causing a lot of people in the area to give a side-eye to vaccination drives, fearing there’s more than meets the eye.

It almost feels like sometimes, big tech in the US gets to bend the rules a little, while smaller players or those from other parts of the world have to toe the line. It’s this uneven ground that can breed mistrust and skepticism, making folks doubt the systems meant to protect them.

In short, these decisions to compromise core infrastructure and come with long-term consequences that are surely not being fully considered.

The Evolution and Limitations of Hardware Security Modules

Summary

Hardware Security Modules (HSMs) have not substantially evolved in the last two decades. The advent of enclave technology, such as Intel SGX, TDX and AMD SEV, despite their weaknesses [1,2,3,4,5,6,7,8,9], has enabled some to build mildly re-envisioned versions of these devices [10]. However, the solutions currently available on the market today are far from meeting the needs of the market. In this paper, we will explore what some of the shortcomings of these devices look like and discuss how the market for them has evolved.

A Brief History of HSMs

The first Hardware Security Module was introduced in the late 1970s by IBM. It was designed to be attached to a specific host with the goal of never exposing the PINs to the host. By the early 90s, there were a handful of more capable solutions in this market, they were primarily sold to governments, often for military use cases where the cost of key compromise warranted the increased operational burden and associated costs of this particular approach.

In the late 90s, we started to see practical innovation in this space. For example, the existing HSMs moved into hosts that enabled them to be accessed over the network. We also began to recognize the need to use code containing business logic that gated the use of specific keys. nCipher was one of the first to market with a generic offering in this space which was ahead of its time.

During this time period, the US government had established itself in the area of security and cryptographic standards. It was in this decade that we saw the concept of “Military Grade” security being extensively used in marketing. At this time, the idea of a commercial security industry was in its nascent stage, and the traction that was achieved was usually with government agencies and those who provided services to governments.

As these companies expanded their sales to the rest of corporate America, they mostly pushed for adopting the same patterns that had been found to be successful in the government. In the context of key management, these physical HSMs were considered the gold standard. As cryptography was seldom used in commercial software in any extensive way, the adoption of these devices was usually limited to very high-security systems that were often largely offline and had low-volume usage, at least by today’s standards.

During the 2000s, enterprise software began to move to third-party data centers and later to the cloud. This trend continued into the 2010s and gave us SEV/SXG-based appliances offering HSM-like capabilities, as well as the first HSMs designed for some level of multi-tenancy. However, from a product standpoint, these devices were designed much like their predecessors, so they inherited many of their shortcomings while also introducing their own issues along the way.

Evolution of Development in the Era of HSMs

In the 1970s, security was not really considered a profession but rather an academic exercise at best. In the following decades, software and hardware were shipped with huge leaps of assumption, such as assuming that anything on the network that can talk to me is trusted because we share the physical network. Networks were largely closed and not interconnected at the time. It was not until the 1980s that TCP/IP was standardized, and the interconnected ecosystem of networks and devices we all know today became a reality.

During the period of technological evolution, software was often written in Assembler and C. In the late 1990s and 2000s, the concept of memory-safe languages became increasingly important. Originally, these languages required large tradeoffs in performance, but as they evolved and Moore’s Law caught up, those tradeoffs became inconsequential for most applications.

It was during this same period that security, as a whole, was evolving into both a profession and an academic specialty. One of the largest milestones in this transition was the Microsoft Trustworthy Security Memo [11], which stressed that security had to become core to the way Microsoft built software. It took the next two decades, but the industry evolved quickly at this point, and the approaches to produce secure-by-default software, as well as the technology that made it easier to create secure-by-default software, became more common.

During this time the US Government stopped being seen as a leader in security, and its role in developing cryptographic standards began to wain due to several suspected examples of the government pushing insecurities and viabilities into both standards and commercial systems.

As a result, we saw a Cambrian explosion of new cryptographic approaches and algorithms coming out of academia. The government’s role in standardizing cryptography was still strong, and the commercial and regulatory power of the government meant that products using cryptography were effectively limited to using only the algorithms approved by the government.

Around the same time, Bitcoin, cryptocurrencies, and blockchain gained momentum. As they had no interest in governments, this created a once-in-a-lifetime opportunity for the world to explore new design patterns and approaches to solving problems related to cryptography and lit a fire under researchers to further long-standing ideas like Homomorphic Encryption (HE), Multi-Party Computation (MPC), Threshold Signatures, new ECC algorithms and more.

At the same time, we saw quantum computers become far more real, which introduced the question of how the cryptography we rely on will need to change to survive this impending change. Despite the reputational damage experienced by the US government due to its shepherding of cryptographic standards in preceding years, it still had a role to play in establishing what cryptographic algorithms it will rely upon in this post-quantum world. In 2016, the government started a standardization process for the selection of post-quantum cryptographic algorithms. In 2022, they announced the first four approved PQ algorithms based on that process [12].

During this same time, we have seen a substantial increase in security in the applications and operating systems we use, as a result of improvements in the processes and techniques used to build software. Despite this, there is still a long way to go. In particular, although this is changing, a lot of development happens in non-memory safe languages, and legacy applications still see broad deployment, which makes broad assumptions on the software and operating system dependencies they rely on.

Cloud adoption has played a significant role in driving this change, with as much as half of all enterprise computing now happening on cloud service providers. With the move to cloud, the physical computing environment is no longer the primary risk when it comes to protecting keys. The focus has shifted to online access, and as recent cryptocurrency compromises have shown, it is key management, not physical key protection, that is the weak link in modern cryptographic systems.

One notable exception here is Signal. It has been designed to minimize assumptions on underlying dependencies and is seen as the most secure option for messaging. So much so that most large technology companies have adopted their design, and the IETF has standardized its own derivatives of their protocols. This, combined with the earlier trend, signals how we are not only changing the processes, languages, and tooling we use to build everyday software but also the actual designs in order to mitigate the most common vulnerabilities.

This journey is not done, the Whitehouse has recently signaled [13,14,15] it will be using its regulatory power to accelerate the move to more secure services and will be looking to “rebalance the responsibility to defend cyberspace” by shifting liability for incidents to vendors who fail to meet basic security best practices.

What does this mean for HSMs?

The evolving landscape of cybersecurity and key management has significant implications for Hardware Security Modules (HSMs). With businesses moving their computing to cloud service providers, protecting keys has shifted from a physical computing environment to online access, making key management the weak link in modern cryptographic systems.

To stay relevant and effective, HSMs need to adapt and innovate. They will become computing platforms for smart contract-like controls that gate access to keys rather than cryptographic implementations that protect from physical key isolation.

This means that the developer experience for these devices must evolve to match customers’ needs for development, deployment, and operations, including languages, toolchains, management tools, and consideration of the entire lifecycle of services that manage sensitive key material.

Additionally, the HSM industry must address the issue of true multi-tenancy, securing keys for multiple users while maintaining strong isolation and protection against potential breaches.

Moreover, the HSM industry must address the growing demand for quantum-safe cryptography as current cryptographic algorithms may become obsolete with the development of quantum computers. HSMs must support new algorithms and provide a smooth transition to the post-quantum world.

In conclusion, the HSM industry needs to continue evolving to meet the changing needs of the cybersecurity landscape. Failure to do so may lead to new entrants into the market that can better provide solutions that meet the needs of today’s complex computing environments.

What are the opportunities for HSMs?

The changing landscape of cybersecurity and key management presents both challenges and opportunities for Hardware Security Modules (HSMs). One opportunity is the increasing need for secure key management as more and more businesses move their computing to cloud service providers. This presents an opportunity for HSMs to provide secure, cloud-based key management solutions that are adaptable to the evolving needs of modern cryptography.

Furthermore, the shift towards smart contract-like controls gating access to keys presents an opportunity for HSMs to become computing platforms that can be customized to meet specific business needs. This could lead to the development of new HSM-based services and applications that can provide added value to customers.

Another opportunity for HSMs is the growing demand for quantum-safe cryptography. With the development of quantum computers, the cryptographic algorithms used today may become obsolete, requiring the adoption of new quantum-safe cryptographic algorithms. HSMs can play a critical role in providing the necessary support for these new algorithms and ensuring a smooth transition to the post-quantum world.

In addition, the increasing adoption of blockchain and cryptocurrencies presents a significant opportunity for HSMs. These technologies rely heavily on cryptographic keys for security, and HSMs can provide a secure and scalable key management solution for these applications.

Overall, the changing landscape of cybersecurity and key management presents several great opportunities for HSMs to provide innovative solutions that can meet the evolving needs of businesses and the broader cryptographic community.

Towards Greater Accountability: A Proposal for CA Issuance Decision Logs

It took us a long time, but objectively, Certificate Transparency is a success. We had to make numerous technology tradeoffs to make it something that the CAs would adopt, some of which introduced problems that took even longer to tackle. However, we managed to create an ecosystem that makes the issuance practices of the web transparent and verifiable.

We need to go further, though. One of the next bits of transparency I would like to see is CAs producing logs of what went into their issuance decisions and making this information public. This would be useful in several scenarios. For example, imagine a domain owner who discovers a certificate was issued for their domain that they didn’t directly request. They could look at this log and see an audit history of all the inputs that went into the decision to issue, such as:

  • When was the request for the certificate received?
  • What is the hash of the associated ACME key that made the request?
  • When CAA was checked, what did it say? Was it checked via multiple perspectives? Did all perspectives agree with the contents?
  • When Domain Control was checked, did it pass? Which methods were used? Was multiple perspectives used? Did all perspectives agree with the contents?
  • What time was the pre-certificate published? What CT logs was it published to?
  • What time was the certificate issued?
  • What time was the certificate picked up?

This is just an example list, but hopefully, it is enough to give the idea enough shape for the purpose of this post. The idea here is that the CA could publish this information into cheap block storage files, possibly. I imagine a directory structure something like: ” /<CA CERTHASH>/<SUBJECT CERTHASH>/log”

The log itself could be a merkle tree of these values, and at the root of the directory structure, there could be a merkle tree of all the associated logs. Though the verifiability would not necessarily be relied upon initially, doing the logging in this fashion would make it possible for these logs to be tamper-evident over time with the addition of monitors.

The idea is that these logs could be issued asynchronously, signed with software-backed keys, and produced in batches, which would make them very inexpensive to produce. Not only would these logs help the domain owner, but they would also help researchers who try to understand the WebPKI, and ultimately, it could help the root programs better manage the CA ecosystem.

This would go a long way to improving the transparency into CA operations and I hope we see this pattern or something similar to it adopted sooner rather than later.

Exploring the Potential of Domain Control Notaries for MPDV in WebPKI

In an earlier post on the Role of Multiple Perspective Domain Control Validation (MPDV) in the WebPKI, I discussed how there was an opportunity for CAs to work together to reduce the cost of meeting the upcoming requirements while also improving the security of the ultimate solution.

In this post, I want to talk a little about an idea I have been discussing for a while: specifically, Domain Control Notaries.

Before I explain the idea, let’s first look at how domain control verification (MPDV) happens. In short, the CA generates a random number and asks the certificate requestor to place that number in a location that only an administrator can access, for example, in a DNS record, in part of the TLS exchange, or in a well-known location.

If the CA is able to fetch the number, and the underlying network is living up to its promises, it can have confidence that the requestor is likely authorized for the given domain.

To understand Domain Control Notaries you will also need to have a basic understanding of what MPDV means. Thankfully, that’s easy: do that from multiple network locations and require a quorum policy to be met. This basically means that an attacker would have to tick enough of the quorum participants to bypass the quorum policy.

So that takes us back to the idea of Domain Control Notaries. As you can see, Domain Control Verification is super simple. That means you could put it on a very small computer, and it could be able to perform this simple task. For example, imagine a USB Armory that was fused at manufacturing time with firmware that only did these domain control checks. That this hardware also had a cryptographically unique key derived from the hardware fused to the device at manufacturing time.

Now imagine an aggregator handling admissions control for a network of these Domain Control Notaries. This aggregator would only allow devices that were manufactured to meet these basic requirements. It would enforce this because the manufacturer would publish a list of the public keys of each of the devices to the aggregator, which would use for admission control.

This aggregator would expose a simple REST API that would let the requestor specify some basic policy on how many of these Domain Control Notaries to broadcast their request to, what domain control methods to use, and a callback URL to be used by the aggregator when the verification is complete.

These aggregators would only accept responses from the Domain Control Notaries that signed their responses and whose keys were on this authorized list and were not added to their deny lists.

This basic framework sets you up with a network of very cheap network endpoints that can be used to perform domain control verification. You could even have a few of these aggregators each with its own Domain Control Notaries. CAs could use multiple of these aggregator networks to reduce centralization risk.

You might be asking yourself how these tiny computers could deal with the scale and performance of this task! The reality is that in the grand scheme of things, the WebPKI is relatively small! It is responsible for only 257,035 certificates every hour. The real number is actually smaller than this too because that includes some pre-certificates and in the context of domain control verification. CAs are also allowed to do some re-use of past validations if recent enough. This means we should be able to use this as a worst-case number safely. Simply put, that is only 1.18 certificates every second. That is tiny. If you spread that out over a few hundred Domain Control Notaries, the number of transactions gets that much smaller.

Then there is the question of who would run these Domain Control Notaries? A lesson learned from Certificate Transparency is that if you make it easy and cheap to participate and make it easy to both come and go, most organizations are willing to help. You’re basically asking an operator to provide a few lightbulbs of electricity and a tiny amount of network connectivity. This is easy for most organizations to sign up for since there is no tax in turning it down, and no impact if there is an outage.

Remember how I said there could be multiple aggregators? These aggregators could also expose more expensive heavy-weight nodes that were not reliant on Domain Control Notaries to provide a more reliable substrate to power such a network.

That’s Domain Control Notaries. Maybe they can be a tool to help us get to this world of MPDV everywhere.

Strengthening Domain Control Verification: The Role of Multiple Perspectives and Collaboration

The security and stability of encryption on the web rely on robust domain control verification. Certificate Authorities in the WebPKI are increasingly facing attacks that exploit weaknesses in the Border Gateway Protocol (BGP) ecosystem to acquire certificates for domains they do not legitimately control.

While Resource Public Key Infrastructure (RPKI) has the potential to mitigate these threats, its adoption is hindered by several structural barriers that have slowed its adoption.

In response, larger more security-minded CAs have started embracing the concept of Multiple Perspective Domain Control Verification (MPDV) to enhance their defenses. The fundamental idea of MPDV is that before issuing a certificate, the CA will require numerous network perspectives to agree that the domain control verification criteria have been met.

Researchers at Princeton University have played a significant role in this journey in various ways, including raising awareness about the issue, evaluating the effectiveness of different MPDV implementations, and helping determine efficient quorum policies.

This combination has led to Google Chrome signaling an intention to require MPDV from all CAs. This indicates that there is enough data to demonstrate this is both valuable and doable and I agree with this conclusion.

This new requirement will have several consequences. This is because implementing a competent MPDV solution is more difficult than it appears on the surface. For instance, these network perspectives need to be located in different networks for this to be an effective tool to mitigate these risks. One of the most expensive aspects of operating a transactional service is managing the environment in which the service runs. This means that if CAs distribute the entire MPDV checking process to alternative network perspectives, they will need to manage multiple such environments. The cost and complexity of this go up as the number of perspectives is added.

This should not be a problem for the largest CAs, and since the top 10 CAs by issuance volume account for 99.58% of all WebPKI certificates, achieving broad coverage of the web only requires a few companies to make these investments and they should be able to assume those costs. But what about the smaller CAs?

These smaller, regional CAs are often focused on language-specific support in the markets they operate in, assisting with local certificate-related product offerings such as document signing or identity certificates, and adhering to regional regulations. These are much smaller markets and leave them with far fewer resources and skills to tackle problems like this. The larger CAs on the other hand will also end up duplicating much of the same infrastructure as they worked toward meeting these requirements. 

This suggests there is an opportunity for CAs to collaborate in building a shared network of perspectives. By working together, CAs can pool resources to create a more diverse network of perspectives. This can help them meet the new requirements more efficiently and effectively, while also strengthening the internet’s overall security.

Key Management and preparing for the Crypto Apocalypse

Today, keeping sensitive information secure is more critical than ever. Although I’m not overly concerned about the looming threat of quantum computers breaking cryptography, I do worry about our approach to key management, which significantly impacts how we will ultimately migrate to new algorithms if necessary.

Traditional key management systems are often just simple encrypted key-value stores with access controls that release keys to applications. Although this approach has served us well in the past, moving away from bearer tokens to asymmetric key-based authentication and, ultimately, the era of post-quantum cryptography (PQC) demands a different approach.

Why am I concerned about bearer tokens? Well, the idea of a long-lived value that is passed around, allowing anyone who sees the token to impersonate its subject, is inherently risky. While there are ways to mitigate this risk to some extent, most of the time, these tokens are poorly managed or, worse, never changed at all. It’s much easier to protect a key that no one sees than one that many entities see.

The old key-value approach was designed around this paradigm, and although some systems have crude capabilities that work with asymmetric keys, they leave much to be desired. If we want seamless, downtime-free rollover of cryptographic keys across complex systems, we need a model that keeps keys isolated even from the entities that use them. This change will significantly improve key rollover.

This shift will require protocol-level integration into the key management layer, but once it’s done in a reusable way, keys can be changed regularly. A nice side effect of this transition is that these components will provide the integration points allowing us to move to new algorithms in the future.

What does this have to do with PQC? Unlike the shift from DES to AES or RSA to ECC, the post-quantum algorithms available now are substantially larger and slower than their predecessors, meaning the gradual migration from the once state-of-the-art to the new state-of-the-art won’t start until it absolutely has to. Instead, the migration to PQC starts by changing the way we build systems, specifically in how we architect key rollover and the lifecycle of keys. I personally believe the near-term impetus for this change will be the deprecation of bearer tokens.

The importance of seamless and automated rollover of keys is crucial for making systems secure, even if the post-quantum apocalypse never happens.

I also think we will see PQC readiness in credentialing systems. For example, we may see ACME clients support enrolling for PQC certificates simultaneously as they enroll for their ECC certificates, or perhaps support the (more bloated) hybrid certificates.

In conclusion, rethinking our key management approach is increasingly important. So far, I have not seen anyone come to market with what I would call a different approach to key management, and we need one.

The Growing Security Concerns of Modern Firmware and the Need for Change.

Today’s firmware is larger and more complex than ever before. In 1981, the IBM PC BIOS was a mere 8 KB, but now UEFI, even without considering machines with BMCs, can be 32 MB or even larger! To illustrate the magnitude of the problem, Intel will soon release its next-generation SoCs with support for 128 MB of firmware!

Essentially, UEFI has become a real-time OS with over 6 million lines of code and is still growing. This is larger than most modern operating systems. Furthermore, the various boot phases and hardware layers create significant complexity for defenders. Increased surface area leads to more vulnerabilities.

The most impactful and difficult-to-patch vulnerabilities reside in the underbelly of technology. Firmware, file systems, BGP, and other foundational aspects of technology that we often take for granted are becoming more vulnerable to attacks. It’s time to prioritize security for the very foundation of our tech. Benjamin Franklin once said, “A failure to plan is a plan to fail.” This adage often applies to long-term vulnerabilities in technology. Insufficient planning can lead to an inability to detect issues, inadequate data to assess their true severity, and a lack of responsiveness.

Firmware serves as a prime example. Many firmware-level issues remain unpatched because firmware often lacks the measurement and patching middleware we expect from software. Moreover, hardware vendors frequently behave as if their job is complete once they release a patch. Imagine if, in 2023, software vendors merely dropped a patched piece of software into a barely discoverable HTTP-accessible folder and proclaimed, “Thank goodness we’ve done our part.” This scenario largely reflects the current state of firmware.

One reason for this situation is that the problem on the surface appears intractable. A typical PC may house dozens of firmware components, with no inventory of what exists. This firmware often originates from multiple vendors and may include outdated chips that have not been updated.

Another fitting saying is, “You can’t manage what you can’t measure.” Combine this with the exponential growth of firmware surface area and the increasing number of internet-connected devices containing firmware, and you have a massive security issue arising from decades of neglect.

There is no silver bullet here. One aspect to address is the way firmware is built. USB Armory aims to solve this by making firmware memory safe, easy to read, and with minimal dependencies. While this is a positive step, it is not sufficient on its own. Binarly.io has created the best automation available for detecting firmware issues automatically, which is invaluable considering that old approaches will persist for decades.

To drive change, we need better measurement and widespread adoption of automatic update mechanisms for firmware of all sizes. These mechanisms must be safe, reliable, and robust. Misaligned incentives contribute to the problem, often resulting from a lack of accountability and transparency. This is why I dedicated as much time as I could to binary.transparency.dev while at Google.

The momentum around software supply chain security is essential, as it sheds some light on the problem, but alone it is not enough to bring about the necessary change. If you create a chip with firmware that has a vulnerability, your responsibility does not end with shipping a patch. If you ship devices without providing a way to seamlessly patch all firmware, you are failing.

Relying on the next hardware refresh cycle to update firmware on your devices in the field is insufficient. With cloud adoption, refresh cycles might even lengthen. A long-term strategy to support your devices is necessary; not doing so externalizes the consequences of your inaction on society.

If you have devices in the field that are in use, and you don’t have a confident inventory of the dependencies that exist in them, and you’re not monitoring those dependencies and the firmware itself for issues, you are part of the problem, externalizing consequences on society.

We can do better.

To improve firmware security, the industry must collaborate and adopt best practices. This includes embracing transparency, robust patch management systems, and long-term support strategies for devices. By working together to address these challenges, we can build a more secure foundation for the technology that underpins our modern world.

In conclusion, it’s crucial that we prioritize firmware security, as it plays a fundamental role in the safety and reliability of our devices and systems. By implementing more effective measurement, automatic update mechanisms, and long-term support strategies, we can reduce the risks associated with outdated and vulnerable firmware. This will help create a safer digital environment for everyone.

P.S. Thanks to @matrosov and @zaolin for their insights on the problem on Twitter.