Category Archives: Technology

Evolving Challenges in Software Security

In 2023, we observed an average month-to-month increase in CVEs of approximately 1.64%, with this rate accelerating as the year progressed. At the same time, several trends emerged that are associated with this increase. These include a heightened focus on supply chain security by governments and commercial entities, intensified regulatory discussions around how to roll out concepts of software liability, and the expanded application of machine learning technologies in software security analysis.

Despite the broad use of open source, the large majority of software is still delivered and consumed in binary form. There are a few reasons for this, but the most obvious is that the sheer size and complication of code bases combined with the limited availability of expertise and time within consuming organizations makes the use of the source to manage risk impractical.

At the same time, it’s clear this issue is not new, for example in 1984, Ken Thompson, in his Turing Award Lecture, mentioned, “No amount of source-level verification or scrutiny will protect you from untrusted code”. This statement has been partially vindicated recently, as intelligent code analysis agents, although faster ways to produce code, have been found to exert downward pressure on code quality while also reducing the developer’s understanding of the code they produce — a bad combination.

To the extent these problems are resolved we can expect the attackers to be using the same tools to more rapidly identify new and more complex attack chains. In essence, it has become an arms race to build and apply these technologies to both offensive and defensive use cases.

It is this reality that has led to DARPA’s creation of the DARPA’s Artificial Intelligence Cyber Challenge and its various projects on using AI to both identify and fix security defects at scale.

The saying “In the middle of difficulty lies opportunity” aptly describes the current situation, where numerous security focused startups claim to offer solutions to our problems. However, the truth is often quite different.

Some of those racing to take advantage of this opportunity are focusing on software supply chain security, particularly with a focus on software composition analysis. This is largely driven by regulatory pressures to adopt the Software Bill of Materials concept. Yet, most tools that generate these documents only examine interpreted code and declared dependencies. As previously mentioned, the majority of code is delivered and consumed in compiled form, leaving customers unable to assess its correctness and completeness without enough data to do so. As a result, although these tools may help with compliance, they inadvertently cause harm by giving a false sense of security.

There are other vendors still that are essentially scaling up traditional source code reviews using large language models (LLMs). But as we’ve discussed, these tools are currently showing signs of reducing code quality and developers’ understanding of their own code. At the same time these tools produce such a high volume of false positives given the lack of context this analysis has available to it triaging the outputs can turn into a full-time job. This suggests that negative outcomes could ensue over time if we don’t adjust how we apply this technology or see significant improvements in the underlying technology itself.

These efforts are all concentrated on the software creators but if we expand the problem domain to include the consumers of software we see that outside of cloud environments, where companies like Wiz and Aqua Security provide vulnerability assessments at scale, there are hardly any resources aiding software consumers in making informed decisions about the risks they face by the software they use. A big part of this is the sheer amount of noise even these products produce, combined with the lack of actionability in such data for the consumer of the software. With that said these are tractable problems if we just choose to invest in new solutions rather than apply the same old approaches we have in the past.

As we look toward the next decade, it is clear that software security is at a pivotal point, and navigating it goes beyond just technology; it requires a change in mindset towards more holistic security strategies that consider both technical and human factors. The next few years will be critical as we see whether the industry can adapt to these challenges.

Challenges in Digital Content Authentication and the Persistent Battle Against Fakes

Efforts have been made for years to detect modified content by enabling content-creation devices, such as cameras, to digitally sign or watermark the content they produce. Significant efforts in this area include the Content Authenticity Initiative and the Coalition for Content Provenance and Authenticity. However, these initiatives face numerous issues, including privacy concerns and fundamental flaws in their operation, as discussed here.

It is important to understand that detecting fakes differs from authenticating originals. This distinction may not be immediately apparent, but it is essential to realize that without 100% adoption of content authentication technology—an unachievable goal—the absence of a signature or a watermark does not mean that it is fake. To give that some color just consider that photographers to this day love antique Leica cameras and despite modern alternatives, these are still often their go-to cameras.

Moreover, even the presence of a legitimate signature on content does not guarantee its authenticity. If the stakes are high enough, it is certainly possible to extract signing key material from an authentic device and use it to sign AI-generated content. For instance, a foreign actor attempting to influence an election may find the investment of time and money to extract the key from a legitimate device worthwhile. The history of DVD CSS demonstrates how easy it is for these keys to be extracted from devices and how even just being able to watch a movie on your favorite device can provide enough motivation for an attacker to extract keys. Once extracted, you cannot unring this bell.

This has not stopped researchers from developing alternative authenticity schemes. For example, Google recently published a new scheme they call SynthID. That said, this approach faces the same fundamental problem: authenticating trustworthy produced generative AI content isn’t the same thing as detecting fakes.

It may also be interesting to note that the problem of detecting authentic digital content isn’t limited to generative AI content. For example, the Costco virtual member card uses a server-generated QR code that rotates periodically to limit the exposure of sharing of that QR code via screenshots.

This does not mean that the approach of signing or watermarking content to make it authenticatable lacks value; rather, it underscores the need to recognize that detecting fakes is not the same as authenticating genuine content — and even then we must temper the faith we put in those claims.

Another use case for digital signatures and watermarking techniques involves their utility in combating the use of generative AI to create realistic-looking fake driver’s licenses and generative AI videos capable of bypassing liveness tests. There have also been instances of generative AI being used in real-time to impersonate executives in video conferences, leading to significant financial losses.

Mobile phones, such as iPhones and Android devices, offer features that help remote servers authenticate the applications they communicate with. While not foolproof, assuming a hardened and unmodified mobile device, these features provide a reasonable level of protection against specific attacks. However, if a device is rooted at the kernel level, or physically altered, these protective measures become ineffective. For instance, attaching an external, virtual camera could allow an attacker to input their AI-generated content without the application detecting the anomaly.

There have also been efforts to extend similar capabilities to browsers, enabling modern web applications to benefit from them. Putting aside the risks of abuse of these capabilities to make a more closed web, the challenge here, at least in these use cases, is that browsers are used on a wider range of devices than just mobile phones, including desktops, which vary greatly in configuration. A single driver update by an attacker could enable AI-generated content sources to be transmitted to the application undetected.

This does not bode well for the future of remote identification on the web, as these problems are largely intractable. In the near term, the best option that exists is to force users from the web to mobile applications where the server captures and authenticates the application, but even this should be limited to lower-value use cases because it too is bypassable by a motivated attacker.

In the longer term, it seems that it will fuel the fire for governments to become de facto authentication service providers, which they have demonstrated to be ineffective at. Beyond that, if these solutions do become common, we can certainly expect their use to be mandated in cases that create long-lasting privacy problems for our children and grandchildren.

UPDATE: A SecurityWeek article came out today on this topic that has some interesting figures on this topic.

UPDATE: Another SecurityWeek article on this came out today.

Effortless Certificate Lifecycle Management for S/MIME

In September 2023, the SMIME Baseline Requirements (BRs) officially became a requirement for Certificate Authorities (CAs) issuing S/MIME certificates (for more details, visit CA/Browser Forum S/MIME BRs).

The definition of these BRs served two main purposes. Firstly, they established a standard profile for CAs to follow when issuing S/MIME certificates. Secondly, they detailed the necessary steps for validating each certificate, ensuring a consistent level of validation was performed by each CA.

One of the new validation methods introduced permits mail server operators to verify a user’s control over their mailbox. Considering that these services have ownership and control over the email addresses, it seems only logical for them to be able to do domain control verification on behalf of their users since they could bypass any individual domain control challenge anyway. This approach resembles the HTTP-01 validation used in ACME (RFC 8555), where the server effectively ‘stands in’ for the user, just as a website does for its domain.

Another new validation method involves delegating the verification of email addresses through domain control, using any approved TLS domain control methods. Though all domain control methods are allowed for in TLS certificates as supported its easiest to think of the DNS-01 method in ACME here. Again the idea here is straightforward: if someone can modify a domain’s TXT record, they can also change MX records or other DNS settings. So, giving them this authority suggests they should reasonably be able to handle certificate issuance.

Note: If you have concerns about these two realities, it’s a good idea to take a few steps. First, ensure that you trust everyone who administers your DNS and make sure it is securely locked down. 

To control the issuance of S/MIME certificates and prevent unauthorized issuance, the Certification Authority Authorization (CAA) record can be used. Originally developed for TLS, its recently been enhanced to include S/MIME (Read more about CAA and S/MIME).

Here’s how you can create a CAA record for S/MIME: Suppose an organization, let’s call it ‘ExampleCo’, decides to permit only a specific CA, ‘ExampleCA’, to issue S/MIME certificates for its domain ‘example.com’. The CAA record in their DNS would look like this:

example.com. IN CAA 0 smimeemail "ExampleCA.com"

This configuration ensures that only ‘ExampleCA.com’ can issue S/MIME certificates for ‘example.com’, significantly bolstering the organization’s digital security.

If you wanted to stop any CA from issuing a S/MIME certificate you would create a record that looks like this: 

example.com. IN CAA 0 issuemail ";"

Another new concept introduced in this round of changes is a new concept called an account identifier in the latest CAA specification. This feature allows a CA to link the authorization to issue certificates to a specific account within their system. For instance:

example.com. IN CAA 0 issue "ca-example.com; account=12345"

This means that ‘ca-example.com’ can issue certificates for ‘example.com’, but only under the account number 12345.

This opens up interesting possibilities, such as delegating certificate management for S/MIME or CDNs to third parties. Imagine a scenario where a browser plugin, is produced and managed by a SaaS on behalf of the organization deploying S/MIME. This plug-in takes care of the initial enrollment, certificate lifecycle management, and S/MIME implementation acting as a sort of S/MIME CDN.

This new capability, merging third-party delegation with specific account control, was not feasible until now. It represents a new way for organizations to outsource the acquisition and management of S/MIME certificates, simplifying processes for both end-users and the organizations themselves.

To the best of my knowledge, no one is using this approach yet, and although there is no requirement yet to enforce CAA for SMIME it is in the works. Regardless the RFC has been standardized for a few months now but despite that, I bet that CAs that were issuing S/MIME certificates before this new CAA RFC was released are not respecting the CAA record yet even though they should be. If you are a security researcher and have spare time that’s probably a worthwhile area to poke around 😉

The Rise of Key Transparency and Its Potential Future in Email Security

Key Transparency has slowly become a crucial part of providing truly secure end-to-end encrypted messaging. Don’t believe me? The two largest providers of messaging services, Apple and Facebook (along with their WhatsApp service), have openly adopted it, and I am hopeful that Google, one of its early advocates, will follow suit.

At the same time, we are on the precipice of interoperable group messaging as Messaging Layer Security (MLS) was recently standardized. Its contributors included representatives from employees of the mentioned services and more, which suggests they may adopt it eventually. What does this have to do with Key Transparency? It acknowledges the need for secure, privacy-preserving key discovery through its inclusion of Key Transparency in its architecture.

It’s also noteworthy to see that Apple has agreed to support RCS, Android’s messaging protocol. While there is no public hint of this yet, it’s possible that since they have positioned themselves as privacy champions over the last decade frequently toting their end-to-end encryption, we may see them push for MLS to be adopted within RCS, which could net the world its first interoperable cross-network messaging with end-to-end encryption, and that would need a key discovery mechanism.

In that spirit, recently the Internet Engineering Task Force (IETF) has established a Working Group on Key Transparency, and based on the participation in that group, it seems likely we will see some standardization around how to do Key Transparency in the future.

What’s Next for Key Transparency Adoption Then?

I suspect the focus now shifts to S/MIME, a standard for public key encryption and signing of emails. Why? Well, over the last several years, the CA/Browser Forum adopted Baseline Requirements (BRs) for S/MIME to help facilitate uniform and interoperable S/MIME, and those became effective on September 1, 2023 – this means CAs that issue these certificates will need to conform to those new standards.

At the same time, both Google and Microsoft have made strides in their implementations of S/MIME for their webmail offerings.

In short, despite my reservations about S/MIME due to its inability to address certain security challenges (such as metadata confidentiality, etc), it looks like it’s witnessing a resurgence, particularly fueled by government contracts. But does it deliver on the promise today? In some narrow use cases like mail signing or closed ecosystem deployments of encrypted mail where all participants are part of the same deployment, it is probably fair to say yes.

With that said, mail is largely about interoperable communications, and for that to work with encrypted S/MIME, we will need to establish a standard way for organizations and end-users to discover the right keys to use with a recipient outside of their organization. This is where Key Transparency would fit in.

Key Transparency and S/MIME

Today, it is technically possible for two users to exchange certificates via S/MIME, enabling them to communicate through encrypted emails. However, the process is quite awkward and non-intuitive. How does it work? You either provide the certificate out of band to those in the mail exchange, and they add it to their contact, or some user agents automatically use the keys associated with S/MIME signatures from your contacts to populate the recipient’s keys.

This approach is not ideal for several reasons beyond usability. For instance, I regularly read emails across three devices, and the private keys used for signing may not be the same on each device. Since consistent signing across devices isn’t required, if I send you an email from my phone and then you send me an encrypted message that I try to open on my desktop, it won’t open.

Similarly, if I roll over my key to a new one because it was compromised or lost, we would need to go through this certificate distribution workflow again. While Key Transparency doesn’t solve all the S/MIME-related problems, it does provide a way to discover keys without the cumbersome process, and at the same time, it allows recipients to know all of my active and published certificates, not just the last one they saw.

One of the common naive solutions to this problem is to have a public directory of keys like what was used for PGP. However, such an approach often becomes a directory for spammers. Beyond that, you have the problem of discovering which directory to use with which certificate. The above Key Transparency implementations are all inspired by the CONIKS work, which has an answer to this through the use of a Verifiable Random Function (VRF). The use of the VRF in CONIKS keeps users’ email addresses (or other identifiers) private. When a user queries the directory for a key, the VRF is used to generate a unique, deterministic output for each input (i.e., the user’s email). This output is known only to the directory and the user, preserving privacy.

The generic identifier-based approach in Key Transparency means it can neatly address the issue of S/MIME certificate discovery. The question then becomes, how does the sender discover the Key Transparency server?

Key Transparency Service Discovery

The answer to that question probably involves DNS resource records (RRs). We use DNS every day to connect domain names with IP addresses. It also helps us find services linked to a domain. For instance, this is how your email server is located. DNS has a general tool, known as an SRV record, which is designed to find other services. This tool would work well for discovering the services we’re discussing.

_sm._keytransparency._https.example.com. 3600 IN SRV 10 5 443 sm-kt.example.com.

In this example, _sm the identifier is placed before _keytransparency. and _https shows that this SRV record is specifically for a Key Transparency service for Secure Messaging. This allows someone to ask DNS for a S/MIME-specific Key Transparency service. It also means we can add other types of identifiers later, allowing for the discovery of various KT services for the same domain.

Conclusion

While S/MIME faces many challenges, such as key roaming, message re-encrypting on key rollover, and cipher suite discoverability, before it becomes easy to use and widely adopted—not to mention whether major mail services will invest enough in this technology to make it work—there’s potential for a directory based on Key Transparency if they do.

Hopefully, the adoption of Key Transparency will happen if this investment in S/MIME continues, as it’s the only large-scale discovery service for user keys we’ve seen in practice. Unlike other alternatives, it’s both privacy-respecting and transparently verifiable, which are important qualities in today’s world. Only time will tell, though.

Document Authenticity in the Age of Generative AI

In our rapidly evolving lives, the credibility of documents, images, and videos online has emerged as a concern. The pandemic and recent elections have helped highlight this issue. In the case of elections, one area that stands out to me is concerns over voter roll integrity, a pillar of our democratic process in the US.  

As we grapple with these issues, it is important to explore what a solution might look like that balances the associated privacy concerns. Is it possible to provide assurance of integrity and transparency while also providing accommodations for privacy and accountability?

Misinformation in the Digital Age

Despite its challenges, the pandemic did have a silver lining — it brought attention to the internet’s role as a breeding ground for misinformation campaigns. These campaigns featured manipulated images and documents, creating confusion and distrust globally. They also underscored a glaring gap in our current systems — no broad deployment of reliable mechanisms to verify the authenticity and origin of digital content.

The recent advancements in generative AI over the last two years have further complicated this issue. Now with a few words, anyone on the web can create images that at first blush look real. This technology will only continue to get better which means we will need to begin to more formally look at how we build solutions to tackle this new reality.

Existing Solutions and Their Shortcomings

Several technologies have recently been discussed as the way forward to address at least portions of these problems. One such example is the Content Authenticity Initiative which proposes that devices like cameras cryptographically sign all pictures and videos with a device credential, a feature aimed at enabling the detection of any alterations made post-capture. 

This method raises significant privacy concerns. Essentially, it could create a surveillance infrastructure where each content piece could be unexpectedly traced back to an individual or a group of devices, potentially becoming a surveillance tool.

Google DeepMind also recently brought forth the idea of opt-in watermarking for images created through AI technologies. While this initiative seems promising at a glance, it fails to address the nuances of the threat model. For instance, a nation-state with intentions to manipulate an election using generative AI assets wouldn’t likely volunteer to watermark these materials as AI-generated. This significant loophole sets a precarious stage where misinformation can still flourish.

These approaches, though developed with noble intentions, reveal critical gaps in addressing the complex landscape of content authenticity. They either infringe upon individual privacy rights or are vulnerable to exploitation when faced with a real threat model. 

Middle Ground: Publisher Signatures and Blinding as a Potential Solution

A more nuanced approach could utilize optional cryptographic signatures linked to a publisher, instead of devices, when signed, the publisher, not their devices, opts into staking their reputation on the authenticity of the artifact. Coupled with a feature to enable cryptographically blinding the publisher’s identity, this strategy could offer a safe avenue for them to reveal their identity at a later time, if necessary. Such a situation might arise in cases of whistleblower claims, where shielding the publisher’s identity becomes crucial for their safety. This blinding could strike a balance, granting publishers temporary anonymity while preserving the potential to enable them to later opt-in to publicly stand behind the artifact or to take accountability for any misinformation.

In contrast to devices subtly leaking metadata that would put subjects in the position to have to prove a negative, for example, needing to explain a picture does not tell the whole story. Or even worse putting the subject of a picture in a situation where they need to prove that the device that captured it was compromised, This is similar to what happens today with red-light cameras and automated radar guns where poorly calibrated devices result in innocent people being charged.

The proposed model shifts the identification to publishers in the hope of fostering a responsible publishing culture where publishers have the discretion to unveil their identity, promoting accountability without completely compromising privacy.

It is also worth noting that a transition from ink signatures to cryptographic signatures for documents appears more pertinent than ever. Generative AI and graphic design technologies have enhanced the ability to replicate handwriting styles, making traditional signatures highly susceptible to forgery. The use of cryptographic techniques offers a more secure alternative, integrating seamlessly into modern workflows and standing resilient against unauthorized alterations.

Publisher Signatures Are Not Enough

In information security, it’s now accepted that insider threats are a significant risk. This realization has steered us away from merely verifying the identity of a publisher, especially in cryptographic signing systems such as code signing. 

There are a few reasons, but one of the largest is that key management proves to be challenging, often due to the fact that publishers frequently represent groups rather than individuals, leading to key management practices being more permissive than ideal from a security standpoint. 

Additionally, if a solution is to incorporate the possibility of anonymity through cryptographic blinding we can not simply bet on the presence and blind trust in that identity.

This is the same reason that led modern code-signing solutions to adopt ledgers that record an artifact’s provenance and history. For instance, in a Binary Transparency system, a ledger might house not only a list of software packages and their contents but also offer qualitative attestations about the software, for example indicating whether it has been screened for malware or verified to be reproducible from its source. This mechanism allows devices to understand not just the origin of the code but also to grasp the intended release of the software and potentially qualitative aspects to consider before reliance on it.

Applying this pattern to our document provenance and integrity problem, this system can offer value even when the identity remains undisclosed. It can provide a basic idea of the time of existence and allow third parties to vouch for the authenticity, possibly linking to other corroborative artifacts.

Building a continuously verifiable record coupled with supportive evidence for artifacts like documents seems to be a step in the right direction. This approach has demonstrated its value in other problem spaces.

With that said it’s essential to acknowledge that, as with any opt-in system, documents, images, and videos will not all contain this additional provenance and like with all technology this too would not be perfect. As a result, this means that rather than outright dismissal, all content will need to be evaluated based on merit, and the evidence collected about it. At a minimum, we must recognize that it can take years for any new system to gain substantial traction.

Learning from the rest of the world.

This issue is not confined to the US, so we should not restrict ourselves to looking at approaches used by US Big Tech. For instance, the strategies suggested here significantly draw upon the principles of electronic signatures, particularly e-Seals, which are prevalent in the EU and other regions. This European model offers vital insights and presents a robust strategy for resolving disputes, albeit reliant on specific technologies.

Interestingly, US online notarization rules have also borrowed elements from the EU, mandating the use of cryptographic signatures, akin to the EU’s emphasis on Advanced Signatures.

By combining this approach with the lessons learned from Certificate and Binary Transparency, where Merkle trees of published materials, continuous monitoring, and third-party evaluation help ensure a more complete picture — we might find a path forward. 

The addition of blinding the publisher’s identity in a way where they could selectively disclose their identity in the future also seems to provide a plausible way to balance the privacy concerns that could enable this path to become the default in the future.

Motivating Participation through Innovation, Policy and Leadership

Adoption of this approach cannot solely rely on goodwill or regulation. It would require a combination of standardization, regulatory changes, creating incentives for publishers, engagement with civil society and other stakeholders, and some tangible leadership by example by a large player or two. Some ideas in this direction include:

  • Initiating revisions to the existing digital signature legislation seems to be a sensible first step. Given our substantial background with the current laws, it’s clear that without regulatory changes, technological investments are unlikely to shift.
  • The government can lead by example by enhancing initiatives like the current Digital Autopen project. This project allows groups of individuals to access shared signing credentials without tying them to individual users, addressing a notable challenge prevalent in code signing and other organizational signing efforts.
  • I also believe that investing in a singular, immensely impactful use case as a case study could vividly illustrate how these approaches can make a significant difference. The voter registration rolls mentioned earlier seem like an excellent use case for this.
  • Further research and standardization could explore integrating cryptographic blinding of signer identities within the current document signing infrastructure, allowing for future disclosure. Investigating the potential structure and security of ledgers, and considering the extension of signing protocols to other artifact formats, seems to be a natural progression in supporting a system of this kind.
  • Simultaneously, collaboration with civil society, tech companies, and other stakeholders, including publishers, appears vital. This will guarantee that their concerns are integrated into the developed solutions and that appropriate policies are instituted to effectively incorporate this metadata into their operations.
  • I also believe investing in a singular and hugely impactful use case as a case study of how these approaches can make a big difference. The voter registration rolls discussed earlier seem like a great use case for this.

While these efforts would not necessarily lead to adoption it does seem that adoption would minimally be a predicate on efforts like these.

A Pathway to Trust and Accountability

Balancing privacy and accountability in the digital age is a nuanced but achievable goal, especially if we build on top of existing successes. By adopting a well-rounded approach that integrates cryptographic signatures with mechanisms for temporary anonymity, we can carve a pathway toward a society where digital content maintains its integrity and trustworthiness. 

Moreover, by fostering an environment where content, even without clear provenance, is evaluated critically rather than dismissed, we encourage a richer discourse and a healthier digital ecosystem.

Through the union of technology and policy, we can create a more secure, transparent, and accountable future for content authenticity.

How to keep bad actors out in open ecosystems

There is a class of problems in information security that are intractable. This is often because they have conflicting non-negotiable requirements. 

A good example of this is what I like to call the “re-entry” problem. For instance, let’s say you operated a source control service, let’s call it “SourceHub”. To increase adoption you need “SourceHub” to be quick and easy for anyone to join and get up and running. 

But some users are malicious and will do nefarious things with “SourceHub” which means that you may need to kick them off and keep them off of your service in a durable way.

These two requirements are in direct conflict, the first essentially requires anonymous self-service registration and the second requires strong, unique identification.

The requirement of strong unique identification might seem straightforward on the surface but that is far from the case. For example, in my small social circle, there are four “Natallia’s” who are often are at the same gatherings, everyone must qualify which one we are talking to at these events. I also used to own ryanhurst.com but gave it up because of the volume of spam I would get from fans of Ryan Hurst the actor because spam filters would fail to categorize this unique mail as spam due to the nature of its origin.

Some might say this problem gets easier when dealing with businesses but unfortunately, that is not the case. Take for example the concept of legal identity in HTTPS certificates — we know that business names are not globally unique, they are not even unique within a country which often makes the use of these business names as an identifier useless. 

We also know that the financial burden to establish a “legal business” is very low. For example in Kentucky, it costs $40 to open a business. The other argument I often hear is that despite the low cost of establishing the business the time involved is just too much for an attacker to consider.  The problem with this argument is the registration form takes minutes to fill out and if you toss in an extra $40 the turnaround time goes from under 3 weeks to under 2 days — not exactly a big delay when looking at financially incentified attackers.

To put this barrier in the context of a real-world problem let’s look at Authenticode signing certificates. A basic organizationally validated Authenticode code signing certificate costs around $59, With this certificate and that business registration, you can get whatever business name and application name you want to show in the install prompt in Windows.

This sets the bar to re-enter this ecosystem once evicted to around $140 dollars and a few days of waiting.

But what about Authenticode EV code signing? By using an EV code signing certificate you get to start with some Microsoft Smart Screen reputation from the get-go – this certainly helps grease the skids for getting your application installed so users don’t need to see a warning.


But does this reduce the re-entry problem further? Well, the cost for an EV code signing certificate is around $219 which does take the financial burden for the attacker to about $300. That is true at least for the first time – about $50 of that first tome price goes to a smart card like the SafeNet 5110cc or Yubikey 5.  Since the same smart card can be used for multiple certificates that cost goes down to $250 per re-entry. It is fair to say the complication of using a smart card for key management also slows the time it takes to get the first certificate, this gets the timeline from incorporation to having an EV code signing certificate in hand to about 1.5 weeks.

These things do represent a re-entry hurdle, but when you consider that effective Zero Day vulnerabilities can net millions of dollars I would argue not a meaningful one. Especially when you consider the attacker is not going to use their own money anyway.

You can also argue that it offers some rate-limiting value to the acquisition pipeline but since there are so many CAs capable of issuing these certificates you could register many companies, somewhat like Special Purpose Acquisition Companies (SPAC) in the stock market, so that when the right opportunity exists to use these certificates it’s ready and waiting.

This hurdle also comes at the expense of adoption of code signing. This of course begs the question of was all that hassle was worth it?

Usually the argument made here is that since the company registration took place we can at least find the attacker at a later date right? Actually no, very few (if any) of these company registrations verify the address of the applicant.

As I stated, in the beginning, this problem isn’t specific to code signing but in systems like code signing where the use of the credential has been separated from the associated usage of the credential, it becomes much harder to manage this risk.

To keep this code signing example going let’s look at the Apple Store. They do code-signing and use certificates that are quite similar to EV code-signing certificates. What is different is that Apple handles the entire flow of entity verification, binary analysis, manual review of the submission, key management, and signing. They do all of these things while taking into consideration the relationship of each entity and by considering the entire history available to them.  This approach gives them a lot more information than you would have if you did each of these things in isolation from each other. 

While there are no silver bullets when it comes to problems like this getting the abstractions at the right level does give you a lot more to work with when trying to defend from these attacks.

Evolution of privacy governance and how cryptographic controls can help

Unfortunately, most organizations do not have a formal strategy for privacy, and those that do usually design and manage them similarly to their existing compliance programs.

This is not ideal for several reasons which I will try to explore in this post but more importantly, I want to explore how the adoption of End-To-End encryption and Client-Side encryption is changing this compliance-focused model of privacy to one backed by proper technical controls based on modern cryptographic patterns. 

If we look at compliance programs they usually evolve from something ad-hock and largely aspirational (usually greatly overselling what they deliver) and over time move to a more structured process-oriented approach that is backed with some minimum level of technical controls to ensure each process is followed. 

What is usually missing in this compliance journey is an assessment of sufficiency beyond “I completed the audit gauntlet without a finding”. This is because compliance is largely focused on one objective — proving conformance to third-party expectations rather than achieving a set of contextually relevant security or privacy objectives – This is the root of why you hear security professionals say “Compliance is not Security”.

If you watch tech news as I do, you have surely seen news articles calling the cybersecurity market “the new lemon market “, and while I do agree this is largely true I also believe there is a larger underlying issue — an overreliance on compliance programs to achieve security and privacy objectives.

To be clear, the point here is not that compliance programs for privacy are not necessary, instead, what I am saying is they need to be structured differently than general-purpose compliance programs, as the objectives are different. These differences lend themself to misuse resistant design and the use of cryptographic controls as an excellent way to achieve those objectives.

Misuse resistance is a concept from cryptography where we design algorithms to make implementation failures harder. This is in recognition of the fact that almost all cryptographic attacks are caused by implementation flaws and not fundamental breaks in the cryptographic algorithms themselves. 

Similarly, in the case of privacy, most companies will not say “we intend to share your data with anyone who asks”. Instead, they talk of their intent to keep your information confidential — the problem is that in the long run everyone experiences some sort of failure and those failures can make it impossible to live up to that intent.

So back to this compliance-focused approach to privacy — it is problematic in several cases, including:

  • Where insiders [1,2,3] are abusing their position within an organization,
  • When configuration mistakes result in leakage of sensitive data [4,5,6], 
  • When service providers fail to live up to customers’ expectations on data handling [7,8], 
  • When technical controls around data segmentation fail [9],
  • and of course when service providers fail to live up to their marketing promises for government access requests for your sensitive information [10,11].

If we move to a model where we approach these problems using engineering principles rather than process and manual controls we end can end up in a world where the data and access to it are inherently misused resistant and hopefully verifiable.

This is exactly what we see in the design patterns that have been adopted by Signal, namely they gather only the minimal level of data to deliver the service, and the data they do gather they encrypt in such a way to limit what they can do with it.

They are not the only ones though can see the similar approaches from Android and how they handle encrypted backups as well as in Apple in how they handle device pin recovery

Another great example is how payment providers like Stripe leverage client-side encryption to reduce exposure of payment details to intermediaries or how Square uses client-side encryption in the Cash App to limit adversaries’ access to the data.

While today these patterns are only being applied by the largest and most technically advanced organizations the reality is that as Alexander Pope once said “to err is human” and if we are to truly solve for privacy and security we have to move to models that rely less on humans doing the right thing and for privacy, this means extensive use of cryptographic patterns like those outlined above.

It could be argued that of the reasons we do not see these patterns applied more is security the nihilist’s argument that client-side encryption and cryptographic transparency are exercises of re-arranging deck chairs on a sinking ship.

While there is some truth to that argument you can both limit the amount of trust you have to place in these providers (limiting the amount of trust delegated is the essence of security after all!) and make elements of what the provider does verifiable which again furthers this misuse resistant objective.

If that is the case then why is it that only the largest providers do this today? I would argue it’s just too darn hard right now. For example, when you start applying client-side encryption and cryptographic controls to systems there is still a substantial lift, especially when compared to the blind trust paradigm most systems operate within.

There are a few reasons, but one of the largest in the case of client-side encryption is that you end up having to build the associated key management frameworks yourself and that takes time, and expertise that many projects simply do not have access to.

With that said just like modern systems have moved from self-hosted monoliths to microservices that are globally scaled thanks to solutions like Kubernetes which in turn gives them node to node compartmentalization and other by default security controls, I believe we will see a similar move for sensitive data handling where these cryptographic controls and usage policies become a key tool in the developer’s toolbox.

More work is needed to make it so the smaller organizations can adopt these patterns but, unquestionably, this is where we end up long term. When we do get here, to borrow an over-used marketing term, we end up what might be called Zero Trust Data Protection in today’s market.

The next decade of Public Key Infrastructure…

Background

Before we talk about the future we need to make sure we have a decent understanding of the past. X.509 based Public Key Infrastructure originally was created in the late 80s with a focus on enterprise and government use cases.  These use cases were largely for private systems, it was not until a decade later this technology was applied to the internet at large.

Since the standards for enrollment and lifecycle management at the time were building blocks rather than solutions and were designed for government and enterprise use cases rather than the internet, the Web PKI, as it became known, relied largely on manual certificate lifecycle management and a mix of proprietary automation solutions.

While the use of PKI in the enterprise continued, primarily thanks to Microsoft AD/CS and its automatic certificate lifecycle management (I worked on this project), the Web PKI grew in a far more visible way. This was primarily a result of the fact that these certificates had to be acquired manually which led to the creation of an industry focused on sales and marketing of individual certificates.

The actors in this system had no incentive to push automation as it would accelerate the commoditization of their products. The reality was that these organizations had also lost much of their technical chops as they became sales and marketing organizations and could no longer deliver the technology needed to bring this automation anyways.

This changed in 2016 when the Internet Security Research Group, an organization I am involved in, launched Let’s Encrypt. This was an organization of technologists looking to accelerate the adoption of TLS on the web and as such started with a focus on automation as it was clear that without automation growth of HTTPS adoption would continue to be anemic. What many don’t know is in when Let’s Encrypt launched HTTPS adoption was at about 40% and year over year growth was hovering around 2-3%, about the rate of growth of the internet and — it was not accelerating. 

Beyond that TLS related outages were becoming more frequent in the press, even for large organizations. Post mortems would continuously identify the same root causes, a manual process did not get executed or was executed incorrectly.

The launch of Let’s Encrypt gave the Internet the first CA with a standards-based certificate enrollment protocol (ACME), this combined with the short-lived nature of the certificates they issued meant those that adopted it would have to use automation for their services to reliably offer TLS. This enabled products to make TLS work reliably and by default, a great example of this is the Caddy web server. This quickly took the TLS adoption rate to around 10% year over year and now we are hovering around 90%+ HTTPS on the internet.

While this was going on the concept of microservices merged with containers which led to container-orchestration, which later adopted the concept of mesh networking. This mesh networking was often based on mutual-TLS (mTLS). The most visible manifestation of that being SPIFFE, the solution used by Kubernetes.

At the same time, we saw networks becoming more composable, pushing authentication and authorization decisions out to the edge of the network. While this pattern has had several names over the years we now call it Zero Trust and a visible example of that today is Beyond Corp from Google. These solutions again are commonly implemented ontop of mutual TLS (mTLS).

We now also see the concept of Secure Access Service Edge (SASE) or Zero-Trust Edge gaining speed which extends this same pattern to lower-level network definition. Again commonly implemented ontop of mTLS.

The reality is that the Web PKI CAs were so focused on sales and marketing they missed almost all of these trends. You can see them now paying lip service to this by talking about DevOps in their sales and marketing but the reality is that the solutions they offer in this area are both too late and too little. This is why cloud technology providers like Hashicorp and cloud providers like Amazon and Google (I am involved in this also) had to step in and provide their offerings.

We now see that Web PKI CAs are starting to more seriously embrace automation for the public PKI use cases, for example, most of the major CAs now offer ACME support to some degree and generally have begun to more seriously invest in the certificate lifecycle management for other use cases.

That being said many of these CAs are making the same mistakes they have made in the past. Instead of working together and ensuring standards and software exist to make lifecycle management work seamlessly across vendors, most are investing in proprietary solutions that only solve portions of the problems at hand.

What’s next?

The usage of certificates and TLS has expanded massively in the last decade and there is no clear alternative to replace its use so I do not expect the adoption of TLS to wain anytime soon.

What I do think is going to happen is a unification of certificate lifecycle management for private PKI use cases and public PKI use cases. Mesh networking, Zero-Trust, and Zero-Trust edge is going to drive this unification.

This will manifest into the use of ACME for these private PKI use cases, in-fact this has already started, just take a look at Cert Manager and Small Step Certificates as small examples of this trend. 

This combined with the ease of deploying and managing private CAs via the new generation of Cloud CA offerings will result in more private PKIs being deployed and the availability problems from issues like certificate expiration and scalability will no longer be an issue.

We will also see extensions to the ACME protocol that make it easier to leverage existing trust relationships which will simplify the issuance process for private use cases as well as ways to leverage hardware-backed device identity and key protection to make the use of these certificate-based credentials even more secure.

As is always the case the unification of common protocols will enable interoperability across solutions, improve reliability and as a result accelerate the adoption of these patterns across many products and problems.

It will also mean that over time the legacy certificate enrollment protocols such as SCEP, WSTEP/XCEP, CMC, EST, and others will become less common.

Once this transition happens this will lead us to a world where we can apply policy based on subjects, resources, claims, and context across L3 to L7 which will transform the way we think about access control and security segmentation. It will give both more control and visibility into who has access to what.

What does this mean for the Web PKI?

First I should say that Web PKI is not going anywhere – with that said it is evolving.

Beyond the increase in automation and shorter certificate validities over the next decade we will see several changes, one of the more visible will be the move to using dedicated PKI hierarchies for different use cases. For example, we will ultimately see server authentication, client authentication, and document signing move to their own hierarchies. This move will better reflect the intent of the Web PKI and prevent these use cases from holding the Web PKI’s evolution back.

This change will also minimize the browser influence on those other scenarios. It will do this at the expense of greater ecosystem complexity around root distribution but the net positive will be felt regardless. I do think this shift will give the European CAs an advantage in that they can rely on the EUTL for distribution and many non-web user agents simply do not want to manage a root program of their own so the EUTL has the potential to be adopted more. I will add that is my hope these user agents instead adopt solution-specific root programs vs relying on a generic one not built for purpose.

The Web PKI CAs that have not re-built their engineering chops are going to fall further behind the innovation curve. Their shift from engineering companies to sales and marketing companies resulted in them missing the move to the cloud and those companies that are going through digital transformation via the adoption of SaaS, PaaS, and modern cloud infrastructures are unlikely to start that journey by engaging with a traditional Web PKI CA.

To address this reality the Web PKI CAs will need to re-invent themselves into product companies focusing on solving business problems rather than selling certificates that can be used to solve business problems. This will mean, for example, directly offering identity verification services (not selling certificates that contain assertions of identity), providing complete solutions for document signing rather than certificates one can use to sign a document or turnkey solutions for certificate and key lifecycle management for enterprise wireless and other related use cases.

This will all lead to workloads that were once on the Web PKI by happenstance being moved to dedicated workload/ecosystem-specific private PKIs. The upside of this is that the certificates used by these infrastructures will have the opportunity to aggressively profile X.509 vs being forced to carry the two decades of cruft surrounding it like they are today.

The Web PKI CAs will have an opportunity to outsource the root certificate and key management for these use cases and possibly subcontract out CA management for the issuing CAs but many of these “issuing CA” use cases are likely to go to the cloud providers since that is where the workloads will be anyway.

Due to the ongoing balkanization of the internet that is happening through increased regional regulation, we will see smaller CAs get acquired, mainly for their market presence to let the larger providers play more effectively in those markets.

At the same time, new PKI ecosystems like those used for STIR/SHAKEN and various PKIs to support IoT deployments will pop up and as the patterns used by them are found to be inexpensive, effective, and easily deployable they will become more common.

We will also see that the lifecycle management for both public and private PKI will unify ontop of the ACME enrollment protocol and that through that a new generation of device management platforms will be built around a certificate-based device identity anchored in keys bound to hardware where the corresponding certificates contain metadata about the device it is bound to.

This will lay the groundwork for improved network authentication within the enterprise using protocols like EAP-TTLS and EAP-TLS, enable Zero-Trust and Zero-Trust Edge deployments to be more easily deployed which will, in turn, blur the lines further between what is on-premise and what is in the cloud.

This normalization of the device identity concepts we use across solutions and the use of common protocols for credential lifecycle will result in better key hygiene for all use cases, and simplify deployment for those use cases.

Accountability and Transparency in Modern Systems

Over the last several decades we have seen the rate of technological innovation greatly accelerate. A key enabler of that acceleration has been the move to cloud computing which has made it possible for hardware, software, and services to be shared. This significantly reduced both the capital and time necessary to adopt and operate the infrastructure and services built on these platforms.

This migration started by enabling existing software to run using dedicated computers and networks owned and operated by someone else. As these computers got faster and the tools to share the physical hardware and networks were built, the cost of technological innovation reduced significantly. This is what democratized modern startup entrepreneurship as it made it cost-effective for individuals and small businesses to gain access to the resources once only available to the largest companies.

This flipped technological innovation on its head. It used to be that government and big businesses were the exclusive sources of technological innovation because they were the ones who could afford to buy technology. The lowering of the cost of innovation is what gave us the consumer startups we have today. This drew the attention of large companies to this emerging market and led to the creation of the modern smartphone which was fundamental to creating the market opportunity we see in consumer startups. This was a scale opportunity that was fundamentally different than the prior government and enterprise models of innovation.

As enterprises saw the rate of innovation and agility this new model provided, it became clear that they too needed to embrace this model in their businesses. It is this reality that led to the creation of Salesforce, the first Software As a Service, and AWS, the first to market with a modern Cloud Service Provider. It was these offerings that gave us Software As a Service (SaaS), Platform As a Service (PaaS), and what we think of as modern cloud infrastructure.

At first, these enterprises only moved greenfield or very isolated projects to the cloud but as the benefits of the new model became irrefutable and the capabilities of these offerings were enriched in ways that were impractical to replicate in their environments, they started moving more business-critical offerings. We can see this trend continues today, a recent survey found that 55% of IT organizations are now looking at ways to reduce their on-premise spending. This will lead to many legacy systems being replaced with more modern, scalable, agile, and secure solutions.

That same survey found that digital transformation and security are the two biggest reasons for this shift. This is no surprise when we look at how capital efficient modern businesses are relative to those based on legacy IT and manual processes, or how vulnerable legacy IT systems are to modern attacks. 

This does beg the question, what is next?. I believe that two trends are emerging. The first being the democratization of compliance for modern systems and the second being the shift in expectations of what does it mean to be “secure”.

If we look at the first trend, the democratization of compliance, we see the internet becoming balkanized through regulation and governments seeking to get more control over what people do on the internet. Increased regulation makes it significantly harder for new entrants to compete, which in turn helps entrench the incumbents who can often eat the engineering and compliance costs associated with the regulations. When you think about this in the context of the global economy in which the internet exists, an economy made up of 195 independent sovereign countries, the compliance burden becomes untenable.

Modern Cloud Service Providers can make a significant dent in this by making it possible for those who build on them to meet many of these compliance obligations as a byproduct of adopting their platforms.

In the near term, this will likely be focused on the production of the artifacts and audit reports that are needed to meet an organization’s current compliance requirements but if we project out, it will surely evolve to include services for legal identity verification, content moderation, and other areas of regulatory oversight. A decade from now I believe we will see systems being built on these platforms in such a way that they will be continually compliant producing the artifacts necessary to pass audit as a natural byproduct of the way they work. 

This will in turn make it easier to demonstrate compliance and create new opportunities such as auditors continually monitoring an organization for its compliance with guidelines rather than just doing annual point-in-time assessments as is done today.

This has also led to companies like Coalition building offerings that let customers augment existing systems with the artifacts to demonstrate conformance with security best practices are being met so that insurance companies can offer more affordable risk-based insurance policies.

As we look at the second trend, the redefinition of what it means to be secure, we can see consumers becoming more aware of security risks and as a result, their expectations around the sovereignty of their data and the confidentiality of their information evolving. 

One response to this realization is the idea of decentralization. The thesis here is arguably is that there can be no sovereignty as long as there is centralization. In practice, most of these decentralized systems are in-fact quite centralized. While there are many examples of this, one of the more visible has been the DAO hard fork which was done to recover stolen funds or the simple fact that 65% of Bitcoin mining happens in China. Additionally, for the most part, the properties that enable sovereignty typically come from the use of verifiable data-structures and cryptography and not decentralization. That is not to say these systems do not have a place, I would argue that their success and durability so far at least suggests there is “a there, there” but I would also say that, at least currently, they do not yet live up to their full promise. 

Another response to this is the consumer adoption of End-To-End encryption in messaging applications (even iMessage is end-to-end encrypted!) and by extension to that problem the verifiability of the systems that implement these schemes. 

The best example here is probably Signal, they spent time designing security and privacy into their messaging protocol and implementing systems from the beginning, modeling its design on modern threats and decades of learning about what does, and does not work. This approach led to the protocol that they defined being adopted by many of their competitors, including WhatsApp, Facebook Messenger, Skype, and Google Allo.

Signal is also a great example of the verifiability property, in particular, the work they have done with Contact Discovery is exciting. What they have done with this feature is first to minimize what information they need to deliver the capability in the hope to limit future abuse. Secondly, they leveraged technologies like SGX, which is an example of a Confidential Compute, that enables them to demonstrate what they are doing with the information they do collect. This introduces transparency and accountability which both are important ingredients to earning trust.

The use of hardware security as a key component of the security boundary has already found its way from consumer phones, laptops, and tablets to the cloud. For example, Google Cloud‘s Shielded VMs and Azure Trusted Launch use hardware to provide verifiable integrity to VM instances to make it possible to detect VMs compromised by boot- or kernel-level malware or rootkits similar to how Apple does with the iPhone. We also now see AMD Sev and SGX seeing broader deployment in the larger Cloud Service Providers (I will be the first to admit these technologies have room to grow if they are to live up to their promises but they are promising none the less).

With this foundation, the industry is starting to look at how they can bring similar levels of transparency and accountability into applications and ecosystems too. One of the projects that have demonstrated that doing this can have a big impact is Certificate Transparency. As a result of the investments in deploying Certificate Transparency, the internet is now materially more secure than it was before and this is a direct result of introducing accountability into an opaque ecosystem based on blind trust.

Another example in this space is the Golang Checksum Database where verifiable data-structures like Merkle Trees are being used to introduce accountability into the software supply chain as a means to mitigate risks for those who rely on the Golang ecosystem. 

For many problems in the security space, you can solve from one of two philosophical bases. You can either create privileged systems only visible to a few that you hope aren’t corruptible or you can build democratizing transparency into the system as a check on corruption.

Dino A. Dai Zovi

While the earlier examples are using combinations of hardware, cryptography, and verifiable data-structures to deliver on these properties, other examples take a more humble approach. For example, Google Cloud’s Access Transparency uses privilege separation, audit logs, and workflows to provide the fundamental ability to track business justifications for access to systems and data. The existence of these systems is further validation that the trend of verifiability is emerging.

So what should you take away from this post? I suppose there are four key messages:

  1. The definition of security in modern Cloud services is continuing to be influenced by the consumer space which is leading to the concepts of verifiability, accountability, data sovereignty, and confidentiality becoming table stakes.
  2. Globalization and regulation are going to accelerate the adoption of these technologies and patterns as they will ultimately become necessary to meet regulatory expectations.
  3. Increasingly verifiable data structures, cryptography, and hardware security capabilities are being used to make all of this possible.
  4. These trends will lead to the democratization of compliance to the many regulatory schemes that exist in the world.

I believe when we look back, these trends will have significantly changed the way we build systems and a new generation of businesses will emerge enabling these shifts to take place.

Software Supply Chain Risk Mitigation

Increasingly we are seeing attacks against what is now commonly referred to as the software supply chain.

One of the more notable examples in the last few months was from the Nodejs package management ecosystem [1]. In this case, an attacker convinced the owner of a popular but unmaintained Node package to transfer ownership to them. The attacker than crafted a version of the package that unsuccessfully attacked Copay, a bitcoin wallet platform.

This is just one example of this class of attack, insider attacks of the software supply chain are also becoming more prevalent. When looking at this risk it holistically it is also important to realize that as deployments move to the Cloud the lines between software and services also blur.

Though, not specifically an example of a Cloud deployment issue, in 2015 there was a public story of how some Facebooks employees have the ability to log into users accounts without the target user’s knowledge [2]. This insider risk variant of the supply chain exists in the Cloud in a number of different areas.

Probably the most notable being in the container images provided by their Cloud provider. It is conceivable that a Cloud provider could be compelled by government to build images that would attack a specific or set of customers as part of an investigation, or that an employee would do so under compulsion or in service of personal interests.

This is not a new risk, in fact, management of internal and external dependencies has always been core to building secure systems. What has changed is that in the rush to the Cloud and Open Source users have adopted the tools and resources these cloud providers have built to make this migration easier without fully understanding and managing this risk that they have assumed in doing so.

In response to this reality, Cloud providers are starting to provide tools to help mitigate this risk, some such examples include:

  • Providing audit records of employee access to customer data and services,
  • Building solutions to provide hardware-based trusted execution environments that provide some level of protection from cloud providers.
  • Offering hardware key management solutions provided by third-parties to protect sensitive key material,
  • Cryptographically signing the binaries and images that are published so that their distribution is controlled and tampering post-production can be detected.

Despite these advancements, there is still a long way to go to mitigate these risks in a holistic fashion.

One effort in this area I am actively involved in is in the adoption of the concept of Binary Transparency. This can be thought of as an evolution of legacy code signing models. In these solutions, a publisher places a cryptographic signature using a private key associated with a public certificate of some sort that is either directly trusted based on package origin and signature (such as with GPG signatures) or is authenticated based on the legal identity of the publisher of the package (as is the case with Authenticode).

These solutions, while valuable, help you authenticate a package but they do not provide you the tools to understand the history of that package. As a result, these publishers can produce packages either accidentally or on purpose that are malicious in nature that is signed with their “trusted keys” and it is not detectable until it is too late.

As an example of this risk, you only need to look at RealTek, over the years numerous times their code signing key has been compromised and used to produce malware, some of it targeted such as in the case of Stuxnet [3].

Binary Transparency addresses this risk in a few ways. At its core Binary Transparency can be thought of as an append-only ledger listing of all versions of a given binary, each of these versions having a pointer to a content addressable store where that binary is available.

This design enables the runtime that will execute the binary to do a few things that were not possible, It can, for example, ensure it is running the most recent version of a binary and to only run the binary when it, and some number of previous revisions are publicly discoverable. This also enables the relying parties of the published binaries and images to comp it can inspect all versions and potentially diff those versions to understand the differences.

When this technique is combined with the concept of reproducible builds, as is provided by Go [4] and a community of these append-only logs and auditors of those logs you can get strong assurances that:

  • You are running the same version as everyone else,
  • That the binary you are running is reproducible from the source you can review,
  • The binary are running has not neen modified since it was published,
  • That you, and others, will not run binaries or images that have not been made publicly available for inspection.

A system with these properties disincentivizes the attacker from executing these attacks as it significantly increases the probability of being caught and helps bound the impact of any compromise.

Importantly, by doing these things, it makes it possible to increase the trust in the Cloud offering because it minimizes the amount of trust the user must put into the Cloud provider to remain honest.

A recent project that implements these concepts is the Go Module Transparency project [5] [6].

Over time we will see these same techniques applied to other areas [7] [8] of the software supply chain, and with that trend, users of open source packages, automatic update systems, and the Cloud will be able to have increased peace of mind that their external dependencies are truly delivering on their promises.


  • [1] Node.js Event-Stream Hack Exposes Supply Chain Security Risks
  • [2] Facebook Engineers Can Access Your Account Without A Password
  • [3] STUXNET Malware Targets SCADA Systems
  • [4] REPRODUCING GO BINARIES BYTE-BY-BYTE
  • [5] Proposal: Secure the Public Go Module Ecosystem
  • [6] Transparent Logs for Skeptical Clients
  • [7] Firefox Security/Binary Transparency
  • [8] Contour: A Practical System for Binary Transparency