A CA That Produces Evidence, Not Promises

In my last post I argued that high-assurance systems should stop asking to be trusted on the basis of institutional promises and start producing verifiable runtime evidence about what actually happened. This post is the worked example. A certificate authority built that way, what choices it forced, and what is and is not done yet.

When I was at Google I got to work a bit with the BeyondCorp folks. What most didn’t understand is that the BeyondCorp Google used internally was substantially different from the BeyondCorp they launched to customers. Internally, TPMs on Windows and Linux machines were used to create device credentials associated with each machine that were hardware bound, turning possession of the laptop and an authenticated credential into another factor.

You will notice I didn’t mention Macs. That’s because Apple, although it had a similar secure processor on its devices, did not give customers attestations over keys stored inside it. We could put keys in the Secure Enclave. We could not prove to a third party that we had. For years we tried to get Apple to change that. Eventually they did, and what they shipped was not what we asked for.

We didn’t get the ability to put arbitrary keys in the enclave under our control with attestation. We got a device-bound credential signed by Apple that let us verify, at enrollment time, that a request was coming from one of our devices. We used that as a bootstrap to enroll a short-lived credential that the OS stored and used for day-to-day authentication. Apple’s attestation answered the which device question. Our short-lived credential answered everything that came after.

That worked. The standardization piece is draft-ietf-acme-device-attest, an IETF working group document I co-author, which lets the same ACME flow carry an Apple Managed Device Attestation statement, a TPM key attestation, or a YubiKey assertion without the CA needing to special-case each platform. Apple’s adoption was what unlocked normalizing on a single way across the fleet to authenticate these devices.

That normalization is the relying-party side of the credential story. The hard-edge property, this key lives on this specific device, signed by a chain you can verify back to the manufacturer, is now what we expect from a workload, a laptop, a phone, a passkey, a SPIFFE workload identity. MFA was the first compensating control for the weaknesses in passwords and API keys and bearer tokens. Passkeys, SPIFFE, and certificate-based zero-trust segmentation are the structural answer. We replaced the secret-you-know with a key-you-hold and bound it to hardware where we could.

Google’s internal production systems run the same shape, with Titan chips as the foundational substrate for the devices that get on the network for internal and cloud workloads.

That shift is well underway on the side of the wire that uses credentials.

The side that issues them is still mostly software on a server with an API key and a SOC 2 report.

If issuance hasn’t kept up with what we now expect from credential holders, what would catching up actually look like? The previous post made the general argument. The CA should be asked to produce evidence, not to be trusted. This post is what that actually looks like when you build it. A certificate authority where the issuance path itself is the evidence, where the policy that fired is part of what was measured, where the key release is gated on the measurement of the binary asking for it, and where every issued certificate is accompanied by a portable bundle a relying party can verify against trust anchors they already hold.

The architecture is designed to run on both AWS and Google Cloud. On AWS, enclave-based deployments use Nitro Enclave attestations, while VM-level deployments can use NitroTPM-backed evidence for measured boot, instance identity, and workload state. On Google Cloud, Confidential VM deployments use AMD SEV-SNP or Intel TDX attestation for the protected execution environment, with Cloud HSM as the key custodian. Where the system needs VM identity, boot-state evidence, or platform posture outside the confidential-computing attestation itself, it can also use vTPM-based evidence.

That breadth is intentional, and I’ll come back to why. But the real load-bearing architectural choice is not which cloud, HSM, enclave, VM, or attestation primitive we use. It is the shape of the issuance system inside each trust boundary.

Two components, one evidence chain

The split is structural, not procedural.

In a single-binary CA, policy enforcement and signing authority are separated by code paths inside one process. A vulnerability in the policy evaluator is a vulnerability in the signing authority, because they share an address space. The compartments exist in the source code. They do not exist at runtime.

Here, the compartments are real. The architecture splits issuance into two attested components.

The registration authority receives the certificate request, resolves identity from authoritative sources, evaluates the issuance policy against the requester’s posture and attestation, and produces a signed authorization context. The signing oracle holds the path to the CA’s signing key and produces the signature. Each runs in a separate attested environment. Each is measured separately. Each has its own keys. Both are written in Go. Memory-safe in normal code, reproducible without ceremony, and a standard library that already covers most of what a CA needs.

The RA and the oracle are different binaries, with different measurements, with different keys, on different network endpoints, joined by mutually-attested TLS where each side has independently verified the other’s measurement before either will talk. A compromise of the RA does not give the attacker the signing key, because the signing key is not in the RA’s address space and is not reachable from the RA’s network position. A compromise of the oracle does not give the attacker the policy evaluator’s identity providers, because the oracle has no identity providers wired into it. The interface between them is narrow, signed, and replay-protected.

This is the cross-machine version of the kernel-userland boundary. The kernel does not trust userland’s claim that a syscall is authorized. It does the check itself, every time. The oracle does the same thing for the parts of issuance it can verify independently.

What this buys is a bounded blast radius on a compromised RA.

If an attacker takes over the RA, including the RA’s signing key, the damage is bounded only to the extent that the oracle requires independently verifiable evidence for the facts that matter. For profile authorization, key binding, replay protection, RA authorization, and certificate structure, the oracle can check those facts locally.

For domain control validation, the same is true only when the evidence is cryptographic or independently corroborated, such as DNSSEC-validatable DNS evidence or signed observations from independent multi-perspective validators. The current implementation does not yet do either of these things, but adding support for them is a straightforward extension of the model. Without that, however, an RA compromise can still become a validation compromise.

That distinction matters. The oracle does not make an RA trustworthy. It makes the RA’s assertions conditional. Where the RA presents verifiable evidence, the oracle can check it. Where the RA presents only its own statement about what it observed, the oracle can enforce structure, freshness, profile policy, replay protection, and RA authorization, but it cannot turn that statement into ground truth.

They cannot get the oracle to sign a CA certificate, because the oracle’s profile registry does not contain one for them. They cannot get a leaf outside the measured profile registry, because the oracle will not recognize the profile. They cannot get the oracle to issue under a profile this RA was not authorized for, because the oracle scopes RAs to profiles in its own configuration and does not take the RA’s word for what it is allowed to request. For enterprise identity and hardware-bound requester claims, they cannot get a certificate for an identity the oracle’s verifier would not accept, because the oracle re-evaluates those claims independently. For WebPKI domain names, that guarantee requires independently verifiable DCV evidence, such as DNSSEC-validatable DNS records or signed multi-perspective validation results. They cannot replay an old authorization, because the oracle tracks the freshness of every authorization context it accepts.

The architectural property is not that the oracle magically knows every fact the RA observed. It is structural separation plus independent verification wherever the fact is independently verifiable. Every other property the rest of this post will describe, measured policy, gated key release, per-operation attestation, portable evidence, depends on that boundary.

What I mean by evidence

Before walking through the attestations, it is worth being precise about the word evidence.

A raw measurement is a narrow statement, this binary had this digest, this key was generated by this HSM, this quote was signed by this platform, this public key matches the attested key. Those statements matter because they are the hard edge of the system.

But they are not enough to explain issuance. A certificate is issued because policy evaluated a set of facts and authorized a specific profile for a specific requester at a specific time.

So evidence here means the decision record, the raw attestations where they can be disclosed, the verifier results, the policy digest, the profile binding, the facts the policy evaluated, the signed authorization context, the oracle’s per-operation attestation, the custodian’s key-release evidence, and the transparency proof.

draft-ietf-acme-device-attest helps with the requester-side device and key attestation. The CA-side runtime evidence chain is still built from platform-specific TEE, TPM, HSM, KMS, and transparency-log evidence. The point is not that one standard covers all of it. The point is that the CA should preserve the evidence chain instead of collapsing it into a one-bit pass/fail result or a promise.

What each attestation lets you verify

Cross-checking only matters if the things being checked are concrete. Four attestations cross the issuance flow, each one produced by a different party, each one letting the next party check something specific. It is worth walking through what each one actually lets you verify before going further.

The client’s attestation

What the RA verifies: that the requestor controls the private key whose public half is in the certificate request, that the key lives on a specific piece of hardware, that the hardware has a manufacturer-vouched identity, and that the key was generated under conditions the policy can reason about.

The shape of the attestation changes between platforms. A TPM-bound key on a Windows or Linux machine, a Secure Enclave key on an Apple device, and a key in a YubiKey’s PIV slot each arrive in a different format with a different signing chain and different platform-specific fields. The function is the same. Each one carries a binding between the key in the CSR and a piece of hardware, an identifier for that hardware, the conditions under which the key can be used, and a certificate chain back to a manufacturer the policy can decide whether to trust.

draft-ietf-acme-device-attest is the wire format that carries any of these in the same ACME flow, so the CA doesn’t need to special-case each platform. What the policy sees in the end is the same five things regardless of which platform produced them. Does the requestor control the private key. Is the key on a piece of hardware. Whose hardware. Under what conditions can it be used. Does the manufacturer chain trace back to a root the policy will accept.

The RA’s environment attestation

What anyone holding the document can verify: that the RA ran a specific measured image inside an attested execution environment, that the cloud provider signed off on the measurement, that the workload identity or role is the one the deployment expects, and that the public key bound to that measured environment for the rest of its lifetime is the one in the document.

The shape changes between TEE platforms. AWS Nitro, AMD SEV-SNP, and Intel TDX each produce a document with different fields signed by a different chain. The function is the same. Each one carries a measurement of what was loaded into the measured environment, an identifier for the platform that produced the measurement, the operating context the measured environment runs in, and a certificate chain back to a hardware vendor the relying party can decide whether to trust.

Worth being explicit about what the document does not let you verify. It does not tell you who deployed the measured environment, who runs the cloud account, or what the operator intended to run. It tells you what was measured. The operator’s intent is a separate question, answered by the published image list and the policy that names which measurements are acceptable. A relying party who trusts the hardware vendor’s chain can verify the measurement. They still have to verify, separately, that the measurement is one they should accept.

The oracle’s environment attestation

What it lets the RA verify, during handshake: that the oracle the RA is about to send an authorization context to is running the published image, and that the signing key the oracle will use for its half of the mutual TLS is the one bound to that image at boot.

What it lets a relying party verify, after issuance: the same thing, plus one additional binding. The oracle produces a fresh attestation for every signing operation, and the attestation is bound to the certificate that operation just produced and to the RA on the other end of the conversation that authorized it. Where the RA’s boot-time attestation carries the long-lived public key the measured environment will sign with, the oracle’s per-operation attestation pins this specific certificate to this specific oracle measurement with this specific RA.

The difference between boot-time and per-operation attestation is the difference between “the box looked right when it started” and “the box looked right when it did the thing you actually care about.” Boot-time attestation is what most confidential-computing deployments do today. It tells a relying party the deployment was valid at startup. It tells them nothing about whether the deployment was still valid five hours later when an actual issuance happened. Per-operation attestation closes that gap.

The custodian’s attestation

The CA private key is not loaded by the operator. It is held by a custodian, an HSM with discrete-silicon attestation, a cloud KMS that gates on attestation, or another measured execution environment. The key is wrapped so the custodian will release it only to a signing oracle whose attestation matches a published image. The list of acceptable images is small and public.

What the custodian’s evidence lets a relying party verify: that the wrapped key was released only because the oracle’s measurement matched a measurement on the published list, signed by the custodian’s hardware root. If an operator runs a different binary, however benign the reason, the key does not get released. The dragon’s teeth around the data center are still there. They no longer have to do the whole job.

For a concrete look at what one of these documents actually contains, rather than just what it proves, the PeculiarVentures attestation library parses examples from each of these platforms, including the Marvell HSM attestation produced by Google Cloud HSM. An attestation without a verifier is a claim. With a verifier, it is something a relying party can act on.

Policy as mechanism, not promise

Reading all of those attestations means the policy that evaluated them has to be something concrete enough to read.

Today’s CAs publish a CP/CPS. The Certificate Policy and Certification Practice Statement is a document describing what the CA will and will not do. An auditor samples evidence once a year against the document. The document and the system that produces certificates are not cryptographically linked. The relying party trusts that the document describes the system. The auditor’s annual report is the closing of the loop.

Cedar policies are different. They are a domain-specific language with declarative semantics, written under version control, statically analyzable, and small enough to read in a sitting. The policy that fires inside the signing oracle is compiled into the measured binary. The digest of the policy travels in the evidence bundle that accompanies the certificate. A relying party can re-fetch the source at the named digest, read the rules, and decide for themselves whether the policy that authorized their certificate is a policy they accept.

The contrast that matters operationally is the one between policy-as-promise and policy-as-mechanism. A CP/CPS is a promise. The auditor verifies, by sample, that the practice resembles the promise. A Cedar policy compiled into a measured binary, with its digest in the bundle, is a mechanism. The relying party verifies, per certificate, that the rules that fired are the rules the CA published.

There is a sharp footgun specific to Cedar that is worth naming because the answer to it is part of the architecture. Cedar’s evaluator skips a policy that throws while accessing an attribute that was not present. The convenient result is that policies stay readable in the absence of optional context. The inconvenient result is that a forbid policy with an unguarded attribute access can silently drop, which is a fail-open. The lint at build time requires a has guard before any optional attribute access. A policy that would have failed open is now a build error. The defense-in-depth move is structural. The policy author cannot ship a fail-open by accident.

What this looks like in practice

An employee gets a new smart card from corporate IT in a sealed blister pack. They plug it in.

The enrollment client on the laptop sees a card it doesn’t know. It reads the card’s GlobalPlatform Card Production Lifecycle data and the card recognition data, learns this is a factory-fresh retail token from a manufacturer it has a trust root for, and verifies the card identity attestation back to that root. The card is genuine. It has never been provisioned. The enrollment client knows what kind of token it is looking at and what the policy says to do with one.

The client provisions the token. It generates a new keypair on the card under a policy that requires user PIN for use and marks the key non-exportable. The card produces a key attestation: a statement signed by the card’s manufacturer-installed attestation key asserting that this specific public key was generated on this specific token, has these specific usage constraints, and will never leave the hardware. The enrollment client builds a CSR for the new key and has the card sign it, which is the standard proof that the requestor controls the corresponding private key. It then packages the CSR, the user’s identity claim, and the card’s attestation into an ACME request and sends it to the RA.

That is the first link. The chain is going to have several.

The RA receives the request inside its measured execution environment. It verifies the CSR signature against the public key in the CSR, which proves the requestor controls the corresponding private key. It verifies the card’s attestation against the manufacturer chain. Yes, this is a genuine retail token from a manufacturer we trust. Yes, the key in the CSR is on the card. Yes, the usage constraints match policy. It resolves the identity claim against the corporate identity provider. Yes, this user exists. Yes, they are entitled to this credential type. Yes, their device posture matches. It evaluates the Cedar policy against the cross-product of the device, the identity, and the requested profile. Yes, all of it permits issuance. It builds an authorization context, signs it with the key bound to its measured environment, and sends it to the signing oracle.

The signing oracle receives the context inside its own measured execution environment, over mutually-attested TLS where both sides have verified the other’s measurement before they would talk. It does not simply believe what the RA told it. For facts backed by independently verifiable evidence, the oracle re-verifies that evidence against its own configured verifier. In this example, it re-verifies the card’s attestation, cross-checks the claims the RA made about the card, the requested profile, and the requester, validates the profile binding, the certificate type, the validity window, the structural invariants the profile requires, and confirms that this RA is authorized to ask for this kind of issuance. It rejects replays. Only then does it ask the custodian for the signing key. The custodian releases it because the oracle’s attestation matches the published image. The oracle signs. It produces a fresh per-operation attestation binding this certificate to this measured execution environment and to the RA that authorized it. The issuance is written to a transparency log that independent witnesses cosign.

The bundle that comes back to the enrollment client contains the card’s attestation that the key is on the token, the RA’s signed authorization context naming the identity, the profile, the policy digest, and the verifiers it ran, the oracle’s per-operation attestation binding the signature to a measured binary on a measured platform, the custodian’s evidence that the key was released only because the oracle’s measurement matched, the transparency log inclusion proof and witness cosignatures showing the issuance was published before the certificate was returned, and the certificate itself.

Every party in the flow did the logical equivalent of what every other party did. It verified upstream evidence against a manufacturer, platform, custodian, or witness root it already trusted, did its work, and produced its own evidence for the downstream party to verify. The bundle packages those attestations and proofs so the subscriber, or anyone the subscriber shares it with, can re-walk the chain end to end against the same trust roots, without having to take any single party’s word for it.

The card manufacturer’s root says the key is on the hardware. The chip vendor’s TEE root says the RA ran the measured image. The chip vendor’s TEE root says the oracle ran the measured image. The HSM vendor’s root says the key was released to a measurement that matched. The witness network’s cosignatures say the log is what the operator published, not a fork served to one relying party. A relying party who trusts each of those roots can verify the certificate’s basis of issuance from the evidence bundle, instead of relying only on the CA’s institutional promise.

The CA is not asked only to be trusted. The CA produced evidence.

What is built

The architecture above runs in preproduction today on AWS and Google Cloud.

On AWS, that means Nitro Enclaves for enclave-based issuance components and NitroTPM-backed evidence for VM-level identity, measured boot, and workload posture. On Google Cloud, it means AMD SEV-SNP and Intel TDX Confidential VMs for protected execution, Cloud HSM as custodian, and vTPM-based evidence where VM boot state or workload identity needs to be represented.

Classical ECDSA and post-quantum ML-DSA-65 (FIPS 204) hierarchies operate in parallel. ML-KEM-768 (FIPS 203) is the subject key for TLS key-exchange certificates. Cedar policy with the fail-open lint is enforced in the oracle. Per-operation attestation, evidence bundles, the custodian gating key release on measurement, mutually attested TLS between RA and oracle, end-to-end on both clouds.

The breadth is there because no single TEE family fits every customer environment, and the architecture should not be hostage to one chip vendor or one cloud.

The 2029 problem

The math snapshot has a date.

In March 2026, Google’s Heather Adkins and Sophie Schmieg set 2029 as the target for completing Google’s migration to post-quantum cryptography. Google’s timeline matters beyond Google. They run Chrome and Android, and when they move, the WebPKI moves with them. CNSA 2.0 puts 2027 on software and firmware signing in National Security Systems and 2030 on general use. The CABF is working through its own timeline. Federal procurement requirements are already moving.

The CA infrastructure that exists today was designed for the snapshot of math problems that the 2029 transition invalidates. Every CA in production is going to be re-architected before it lands. The algorithms expire. The migration is not optional.

The transition itself is going to be heterogeneous. The classical-PQC X.509 path is going to run for a long time alongside what eventually replaces it. Merkle Tree Certificates — batched, transparency-native issuance with much smaller per-certificate overhead — are a likely part of the answer to ML-DSA’s signature size on the wire. The architecture above does not care which container format the certificate is in. The attested issuance pipeline, the custodian-gated key release, the evidence bundle, the transparency log — all of it operates on the issuance side. MTC issuance benefits from runtime evidence the same way X.509 issuance does, and the patterns in this post carry over.

The architectural question is what the replacement looks like.

A CA built on the runtime-evidence pattern does not cost more to deploy at the moment you are already rebuilding. It costs more only if you skip the rebuild, and skipping is not on the table. The hardware-anchored credential side of the wire has been arriving in production for a decade. The issuance side is the part still running on the old shape. The PQ deadline is the forcing function that makes the issuance side move. The choice is between rebuilding the old shape with new algorithms, and rebuilding it with the same discipline that the relying-party side has spent the last decade adopting.

Short lifetimes with ARI make the operational side tractable. Seven-day certificates with ARI-driven renewal turn the PQ migration from a flag day into a moving window. The fleet rotates without an emergency, without anyone touching a machine, because the CA can shorten the renewal window for specific machines or profiles whenever it wants to.

That is what the next CA looks like. It is not a different CA than the one I have been describing. It is the same one.

A CA Built for the Threat Model We Actually Have

This builds on earlier posts on what attestation actually proves, what confidential computing is and isn’t, and an honest accounting of the problems with the current generation of TEEs. None of those problems go away here. The argument is that despite those limitations, attestation is an important tool. Certificate issuance is overdue to use it.

Back in the 1990s, I was doing some consulting for DigiNotar, yes, that DigiNotar. They had CA facilities in a data center whose perimeter still had WWII-era anti-tank obstacles, large concrete barriers sometimes called “dragon’s teeth.” Of course, this was an artifact of the facility’s history, but data centers are designed from a security perspective with layers of physical protection, including barriers, mantraps, biometrics, individual vaults with cages, individual racks with their own locks and biometrics, cameras, and more. The threat of physical theft, destruction, or manipulation is exactly what these facilities are designed to mitigate.

When building a CA inside one of these facilities, we design yet another layer of protection. Administration networks are segmented from transaction networks, interconnects from supporting infrastructure, the issuance environment from the systems holding root keys. We add our own physical segmentation on top of that so we can build controls around multiple parties being necessary for the more sensitive operations, while still letting routine hardware maintenance happen on the schedule the SLA needs.

These are all useful and important things, but the reality is that CA key material is not likely to be physically stolen. It is more likely to be compromised from the outside. We solve these problems through design, not by writing more code.

Meaningfully measured code forces design upstream of the code itself.

A measurement of a monolithic blob proves almost nothing useful. A measurement that names a specific role, a specific security domain, and a specific assertion the verifier is supposed to act on, proves something. The roles, the domain boundaries, and the questions the verification has to answer have to exist before any code is written, or the attestation is a signature on nothing in particular.

In operating system design, we have similar problems. In naive systems, we load cryptographic keys into memory on a running, network-connected system, accidentally exposing ourselves to memory-disclosure bugs where a network attacker can steal keys. Heartbleed is the canonical example, but the class is what matters. We do this because it is simpler and faster, but it is also less secure. As systems designers, we address this by moving those keys out of the process of the network-connected application and into a different user context. That way, an attacker cannot simply get the network-connected service to dump memory. They have to get persistence and cross a kernel-enforced user boundary.

This is old wisdom. Least privilege and privilege separation exist because network-facing code should not also be the thing that controls the keys.

A parallel showed up in early cryptocurrency exchanges. Hot wallets were used as signing oracles because the design and deployment work needed to prevent that had not been done, and many of the high-profile compromises of that era trace back to that gap. The exchanges that survived learned to put boundaries between the wallet and the network. That boundary did most of the work. The dragon’s teeth around the building did the rest.

When third parties need to rely on external services operating in these environments, they often rely on auditors to attest that management assertions about operational practices are actually being followed. These assessments are usually performed by CPAs, not security specialists, which can limit their value. They also often rely on sampling a small portion of transactions to confirm that the controls being evaluated are being followed. That sample is drawn from evidence provided by the entity being audited, which is also the party paying for the audit.

All of the things discussed above help bring some minimal level of transparency and verifiability, but it is turtles all the way down, layer on layer, none of them reaching the runtime where the actual compromise happens. This is where confidential computing, and solutions like Private Cloud Compute, start to matter.

Policy changes meaning in this model. In the traditional assurance world, policy is a written promise. The CA publishes a CP or CPS, the operator commits to following it, and the auditor samples evidence to decide whether that promise was kept. In a runtime-evidence model, policy becomes part of the mechanism. A measured binary evaluates a specific policy, produces a decision, and the digest of that policy travels with the evidence. The shift is from policy as promise to policy as enforcement, from “trust us, this is what we do” to “this is the policy the measured system actually applied.”

Apple’s Private Cloud Compute is the worked example. PCC nodes attest to the binary they are running, refuse to do work for clients that cannot verify that attestation, and publish every production build for public inspection. The user’s device, not Apple, decides whether a given node is acceptable. That inversion, the relying party verifying the service rather than the service asserting to the relying party, is the part of the pattern that matters. The pieces are not new individually. The combination, at the scale Apple shipped it, proves the pattern is real. The third-party security reviews prove the architecture is serious. Attacks on confidential computing do not refute that point. They prove there is now a boundary worth attacking, measuring, and improving.

Apple is not the only proof point. Signal used SGX remote attestation for private contact discovery in 2017, with clients verifying that the enclave was running the expected open-source code. WhatsApp’s end-to-end encrypted backups use an HSM-based Backup Key Vault to keep recovery keys out of the ordinary service path, and that design was publicly reviewed by NCC Group. Microsoft’s Confidential Consortium Framework powers Azure Confidential Ledger. Different systems, different threat models, same direction of travel. High-assurance services are moving from institutional assurances toward runtime evidence.

What it looks like

Concretely, a CA built on the Private Cloud Compute pattern looks like this.

Issuance is split into two attested components. The first, the registration authority, takes the certificate request, resolves identity from authoritative sources, evaluates the issuance policy, and produces a signed authorization context. The second, the signing oracle, holds the CA private key and produces the signature. Each runs in a separate attested enclave, and each is measured separately. This means policy can evolve without re-measuring the key-custody component, and keys can rotate without re-measuring the policy component.

The policy layer matters here too. Each component is not just running code, it is making a verifiable policy decision before it acts. The RA decides whether the request is authorized and the identity evidence is sufficient. The signing oracle decides whether the RA, the request, and the authorization context are acceptable before it signs. The evidence does not just say which binary ran. It also says which policy that binary evaluated.

The two components do not trust each other because they are on the same network. They trust each other through attestation, mutually verified at every connection, and the signing oracle does not merely accept the RA’s conclusion. Before it signs, it independently verifies the RA’s attestation, checks that the authorization context is fresh, confirms that the request is bound to an allowed profile, and verifies that the policy facts asserted by the RA match the evidence presented to the oracle. A compromised RA, even one with its own signing key, does not get to mint an out-of-profile certificate, bypass attestation, or turn the CA key into a general-purpose signing oracle.

The CA private key is not loaded by the operator. It is held by a custodian, a hardware security module, a cloud KMS, or another enclave, and it is wrapped so that the custodian will release it only to a signing oracle whose attestation matches a published image. The list of acceptable images is small, public, and updated through a documented process. An operator who runs a different binary, however benign the reason, does not get the key. The dragon’s teeth around the data center are still there. They no longer have to do the whole job.

Both the RA binary and the oracle binary are built from public source and are reproducibly buildable. Anyone can rebuild from the published sources, compare their measurement to the one in the attestation, and confirm that the two match. This is the part of the model that makes trust mean something specific. Not the operator’s word, not the auditor’s snapshot, not the CA’s policy statement, but the build process and the published source. To verify what a particular issuance was actually done by, you would not need to be admitted to the data center. You would need a compiler.

Each issued certificate is accompanied by a portable evidence bundle, signed by the attested issuance system. The bundle names the binary that produced the signature, the attestation root that vouched for the binary, the RA policy decision, the oracle policy decision, the identity assertion the RA accepted, and the inputs the oracle independently verified before signing. A relying party who trusts the chip vendor’s attestation root can determine for themselves whether the issuance was performed by code on the published list, against the policy on the published list, by an RA that accepted the identity claim it claimed to accept. The CA is not asked to be trusted. The CA is asked to produce evidence.

None of this removes the HSM, the auditor, or the operator. The HSM is still excellent at the threat it was built for, and a custodian holding a key wrapped to an attestation policy is still doing HSM work under the hood. The auditor is still needed to attest that the published policy is sensible, that the source matches the binary, that the threat model is honest, and that the runbook is followed in the moments where attestation cannot help. The operator is still needed to run the infrastructure and respond when things break.

What changes is what they are asked to prove.

Today, a relying party mostly gets institutional assurances. The CA says it followed its policy. The auditor samples evidence and says the controls were operating. The operator says the production system was the one described. Those are useful assurances, but they are indirect. They do not let the relying party inspect the actual path between a request, a policy decision, and a signature.

A Private Cloud Compute style CA changes that. It turns the issuance path itself into evidence. The question is no longer only whether the CA says it followed the rules. The question becomes which measured binary evaluated this request, which measured binary signed it, which policy digest was used, which identity evidence was accepted, what validation methods were used during issuance, and whether all of that matches the public commitment the CA made.

When the source is open and reproducibly buildable, that evidence includes a hash of the code that made the decision and signed attestations about the runtime elements that went into that decision. When the code is not open source, third parties can come in and validate the source, the build process, and the correctness of the claims, as Apple did with Private Cloud Compute. The public hashes then let others verify that the code claiming to provide these guarantees is, in fact, the code that ran.

Open source is not magic, and the point is not faith in “many eyes.” The point is that this shifts the emphasis from betting on physical security and operational practice audits to secure system design and cryptographic evidence about what code actually ran and what it actually did.

That is the threat model mismatch, and it is not only a CA problem. We built the WebPKI around buildings, cages, ceremonies, HSMs, and audits because those were the tools we had. We did the same thing cryptographically. We built systems around the assumption that factoring large composites and solving discrete logs on elliptic curves were out of reach. Q-day changes that assumption. Runtime compromise changes the operational assumption just as fundamentally.

We apply the same instincts in any environment we want to call high-assurance. They still matter, but most of the failures we care about are not physical failures. They are logical, remote, operational failures in the runtime path. The rate of change makes that gap wider every year. Annual audits are retrospective, and between them systems change thousands of times, so what the auditor described is rarely what is actually running when a relying party sees a certificate.

Cryptography turns security problems into key-management problems. AI turns assurance problems into runtime-evidence problems. Once agents are making decisions, calling tools, and changing state, the question is no longer what policy you wrote or what control an auditor sampled. The question is what actually ran, what it saw, what boundary contained it, what policy constrained it, and what evidence survived execution.

A Private Cloud Compute style CA gives us a way to make that path visible, attestable, and independently verifiable. The same pattern applies wherever the gap between what we say a system does and what it actually does at runtime matters.

The First AI-Built Zero-Day Is Not the Interesting Part

In the mid 90s I worked at a company called Cybersafe. Today it would get labeled an IAM/SSO vendor. What we actually built was a first-generation security platform: Kerberos, password management, PKI-based MFA, key management, host intrusion detection, and what would now be called zero trust access. The company failed for the usual startup reasons. People. Corporate Politics. Timing. The technology was a decade ahead of its market.

One debate from that period has stayed with me. As we expanded into host intrusion detection, the question of automated response kept surfacing. Could a system safely act on its own to contain an intrusion in progress? Drop a connection. Kill a process. Isolate a host. Nobody on the team could imagine a credible answer. The false positive risk was unbounded. The response itself could be weaponized. The rule sets were not trustworthy enough to delegate authority. We shipped detection and let humans make the call.

That debate has an answer now, and it is not the one we expected. Automation on the offensive side is not new. Worms, exploit kits, credential stuffing, and phishing infrastructure have been automated for decades. What is new is broad delegated judgment at machine speed, in the hands of people who do not have to worry about false positives because the blast radius is somebody else’s network.

What the report actually shows

The interesting question is not whether AI helped produce a zero-day. That was inevitable. The interesting questions are operational. What kinds of systems make bad machine judgment cheap enough to deploy at scale. What kinds of defensive systems are still pretending human review is the control boundary.

Google Threat Intelligence Group’s latest AI Threat Tracker report documents the first zero-day exploit that GTIG says it has high confidence was developed with AI assistance. The headline framing is technically correct. The specifics tell a more interesting story.

The exploit was a Python script that bypassed 2FA on an open-source web-based system administration tool. It required valid user credentials in the first place. The criminal group planned a mass exploitation campaign, and Google disrupted it through responsible disclosure to the vendor. GTIG identified the artifact as AI-developed because the code carried obvious tells. A hallucinated CVSS score. Textbook Python formatting. Detailed help menus. Educational docstrings characteristic of training data. The artifact still carried the seams of its production.

This is not the LLM failing at the hard part. The vulnerability itself is a real find. GTIG specifically notes that the 2FA flaw stems from a hardcoded trust assumption, a high-level semantic logic flaw of the kind that fuzzers and static analyzers tend to miss but that frontier LLMs can reason about by reading developer intent. The model did discovery work that previously required a competent human auditor. Where the operation broke down was in weaponization. The attacker shipped an artifact that still looked like a tutorial.

This is a familiar failure pattern showing up on the offensive side for the first time. Fluency reads as competence. The attacker trusted an artifact with hallucinated metadata and educational comments still attached because it looked like a real exploit, in the same way over-eager engineering teams hand agents production credentials because the agent sounded like it knew what it was doing. The criminals here got bitten by the same dynamic that has been producing outages and data loss in vibe-coded production systems for the last eighteen months. The substrate is doing some of the work of inviting the misconfiguration.

Hultquist’s thread on the report is hedged correctly. The importance is the trajectory, not this specific specimen. Pull the camera back and the rest of the report is more interesting than the lede.

Three things worth surfacing

APT45 sending thousands of repetitive prompts. The North Korean group has been observed using recursive prompting to analyze CVEs and validate proof-of-concept exploits at scale. That is the industrial-scale answer to LLM variance. Solve the quality problem by amortizing across volume, then have humans cherry-pick the outputs that survived validation. The same statistical strategy that makes modern fuzzing work, applied one layer up the stack. The model does not have to be reliable. The pipeline has to be cheap enough that unreliability does not matter.

CANFAIL and LONGSTREAM using LLM-generated decoy code. A Russia-nexus intrusion cluster has been deploying malware that uses LLM-generated code to conceal malicious functionality. GTIG documented LONGSTREAM containing 32 instances of code querying the system’s daylight saving status, repetitive benign-looking activity used to camouflage the malicious core. CANFAIL carries similar filler logic with LLM-generated comments self-describing the decoy blocks. The stylistic noise of LLM output is becoming the obfuscation layer. The verbose docstrings. The textbook structure. The over-explained variable names. These used to be tells. They are now camouflage. Any heuristic built on the AI-tell will start producing false negatives.

The wooyun-legacy skill plugin. A specialized GitHub repository is being distributed as a Claude code skill plugin that integrates a distilled knowledge base of over 85,000 real-world vulnerability cases from the Chinese bug bounty platform WooYun (2010 to 2016). This is the supply side of the same market. Skill packs are tooling. Tooling gets distributed. The economic logic for adversarial skill packs is identical to the economic logic for legitimate ones. Any platform hosting them inherits a familiar problem. App stores and package registries have been working through it for two decades. Making trust decisions at distribution scale about code from parties you cannot directly inspect.

Both sides are running on the same substrate

On the defensive side, Google is using Big Sleep to find vulnerabilities and CodeMender (Gemini-driven) to fix them automatically. The criminals are pulling from a model class indistinguishable from the one Google is running its defensive tooling on. Both sides have access to the same substrate. The differential collapses to data quality, harness sophistication, and discipline around permissions.

That last one is the part the 90s HIDS conversation did not anticipate. It is also the part that should be the least surprising. The controls discipline did not get easier because the platform got more capable. If anything the gradient got worse. A confused regex IDS in 1999 had a bounded action space. The rule set was enumerable. You could write down what it would do wrong. A confused agent in 2026 has whatever action space its credentials grant it, which in most deployments is more than it should. The fluency that made it easy to give the agent broad permissions in the first place is exactly the property that makes its failures look reasonable in the moment.

The race Hultquist refers to is real, and it has started. The race is not about model capability. Both sides are running models from the same vendors, often the same model. The race is about who has better-curated data feeding their harnesses. Who has stricter discipline around what their automation can touch. Who has the institutional memory of what happens when you delegate authority to a system whose judgment you cannot audit in advance.

The HIDS debate from the mid-90s got an answer. It came from the other side of the wire. Not because defenders learned how to trust autonomous judgment, but because attackers learned they did not need to. They could delegate broadly, externalize the blast radius, and let volume compensate for judgment. The defensive answer cannot be more vibes, broader credentials, and better prompts. It has to be the inverse. Narrower authority. Better harnesses. Replayable decisions. And institutional memory about what happens when fluent systems get mistaken for trustworthy ones.

AI Is Not Why They Are Cutting (Yet)

Back in 2000, the rule of thumb at Microsoft was that each employee needed to average roughly $600K in top-line revenue. Inflation adjusted, that is about $1.1M to $1.2M today. Microsoft was a high-margin software monopoly at peak, so it is not a universal benchmark, but it gives a sense of what disciplined operating leverage looked like even at a company printing money.

Over the last decade, and especially during the COVID-era zero-rate and QE environment, many companies responded to dysfunction by hiring around it instead of fixing it. Cheap capital reduced the pressure to make hard operating decisions. Necessity is the mother of invention, but cheap money suppressed that necessity for a long time.

Then two things changed at roughly the same time. Rates went from zero to five, and Section 174 of the tax code stopped letting companies expense software developer salaries in the year incurred. The R&D amortization rule from TCJA kicked in for the 2022 tax year, forcing five-year amortization domestically and fifteen years for work done offshore. At the exact moment capital got expensive, a major software-company cost center became less friendly from a cash-tax and after-tax economics perspective.

Now AI has added a new pressure. Companies are adopting AI quickly, but we are still early. Much of what is happening inside enterprises is still R&D, experimentation, platform buildout, workflow redesign, and internal tooling. That work is not free. It comes with token costs, infrastructure commitments, GPU capacity, vendor contracts, and a lot of expensive trial and error.

Jensen Huang has made the point, in characteristically aggressive form, that if he pays someone $500K, he expects them to use a meaningful amount of compute to become more productive. Whether or not you take the specific numbers literally, and you probably should not since Nvidia sells the machines that consume those tokens, the economic point matters. AI spend has to come from somewhere.

That is the part many layoff narratives miss. Companies are not simply replacing workers with AI. They are also reallocating budget toward AI. Token budgets, model access, inference costs, internal AI platforms, data infrastructure, and R&D commitments are becoming real line items. To fund them, companies are looking at the headcount they accumulated under different interest-rate assumptions, different tax assumptions, and a different view of software demand.

There is also a demand-side story. COVID pulled years of enterprise software adoption into eighteen months, and a lot of what gets reported as growth now is ARR rotating through M&A rather than new logos landing. In parts of the market, revenue is moving around as much as it is expanding.

That is the real backdrop for the wave of layoffs. AI is the story being told on earnings calls. The reality is accumulated management debt finally meeting a cost of capital that punishes it. Layers of process. Unclear ownership. Duplicated work. Headcount that grew faster than execution improved. And now, on top of that, companies need to make room for a new class of AI-related spend.

The pressure also lands hard on old farts like me. We are expensive. And to be honest, some of us (not all) do not want to change how we work or keep up with how the technology is evolving. That makes us easy targets when finance needs to hit a cost number. AI gives the story a forward-looking sheen, but the underlying move is simpler: reduce expensive headcount, flatten layers, correct years of operational laziness, and redirect budget toward the new thing everyone believes they must fund.

AI is real. The layoff narrative around it usually is not. When you read a layoff announcement blaming AI, you are mostly reading a press release about cost of capital, tax policy, demand pull-forward, AI infrastructure spend, and an org chart that finally got too expensive to defend.

Read the 10-Qs, not the blog posts.

Smaller, Provable, and on Hardware You Own and Operate

Dino Dai Zovi made an argument recently that I want to build on.

“If you agree that AI will help attackers discover and exploit vulnerabilities 10-100x more easily, then your excess attack surface has also just become 10-100x more of a liability. The right defensive strategy is to prioritize reducing attack surface and trusted computing bases.”

The argument is right. It is also not new.

We have been working on this problem for fifty years

Operating system designers gave this set of principles a name in 1975. Saltzer and Schroeder published The Protection of Information in Computer Systems and laid out economy of mechanism, least privilege, separation of privilege, complete mediation, fail-safe defaults, and open design. The Orange Book formalized “trusted computing base” a few years later, with the central observation that the security of a system depends on what is inside the TCB, and that smaller TCBs are easier to make trustworthy than stronger ones. The microkernel debate that ran from Mach through L4 was an argument about how aggressively to apply these principles to commodity systems. seL4 went further and produced a formally verified microkernel in 2009, demonstrating that the principles could be pushed all the way to mathematical proof.

The same ideas show up everywhere once you look. Chrome’s site isolation is privilege separation applied to the browser. OpenBSD pledge and unveil are least privilege applied to userland. Linux namespaces, capabilities, and seccomp are mediation primitives. CHERI takes the same intuitions down into the instruction set. GlobalPlatform Security Domains are the smart-card-world version of compartmentalized trust, with separate keysets, separate trust roots, and isolation between issuers, verifiers, and applications on the same chip.

None of this is new vocabulary. Security domains. Privilege separation. Attack surface reduction. Trusted computing bases. We have known the names of these things for decades, and we have known what to do about them.

What AI changes is the math, not the principles. Excess privilege has always been a liability. The probability of it mattering on any given day was low enough, and the timescale on which it mattered was long enough, that organizations could carry oversized TCBs and broad blast radii in the backlog as “things we should clean up someday.” AI compresses the timescale and raises the probability. The slack that was tolerable on a five-year cleanup horizon is not tolerable on a six-month one. Dai Zovi’s 10-100x is a multiplier on the cost of carrying slack, not a discovery about whether slack should be carried.

The OS tradition assumed you owned the layer below the boundary

There is one place where the classical OS framework needs an extension before it covers the world we are actually deploying into.

The kernel could enforce process isolation because the kernel was below the processes. The hypervisor could enforce VM isolation because the hypervisor was below the VMs. The trust property was “I control the layer below the boundary, so the boundary is meaningful to me.” Every classical OS-level guarantee depends on that.

Cloud broke that assumption. AI workloads, which run on cloud GPUs and orchestration infrastructure that almost nobody owns, intensify the break. The layer below your workload is operated by someone else. Their hypervisor, their firmware, their physical facility, their scheduling. The classical principles still apply, but their enforcement mechanism is gone.

Reduction is necessary. Reduction is not sufficient. Once you have shrunk the attack surface and the TCB to something defensible, you still have to prove that the small thing you reduced to is the small thing actually running, and that what it just did is what you said it would do. Without that proof, the small thing is functionally indistinguishable from the large thing. An attacker who replaces your tiny attested signing service with a tiny lookalike has bought themselves all the same access at a lower cost.

The defensive posture in an AI-leverage world is not just smaller. It is smaller and provable.

Law #3 did not go away

There is also one law older than the OS-design principles that the cloud security pitch of the last decade has spent a lot of energy pretending to repeal.

Microsoft’s Ten Immutable Laws of Security were published by Scott Culp in 2000. Law #3 is the relevant one here. If a bad actor has unrestricted physical access to your computer, it’s not your computer anymore. The marketing for confidential computing has, in effect, been an extended argument that hardware-encrypted memory and remote attestation make Law #3 obsolete on cloud infrastructure. They do not, and the research record is clear that they will not.

Cloud TEEs share microarchitectural resources with the hypervisor and with co-tenants. That is what produces the side-channel catalog. Cloud providers have physical access to every server they operate. That is what produced TEE.Fail. Hardware roots of trust have a shelf life because they live on the same silicon as everything else, and that silicon is in the operator’s possession. None of these properties are bugs. They are what “running on hardware somebody else owns” means.

Server-side cloud TEEs are useful for narrow, bounded properties. They are not useful for repealing Law #3 against a determined operator, and they will not meaningfully defeat multi-tenant side channels at the scale at which they are deployed. Selling them as if they would is what produces the gap between marketing and engineering that I have been writing about for the last year in Confidential Computing’s Inconvenient Truth, What Is Confidential Computing, What It Isn’t, and How to Think About It, and TPMs, TEEs, and Everything In Between.

The criticism in those pieces is specific. It is about the gap between what cloud TEEs are sold as doing (defeating the operator) and what they actually do (making narrow verifiable claims to relying parties about specific operations). The criticism is not that the underlying assurance technology is useless. The technology delivers exactly what it was originally designed to deliver, in the contexts where the original threat model holds. The marketing has been run over those contexts.

Where the assurance property actually delivers

The assurance property does deliver, where the model fits. The model fits when the hardware is in the user’s possession, when the device is discrete and tamper-resistant, and when attestation is used to prove “the key in this request lives on this specific device and has never left it” rather than to prove “the operator of the rack cannot read your memory.” That is the threat model the technology was designed for, and it has been working in production for a long time.

A few examples of the pattern done honestly.

YubiKey PIV attestation. The YubiKey can produce an attestation certificate, signed by Yubico’s manufacturer key, asserting that a private key was generated on this YubiKey, has the slot and policy attributes you expect, and is non-exportable. Yubico documents the protocol clearly. The trust property is sharp because the device is sharp. Discrete silicon, tamper-resistant package, manufacturer chain you can pin against. Law #3 still applies, and it cuts the right way: the user has unrestricted physical access to the YubiKey, and the YubiKey is the user’s computer.

Apple Secure Enclave for SSH agents. Paprika and Secretive are SSH agents that store the private key in the Mac’s Secure Enclave Processor. The application processor never sees the key, and even root on the Mac cannot extract the key material. Root can still cause the key to be used through the legitimate signing API, modulo whatever consent prompts apply, but extraction itself is what the SEP boundary is built to defeat. The user owns the laptop, the key is on a physically separated processor on the same SoC, and the threat model (other applications on the same device, or malware that compromises the application processor) matches what the SEP was built for.

Smart cards and HSMs. GlobalPlatform Security Domains, the Yubico PIV applet, hardware-backed PKCS#11 tokens, FIPS 140-3 Level 3 modules. Discrete silicon, tamper-resistant packaging, attestation chains rooted in manufacturer keys. The model that worked in the late 1990s and that still works today, because the threat model has not drifted.

PeculiarVentures/attestation is the verification side of all of this. Parsing, validating, and reasoning about attestation evidence from these various sources. Attestation without a verifier is a claim. Attestation with a verifier is something the relying party can act on.

The common shape across all of these is that the user owns the hardware, the boundary is physical, and the attestation chain anchors in a manufacturer key whose threat model the user can actually evaluate. Law #3 is honored rather than denied.

Transparency is the other cross-machine extension

There is a second extension of the classical OS-design tradition that matters for the AI-leverage world, and that composes with attestation in important ways.

Saltzer and Schroeder’s open design principle says the security of a system should not depend on the secrecy of its mechanism. The cryptography community has applied this rule to algorithms for decades. The systems community has been slower to apply it to operations. What is the rack actually doing right now? and what has it done in the past? are operational questions, and historically the answer was “trust the operator’s audit logs.”

Transparency logs are the operational extension of open design. The idea is to publish what a system is doing to an append-only public log, with cryptographic proofs that the log cannot be retroactively modified, and to design the relying party to require evidence from the log before trusting any operation. Multiple independent witnesses cosign the log so that no single party can serve different views of reality to different relying parties.

The pattern is in production at scale. Certificate Transparency requires every WebPKI certificate to be logged publicly before browsers will trust it, which converts CA misissuance from “discovered by accident, sometimes” into “discovered by anyone watching the log.” Sigstore applies the same model to software signing, with every signature published to Rekor and consumers able to require log inclusion before accepting a binary. Google DeepMind’s Verifiable Data Audit was an early attempt to apply the same model to data access in healthcare. The infrastructure is consolidating at transparency.dev, and C2SP standardizes the interoperability primitives: tlog-tiles, the witness and cosignature protocols, signed-note, and static-ct-api.

Attestation tells a relying party “this code is running right now.” Transparency tells a relying party “this code has been published, reproduced, and witnessed by parties whose collusion would be visible.” The two compose. Apple’s Private Cloud Compute is the most prominent recent example. Every production build is published to a transparency log, user devices will only communicate with nodes whose attested measurement matches the log, and Apple released a virtual research environment so anyone can verify the build claims independently. Google’s Project Oak was an earlier expression of the same combination, building remote attestation against publicly-published binaries as the foundation of trust. The Merkle Tree Certificates draft, now a working group document in the IETF’s new PLANTS working group, extends the same logic to TLS at scale, replacing traditional X.509 issuance with batched, transparency-native cert formats designed for the shorter lifetimes the WebPKI is moving toward.

The relevant property for the AI conversation is that transparency reduces the number of parties you have to trust to one less than would otherwise be required. With attestation alone, you trust the manufacturer of the silicon. With transparency, you trust any of the witnesses to be honest, plus the manufacturer of the silicon. That asymmetry is what makes transparency the right tool for environments where the operator might be the adversary.

What this leaves for server-side TEEs

Bounded usefulness, designed honestly.

Server-side cloud TEEs do not defeat the operator. They produce narrow verifiable claims that a relying party can check against their own trust anchors. This signing service ran this image at this measurement. This certificate was produced by this enclave for this RA. This policy was applied. This key was attested as non-exportable by the HSM that signed. Each of those is a useful property. None of them is “the operator cannot see your data.” Building an architecture that pretends otherwise is how organizations end up with a single point of failure they did not know they had.

I have been building GoodKey CA as a worked example of the bounded-usefulness pattern. A certificate authority is a useful test case for this kind of architecture, because the trust property is sharp and the threat model is well understood. The shape of the answer is mostly classical OS design pulled across machine boundaries, with hardware-anchored trust at the endpoints and a deliberately bounded intermediary in the middle.

Each enclave is a security domain. RA, CA, and HSM are independent compartments. Each has its own measured image, its own keys, and its own attested boundary. Compromising one does not compromise the others. Privilege is separated by design rather than by policy.

The TCB inside each domain is small enough to characterize. Each enclave runs a single-purpose deterministic image. The measurement is one number. The image is reproducible from source. There is no general-purpose runtime to subvert and no orchestration sidecar to gain a foothold from. AWS Nitro Enclaves were the deliberate choice over SGX or TDX. The architecture uses VM-level isolation with dedicated CPU and memory rather than carving enclaves out of shared-cache, shared-core silicon, which reduces a large class of the microarchitectural side-channel exposure that the SGX and TDX families have to grapple with. Dedicated resources, minimal hypervisor, deterministic measurement.

Mediation is complete and inside the boundary. Every signing operation goes through the policy evaluator (Cedar) inside the enclave. Authorization is part of what is attested, not external to it. A compromised RA cannot lie about what policy was applied, because the policy evaluation was inside the measurement.

Trust is not transitive. When the RA tells the CA that a client attestation passed, the CA does not believe it. The CA re-runs the verification itself, against its own registered verifier, before signing anything. This is the cross-machine version of “the kernel does not trust userland’s claim that a syscall is authorized.” The CA does the check itself, every time.

Per-operation attestation, not per-boot attestation. The CA produces a fresh Nitro attestation for every certificate it signs, with user_data set to SHA-256(certDER || raKeyFingerprint). That binds this specific certificate to this specific enclave with this specific RA on the other end of the conversation. A boot-time attestation tells you the box looked right when it started. A per-operation attestation tells you the box looked right when it did the thing you actually care about.

Hardware-anchored trust at the endpoints. The signing keys themselves live in a hardware HSM with discrete-silicon attestation rooted in the Marvell manufacturer chain. The clients prove they hold hardware-protected keys via TPM or device attestation. The Nitro layer in the middle does not have to defeat AWS to be useful, because the actual key material is protected by a different boundary that AWS does not own, and the evidence on the wire is anchored in trust roots the relying party already trusts.

Operations published to a transparency log. The CA’s attested measurements, policy versions, and issuance records get logged to an append-only structure with multi-witness cosigning. The operator still chooses what to submit. What the operator does not get is the ability to retract entries after the fact, modify history, or serve a different version of the log to a different relying party without those parties detecting the divergence. A relying party’s confidence that the system has been running honestly over time stops being a function of trust in the operator’s audit logs and starts being a function of properties that hold against the operator. This is the same shape Certificate Transparency gives the WebPKI, applied to the CA’s own operational claims about itself.

Failure modes are bounded by design. Certificates are seven days. ACME Renewal Information lets the CA shorten renewal windows targeted at specific machines or specific profiles, and goodenroll polls for those signals on its own schedule. The fleet rotates without an emergency window and without anyone touching a machine. The exposure window becomes a configuration choice rather than a function of certificate lifetime, and revocation infrastructure stays out of the critical path of the threat model.

Post-quantum where it counts. ML-DSA-65 (FIPS 204) for certificate signing, ML-KEM-768 (FIPS 203) as the subject key for TLS key-exchange certificates. ARI is what makes the migration tractable on the deployed fleet, because you do not have to wait for natural expiry to do the work.

Nitro is a bounded-trust intermediary. AWS still owns the silicon it runs on. What the architecture buys you is that the property the relying party has to verify is narrow and concrete, and that the actual long-lived secrets are protected by hardware that AWS does not own. Against an AWS-internal threat with full physical access and unbounded effort, Law #3 still applies. Against the attacks the architecture is actually defending against (software compromise of the CA pipeline, a rogue admin pulling secrets through the management plane, a tampered build reaching production), the bounded property is exactly the property you need.

The substrate

An architecture like this only works if the underlying primitives are right. Three pieces of infrastructure I have been spending time on are upstream of GoodKey CA.

PeculiarVentures/scp is GlobalPlatform Security Domain key management in Go. The name is not a coincidence. Smart cards and HSMs have been doing security domains in hardware for two decades, with separate keysets, separate trust roots, and isolation between issuer, verifier, and application code on the same chip. The library implements SCP03 and SCP11 and a typed Security Domain management layer for key lifecycle, certificate provisioning, and trust validation, against verified profiles with byte-exact validation against independent reference implementations. This is the unglamorous work of “make sure the keys you are putting on hardware are actually being put on hardware in the way you think they are.” If the key on the device is not where you think it is, every downstream signature is asserting something false.

draft-ietf-acme-device-attest, which I am a co-author on, is the cross-machine extension on the client side. It standardizes how a device proves to an ACME server that the key in a certificate request lives in attested hardware on a specific device. The recent revisions resolved several interoperability gaps that had blocked broad implementation, including the Apple-specific attToBeSigned semantics around sha256(token) versus sha256(keyAuth), an explicit identifier-verification step, the badAttestationStatement error type, and a hardware-module identifier type. The point of the work is to make the client side of the trust chain as verifiable as the CA side. An attested signing service that issues credentials to anyone who asks is not solving the problem, it is moving it.

PeculiarVentures/attestation closes the loop. It is the verifier side that consumes attestation evidence from these various sources (TPMs, YubiKeys, Apple devices, Nitro Enclaves) and reduces it to claims a relying party can act on. Without a verifier, attestation is marketing. With a verifier, it is engineering.

These are not separate efforts. They are what makes hardware-anchored cross-machine trust mean anything in the wild. The transparency-log side of the same problem is being standardized in parallel through transparency.dev, C2SP, and the Merkle Tree Certificates draft, which together extend the same model to issuance auditability at WebPKI scale.

What this asks builders to do

The Dai Zovi prescription is operating-systems hygiene applied to the whole stack. The verifiability corollary is the same hygiene extended across machines you do not own. Both are old. AI is what is making them mandatory.

Pick small. Compartmentalize. Strip privilege to what each component genuinely needs. Make each component’s TCB small enough that one person can characterize it in a sitting. Single-purpose services, deterministic builds, dedicated resources rather than shared microarchitectural state, single-image enclaves rather than orchestrated runtimes.

Make it provable across machines. Per-operation attestation rather than per-boot. Independent re-verification at every hop, not transitive trust. Authorization decisions inside the attested boundary. Evidence bundles the relying party can run a verifier against, with their own trust anchors. Short lifetimes with active rotation rather than long-lived credentials backstopped by revocation. And publish the operations themselves to a transparency log with independent witnesses, so the proofs survive disagreement about who saw what when, and so a single dishonest operator cannot serve different versions of reality to different relying parties.

Anchor trust in hardware whose threat model you can actually evaluate. Where you can put the long-lived secret on hardware the user owns, do that. YubiKey, Apple Secure Enclave, TPM in the laptop on the engineer’s desk, smart card in the operator’s pocket. Where you cannot, use a cloud TEE as a bounded-trust intermediary that produces narrow verifiable claims, and design the architecture so the long-lived material lives in a different boundary that the cloud operator does not own.

And know what your assurance is buying you. Cloud TEEs are not how you defeat the operator. They are how you make narrow operations verifiable to relying parties while accepting that absolute properties against the operator are not on offer. The places where attestation delivers what it advertises are the places where the user owns the silicon. Law #3 has not been repealed, and AI has only raised the cost of pretending otherwise.

Smaller is the easy half. Provable is most of the engineering. On hardware you own is where the property actually holds.

The Illusion of Constant Acceleration

Spend enough time around AI right now and you start to get the feeling that everything is speeding up, all the time.

Every week there is a new model, a new capability, a new claim that some industry is about to be remade. It starts to feel like rapid change is just the new baseline. Like history has bent into a permanently steeper slope.

I do not think that is right.

What I think is closer to the truth is that we have gotten used to confusing motion with progress, and delay with inevitability. Some things are moving very quickly. Others are barely moving at all. We treat the former as inevitable and the latter as unavoidable.

Neither is true.

My father was born in 1942. That is not ancient history. When he was born, there was still a lot of basic infrastructure left to build.

Within a little more than a decade, nonstop transcontinental passenger air service became viable. Less than eight years after that, a human entered space. Eight years later, people were walking on the Moon.

That is a staggering amount of change in a very short period of time.

In one person’s early life, we went from making coast-to-coast air travel practical to landing human beings on another celestial body. Not as a thought experiment. Not as a roadmap. We just did it.

And it was not only aerospace. The Golden Gate Bridge was built in about four years. The first transcontinental railroad was completed in about six. These were massive physical undertakings that reshaped how people moved and how economies functioned, delivered on timelines that would feel almost implausible now.

The easy way to dismiss this is to say that software is fast and physical infrastructure is slow. That if AI looks fast and transit looks slow, that is just how the world works.

But that does not really hold up.

Ukraine did not build its drone ecosystem on leisurely timelines. Tesla compressed what many assumed would be a slow industrial transition into something the rest of the auto industry had to react to. When something actually matters, physical systems move. Supply chains get reorganized. Tradeoffs get made. Bureaucracies get bent. Talent concentrates. People stop explaining why something is hard and start figuring out how to get it done.

That is part of what makes Artemis interesting.

This is not a criticism of Artemis. It is an ambitious and serious effort. But it is also a reminder that progress is not self-sustaining. Apollo is often remembered as a triumph of technology, but it was just as much a triumph of focus, alignment, and urgency. Artemis reminds us that those things matter just as much as the rockets do.

There is another force that shows up in systems like this.

At Google, there was a name for it: slime mold.

It is what happens when layers of process, approvals, coordination costs, and local incentives build up over time until forward motion gets harder even when nobody involved is being unreasonable. Everything makes sense on its own. The system just moves more slowly.

Technology policy has its own versions of slime mold.

We saw it in the crypto wars, when policymakers convinced themselves that math could be slowed down with policy, as if cryptographic reality were open to negotiation. It was not. What that produced was not real control. It produced friction, workarounds, and the illusion of governance.

You can see the same instinct showing up again in parts of the conversation around AI. When institutions feel outpaced, they respond with process. That instinct is understandable, but it rarely solves the problem. You do not make systems safer by pretending inevitabilities are optional. You make them safer by building the infrastructure, incentives, and accountability needed to deal with what is actually happening.

But that is not how we tend to think about progress.

We talk about technological achievement as if it were mostly about invention, as if once something has been demonstrated it remains latent in society, ready to be called back into service whenever we need it.

That is not how any of this works.

The ability to do ambitious things quickly depends on organizational memory, industrial capacity, political alignment, tolerance for risk, and a culture that still expects big things to happen on human timescales.

Lose enough of that, and even getting back to where you once were becomes hard.

You can see it in infrastructure. Projects that once would have been treated as urgent now take decades, often in fragments so small that earlier generations would have treated them as preliminary milestones. Over time, that changes expectations. Slowness starts to look like responsibility. Ambition starts to sound naive.

That is the trap.

The problem is not just that progress slows. It is that people get used to it. What would once have looked like drift starts to look like process. What would once have sounded like an excuse starts to sound like maturity.

Meanwhile, in domains where urgency and incentives line up, things still move very quickly. ChatGPT was released publicly in late 2022. In a few years, AI went from something most people associated with research labs to something embedded in everyday workflows, products, and policy debates.

AI did not prove that everything is accelerating.

It proved that when enough capability, capital, and attention line up, rapid change is still possible.

That is the point.

The world is not uniformly speeding up. Some parts of it are. Others are not. And the difference has less to do with atoms versus bits than with whether we have decided something actually matters.

That ought to make us a little less complacent.

People like to tell themselves that once a technology is important enough, the rest somehow sorts itself out. The problems get solved. The risks get managed. The surrounding systems catch up.

History does not really support that.

Things were only all right in the past because people worked very hard to make them all right. The systems that made aviation safe, that made infrastructure dependable, that made computing usable in high-trust environments, none of that appeared on its own.

The same will be true here.

If we want AI to be safe, trustworthy, and broadly useful, that will not happen as a side effect of capability gains. Security will not emerge on its own. Governance will not emerge on its own. The infrastructure needed to make these systems worthy of dependence will not emerge on its own.

Those things only happen when people decide they matter.

That is the real problem with the idea that everything is accelerating. It makes it easy to believe that progress takes care of itself.

It does not.

Progress happens when people decide it needs to, and then do the work.

Confidential Computing’s Inconvenient Truth

This is part of a series on confidential computing. See also: Confidential Computing: What It Is, What It Isn’t, and How to Think About It for practical deployment guidance, and Why Nobody Can Verify What Booted Your Server for the attestation infrastructure gap. Two companion reference documents provide the evidence base: the TEE Vulnerability Taxonomy and TPM Attestation and PCR Verification: The Infrastructure Gap.

Confidential computing has a vulnerability record that grows every year, an attestation infrastructure that does not work at scale, and a hardware root of trust with a demonstrated shelf life. This piece explains why.

I want to be clear about where I stand before cataloging problems. I believe in this technology. What Signal has done with Private Contact Discovery and Sealed Sender using SGX enclaves, building systems where even Signal’s own servers cannot see who is contacting whom, is exactly the kind of architecture that confidential computing makes possible. Apple’s Private Cloud Compute takes the model further. Every production build is published to a transparency log, user devices will only communicate with nodes whose attested measurements match the log, and Apple released a virtual research environment so anyone can verify the claims independently. Moxie Marlinspike’s Confer applies the same idea to AI inference, with all processing inside a TEE and remote attestation so the service provider never has access to your conversations. These are real systems delivering real privacy guarantees that would be hard to achieve any other way.

More broadly, TEEs make systems more verifiable. Instead of asking users to take on faith that a service handles their data correctly, the service can prove it through attestation. I wrote earlier about attestation as the MFA for machines and workloads, and I explored the same idea in 2022 in the context of certificate authorities. If the CA runs open-source software on attesting hardware with reproducible builds, you can verify its behavior rather than trusting an annual audit. That shift, from asserted trust to verifiable trust, is genuinely important, and confidential computing is what makes it possible.

But “the direction is right” is not the same as “the current state is adequate.” We should not make perfection the enemy of good. This technology delivers real value today. But we also cannot afford to mistake the current state for the desired end state. Getting to where this technology needs to be requires seeing clearly where it actually is. That is what this piece is about.

The answer is not “the implementations are buggy.” The answer is structural. These technologies were designed for threat models that do not match how they are being deployed. Smart cards and HSMs were physically discrete devices with clear trust boundaries. TPMs were designed for boot integrity on enterprise desktops. Intel SGX was designed for desktop DRM. Each was repurposed for the cloud because the technology existed and the market needed something now. The repurposing created systematic security gaps that the research community has spent a decade documenting and the market has spent a decade deploying through.

In March 2025, I published a technical reference on security hardware and an in-depth companion document that categorized how these technologies fail. One of those failure categories was “Misuse Issues”: vulnerabilities that occur when security technology is adopted beyond its original design. A year later, with TDXRay reconstructing LLM prompts from inside encrypted VMs, TEE.Fail extracting attestation keys with a $1,000 device, and the SGX Global Wrapping Key extracted from hardware fuses, that observation warrants a much fuller treatment.

Timeline

YearEventCategory
1968Smart card patents (Dethloff, Moreno). Special-purpose computers in tamper-resistant packages. The original TEE.Hardware TEE
1980sIBM secure coprocessors for banking. US government funds kernelized secure OS research.Hardware TEE
1996nCipher founded. nShield HSMs with CodeSafe: custom application code inside tamper-resistant hardware.Hardware TEE
1998IBM 4758 commercially available. Arbitrary code execution inside tamper-responding enclosure. FIPS 140-1 Level 4.Hardware TEE
2003TCG founded, TPM standardized. Designed for boot integrity from ring -x. Hardware root of trust, measurement chains, attestation concepts established.Institutional
2006AWS launches EC2. Public cloud computing begins. Workloads move to shared infrastructure owned by someone else.Cloud
2006BitLocker ships with TPM support. TPMs reach millions of enterprise devices. Reference value infrastructure never materializes.Hardware TEE
2008-2010Cloud goes mainstream. Azure (2010), GCP (2008), OpenStack (2010). Multi-tenant shared infrastructure becomes the default enterprise compute model.Cloud
2012AlexNet wins ImageNet. Deep learning proven at scale on GPUs. AI workloads begin moving to cloud GPU infrastructure.AI
2013Apple Secure Enclave Processor (iPhone 5s). Physically separate processor on SoC. First mass-market TEE. Invisible to users.Hardware TEE
2015Intel SGX (Skylake). Enclaves inside the CPU. Designed for desktop DRM: single-tenant threat model. Cloud providers begin evaluating for multi-tenant use.CPU TEE
2016AMD SEV. VM-level memory encryption. First CPU TEE designed with virtualization in mind.CPU TEE
2017Transformer architecture published (“Attention Is All You Need”). Foundation for the model scale that will drive confidential computing demand.AI
2017First SGX side-channel attacks. Cache-timing, Spectre adaptation. Desktop design meets multi-tenant reality.Vulnerability
2018Foreshadow (L1TF) reads arbitrary SGX memory. SEVered remaps SEV guest pages. Desktop-to-cloud threat model gap exploited.Vulnerability
2019Confidential Computing Consortium founded (Google, Microsoft, IBM, Intel, Linux Foundation). Repurposing becomes official strategy.Institutional
2019Plundervolt, ZombieLoad, RIDL. Three distinct attack classes against SGX in one year.Vulnerability
2020GPT-3 (175B parameters). Model weights become billion-dollar assets. Protecting weights on shared infrastructure becomes a business requirement.AI
2020AWS Nitro Enclaves. Purpose-built for cloud, not repurposed from desktop. The exception to the pattern.Cloud
2020AMD SEV-SNP, Intel TDX announced. VM-level TEEs designed for cloud but still sharing microarchitectural resources. Azure/GCP ship confidential VMs with vTPMs.Cloud
2021Intel deprecates SGX on consumer CPUs (11th/12th gen Core). Desktop DRM cannot sustain the technology alone.CPU TEE
2022ChatGPT launches (Nov). AI goes mainstream. Every enterprise begins evaluating LLM deployment on cloud infrastructure.AI
2022ÆPIC Leak, SGX.Fail. Vulnerable platforms remain in TRUSTED attestation state months after disclosure.Vulnerability
2023GPT-4, Llama 2, Claude 2. Foundation model race accelerates. EU AI Act passed.AI
2023Downfall (SGX), CacheWarp (SEV-SNP). CacheWarp is first software-based attack defeating SEV-SNP integrity. NVIDIA H100 confidential GPU ships.Vulnerability
2024Confidential AI goes mainstream. Azure, GCP, AWS all position confidential computing for AI. TDXdown and Heckler attacks hit TDX. HyperTheft extracts model weights via ciphertext side channels.AI / Vulnerability
2025 FebGoogle finds insecure hash in AMD microcode signature validation (CVE-2024-56161). Malicious microcode loadable under SEV-SNP.Vulnerability
2025 MayGoogle announces confidential GKE nodes with NVIDIA H100 GPUs. Confidential AI training and inference on GPU clusters.AI
2025 OctTEE.Fail. $1K DDR5 bus interposer extracts attestation keys from Intel TDX and AMD SEV-SNP. Attestation forgery demonstrated.Vulnerability
2025 DecIDC survey: 75% of organizations adopting confidential computing, 84% cite attestation validation as top challenge. Gartner predicts 75% of untrusted-infra processing uses CC by 2029.Institutional
2025 DecIETF RATS CoRIM reaches draft-09. Reference value format standards mature. Vendor adoption of publishing measurements remains minimal.Institutional
2026 JanStackWarp (CVE-2025-29943). Stack Engine synchronization bug enables deterministic stack pointer manipulation inside SEV-SNP guest via MSR toggling. Affects AMD Zen 1 through Zen 5. USENIX Security 2026.Vulnerability
2026TDXRay (IEEE S&P 2026). Reconstructs LLM user prompts word-for-word from encrypted TDX VMs by monitoring tokenizer cache access patterns. No crypto broken. UC San Diego, CISPA, Google.AI / Vulnerability
2026 MarNVIDIA publishes zero-trust AI factory reference architecture. CPU TEE + confidential GPU + CoCo + KBS. Model weights encrypted until attestation passes.AI
2026 Mar 31Ermolov extracts SGX Global Wrapping Key from Intel Gemini Lake. Root key extraction via arbitrary microcode. Unpatchable (hardware fuses).Vulnerability

Trusted Platform Modules: Boot Integrity and System State

The idea that hardware should measure and attest to software integrity goes back to the late 1990s. The Trusted Computing Group, formed in 2003, standardized the Trusted Platform Module, a discrete chip that stores cryptographic keys and maintains Platform Configuration Registers recording the boot chain as a sequence of hash measurements.

The TPM was designed to solve a specific problem: bootloader-level attacks. Rootkits and bootkits that compromised the system before the OS loaded were invisible to any software-based security tool. The TPM sat below the OS, measuring each boot stage before execution. It could answer a question that no operating system could answer about itself: did this machine boot the software it was supposed to boot?

Each boot stage measures the next before handing off execution. The measurements are extended into PCRs using a one-way hash chain: PCR_new = Hash(PCR_old || measurement). The TPM can produce a signed quote of its PCR values, and a remote verifier can check whether the system booted the expected software stack.

TPMs shipped in millions of enterprise laptops and servers. BitLocker used TPM-sealed keys for disk encryption. Linux distributions added measured boot support. But TPMs never achieved the broad security impact their designers envisioned. The problem was practical: to verify a TPM quote, you need to know what the correct PCR values should be, and nobody built the infrastructure to distribute and maintain those reference values at scale.

The TPM could tell you what booted. It could not tell you whether what booted was good.

What TPMs did accomplish was laying the conceptual groundwork for everything that followed. Hardware root of trust, measurement chains, remote attestation, platform state quotes. All of this vocabulary originated in the TPM ecosystem. Modern CPU TEEs inherited these concepts even as their architectures diverged significantly from the TPM model.

Hardware-Isolated Execution: Older Than You Think

Running code inside a tamper-resistant hardware boundary did not start with Intel or Apple. It started with smart cards.

Smart cards emerged in the late 1960s as special-purpose computers embedded in plastic cards. By the 1980s, they were executing cryptographic operations in banking, telecommunications, and government ID. A smart card is a tiny computer with its own processor, memory, and operating system, running inside a tamper-resistant package. That is a trusted execution environment by any reasonable definition, even if nobody called it that at the time.

HSMs extended the same concept to server-class computing. IBM’s 4758, commercially available in the late 1990s, provided a tamper-responding enclosure with its own processor, battery-backed memory, and secure boot chain. If someone tried to open the case, drill through it, or expose it to extreme temperatures, the device would zeroize its keys. The 4758 ran arbitrary code inside the boundary.

nCipher (founded 1996, later acquired by Thales) took this further with CodeSafe on the nShield HSM line, a development framework for deploying custom applications inside the HSM. This was general-purpose computation inside a hardware trust boundary, exactly the model that SGX would later attempt to replicate in silicon without a separate physical device. I spent years working with these HSMs. They ran custom signing logic, policy engines, tokenization routines, and key derivation functions, all inside the tamper-resistant module where the host OS could not observe or interfere.

The difference between these earlier systems and modern confidential computing is not the concept. It is the integration point. Smart cards and HSMs are discrete devices with well-defined physical boundaries. You can see the trust boundary. You can hold it in your hand. SGX, TDX, and SEV moved the trust boundary inside the CPU itself, eliminating the separate device but also eliminating the physical clarity. When the trust boundary is a set of microarchitectural state bits inside a processor with billions of transistors and a microcode layer updated quarterly, the attack surface becomes much larger.

Apple’s Secure Enclave Processor, introduced with the iPhone 5s in 2013, sat between these two models. It was a physically separate processor on the SoC with its own encrypted memory, dedicated to protecting biometric data and cryptographic keys. Even a fully compromised application processor with root privileges could not reach the Secure Enclave’s memory.

The SEP succeeded where HSMs had stayed confined to data centers for two reasons. It was invisible to users. Nobody configured it or provisioned it. And it protected something users cared about: their fingerprints and their money. The security was a means to a consumer feature, not a product in itself.

Intel SGX: Designed for the Desktop

Intel SGX, introduced with Skylake processors in 2015, brought the enclave concept to general-purpose computing. Instead of a separate processor, SGX created isolated memory regions within the main CPU. Code and data inside an enclave are encrypted in memory and protected from all other software on the system. The enclave’s measurement (MRENCLAVE) is a hash of exactly what was loaded, making attestation straightforward. One binary, one deterministic hash.

SGX was designed for the desktop. Its primary use cases were single-tenant scenarios like content protection, DRM key management, and Ultra HD Blu-ray playback. The threat model is clear. One machine, one user, and the enclave protects the content owner’s code from that user.

This is a single-tenant threat model. The attacker is the machine owner. There is no hypervisor. There are no co-tenant workloads competing for shared microarchitectural resources. The side-channel attack surface exists, but the economic incentive is limited. The attacker gains access to one DRM key or one media stream.

Enterprise adoption beyond DRM was limited. SGX enclaves had severe memory constraints (initially 128MB). Programming for SGX required partitioning applications into trusted and untrusted components. Intel deprecated SGX from consumer processors in 2021. The desktop DRM use case was not enough to sustain the technology.

Cloud Adoption and the Threat Model Mismatch

The cloud introduced a fundamentally different threat model, and this is where the problems began.

In the desktop DRM model, you protect your code from one user on one machine. In the cloud, you protect your code and data from the infrastructure provider, co-tenant workloads, the hypervisor, firmware, and anyone with physical access to a shared data center. The provider controls the hardware, the hypervisor, the firmware, the physical facility, and the scheduling of workloads across shared CPU cores.

The industry took technologies designed for the desktop single-tenant model and applied them to this multi-tenant cloud model. The architectural mismatch opened attack surfaces that the original designs did not anticipate.

SGX on a desktop shares caches, branch predictors, execution ports, and power delivery with the enclave owner’s own code. On a cloud server, those same resources are shared with co-tenant workloads controlled by different parties, each potentially adversarial. Cache-timing attacks that were theoretical on a desktop became practical in the cloud because the attacker could run arbitrary code on the same physical core. The side-channel catalog that accumulated against SGX from 2017 onward was not a series of implementation bugs. It was a consequence of deploying a single-tenant design in a multi-tenant environment.

AMD SEV and Intel TDX were designed with the cloud threat model more explicitly in mind, protecting entire virtual machines rather than individual enclaves. But they still share fundamental hardware resources with the hypervisor and co-tenants. CPU caches, memory buses, power delivery, and microarchitectural scheduling state. CacheWarp, StackWarp, WeSee, and Heckler all exploit the interfaces between the confidential VM and the hypervisor that manages it.

Virtual TPMs are another instance of the same pattern. Physical TPMs provide hardware-rooted trust because they are discrete chips with their own silicon. A vTPM is software running inside the hypervisor or a confidential VM. Cloud providers adopted vTPMs because provisioning hardware TPMs per VM is impractical at scale. The vTPM’s trust root is the software stack that hosts it. If the hypervisor is compromised, the vTPM is compromised.

The Repurposing Pattern

This is a recurring pattern in security technology, and it is one I have watched play out multiple times in my career. Build X for threat model Y, then repurpose X for threat model Z because X already exists and deploying it is cheaper than building something new.

SMS was designed for person-to-person messaging. It was repurposed for two-factor authentication because every phone could receive an SMS. The threat model assumed the cellular network was trusted. SIM swapping, SS7 interception, and malware-based SMS capture exploited the gap between “messaging channel” and “authentication channel.” NIST deprecated SMS-based 2FA. SMS OTP is still everywhere because deployment inertia exceeds the security community’s ability to move the market.

SSL was designed for securing web browsing sessions. It was repurposed for API authentication, IoT device communication, email encryption, and VPN tunneling. Each repurposing exposed assumptions in the original design that did not hold in the new context. The ecosystem spent two decades fixing the gaps through Certificate Transparency, HSTS, and progressively stricter CA/Browser Forum requirements. I was part of that ecosystem. The fixes were not inevitable. They required sustained institutional effort.

TPMs were designed for boot integrity on enterprise desktops. They were repurposed as vTPMs for cloud VM attestation, trading hardware isolation for scalability. SGX was designed for desktop DRM. It was repurposed for cloud confidential computing, trading single-tenant simplicity for multi-tenant attack surface. Each repurposing followed the same logic. The technology existed, the market needed something, and “available now with known limitations” beat “purpose-built but years away.”

The repurposed technology works well enough to create adoption. The adoption creates dependency. The dependency makes it difficult to replace even after the threat model gap is well understood. And the security research community spends years documenting the consequences while the market continues deploying.

AWS took a different path with Nitro Enclaves. Rather than building on CPU instruction extensions designed for desktops, Nitro Enclaves are isolated virtual machines on a purpose-built hypervisor with no persistent storage, no network access, and no access from the host. The Nitro model sidestepped many of the shared-resource problems because the hypervisor is minimal and the enclave has dedicated resources. The measurement model is clean. One image, one deterministic measurement.

Azure and GCP followed with confidential VM offerings on AMD SEV-SNP and Intel TDX. Google has positioned confidential computing as foundational to AI, expanding support across Confidential VMs, Confidential GKE Nodes, and Confidential Space with Intel TDX and NVIDIA H100 GPUs.

NVIDIA entered with confidential GPU support on H100 and Blackwell architectures. Their reference architecture for “zero-trust AI factories” combines CPU TEEs with confidential GPUs, Confidential Containers via Kata, and a Key Broker Service that releases model decryption keys only after remote attestation succeeds. Model weights remain encrypted until the hardware proves the enclave is genuine. This positions confidential computing as IP protection for model owners deploying on infrastructure they do not control.

Intel launched Trust Authority as a SaaS attestation service independent of the cloud provider. If the cloud provider both runs your TEE and verifies its attestation, you are still trusting the provider. An independent verifier breaks that circularity.

By 2025, every major hardware vendor and every major cloud provider had a confidential computing offering. The question was no longer whether the technology existed. It was whether anyone could make it work at scale.

Why It Never Hit Mass Adoption

Despite the investment, confidential computing did not achieve mass adoption through the SGX era or the first wave of confidential VMs. Several problems compounded.

Attestation is hard to operationalize. The verification step requires infrastructure that most organizations do not have and that the ecosystem has not built. I wrote about this problem in detail in Why Nobody Can Verify What Booted Your Server. The short version: 84% of IT leaders cite attestation validation as their top adoption challenge.

The performance overhead was non-trivial in early implementations. SGX had significant costs from enclave transitions and limited memory. Confidential VMs with SEV-SNP and TDX reduced this to single-digit percentage overhead for most workloads, but the perception of “secure means slow” persisted.

The developer experience was poor. SGX required application partitioning and a specialized SDK. Confidential VMs improved this by running unmodified applications, but attestation integration, key management, and secret provisioning still required specialized knowledge. As of early 2026, deploying a confidential workload still requires expertise that most teams do not have.

The vulnerability narrative undermined confidence. The side-channel attacks against SGX were not random bugs. They were a predictable consequence of deploying a single-tenant design in a multi-tenant environment. Each new attack generated press coverage and reinforced the perception that the technology could not deliver. Security teams found a long list of CVEs, academic attacks, and “known limitations” that made the risk-benefit calculus uncertain.

And without AI, the use cases were niche. DRM, financial services MPC, healthcare analytics, sovereign cloud compliance. Real markets, but not mass markets. Not enough volume to drive the ecosystem maturity needed for broad adoption.

The Vulnerability Record

The side-channel attacks did not stop with SGX’s partial deprecation. They followed the technology into the cloud.

Intel TDX still shares microarchitectural resources with the hypervisor. TDXdown demonstrated single-stepping and instruction counting against TDX trust domains. PortPrint showed that CPU port contention reveals distinctive execution signatures across SGX, TDX, and SEV alike, and because it exploits instruction-level parallelism rather than thread-level parallelism, disabling SMT does not help.

The attack that most directly undermines the “Private AI” narrative is TDXRay (IEEE S&P 2026, UC San Diego, CISPA, Google). TDXRay produces cache-line-granular memory access traces of unmodified, encrypted TDX VMs. The researchers reconstructed user prompts word-for-word from a confidential LLM inference session. No cryptography was broken. The attack works because standard LLM tokenizers traverse a hash map to find token IDs, and that traversal creates a memory access pattern observable at 64-byte cache-line resolution. The host watches which hash map nodes the tokenizer visits and stitches the prompt back together. The encryption protects the data in memory. The computation pattern leaks it through the cache.

TEE.Fail (ACM CCS 2025) is the most dramatic recent finding. Researchers built a $1,000 physical interposer that monitors the DDR5 memory bus and extracted ECDSA attestation keys from Intel’s Provisioning Certification Enclave, the keys that underpin the entire SGX and TDX attestation chain. Attestation can be forged. The attack requires physical access, which limits applicability. But cloud providers have physical access to every server they operate.

On March 31, 2026, Mark Ermolov announced the extraction of the SGX Global Wrapping Key from Intel Gemini Lake. This is not a side-channel leak. It is extraction of the root cryptographic key that protects SGX sealing operations. The key wraps Fuse Key 0, which means the entire key hierarchy rooted in hardware fuses is compromised for that platform generation. No microcode update can change fuses. Ermolov’s assessment: “its fundamental break means that the HW Root of Trust approach is not unshakable.”

Gemini Lake is a low-power consumer chip, not a Xeon server processor. The same attack has not been demonstrated on current server-class implementations. But the research trajectory is clear. Each generation of hardware trust primitives has been broken by the next generation of hardware security research.

Why the Pattern Persists: Five Broken Design Assumptions

The vulnerability record is not a collection of unrelated bugs. It is the predictable result of specific design assumptions that held in the original use cases but fail in the cloud and AI contexts where the technology is now deployed.

The attacker does not share physical hardware with the victim. SGX was designed for a desktop where one user runs one workload. In the cloud, co-tenants share CPU cores, caches, branch predictors, TLBs, execution ports, memory controllers, and power delivery. CacheWarp, StackWarp, and TDXRay all exploit resources that remain shared because complete resource partitioning would make the hardware unusable for general-purpose computing.

The platform owner is not the adversary. TPMs and early SGX assumed the platform owner was the user or a trusted IT department. In the cloud, the provider controls the hypervisor, firmware, BMC, physical facility, and scheduling. The interfaces between the TEE and the provider-controlled environment become the attack surface. WeSee, Heckler, and SEVered exploit these interfaces. TEE.Fail exploits the provider’s physical access to the memory bus.

The hardware root of trust is immutable. The attestation model depends on root keys being beyond the reach of software attacks. This assumption has been violated repeatedly. Ermolov reached fuse-based keys through microcode. Google’s CVE-2024-56161 found an insecure hash in AMD’s microcode signature validation. Sinkclose provided universal Ring-2 escalation on AMD CPUs back to 2006.

Attestation verification is someone else’s problem. The specifications define how to produce attestation evidence but not how to verify it at scale. In the desktop DRM case, one binary produced one hash. In the cloud, PCR values are combinatorial across firmware, bootloader, kernel, and boot configuration.

Performance and security tradeoffs are invisible. On a desktop running DRM playback, a 5% performance hit is imperceptible. On a cloud server running AI inference at scale, every percentage point is cost. Disabling SMT, applying Downfall mitigations, and enabling inline encryption all have measurable overhead. Organizations are pressured to disable countermeasures for performance, reopening the attack surface.

These assumptions compound. The attacker shares hardware with a platform owner who is the adversary, exploiting a hardware root of trust that has a shelf life, verified through attestation infrastructure that does not exist at scale, with mitigations that carry performance costs the deployment context cannot absorb. No single patch addresses the compound effect. The assumptions are architectural, not implementational, which is why the vulnerability catalog grows despite continuous investment in mitigations.

The full root cause analysis with specific attack mappings for each assumption is in the companion TEE Vulnerability Taxonomy.

AI Changes the Calculus

All of the problems described above are real and unresolved. None of them are stopping adoption, because AI changed the calculus.

Model weights represent billions of dollars in training investment. A leaked foundation model is a competitive catastrophe. Running inference on shared cloud infrastructure means trusting the cloud provider not to inspect memory, which is the exact problem TEEs solve.

Training data includes regulated information across healthcare, financial services, and government. The EU AI Act, DORA, CCPA, and evolving federal privacy frameworks create compliance pressure that confidential computing directly addresses.

Multi-party AI scenarios (federated learning, collaborative training, secure inference on third-party data) require environments where no single party sees the complete dataset. TEEs provide the isolation boundary. This is why every major hyperscaler is building on confidential computing despite its known limitations.

But AI workloads amplify every weakness. GPU TEEs are new and their attestation models are immature. The attestation chain now spans CPU TEE, GPU TEE, and potentially TPM, each with different measurement schemes. AI workloads run on heterogeneous infrastructure across multiple cloud providers. And AI workloads are the most valuable targets for the attacks TEEs are vulnerable to. An attacker who extracts model weights via a side channel gets a multi-billion-dollar asset.

The market treats the different TEE designs (SGX, SEV, TDX, Nitro, NVIDIA confidential GPU) as interchangeable. They are not. Each has different properties and different security guarantees. Pretending otherwise is how organizations end up deploying against a threat model their chosen TEE was not designed to address.

The Trust Model Gap

The deeper issue is the gap between what is marketed and what is engineered.

Confidential computing marketing says “even the infrastructure provider cannot access your data.”

The engineering reality is different. The infrastructure provider cannot access your data through the software stack, but the hardware has known side-channel leakages that a sufficiently motivated attacker with privileged access can exploit. The attestation infrastructure that proves the TEE is genuine has structural limitations that make verification at scale dependent on each organization building its own reference value databases. And the hardware root of trust that anchors the entire system has a demonstrated shelf life.

This is a reasonable tradeoff for many threat models. Most organizations are defending against curious administrators, software-level compromise, and regulatory compliance requirements. Side-channel attacks require significant expertise and often physical access. But the market does not present it as a tradeoff.

What Needs to Happen

Closing the gap between the market narrative and the engineering reality requires work that is less exciting than launching new AI services.

Firmware and OS vendors need to publish reference measurements. The standards exist. CoRIM provides the format. RFC 9683 provides the framework. What is missing is the operational commitment to publish signed measurement values for every release. I wrote about the infrastructure that would need to exist and why none of it does yet.

The industry needs honest threat modeling that acknowledges what TEEs protect against and what they do not. TEE.Fail requires physical access, but cloud providers have physical access to every server. TDXdown requires a malicious hypervisor, which is precisely the threat TDX is designed to defend against. These are not edge cases. They are the threat model.

Attestation verification needs to become a commodity. Organizations should not need to build their own reference value databases, write their own event log parsers, and maintain their own golden image registries. This infrastructure should be as standardized and available as Certificate Transparency logs are for the web PKI.

And the security research community’s findings need to be incorporated into the market narrative rather than treated as exceptions. The pattern of continuous vulnerability discovery and mitigation is the normal state of the technology, not an aberration.

Confidential computing is directionally correct. The ability to verify what code is running on hardware you do not control, rather than simply trusting the operator, is a fundamental improvement in how we build systems. Signal proved the model works. The challenge is closing the gap between that promise and the current engineering reality.

The organizations deploying confidential computing for AI workloads today should understand what they are buying. Against the threats they are most likely to face, curious administrators, software-level compromise, regulatory compliance gaps, and unauthorized data access by the infrastructure operator, confidential computing is a significant improvement. Against a well-resourced attacker with physical access to the hardware, side-channel expertise, or the ability to exploit a hardware root-of-trust vulnerability, it is a partial mitigation, not an absolute guarantee.

That is a defensible position. It is just not the one being marketed.


For practical guidance on deployment, see Confidential Computing: What It Is, What It Isn’t, and How to Think About It.

For the full vulnerability catalog and root cause framework, see the TEE Vulnerability Taxonomy and TPM Attestation and PCR Verification .

Previously: TPMs, TEEs, and Everything In Between (March 2025). See also: Why Nobody Can Verify What Booted Your Server.

What Is Confidential Computing, What It Isn’t, and How to Think About It

Confidential computing is the most important security technology that most organizations deploying it do not fully understand.

Last March, I wrote about the terminology confusion in security hardware — how terms like TEE, TPM, secure enclave, and confidential computing get used interchangeably in ways that obscure what these technologies actually do. The accompanying technical reference laid out the foundational concepts and the ways these technologies fail.

A year later, confidential computing is no longer a niche technology. AI has made it urgent. When you run inference on a model worth hundreds of millions in training compute, on hardware you don’t own, in a data center you’ve never visited, the question of what the infrastructure operator can see becomes a business-critical concern. Confidential computing is the industry’s answer.

It is also a technology whose security properties are routinely overstated by the vendors selling it and the cloud providers deploying it. Marketing language like “even the infrastructure provider cannot access your data” appears in product pages from every major hyperscaler. The engineering reality is more constrained than that, and the gap between the marketing and the engineering is where organizations get hurt. None of that means you shouldn’t use it. I use it extensively. It means you need to understand what it actually gives you so you can build architectures that account for what it doesn’t.

What Confidential Computing Actually Does

Confidential computing protects data while it is being processed. Traditional encryption covers data at rest and data in transit. Confidential computing addresses the third state: data in use, the window when data must be decrypted for computation and is therefore exposed in memory.

The mechanism is hardware-based isolation. The CPU (or GPU, in newer implementations) creates an environment where code and data are encrypted in memory and protected from all other software on the system, including the operating system and hypervisor. The cloud provider’s administrators cannot read your data even though it is running on their hardware.

The technology comes in several forms. AMD SEV-SNP and Intel TDX protect entire virtual machines. AWS Nitro Enclaves provide isolated execution environments on Amazon’s custom hardware. NVIDIA’s H100 and Blackwell GPUs add hardware-encrypted GPU memory with GPU-specific attestation. Apple’s Secure Enclave protects biometric data and cryptographic keys on a physically separate processor. The implementations differ significantly, but they share a common principle: hardware-enforced boundaries that the software stack cannot cross.

What It Does Not Do

Confidential computing does not make your workload invulnerable. It changes the threat model. Understanding what it does not protect against matters as much as understanding what it does.

Side-channel attacks remain viable. The CPU still shares caches, branch predictors, execution ports, and power delivery with other workloads. Researchers have demonstrated attacks that extract data from inside TEEs without breaking any cryptography. The TDXRay attack (IEEE S&P 2026) reconstructed user prompts word-for-word from an encrypted Intel TDX VM by watching which cache lines the LLM tokenizer accessed. The data was encrypted in memory. The computation pattern leaked it through the cache.

Physical access defeats memory encryption. The TEE.Fail attack (ACM CCS 2025) used a $1,000 device soldered to the DDR5 memory bus to extract attestation keys from Intel TDX and AMD SEV-SNP. Cloud providers have physical access to every server they operate. That is the threat model confidential computing claims to address.

Attestation depends on hardware roots of trust that have a shelf life. Attestation is how a TEE proves to a remote party that it is running expected code on genuine hardware. The proof depends on cryptographic keys embedded in the processor. Those keys have been extracted. The March 2026 extraction of the SGX Global Wrapping Key from Intel Gemini Lake reached root keys burned into hardware fuses. Google’s discovery of an insecure hash in AMD’s microcode signature validation (CVE-2024-56161) allowed loading malicious microcode that could subvert SEV-SNP. When root keys are compromised, attestation can be forged.

Attestation verification infrastructure barely exists. Even if the TEE hardware is sound, verifying attestation at scale requires knowing what the correct measurements should be. For TPM-based attestation, this means maintaining reference values for every combination of firmware, bootloader, kernel, and boot configuration across a heterogeneous fleet. That infrastructure largely does not exist. An IDC survey found that 84% of IT leaders cite attestation validation as their top adoption challenge.

The Vulnerability Record in Context

The security research community has published over 50 distinct attacks against TEE platforms since 2017. The companion TEE Vulnerability Taxonomy catalogs these in detail.

The number is large. It does not mean confidential computing is broken. It means the technology has been subjected to intense scrutiny by some of the best hardware security researchers in the world, and they have found weaknesses. The question is whether the risk after deploying the technology is lower than the risk without it.

For most deployments, the answer is yes. Confidential computing raises the bar significantly. An attacker who could previously read VM memory through a compromised hypervisor now needs a side-channel attack, a physical interposition device, or a root-of-trust compromise. Each of these is substantially harder than the baseline attack.

The vulnerability record matters most when the attacker is well-resourced and has privileged access to the infrastructure — which is exactly the cloud provider threat model that confidential computing is designed to address. The threat model it targets is the one where its limitations are most relevant. That tension is real, and pretending it does not exist does not help anyone making deployment decisions.

The Gartner prediction that 75% of processing in untrusted infrastructure will use confidential computing by 2029 assumes a maturity the technology has not achieved. Treating a bounded isolation primitive as a general trust solution is how organizations end up surprised.

How to Think About Deployment

Confidential computing is one layer in a defense-in-depth architecture. It is not a substitute for the other layers. I have deployed confidential computing in production and these are the principles I have found matter most.

Use it, but don’t rely on it alone. Encrypt data at rest and in transit independently of the TEE. Use application-level encryption for the most sensitive data so that even a TEE compromise does not expose plaintext. The TEE is a defense-in-depth layer, not your sole protection.

Verify attestation, and understand what verification actually proves. A TPM quote or attestation report proves the state of the machine at the time the quote was generated. It does not prove the machine is still in that state five minutes later. It does not prove the machine’s physical location or who has physical access. Build your verification flow with these limitations in mind.

Know your TEE’s specific threat model. AMD SEV-SNP, Intel TDX, AWS Nitro Enclaves, and NVIDIA GPU CC have different architectures, different shared resource boundaries, and different attestation mechanisms. They are not interchangeable. A TDX trust domain sharing microarchitectural state with a hypervisor has a different side-channel surface than a Nitro Enclave running on a purpose-built hypervisor with dedicated resources.

Plan for the hardware root of trust to eventually fail. The research trajectory is clear: each generation of hardware trust primitives has been broken by the next generation of hardware security research. Build your key management and secret rotation so that a root-of-trust compromise on one platform generation does not expose secrets that have already been rotated.

Ask whether your workload actually needs multi-tenant cloud TEEs. For some use cases, a physically discrete device — an HSM, a USB Armory, a Nitro Enclave with dedicated resources — provides stronger isolation than a confidential VM sharing silicon with co-tenants. The multi-tenancy problem is where most of the vulnerability surface lives. If your workload does not require multi-tenant shared infrastructure, you can sidestep the largest attack class entirely.

What a Practical Architecture Looks Like

Consider a Certificate Authority that runs its signing operations inside AWS Nitro Enclaves. The enclave has no persistent storage, no network access, and no access from the host instance. The signing key never leaves the enclave. Attestation is verified through the Nitro Attestation PKI, which produces deterministic measurements of the enclave image.

Nitro Enclaves were chosen because their architecture sidesteps the shared-resource side-channel problems that affect SGX and TDX. The Nitro hypervisor is purpose-built and minimal. The enclave gets dedicated resources. The measurement model is clean: one image, one deterministic measurement, no combinatorial PCR explosion.

But the enclave is not the only security layer. The signing keys are backed by a hardware root of trust. The enclave image is built from reproducible builds so the expected measurements are verifiable from source. Access to the host instance is controlled through IAM policies that are themselves audited. The architecture is designed so that compromising any single layer does not compromise the signing keys.

Use confidential computing as a meaningful security improvement, understand its specific limitations, and build the rest of your architecture so that the limitations do not become single points of failure.

Where This Is Heading

Confidential computing is not going away. The economic pressure to deploy AI workloads on shared infrastructure guarantees continued investment. NVIDIA’s Blackwell architecture extends confidential GPU support. ARM CCA adds Realm World isolation. The Confidential Computing Consortium continues to drive standardization.

The technology will improve. Side-channel mitigations will get better. Attestation infrastructure will mature — the IETF RATS standards are ready, and what is missing is vendor adoption of publishing reference values. Performance overhead will continue to decrease.

But the fundamental constraints — shared microarchitectural resources, physically accessible memory buses, the shelf life of hardware roots of trust — are properties of how CPUs and memory work. They will not be eliminated by the next generation of silicon. They will be reduced, mitigated, and worked around.

Confidential computing is a significant improvement in security posture. It is not an absolute guarantee. That is a defensible position to sell. It is just not the one being sold.


For the deeper analysis of why the vulnerability record looks the way it does, see Confidential Computing’s Inconvenient Truth.

For the full vulnerability catalog, attestation gap analysis, and root cause framework, see the TEE Vulnerability Taxonomy and TPM Attestation and PCR Verification .

Previously: TPMs, TEEs, and Everything In Between: What You Actually Need to Know (March 2025). See also: Why Nobody Can Verify What Booted Your Server.

Why Nobody Can Verify What Booted Your Server

There is no public database of known-good TPM measurements. There never has been.

The Trusted Platform Module, a security chip that measures and attests to system integrity, has been a standard for twenty years. TPMs ship in virtually every enterprise laptop and server. Software-emulated versions are provisioned for every cloud VM on Azure, GCP, and AWS. Measured boot is a checkbox in every compliance framework that touches system integrity. The hardware that produces platform measurements is everywhere. The infrastructure to verify those measurements is not.

If you have deployed measured boot at scale, you have hit this wall. I have, more than once. If you haven’t yet, you will.

I wrote about the foundational concepts behind these technologies last year, covering how TPMs, TEEs, HSMs, and secure enclaves differ and where they fail. This post goes deeper on one specific problem that anyone deploying measured boot or confidential VMs hits immediately: the verification gap for PCR values.

What PCRs Are and Why They Exist

A TPM contains a set of Platform Configuration Registers, special-purpose storage locations that record the boot chain as a sequence of cryptographic measurements. Each boot stage measures the next before handing off execution. The measurements are extended into PCRs using a one-way hash chain: the old value is concatenated with the new measurement and hashed to produce the new value. This is irreversible. Given a final PCR value, you cannot determine the individual measurements without replaying the full sequence.

A TPM quote is a signed snapshot of these PCR values, which lets a remote verifier assess what software actually booted on the machine. This is remote attestation, and it answers a question no operating system can answer about itself: did this machine boot what it was supposed to boot?

This works fine for a single machine. The problem is fleets.

Why There Is No PCR Registry

You would think someone would have built a public database of known-good PCR values by now, something like CCADB for certificate trust or VirusTotal for malware hashes. Nobody has, and it is not because nobody thought of it. The reasons are structural.

PCR values are combinatorial. A single PCR accumulates measurements from multiple software components. PCR 0 reflects the firmware version, CPU microcode patches, and the UEFI configuration that controls early boot behavior. PCR 4 reflects the bootloader and the shim that validates Secure Boot signatures. On modern Linux distributions using Unified Kernel Images, which bundle the kernel and initial RAM disk into a single signed binary, measurements fragment across PCRs 8, 9, 11, and 12 depending on the distribution and boot configuration. This is messier than the traditional GRUB boot path, and it was already messy.

Any component update produces a completely different PCR value for the affected register. A fleet with 3 firmware versions, 2 bootloaders, 4 kernels, and 3 initrd configurations has 72 valid PCR value combinations for a single hardware model. Five hardware models is 360. Add boot parameters and the number becomes effectively unbounded.

Measurement ordering matters. The hash chain is order-dependent. Extending measurement A then B produces a different result than B then A. Boot is not fully deterministic. Driver initialization order, ACPI table enumeration, and peripheral probe sequences can vary between boots of identical software on identical hardware. The TCG’s own specification acknowledges this directly: operating system boot code is “usually non-deterministic, meaning that there may never be a single ‘known good’ PCR value.”

Firmware measurements are opaque. The UEFI event log is the detailed record behind those PCR values, and in practice it is often more useful than the final values themselves. But the event data for firmware blobs is often just a physical memory address and size. No indication of format or purpose. Intel Boot Guard measurements use methods that are under NDA. Dell extends proprietary configuration data into PCR 6 in undocumented formats. A verifier cannot independently reconstruct many of these measurements without vendor-specific knowledge that is not publicly available.

Nobody is obligated to publish reference values. The standards for publishing expected measurements exist. The TCG Reference Integrity Manifest specification defines the formats. The IETF RATS working group developed CoRIM, a compact machine-readable format for publishing reference measurements. RFC 9683, which covers remote integrity verification of network devices containing TPMs, specifies that software suppliers MUST make reference values available as signed tags. The standards are there. Manufacturers are not obligated to follow through, and most do not.

What Everyone Actually Does Instead

PCR value matching fails at scale, so the industry has quietly converged on something else: event log verification.

The TPM does not just produce final PCR values. It also maintains an event log, a sequential record of every individual measurement extended into each PCR during boot. Each entry contains the PCR index, the hash of what was measured, and a description of the event — “loaded bootloader from partition 1” or “Secure Boot certificate db contained these entries.”

The event log is what makes attestation workable in practice. The verifier replays the log by re-computing the hash chain from the individual entries. If the replayed chain produces the same PCR values that the TPM signed in its quote, the log has not been tampered with. The events it describes are the actual events that produced those values. The verifier then evaluates individual events against a policy: is this firmware version on the approved list? Is Secure Boot enabled? Is the kernel signed by a trusted key? Was anything unexpected loaded?

This is more flexible than PCR matching. A firmware update changes one event in the log, not the entire composite hash, so the policy absorbs the change without requiring new reference values.

But event log verification has its own problems. Event data is often insufficient for independent verification. Vendor-specific formats are undocumented. Event types and descriptions are not part of the hash, so they can be manipulated without affecting the signed PCR value. Intel’s CSME subsystem extends measurements that verifiers cannot evaluate without access to Intel’s proprietary documentation.

Keylime, the most mature open-source attestation framework, says it plainly: direct PCR value matching is “only useful when the boot chain does not change often.” Intel Trust Authority, Google Cloud Attestation, and Azure Attestation all verify event log properties rather than matching literal PCR values.

So every organization deploying TPM attestation at scale ends up building their own reference values by capturing measurements from known-good environments. The “registry” is whatever you build from your own golden images. This is not a sustainable state of affairs, but it is the state of affairs.

vTPMs Add Another Layer

Virtual TPMs make the verification problem worse. A physical TPM’s trust comes from being a discrete chip with its own silicon. A vTPM is software running inside the hypervisor or a confidential VM. Cloud providers adopted vTPMs because provisioning physical TPMs per VM is impractical at cloud scale.

The vTPM’s trust root is the software and hardware stack that hosts it. If the hypervisor is compromised, the vTPM is compromised. If the CPU’s hardware isolation (the TEE that protects the confidential VM) has a side-channel vulnerability, the vTPM’s keys are exposed through that side channel. Verifying vTPM evidence requires also verifying the TEE evidence, because the trust chains through.

Each layer’s trust depends on the layer below, and the bottom layer has a demonstrated shelf life. The March 2026 extraction of the SGX Global Wrapping Key from Intel Gemini Lake and Google’s discovery of an insecure hash in AMD’s microcode signature validation (CVE-2024-56161) are the latest demonstrations that hardware roots of trust are not permanent.

A Practical Approach

The reference value infrastructure does not exist. So what do you actually do?

Pick the verification approach that matches what your deployment can support, and accept the tradeoff. I have listed these from strongest assurance to weakest, which is also from highest operational cost to lowest.

Exact PCR match compares values against a fixed allowlist. Strongest when reference values are correct. Breaks on any component update. Only practical for enclave-style deployments like AWS Nitro Enclaves or Intel SGX, where one image produces one deterministic measurement. If you control the entire image and the measurement is deterministic, this is the easy case.

Event log policy replays the event log and evaluates individual events against policy. Flexible to component updates. Requires an event log parser and per-vendor knowledge of event formats.

Signed baseline accepts any PCR values covered by a signature from a trusted key. The signing key becomes the trust anchor rather than a registry of literal values. When software updates change PCR values, the security team signs a new baseline. This is the PolicyAuthorize pattern that System Transparency documents and pcr-oracle supports: seal secrets to a signing key rather than to specific PCR values, so that software updates do not lock you out of your own data.

Node identity only verifies the TPM’s Endorsement Key identity without PCR verification. Proves hardware identity, not software state. Weakest assurance, lowest operational cost.

Most real-world deployments will use different approaches for different parts of their architecture. Exact match for the most sensitive operations. Event log policy for managed servers. Signed baselines for fleet environments where the security team controls the update cycle. The right answer is almost never one approach for everything.

What Would Need to Exist, and Why It Matters

The gap between what TPM attestation promises and what it delivers at scale comes down to five missing pieces of infrastructure. None of them are technically novel. All of them require cross-vendor coordination, which is the hard part.

Firmware vendors publishing signed reference measurements for every release. If Dell, HP, Lenovo, Supermicro, and Intel published signed CoRIM measurement bundles alongside firmware updates, verifiers could check boot measurements against vendor-provided values instead of building golden image databases. The thousands of organizations currently maintaining their own reference values stop doing that redundant, error-prone work. A firmware update becomes verifiable by any attestation service, not just by organizations that happened to capture the right measurements before deploying. This is the single highest-impact change.

OS vendors publishing signed reference measurements for kernels, bootloaders, and initrd images. Red Hat, Canonical, and SUSE would publish expected measurement values for each package version. The cost of operating measured boot drops from “dedicated team” to “configuration.”

A transparency log for reference measurements. Analogous to Certificate Transparency for the web PKI. Reference value providers submit signed measurements to a log. Verifiers check the log. Monitors detect inconsistencies. The incentive structure shifts from “trust the vendor” to “verify the vendor,” which is the entire point of attestation in the first place.

This is not hypothetical. I worked on firmware transparency at Google, including work with Andrea Barisani to integrate it into the Armored Witness, a tamper-evident signing device built on TamaGo and the USB Armory platform. Google publishes a transparency log for Pixel factory images. The broader Binary Transparency framework has production deployments across Go modules, sigstore, and firmware update pipelines. Researchers are extending the approach to server firmware signing. The pattern works. What is missing is adoption by the server firmware vendors whose measurements actually need verifying.

Cross-vendor event log normalization. A library that translates vendor-specific event log formats into a common representation, abstracting away the differences between Dell, HP, Lenovo, and Intel firmware event structures.

Attestation verification as a commodity service. Not vendor-specific, not requiring deep expertise, but as simple as an OCSP responder for certificate revocation: send a TPM quote and event log, get back a signed attestation result.

None of these exist at scale as of April 2026. The standards are ready. The hardware is deployed. The market is adopting confidential computing at a pace that assumes this infrastructure is coming. It is not here yet.

None of this fixes the side-channel vulnerabilities in the TEE hardware itself. None of it extends the shelf life of hardware roots of trust. Those are silicon problems that require silicon solutions. But the attestation infrastructure gap is not a silicon problem. It is a coordination and incentive problem, and those are solvable.

The web PKI went through a similar transition, and I watched it happen from the inside. Certificate mis-issuance was undetectable until Certificate Transparency made it visible. Certificate authorities operated without enforceable standards until the CA/Browser Forum Baseline Requirements created them. There was no shared database of trusted roots until CCADB built one. Each of those required cross-vendor coordination that looked unlikely right up until it shipped. The result is an ecosystem that is not perfect but is dramatically more trustworthy than it was fifteen years ago.

The attestation infrastructure could follow the same path. The standards work is done. What remains is the operational commitment from the vendors who manufacture the hardware and the organizations that rely on it.

Every organization deploying measured boot today is independently solving the same problem with their own golden images, their own event log parsers, and their own reference value databases. I have built some of these myself. The standards are ready, the hardware is deployed, and the economic incentive is growing. What is missing is the willingness to coordinate. That is a solvable problem.


This post is the first in a series on confidential computing. The next two posts, What Is Confidential Computing, What It Isn’t, and How to Think About It, and Confidential Computing’s Inconvenient Truth. Two companion reference documents provide the full evidence base: the TEE Vulnerability Taxonomy and TPM Attestation and PCR Verification

Previously: TPMs, TEEs, and Everything In Between: What You Actually Need to Know (March 2025)

We Built It With Slide Rules. Then We Forgot How.

My father grew up on a subsistence farm, the kind that raised chickens and grew just enough to get by. Farmers were the original hackers. You couldn’t wait for the right tool or the right expert. You fixed what was broken with what you had, because the alternative was worse.

As a kid he taught himself rocket chemistry. Not from a kit. From whatever he could source locally. He was trying to make things burn hotter and fly farther, adjusting mixtures through trial and error long before he had words like specific impulse or oxidizer ratio for what he was doing.

The materials weren’t exotic. Potassium nitrate sold as stump remover. Sulfur and charcoal. Mix them correctly and you have black powder, the same oxidizer-fuel logic underlying every solid rocket motor ever built. More ambitious builders used potassium perchlorate from chemical suppliers, mixed with aluminum powder or sugar to control burn rate and energy density. All of it over the counter. All of it accessible to someone willing to read carefully and try things until they worked.

He wasn’t following a plan. He was just that kind of person.

Most people have forgotten that the Air Force had its own space program before NASA existed. NASA was carved out of NACA in 1958, but the Air Force had been running parallel efforts since the mid-1950s. That generation had grown up on science fiction and wanted to see it happen. When Sputnik launched in October 1957 the country went into a low-grade panic about whether it understood physics well enough to survive, and suddenly the kids who had been dreaming about space since they could read had somewhere to go with it. What followed was one of the rare moments in American history when technical aptitude was a genuine class elevator. The government needed people who understood this stuff badly enough to find them wherever they were.

He enlisted in his early twenties, aerospace degree in hand. The Air Force space program was what he was aiming at. He ended up working on attitude control thrusters for reconnaissance satellites, the kind that could resolve fine surface detail on Earth from hundreds of miles up. For that mission attitude control wasn’t a secondary problem. It was the central one. A camera that can’t hold still is useless. The thrusters are what made the intelligence possible. The underlying engineering was the same problem he had been teaching himself: oxidizer, fuel, combustion geometry, now controlled to tolerances that left no margin.

I remember him watching a satellite reenter on the cable news when I was young. I don’t know which one or exactly what year. What I remember is that he cried. He told me later there was a plate on that satellite with his name engraved on it. Work he had done, hardware he had touched, in orbit for years and now gone. Grief with no adequate audience, because the context was secret and the people who would have understood were scattered across programs that didn’t officially exist.

Years later my father was excited watching Iridium launch, Motorola’s commercial satellite constellation, first launches 1997. The same fundamental technology, now accessible to anyone with a phone. His generation had figured out how to do this, quietly, under classification, and here it finally was in the open. The knowledge had propagated. Just not through the channels that were supposed to carry it.

He kept a green chalkboard in the garage. He would pull out his slide rule and work through things with me. Orbital decay, thrust, specific impulse, delta-v, the rocket equation and why it makes everything harder than it looks. He had a worry he came back to often – society had forgotten how to go to the moon. The knowledge existed in aging engineers and partially classified documents and it was not being transmitted. The chalkboard was what he could do about that.

Last year Destin Sandlin, an aerospace engineer who describes himself as a redneck from Alabama, walked into a room full of the most senior people in American space policy and did something worth an hour of your time to watch. He asked questions that people inside the institutional food chain had stopped asking. Starting with the most basic one: how many rockets does it take to fuel the Artemis lunar lander?

The room went quiet. Nervous laughter. EPublic estimates have varied, but all point to a strikingly high number of launches and on-orbit refueling operations before a landing attempt depending on assumptions about boil-off and reuse, and nobody in the room had a confident answer.

These are not uninformed people. A core operational parameter of their own mission architecture was not common knowledge among the people running it.

Then Destin asked the room a simpler question.

“Is this the simplest solution?”

Silence.

Destin pointed them at NASA SP-287, a document the Apollo engineers wrote and left behind specifically so the next generation wouldn’t have to rediscover everything from scratch. The title is “What Made Apollo a Success.” It has been sitting there, public, for decades. Most of the people in that room had not read it.

The principle at the center of that document is blunt:

“Build it simple and then double up on as many components or systems so that if one fails, the other will take over.”

Simple first. Then redundant. Not complex and hoping.

Simple isn’t just aesthetic preference. Simple is how you keep the system inside your head. Simple is how you build procedures all the way down to bolt cutters and still know what comes next. When a system gets complex enough that a room full of its leaders can’t answer a basic operational question about it, it has exceeded the boundary of what they actually understand. They are renting the complexity along with the capability.

The Apollo engineers meant it literally. When designing the ascent stage separation, the mechanism that gets astronauts off the lunar surface, they didn’t stop at one solution or two. They built redundancy on top of redundancy. Flip the switch. If that fails, go outside and trip the manual release. If that fails, depressurize, suit up, go to the bottom of the spacecraft with bolt cutters, and cut the straps holding the stages together. Harrison Schmitt said there was one more procedure after the bolt cutters. Nobody would say what it was.

That’s not genius. That’s a chicken farmer’s epistemology applied to the hardest engineering problem humans had ever attempted. You don’t wait for perfect conditions or perfect knowledge. You start simple, you build every fallback you can think of, and then you think of one more.

Destin argues that Artemis didn’t follow that logic. The NRHO/Gateway architecture was publicly justified in part on communications, surface access, stability, and operational grounds, but Destin argues that it also reflects deeper architectural constraints that accumulated into a more complex solution. Destin’s read, and he makes a detailed case for it, is that it’s an architectural constraint dressed up as a design choice, complexity that accumulated because the real constraints couldn’t be named publicly. A room full of program leaders who couldn’t tell you the basic parameters of the system they were running.

That’s what happens when you lose the thread.

Destin also interviewed an engineer who had worked on the lunar landing training vehicle, the machine that taught Apollo astronauts to land in one-sixth gravity by actually putting them in a vehicle where their life depended on getting it right. Destin asked whether the Apollo engineers were smarter than engineers today. The answer was no. What they had wasn’t superior intelligence. It was a bias toward doing, toward simplicity, toward keeping the system inside human heads rather than delegating it to complexity they couldn’t fully reason about.

NASA SP-287 exists because those engineers understood something important. Capability doesn’t survive on its own. Knowledge doesn’t transmit automatically. You have to codify it deliberately or it dies with the people who held it. It is ownership made explicit. Here is what we understood. Here is why it worked. Here is the playbook so the next generation doesn’t have to rediscover it at the cost of lives.

The space race created a machine for turning hands-on knowledge into national capability. It found people like my father wherever they were because it needed what they had already taught themselves. It was the on-ramp, the forcing function that pulled curiosity into programs that mattered and gave it somewhere to go. That same forcing function generated SP-287, the discipline to write it down, the institutional pressure to transmit it. When the race ended the machine stopped. The on-ramp closed. The knowledge didn’t vanish immediately. It aged out, program by program, engineer by engineer, panel by panel. What remained was credentials and institutional memory of having once known how, which is a different thing entirely from knowing how.

We took that gift and built a lunar return architecture that, at least in its public form, often looks more operationally intricate than the Apollo playbook would have preferred. More complex architecture. Estimates ranging from eight to fifteen or more rockets just to fuel the lander. A room full of its leaders who hadn’t read the playbook.

“Is this the simplest solution?”

Silence.

That’s not an aerospace problem. That’s the pattern. The knowledge transmission problem is older than aerospace. I’ve been writing about it in other contexts for a while, starting here.

My father spent my childhood pointing at this from a chalkboard in a garage. I didn’t become an astronaut. That was his hope, not my path. The chalkboard worked anyway. The knowledge moved. The Iridium launches proved it. The knowledge his generation developed under classification eventually became infrastructure anyone could hold in their pocket. You can’t fully control where it lands. You can only decide whether to try.

Now AI is doing to software what the end of the space race did to aerospace. It is consuming the early career tasks that used to serve as scaffolding for building judgment. The debugging, the boilerplate, the routine iteration that taught tradeoffs and edge cases before anyone trusted you with the hard problems. The visible work disappears first. The tacit knowledge becomes unreachable just as it becomes most important. The on-ramp closes. And at some point a room full of senior people goes quiet when someone asks a basic operational question, not because they’re uninformed, but because the complexity was delegated before the understanding had time to form.

That is the cautionary tale. Not that AI is bad. That capability outsourced before it is understood leaves you renting decisions you don’t control while keeping consequences you can’t transfer. The room goes quiet. And eventually nobody even thinks to ask whether this is the simplest solution.

My father saw it coming. That’s what the chalkboard was for.

The question isn’t whether you work in aerospace or software. It’s whether you’ve stopped asking basic questions about the system you’re running. Whether it has exceeded the boundary of what you actually understand. Whether you’re renting complexity along with capability and calling it progress.

You don’t wait for perfect knowledge. You read every playbook you can find. You build redundancy all the way down to bolt cutters. And then you think of one more thing.

The chemicals are still on the shelves. SP-287 is still public. The Destin talk is an hour of your time and worth every minute.

Read the playbook.