The WebPKI’s Moral Hazard Problem: When Those Who Decide Don’t Pay the Price

TL;DR: Root programs, facing user loss, prioritize safety, while major CAs, with browsers, shape WebPKI rules. Most CAs, risking distrust or customers, seek leniency, shifting risks to billions of voiceless relying parties. Subscribers’ push for ease fuels CA resistance, demanding reform.

The recent Mozilla CA Program roundtable discussion draws attention to a fundamental flaw in how we govern the WebPKI, one that threatens the security of billions of internet users. It’s a classic case of moral hazard: those making critical security decisions face minimal personal or professional consequences for poor choices, while those most affected have virtually no say in how the system operates.

The Moral Hazard Matrix

The numbers reveal a dangerous imbalance in who controls WebPKI policy versus who bears the consequences. Browsers, as root programs, face direct accountability; if security fails, users abandon them. CAs on the other hand are incentivized to reduce customer effort and boost margins, externalize risks, leaving billions of relying parties to absorb the fallout:

A classic moral hazard structure, with a key distinction: browser vendors, as root programs, face direct consequences, lose security, lose users, aligning incentives with safety. CAs, while risking distrust or customer loss, often externalize greater risks to relying parties, leaving them to face the fallout betting that they wont be held accountable for these decisions.

Mapping the Accountability Breakdown

The roundtable revealed a systematic divide in how stakeholders approach CPS compliance issues. CAs, driven by incentives to minimize customer effort for easy sales and reduce operational costs for higher margins, consistently seek to weaken accountability, while root programs and the security community demand reliable commitments:

Position	Supported By	Core Argument	What It Really Reveals
“Revocation too harsh for minor CPS errors”	CA Owners	Policy mismatches shouldn’t trigger mass revocation	Want consequences-free policy violations
“Strict enforcement discourages transparency”	CA Owners	Fear of accountability leads to vague CPSs	Treating governance documents as optional “documentation”
“SLA-backed remedies for enhanced controls”	CA Owners	Credits instead of revocation for optional practices	Attempt to privatize trust governance
“Split CPS into binding/non-binding sections”	CA Owners	Reduce revocation triggers through document structure	Avoid accountability while claiming transparency
“Human error is inevitable”	CA Owners	Manual processes will always have mistakes	Excuse for not investing in automation
“Retroactive CPS fixes should be allowed”	CA Owners	Patch documents after problems surface	Gut the very purpose of binding commitments
“CPS must be enforceable promises”	Root Programs, Security Community	Documents should reflect actual CA behavior	Public trust requires verifiability
“Automation makes compliance violations preventable”	Technical Community	65+% ACME adoption proves feasibility	Engineering solutions exist today

The pattern is unmistakable: CAs consistently seek reduced accountability, while those bearing security consequences demand reliable commitments. The Microsoft incident perfectly illustrates this, rather than addressing the absence of systems that would automatically catch discrepancies before millions of certificates were issued incorrectly, industry discussion focused on making violations easier to excuse retroactively.

The Fundamental Mischaracterization

Much of the roundtable suffered from a critical misconception: the CPS is “documentation” rather than what it is, the foundational governance document that defines how a CA operates.

A CPS looks like a contract because it is a contract, a contract with the world. It’s the binding agreement that governs CA operations, builds trust by showing relying parties how the CA actually works, guides subscribers through certification requirements, and enables oversight by giving auditors a baseline against real-world issuance. When we minimize it as “documentation,” we’re arguing that CAs should violate their core operational commitments with minimal consequences.

CPS documents are the public guarantee that a CA knows what it’s doing and will stand behind it, in advance, in writing, in full view of the world. The moment we treat them as optional “documentation” subject to retroactive fixes, we’ve abandoned any pretense that trustworthiness can be verified rather than simply taken on blind faith.

Strategic Choices Masquerading as Constraints

Much CA pushback treats organizational and engineering design decisions as inevitable operational constraints. When CAs complain about “compliance staff being distant from engineering” or “inevitable human errors in 100+ page documents,” they’re presenting strategic choices as unchangeable facts.

CAs choose to separate compliance from operations rather than integrate them. They choose to treat CPS creation as documentation rather than operational specification. They choose to bolt compliance on after the fact rather than build it into core systems. When you choose to join root programs to be trusted by billions of people, you choose those responsibilities.

The CAs that consistently avoid compliance problems made different choices from the beginning, they integrated policy into operations, invested in automation, and designed systems where compliance violations are structurally difficult. These aren’t companies with magical resources; they’re companies that prioritized operational integrity.

The Technology-Governance Gap

The “automation is too hard” argument collapses against actual WebPKI achievements:

Challenge	Current State	Feasibility Evidence	CA Resistance
Domain Validation	Fully automated via ACME	65+% of web certificates	✅ Widely adopted
Certificate Linting	Real-time validation at issuance	CT logs, zlint tooling	✅ Industry standard
Transparency Logging	All certificates publicly logged	Certificate Transparency	✅ Mandatory compliance
Renewal Management	Automated with ARI	Let’s Encrypt, others	✅ Proven at scale
CPS-to-Issuance Alignment	Manual, error-prone	Machine-readable policies possible	❌ “Too complex”
Policy Compliance Checking	After-the-fact incident reports	Automated validation possible	❌ “Inevitable human error”

The pattern is unmistakable: automation succeeds when mandated, fails when optional. With Certificate Transparency providing complete visibility, automated validation systems proven at scale, and AI poised to transform compliance verification across industries, operational CPSs represent evolution, not revolution.

The argument is that these “minor” incidents don’t represent smoke, as in where there is smoke there is fire, when we know through past distrust events it is always a pattern of mistakes often snowballing while the most mature CA programs only occasional have issues, and when they do they deal with them well.

Trust Is Not an Entitlement

The question “why would CAs voluntarily adopt expensive automation?” reveals a fundamental misunderstanding. CAs are not entitled to being trusted by the world.

Trust store inclusion is a privilege that comes with responsibilities. If a CA cannot or will not invest in operational practices necessary to serve billions of relying parties reliably, they should not hold that privilege.

The economic argument is backwards:

Current framing: “Automation is expensive, so CAs shouldn’t be required to implement it”
Correct framing: “If you can’t afford to operate, securely, accuratley and reliably, you can’t afford to be a public CA”

Consider the alternatives: public utilities must maintain infrastructure standards regardless of cost, financial institutions must invest in security regardless of expense, aviation companies must meet safety standards regardless of operational burden. The WebPKI serves more people than any of these industries, yet we’re supposed to accept that operational excellence is optional because it’s “expensive”?

CAs with consistent compliance problems impose costs on everyone else, subscribers face revocation disruption, relying parties face security risks, root programs waste resources on incident management. The “expensive automation” saves the ecosystem far more than it costs individual CAs.

When Accountability Actually Works

The example of Let’s Encrypt changing their CPS from “90 days” to “less than 100 days” after a compliance issue is often cited as evidence that strict enforcement creates problems. This completely misses the point.

The “system” found a real compliance issue, inadequate testing between policy and implementation. That’s exactly what publishing specific commitments accomplishes: making gaps visible so they can be fixed. The accountability mechanism worked perfectly, Let’s Encrypt learned they needed better testing to ensure policy-implementation alignment.

This incident also revealed that we need infrastructure like ACME Renewal Information (ARI) so the ecosystem can manage obligations without fire drills. The right response isn’t vaguer CPSs to hide discrepancies, but better testing and ecosystem coordination so you can reliably commit to 90 days and revocations when mistakes happen.

The Solution: Operational CPSs

Instead of weakening accountability, we need CPSs as the living center of CA operations, machine-readable on one side to directly govern issuance systems, human-readable on the other for auditors and relying parties. In the age of AI, tools like large language models and automated validation can make this dual-purpose CPS tractable, aligning policy with execution.

This means CPSs written by people who understand actual issuance flows, updated in lock-step with operational changes, tied directly to automated linting, maintained in public version control, and tested continuously to verify documentation matches reality.

Success criteria are straightforward:

Scope clarity: Which root certificates does this cover?
Profile fidelity: Could someone recreate certificates matching actual issuance?
Validation transparency: Can procedures be understood without insider knowledge?

Most CPSs fail these basic tests. The few that pass prove it’s entirely achievable when CAs prioritize operational integrity over administrative convenience.

Systemic Reform Requirements

Fixing moral hazard requires accountability mechanisms aligned with actual capabilities. Root programs typically operate with 1-2 people overseeing ~60 organizations issuing 450,000+ certificates per hour, structural challenges that automation must address.

Stakeholder	Current State	Required Changes	Implementation
CAs	Manual CPS creation, retroactive fixes	CPSs as operational specifications	Engineering-written, issuance-system-tied, continuously tested
Root Programs	Minimal staff, inconsistent enforcement	Clearer requirements for CPS documents, automated evaluation tools, clear standards	Scalable infrastructure requiring scope clarity, profile fidelity, and validation transparency
Standards Bodies	Voluntary guidelines, weak enforcement	Mandatory automation requirements	Updated requirements to ensure adoption of automation that helps ensure commitments are met.
Audit System	Annual snapshots, limited scope	Continuous monitoring, real-time validation	Integration with operational systems

Root programs that tolerate retroactive CPS fixes inadvertently encourage corner-cutting on prevention systems. Given resource constraints, automated evaluation tools and clear standards become essential for consistent enforcement.

The Stakes Demand Action

Eight billion people depend on this system. We cannot allow fewer than 60 CA owning organizations to keep treating public commitments as optional paperwork instead of operational specifications.

When certificate failures occur, people lose life savings, have private communications exposed, lose jobs when business systems fail, or face physical danger when critical infrastructure is compromised. DigiNotar’s 2011 collapse showed how single CA failures can compromise national digital infrastructure. CAs make decisions that enable these risks; relying parties bear the consequences.

The choice is stark:

Continue excuse-making and accountability avoidance while billions absorb security consequences
Or demand that CAs and root programs invest in systems making trust verifiable

The WebPKI’s moral hazard problem won’t solve itself. Those with power to fix it have too little incentive to act; those who suffer consequences have too little voice to demand change.

The WebPKI stands at a turning point. Root programs, the guardians of web privacy, are under strain from the EU’s eIDAS 2.0 pushing questionable CAs, tech layoffs thinning their teams, and the U.S. DOJ’s plan to break up Chrome, a cornerstone of web security. With eight billion people depending on this system, weak CAs could fuel phishing scams, data breaches, or outages that upend lives, as DigiNotar’s 2011 downfall showed. That failure taught us trust must be earned through action. Automation, agility, and transparency can deliver a WebPKI where accountability is built-in. Let’s urge CAs, root programs, and the security community to adopt machine-readable CPSs by 2026, ensuring trust is ironclad. The time to act is now, together, we can secure the web for our children and our grandchildren.

For more on this topic, see my take on why CP and CPSs matter more than you think.

UNMITIGATED RISK

un.mit.i.gat.ed: Adj. Not diminished or moderated in intensity or severity; unrelieved. risk: N. The possibiity of suffering harm or loss; danger.