Category Archives: Uncategorized

The Rebirth of Network Access Protection with Microsoft’s Zero Trust DNS

The other day Microsoft announced something it calls Zero Trust DNS. At a high level, it is leveraging clients’ underlying name resolution capabilities to establish and enforce security controls below the application.

In design, it is quite similar to what we did in Windows 2008 and Network Access Protection (NAP), a now deprecated solution that if released today would be considered some flavor of “Zero Trust” network access control.

NAP supported several different enforcement approaches for network policy. One of the less secure methods assessed a client’s posture; if found to be non-compliant with organizational requirements, the DHCP server would assign a restricted IP address configuration. The most secure approach relied on certificate-based IPSEC and IPv6. Healthy clients were provisioned with what we called a health certificate. All participating devices would then communicate only with IPSEC-authenticated traffic and drop the rest.

If ZeroTrust DNS had been released in 2008, it would have been another enforcement mode for Network Access Protection. It operates very similarly, essentially functioning on the basis that :

  1. The enterprise controls the endpoint and, through group policy, dictates DNS client behavior and network egress policy,
  2. Leverage mutual TLS and DNS over HTTPS (DoH) to authenticate clients, and
  3. Transform the DNS server into a policy server for network access.

I was always disappointed to see NAP get deprecated, especially the 802.1X and IPSEC-based enforcement models. Don’t get me wrong, there were many things we should have done differently, but the value of this kind of enforcement and visibility in an enterprise is immense. This is why, years later, the patterns in NAP were reborn in solutions like BeyondCorp and the myriad of “Zero Trust” solutions we see today.

So why might this ZeroTust DNS be interesting? One of the more common concerns I have heard from large enterprises is that the mass adoption of encrypted communications has made it hard for them to manage the security posture of their environment. This is because many of the controls they have historically relied on for security were designed around monitoring clear text traffic.

This is how we ended up with abominations like Enterprise Transport Security which intentionally weakens the TLS 1.3 protocol to enable attackers—erm, enterprises—to continue decrypting traffic. 

One of the theses of this Zero Trust DNS solution appears to be that by turning DNS into what we used to call a Policy Enforcement Point for the network enterprises get some of that visibility back. While they do not get cleartext traffic, they do get to reliably control and audit what domain names you resolve. When you combine that with egress network filtering, it has the potential to create a closed loop where an enterprise can have some confidence about where traffic is going and when. While I would not want my ISP to do any of this, I think it’s quite reasonable for an enterprise to do so; it’s their machine, their data, and their traffic. It also has the potential to be used as a way to make lateral movement in a network, when a compromise takes place, harder and maybe, in some cases, even make exfiltration harder.

Like all solutions that try to deliver network isolation properties, the sticking point comes back to how do you create useful policies that reduce your risk but still let work happen as usual. Having the rules based on high-level concepts like a DNS name should make this better, than with, for example, IPSEC-based isolation models, but it still won’t be trivial to manage. It looks like this will have all of those challenges still but that is true of all network segmentation approaches.

What I appreciate most about this is its potential to reduce an organization’s perceived need to deploy MiTM solutions. From a security protocol design perspective, what is happening here is a trade-off between metadata leakage and confidentiality. The MiTM solutions in use today cause numerous problems; they hinder the adoption of more secure protocol variants and objectively reduce enterprise security in at least one dimension. They also make rolling out new versions of TLS and more secure cryptography that much harder. Therefore, in my opinion, this is likely a good trade-off for some organizations.

To be clear, I do not believe that host-based logging of endpoint communications will lead all enterprises to abandon these MiTM practices. For example, some MiTM use cases focus on the intractable problems of Data Leak Protection, network traffic optimization, or single sign-on and privilege access management through protocol-level manipulation. These solutions clearly require cleartext access, and name-based access control and logging won’t be enough to persuade enterprises that rely on these technologies to move away from MiTM. However, there are some use cases where it might.

So, is this a good change or a bad change? I would say on average it’s probably good, and with some more investment from Microsoft, it could be a reasonable pattern to adopt for more granular network segmentation while giving enterprises more visibility in this increasingly encrypted world without needing to break TLS and other encryption schemes.

How TLS Certificates Can Authenticate DNS TXT Records

Have you found a use case where you think DANE and DNSSEC might be helpful? For example, the discovery of some configuration associated with a domain? Since a practically useful DNSSEC deployment, which requires individual domains (ex: example.com) to adopt DNSSEC and for relevant clients to use a fully validating DNSSEC resolver, which has not happened yet at any reasonable scale, maybe consider using certificates to sign the values you place in DNS instead.

SXG certificates are, in essence, signing certificates tied to a domain, while you could use a regular TLS certificate, for today’s post let’s assume that SXG certificates were the path you chose.

You can enroll for SXG certificates for free through Google Trust Services. This would allow you to benefit from signed data in DNS that is verifiable and deployable today not only once DNSSEC gets broad deployment.

Assuming a short certificate chain, since DNS does have size restrictions, this could be done as simply as:

Get an SXG certificate that you can use for signing….

sudo certbot certonly \
  --server https://dv-sxg.acme-v02.api.pki.goog/directory \
  -d yourdomain.com \
  --manual \
  --eab-kid YOUR_EAB_KID \
  --eab-hmac-key YOUR_EAB_HMAC_KEY

Sign your data….

openssl smime -sign -in data.txt -out signeddata.pkcs7 -signer mycert.pem -inkey mykey.pem -certfile cacert.pem -outform PEM -nodetach

Then put the data into a DNS TXT record….

After which you could verify that using the Mozilla trust list using OpenSSL…

openssl smime -verify -in signeddata.pkcs7 -CAfile cacert.pem -inform PEM

In practice, due to DNS size constraints, you would likely use a simpler signature format, such as encoding just the signing certificate and the signature in Base64 as a type-length-value representation; with a 256-bit ECC certificate and its signature, this record would total around 1152 bytes, which would comfortably fit inside a DNS TXT record, thanks to EDNS which has a capacity of up to 4096 bytes.

While this does not provide the authoritative non-existence property that DNSSEC has, which exists to address downgrade attacks, which means relying parties could not detect if the record was removed, nor does it provide security to DNS overall, but if we are just talking about configuration data, it might be a viable approach to consider.

On the other hand, this form of signing a TXT record using a certificate is never going to take down your domain and DNSSEC adoption cannot say that!

Effortless Certificate Lifecycle Management for S/MIME

In September 2023, the SMIME Baseline Requirements (BRs) officially became a requirement for Certificate Authorities (CAs) issuing S/MIME certificates (for more details, visit CA/Browser Forum S/MIME BRs).

The definition of these BRs served two main purposes. Firstly, they established a standard profile for CAs to follow when issuing S/MIME certificates. Secondly, they detailed the necessary steps for validating each certificate, ensuring a consistent level of validation was performed by each CA.

One of the new validation methods introduced permits mail server operators to verify a user’s control over their mailbox. Considering that these services have ownership and control over the email addresses, it seems only logical for them to be able to do domain control verification on behalf of their users since they could bypass any individual domain control challenge anyway. This approach resembles the HTTP-01 validation used in ACME (RFC 8555), where the server effectively ‘stands in’ for the user, just as a website does for its domain.

Another new validation method involves delegating the verification of email addresses through domain control, using any approved TLS domain control methods. Though all domain control methods are allowed for in TLS certificates as supported its easiest to think of the DNS-01 method in ACME here. Again the idea here is straightforward: if someone can modify a domain’s TXT record, they can also change MX records or other DNS settings. So, giving them this authority suggests they should reasonably be able to handle certificate issuance.

Note: If you have concerns about these two realities, it’s a good idea to take a few steps. First, ensure that you trust everyone who administers your DNS and make sure it is securely locked down. 

To control the issuance of S/MIME certificates and prevent unauthorized issuance, the Certification Authority Authorization (CAA) record can be used. Originally developed for TLS, its recently been enhanced to include S/MIME (Read more about CAA and S/MIME).

Here’s how you can create a CAA record for S/MIME: Suppose an organization, let’s call it ‘ExampleCo’, decides to permit only a specific CA, ‘ExampleCA’, to issue S/MIME certificates for its domain ‘example.com’. The CAA record in their DNS would look like this:

example.com. IN CAA 0 smimeemail "ExampleCA.com"

This configuration ensures that only ‘ExampleCA.com’ can issue S/MIME certificates for ‘example.com’, significantly bolstering the organization’s digital security.

If you wanted to stop any CA from issuing a S/MIME certificate you would create a record that looks like this: 

example.com. IN CAA 0 issuemail ";"

Another new concept introduced in this round of changes is a new concept called an account identifier in the latest CAA specification. This feature allows a CA to link the authorization to issue certificates to a specific account within their system. For instance:

example.com. IN CAA 0 issue "ca-example.com; account=12345"

This means that ‘ca-example.com’ can issue certificates for ‘example.com’, but only under the account number 12345.

This opens up interesting possibilities, such as delegating certificate management for S/MIME or CDNs to third parties. Imagine a scenario where a browser plugin, is produced and managed by a SaaS on behalf of the organization deploying S/MIME. This plug-in takes care of the initial enrollment, certificate lifecycle management, and S/MIME implementation acting as a sort of S/MIME CDN.

This new capability, merging third-party delegation with specific account control, was not feasible until now. It represents a new way for organizations to outsource the acquisition and management of S/MIME certificates, simplifying processes for both end-users and the organizations themselves.

To the best of my knowledge, no one is using this approach yet, and although there is no requirement yet to enforce CAA for SMIME it is in the works. Regardless the RFC has been standardized for a few months now but despite that, I bet that CAs that were issuing S/MIME certificates before this new CAA RFC was released are not respecting the CAA record yet even though they should be. If you are a security researcher and have spare time that’s probably a worthwhile area to poke around 😉

Raising the Bar: The Urgent Need for Enhanced Firmware Security and Transparency

Firmware forms the foundation of all our security investments. Unfortunately, firmware source code is rarely available to the public and as a result is one of the least understood (and least secure) classes of software we depend on today.

Despite this, the hardware industry is known for its lack of transparency and inadequate planning for the entire lifecycle of their products. This lack of planning and transparency makes it hard to defend against and respond to both known and unknown vulnerabilities, especially when the industry often knows about issues for ages, but customers do not.

In today’s world, automation allows builders, defenders and attackers to automatically identify zero-day vulnerabilities with just a binary it has become increasingly important that embargo times for vulnerabilities are as short as possible, allowing for quick defense and, when possible, remediation.

Despite this, organizations like the UEFI Forum are proposing extended disclosure periods, suggesting a 300-day wait from initial reporting to the vendor before notifying customers. During this year-long waiting period, customers are exposed to risks without defense options. The longer the period, the more likely it is that automation enables the attacker to identify the issue in parallel, giving them a safe period to exploit the zero-day without detection.

Simply put, this duration seems way too long, considering the ease of proactively catching issues now — especially given the industry’s overall underinvestment in product security. It would be a different case if these organizations had a history of handling issues effectively, but the reality is far from this. Their apparent neglect, demonstrated by unreliable update mechanisms, continuously shipping models with the same issues that have been resolved in other models, and the frequency of industry-wide issues highlight this reality. More often than any other industry, we see hardware manufacturers often reintroducing previously resolved security issues due to poor security practices and poor management of their complex supply chains. This reality makes this position highly irresponsible. We must do better. Concealing vulnerabilities like this is no longer viable — if it ever was.

It is possible we will see changes as a result of shifts in software liability and regulatory changes, like those in White House Executive Order 1428. This order demands that organizations responsible for “critical software” comply with long-standing best practices. Although “critical software” lacks a clear definition, firmware’s role in underpinning all security investments suggests it likely falls into this category. This executive order starts with basics like publishing known dependencies, which is useful but insufficient, especially in this segment given the prevalence of shared reference code and static dependencies that are not expressed as a library dependencies. This language includes adoption of formal vulnerability management practices, bug bounties, and more. This and the EU Cyber Resilience Act are all efforts to get these and other vendors to align with long-time security best practices, like those captured by efforts like the NIST’s vulnerability management recommendations.

This landscape will likely shift once we see enforcement cases emerge, but customers must insist on higher standards from hardware manufacturers and their suppliers, or nothing will change in the near term.

Words matter in cryptography or at least they used to

I was listening to Security Cryptography Whatever today, and they were discussing a topic that has been bothering me for a while.

A common theme in post-quantum cryptography is its pairing with classical cryptography. This “belts and suspenders” approach seems sensible as we transition to relatively new ways to authenticate and protect data. We have already seen some of these new post-quantum methods fail, which underscores the importance of agility in these systems.

However, merging two approaches like this introduces complexity, which is important since as a general rule, complexity is the root of all security issues. Another concern is the labeling of various strategies for doing this as “Hybrid.” This wording makes it challenging to understand what the different approaches are doing and why.

With this background in mind, let’s explore three different “Hybrid” approaches to PQC and classical cryptography. By giving each a unique name and using simple examples, to see if we we can show how they differ: Nested Hybrid Signatures, Side-by-Side Hybrid Protocols, and the proposed Merged Hybrid Signatures.

Nested Hybrid Signatures: A box within a box

In this approach, imagine verifying the authenticity of a letter. The nested hybrid signature method is like putting this letter within a secure box, protected by a classical signature scheme like ECDSA. But we don’t stop there. This box is then placed within another, even stronger box, strengthened with a post-quantum signature scheme like Dilithium. This nested structure creates a situation where even if one layer is broken, the inner core remains trustable..

Side-by-Side Hybrid Protocols: Simultaneous and Nested

In this method, imagine two separate safes, each protecting a part of your secret message. One safe has a classical lock, while the other has a modern, quantum-resistant lock. To see the entire message, one must unlock both safes, as the full message remains trustable unless both safes are broken into. 

Merged Hybrid Signatures: Holding onto the past

This method tries to mix the elements of classical and post-quantum signature schemes into a single, unified signature format. The goal of this approach is to enable minimal changes to existing systems by maintaining a single field that combines a classical signature with a post-quantum signature. This has several issues and seems misguided to me. Firstly, this mixing of PQC and classical cryptography is a temporary problem; eventually, we should have enough confidence that post-quantum cryptography alone is enough at which point this complexity wouldn’t be needed. It also messes with the current assumptions associated with existing signatures, and while it’s not clear what the issues may be, keeping each of the signatures isolated seems less risky. To stick with the lock analogy, it’s somewhat like designing a safe with two different locks on the same door, which must be unlocked at the same time with the same key.

Conclusion

While it’s tough to find the right words to describe new developments as they happen we can do better to avoid using the same terms for different approaches. This will make it easier for everyone to understand what’s being discussed without having to study each protocol in detail. 

Using Caddy with Google Trust Services

Caddy is a powerful and easy-to-use web server that can be configured to use a variety of certificate authorities (CA) to issue SSL/TLS certificates. One popular CA is Google Trust Services, which offers an ACME endpoint that is already compatible with Caddy because it implements the industry-standard ACME protocol (RFC 8555). 

This means that Caddy can automatically handle the process of certificate issuance and renewal with Google Trust Services, once the External Account Binding (EAB) credentials required have been configured. 

How do I use it? 

Using global options

To use the Google Trust Services ACME endpoint you will need an API key so you can use a feature in ACME called External Account Binding. This enables us to associate your certificate requests to your Google Cloud account and allows us to impose rate limits on a per-customer basis. You may easily get an API key using the following commands:

$ gcloud config set project <project ID>
$ gcloud projects add-iam-policy-binding project-foo \  –member=user:[email protected] \  –role=roles/publicca.externalAccountKeyCreator
# Request a key:
$ gcloud alpha publicca external-account-keys create

You will need to add this API key and specify the Google Trust Services ACME directory endpoint along with your email address in your Caddyfile:

{    
acme_ca https://dv.acme-v02.api.pki.goog/directory
email  [email protected]
acme_eab {
        key_id  <key_id>
        mac_key <mac_key>
    }}

It is important to remember that when you use this configuration Google Trust Services explicitly is used as the only CA.

If you want to use multiple CAs for redundancy Caddy which is recommended the configuration would look something like this:

{     
cert_issuer acme https://dv.acme-v02.api.pki.goog/directory  {
          eab <key_id>  <key>
     }
     cert_issuer acme
}

In this example Google Trust Services will be tried and if there is a problem it will fall back to Let’s Encrypt.

It is also worth noting that the Google Trust Services EAB key is one time use only. This means that once Caddy has created your ACME account these can be safely removed.

Using the tls directive

If you want to use Google Trust Services for only some of your sites, you can use the tls directive in your Caddyfile like you’re used to:

tls [email protected] {
    ca https://dv.acme-v02.api.pki.goog/directory
   eab <key_id> <mac_key>
}

The email address in this example identifies the ACME account to be used when doing enrollment.

In conclusion, using Caddy with Google Trust Services is a simple and simple and secure way to issue and manage SSL/TLS certificates for your websites. With the easy-to-use Caddyfile configuration, you can quickly and easily configure your server to use Google Trust Services for all of your sites or just a select few. With Google Trust Services, you can trust that your websites will be secure and your visitors’ data will be protected.

Top #5 CAs by issuance volume as of 09/19/22

As Google Trust Services has been available for a few weeks I thought it would be interesting to look at where it stands relative to other CAs based on its issuance volume.

#CA OwnerCertificatesPre-Certificates
AllUnexpiredAllUnexpired
1Internet Security Research Group2,972,072,131270,986,3172,689,923,862233,570,311
2DigiCert115,808,4067,603,151443,129,508138,144,685
3Sectigo580,982,48145,262,868517,794,47746,204,659
4Google Trust Services LLC13,909,548467,650120,070,01321,232,287
5Microsoft Corporation17,56717032,448,45316,959,805

For more on this methodology of counting see this post.

Information system security and how little things have changed

When I was a boy my father had me read Plato’s Republic – he wanted to give an oral report on what the key points of the book were and what my personal takeaways were after reading the book.

The first question was easy to answer from the dust jacket or maybe a Cliff Notes (For those of you who have not read the book it is an exploration of the ideas of justice and the ideal government).

With that said, I knew from experience that those personal takeaways are buried in the nuance and no shortcut would satisfy him so off to read I went. What were those takeaways? According to him what I said was: 

  1. The nature of people has not changed much,
  2. The problems we have in government have not changed much.

Why do I bring this up in the context of security? Unfortunately, it is because I do not think things have changed much in security either! I’ll give two examples that stand out to me:

Every program and every privileged user of the system should operate using the least amount of privilege necessary to complete the job.

— Jerome Saltzer, 1974, Communications of the ACM

The moral is obvious. You can’t trust code that you did not totally create yourself. (Especially code from companies that employ people like me.) No amount of source-level verification or scrutiny will protect you from using untrusted code.

Ken Thompson, 1984, Reflections on Trusting Trust

The first quote is the seminal quote referring to the term “least privilege” – a concept we still struggle to see deployed nearly 50 years later. The term is old enough now the marketers have latched onto it so when you speak to many enterprises they talk about it in the scope of group management and not the more fundamental design paradigm it actually represents.

To put this concept in the context of the network in the 90s we talked about how Firewalls, however necessary, were a bit of an antipattern since they represented “the hard candy shell” containing the “soft gooey sweet stuff” the attacker wants to get at and that as a result, it was better to design security into each endpoint. 

A decade later we were talking about using network-level enforcement via “Network Admission Control” at the switch, later yet via DirectAccess and Network Access Protection we were pushing those same decisions down as close to the end device as we could, and in some cases making each of those endpoints capable of enforcing these access requests.

Today we call this pattern ZeroTrust networking, a leading example of this pattern is called BeyondCorp, but again marketers have latched onto ZeroTrust and as a result, it seems almost every enterprise product I hear about these days claims to offer some sort of ZeroTrust story but few objectively meet the criteria I would define for such a lofty term.

Similarly, if we look at the second quote all we have to do is take a look at the recent SolarWinds debacle and realize that almost nothing has changed since Ken Thompson wrote that paper. We also have dozens of examples of keys being compromised being used to attack the software supply chain, or package repositories and open source dependencies being used as attack vectors. Despite us knowing how significant these issues can be for nearly 40 years we have made very little progress in mitigating these issues.

As they say, there is nothing new under the sun, and this appears to be especially true with security. If so why is this the case? How is it we have made so little progress on these fundamental problems as an industry?

Unfortunately, I think it boils down to that customers don’t care until it is too late and this makes it hard for the industry to justify the kinds of fundamental investments necessary to protect the next generation from these decades old.

How do we improve the state of affairs here? Thats really the question, one I don’t have a good answer to.

ResortQuest, Wyndham, Home Away, 2 inches of water, vomit stench and bad management.

We recently had a family vacation in Miramar Beach Florida, where we stayed at the SurfSide in unit #502. The experience was, shall we say less than we expected. The management company was incompetent, did not meet their legal requirements when responding to an emergency and left us in a bad situation despite a ton of attempts on our side to work with them.

To top things off HomeAway refused to provide even the most minimal levels of assistance when the management company failed to live up to their obligations.

Bellow is my unedited review of that experience, hopefully, it will help someone in the future.

This is our second stop while in the Destin area. Our first was as a described and great experience provided by Southern Vacation Rentals, this stay, however, left us, literally with a bad taste in our mouth.

Pulling up to the building it was obvious it had seen better days but on other vacations, we have had similar thoughts and were pleasantly surprised when we got to the room.

This time, however, when we opened the door a sour smell reminiscent of dried vomit🤮 engulfed us. A quick inspection revealed that it originated from the couches and the rugs. They, like everything in this unit, are well used and poorly taken care of. It’s clear this is a rental unit where only tenants or service people visit as I have to believe a owner would take care of the things we had to deal with.

We contacted the management firm and they said they would send house cleaning to take care of the smell.  House cleaning did not show up, so to manage the smell we had to put the couch cushions and slips on the deck as it would not be possible to stay in here with that odor. We called again that night and were told someone would be here in the morning. They did not show.

The furniture is ready to be replaced, the refrigerator was dirty inside and out, to top it off the wires that power the light inside it are hanging out due to a cracked housing.

The unit appears to have been remodeled over a decade ago but only minimal maintenance has been done since then.

Though it is clear the room was lightly cleaned prior to our visit I doubt it’s had a real deep cleaning in a long time as there was splattered food on the wall and the drinking glasses were sticky so we washed them all on arrival.

To top off the above we tried to do a load of laundry and the washer flooded the apartment bathroom, hallway, master bedroom and the hallway leading to the unit.

We called the management firm and they said they could not get ahold of maintenance and asked us to spend our evening to clean up the water as best we can, which we did. They did offer to have someone come in the morning (sound familiar?) to take care of what’s left.

It was clear the management firm was concerned with the potential damage but they did not seem to care about our situation at all. Since they couldn’t reach maintenance we also could get no replacement towels so no showers in the morning.

I should note that it’s clear this flooding has happened before because the trim in the hallway of the unit shows clear signs of past water damage.

We asked the management firm to move us to a different unit, after all, if vomit stench and a flood were not enough to justify that what would be? Unfortunately, the best they could offer was one day at another unit a unit 30 minutes away having to return the next day. Since it was already midnight and it would have only been for that night we passed.

If you recall they said someone would come in the morning to take care of the flooding, you guessed it — they never showed up.

We called again in the evening and spoke to the manager for the site and he apologized for the lack of response on prior calls and promised someone would be here to clean up and provide us towels tonight since we have six people and no towels. Of course, no one showed with towels.

We also tried to warm milk in the microwave today but it too doesn’t work,  yet another work order has been filed.

On our last day, the manager contacted us asking if we had gotten the towels he had sent, we had not. A few minutes later towels do show up, we now have had enough towels for a week but we leave in the morning.

The manager did finally offer a concession for this ridiculousness, $150 for the inconvenience. It took three of us Three hours to clean up the water alone. So it looks like they are valuing our time at a little over $15 an hour, and could care less about the inconvenience (no towels, no clean clothes,  no microwave, disgusting odor, time lost doing basic housekeeping, time wasted trying to get them to do their jobs, etc) and then there is the intangible damage they did to our vacation they place no value on.

The reality is beyond the wasted time and inconvenience we were able to use less than half of this unit. The living room was largely unusable due to lack of a place to sit, the deck was at least 1/4 unusable as it held the stinky couch bits, there was no laundry, no cooking, and no towels.

I guess it’s only fair to share the good stuff too, while dealing with vomit stench,  cleaning up a massive amount of water and failing to use the appliances we had an opportunity to think about the  locations fantastic view, if you choose to stay here despite what I shared you will rest assured you will have an amazing view of the gulf and a nice deck to enjoy it from while your not cleaning up a mess.

 

CAs and SSL and Phishing Oh My!

 

 

 

 

 

 

NOTE: This post reflects my personal beliefs and is not necessarily those of my employer Google, or Let’s Encrypt where I am a member of their Technical Advisory Board.

Introduction

Recently Vincent from The SSL Store published a blog post calling out Let’s Encrypt for issuing certificates to domains that contain the world PayPal.

The TL;DR for his post is he believes that Let’s Encrypt is enabling phishers by issuing them SSL certificates that contain the word “PayPal” and then refusing to revoke them when arbitrary third-parties ask them to.

As a result of his post, several news sources have decided to write articles about how “Let’s Encrypt” is acting as an enabler of these Phishers [1] [2].

Unfortunately, Vincent’s post and the associated articles don’t cover this in the most complete and balanced way so over my morning coffee today I decided to write this post to discuss the other side of the argument.

If this is a topic that interests you please also check out the Let’s Encrypt blog post where they talk about why they have taken this position.

Exploration

Let’s explore the opportunities CAs have to check for phishing, the tools they have available to them, the effectiveness of those tools, the consequences of this approach, how complete a solution based on the tools available to them would be and what the resulting experience would be for users.

Opportunities

The WebPKI’s CAs role, historically, has been that of a Passport office, you present proof you control a domain, and possibly that you are an authorized member of an organization and you get a digital certificate that attests to that.

This certificate could be valid for up to 1095 days. Once the certificate is issued the CA, largely speaking, has no natural opportunity to verify this information again. It is worth noting that this month the CABForum voted to shorten this period to 825 days.

Tools

In the event a CA determines it made a mistake in the issuance of a certificate or has been notified by the subscriber they would like to see a certificate marked invalid, the tool they have available to them is called “revocation”.

The two types of revocation that are under the control of a Certificate Authority are called Certificate Revocation Lists and OCSP responses. The first is a like a phonebook of all known “revoked” certificates while the last is more like a lookup that it enables User Agents to ask the status of a particular set of certificates.

Effectiveness

Earlier we discussed the lifetime of certificates, this is important to understand because the large majority of phishing sites do not start out as Phishing sites, as such issuance time checks seldom net positive results.

After issuance, this leaves you with periodic checks of the site,  third-party reports of phishing and relying on revocation checking as an enforcement mechanism. This is a recipie for failure, there are a few reasons for this, but one of the more significant is the general ineffectiveness of revocation checking.

Revocation checking is the most taxing thing a CA does. This is because the revocation mechanisms available to them will result in every relying party contacting them to download a OCSP response or CRL covering that certificate.

As a result, OCSP has a tendency to be both slow and unreliable. This forced browsers to implement this check as a “soft fail”, in other words, if the connection times out or fails for some reason they assume the certificate as good.

To give that some context about 8% of all revocation checks done by Firefox fail and the median response time is over 200ms.

As a result of this in 2012 Chrome, which is used by about 50% of all users, more-or-less disabled revocation checking except for exceptional circumstances.

What this means is that revocation checking, even for its intended purpose, is far from an effective tool. Expanding its use to include protecting users from phishers would not improve its effectiveness and arguably it would (due to the infrastructure implications) make it even less reliable.

It is also important to note that every wildcard certificate can be used for a hostname containing “PayPal” without the CA ever being made aware, a good example is https://paypal.github.io/ which is protected by a wildcard certificate issued to Github.

Consequences

To understand the consequences of expanding the CAs role include protecting us from phishing we first need to understand what a certificate represents, or more importantly what it does not represent. It does not represent the content, it represents the host that is serving content and it is the content that “phishes”.

Today, in the age of cloud services, there is a good chance the host that is serving the content is a service operated by WordPress, or maybe Amazon’s S3. These services allow users to sign up and post arbitrary content for free or very little money.

If we decide that revocation checking is the right tool to get phishing content off the web we would be saying a CA should revoke WordPress’s certificate if one of it’s users posted something someone reported as phishing content. That would, for the situations where revocation checking takes place and happens to work, take WordPress off the Internet. Is that what we want to happen?

If so, who is it we are asking to perform this task? There are well over 400 CAs in the Microsoft Root Program do we believe these are the right organizations to be policing the internet for the appropriateness of content?

If so what criteria should they use to do so and what do we do if they abuse this censorship role?

Completeness

It is easy to say that a CA should not issue a certificate if it contains the word “PayPal”. I could even see an argument that those that would be hurt by such a rule, for example, http://www.PayPalSucks.com and (a theoretical) PayPalantir.com are an acceptable loss.

This would, however not catch homoglyphs like when a Cyrillic “a” is used instead of the latin “a” which would very likley require a manual review of the name and content to determine the intent of the domain owner which is near impossible to do with any level of accuracy or fairness.

Even with that, what about ING, as one of the world’s largest banks, they too are commonly phished, should a CA be able to issue a certificate to https://www.fishing.com. And if they do and the issuing CA receives a complaint that it is Phishing ING what should they do?

And what about global markets and languages? In Romania there is a company called Amazon that is a cleaning company, should anyone be able to request their website be revoked because it contains the word Amazon?

If we promote the CA to content police, how do we do so in a complete way?

User Experience

With CAs acting as the content police, what would a user see when they encounter a revoked site? While it varies browser to browser the experience is almost always a blocking “interstitial”, for example:

 chrome revoked firefox revoked


If you look closely you will see these are not screens that you can bypass, revoked sites are effectively removed from the internet.

This is in contrast to Safe Browsing and Smartscreen which were designed for this particular problem set and therefore provide the user a chance to visit the site after a contextually relevant warning:

 SafeBrowsing smartscreen

Conclusion

I hope you see from the above that relying on Certificate Authorities as content police as a means to protect users from phishers a bad idea, at a minimum, it would be:

  • Ineffective,
  • Incomplete,
  • Unmanageable,
  • and Duplicative.

But more importantly it would be establishing a large loosely managed group as the de-facto content censors on the internet and as Steven Spielberg said, there is a fine line between censorship, good taste, and moral responsibility.

So what should CAs do about phishing then? It is my position they should check the Google Safe Browsing API prior to issuance (which by the way, Let’s Encrypt does), and they should report Phishers to the Safe Browsing service if they encounter any.

It is also important to answer the question about what users should do to protect themselves from phishing. I understand the desire to say there is only one indicator they need to be worried about, it’s just not realistic.

When I talk to regular users I tell them to do three things, the first of which is to use an up-to-date and modern browser that uses Smart Screen or Safe Browsing. Second, you should only provide data to sites you know and only over SSL. And finally, try to only provide sites information when it was you initiated the exchange of information.

P.S.

Thanks To Vincent Lynch and the others who were kind enough to proof this post before publishing.