Category Archives: Thoughts

How did the Terminal Services Licensing PKI effect you?

The other day I did a post on the age of the Microsoft PKI that was used for Terminal Services Licensing, today I thought I would talk about what that age meant in the context of the vulnerabilities it introduced.

The oldest certificate I have been able to find from the same hierarchy is from April 1999 (that’s the issuance date for the Microsoft Enforced Licensing PCA).

Based on the post from the Microsoft Security Research & Defense blog  we know that the reason the attacker had to do the MD5 collision was that as of Vista there was a change in the way critical extensions were handled.

This change made it so that Vista clients would fail when they saw a certificate that contained an unknown critical extension in a signing certificate making it an ineffective attack vector for those clients.

But what does this mean for the period of time before that? Well Windows Vista was released in November of 2006, that’s nearly 8 years in which any enterprise with a terminal services deployment could have owned a Windows PC “As Microsoft” or potentially attack a PKI based system with “trusted” but fraudulent certificate.

But did it really get better in with the release of VISTA? According to StatCounter Windows Vista received its maximum market share of 23% in October of 2009. Yes, two years after the release of Vista 77% of the Windows clients on the Internet were still vulnerable as a result of the design of the terminal services licensing solution.

Things didn’t really start to get better until XP SP3 which was released in April of 2008 as it contained the same certificate validation engine that was found in Vista.

While I do not have any public statistics I can share I can say that this service pack was picked up faster than any other service pack up until that point which says a lot since it was not a “forced” update.

If we are optimistic and say that it took only one year to get 100% penetration and we believe stat counters statistics for XPs Market share of 71% in April 2009 that it took till 2009 to get to 95% patched.

Now these numbers are just for clients on the Internet and not servers. This “fixed” chain validation engine wouldn’t have even found its way into the Windows Server code base until Windows Server 2008 which was released in February 2008 but took some time to get broadly used.

While Windows Servers are not terribly common on the Internet they are extremely common in the Enterprise, especially in 2009 where it had 73.9% market share. Again I don’t have numbers but antidotal we know that Enterprises are notoriously slow to upgrade or patch.

So where does this leave us in 2012? Still vulnerable that’s where.

The other day Microsoft released a patch that in-essence revokes the PKI in question and today WSUS announced their patch that introduces additional pinning.

You need to apply both to secure your systems.

Ryan

MSRC 2718704 and the age of the rogue certificates

One of the things I love about security is how security researchers are so passionate about their jobs; they for the most part respect their peers and enjoy learning about attacks so they are willing to share information.

As a result of this dynamic over the past few days I have had a chance to look at a number of details about the #Flame malware as well as the rogue certificates it used as part of the its Windows Update attack.

With that said the raw data having already been published here quite a long time ago I think it’s OK to go into details on this now (Another interesting data point is that this link is discussing how to bypass the Terminal Services license mechanism).

This certificate chain gives us some insights into the age of the solution Microsoft was using to license Terminal Server Clients, for example one certificate chain I received shows us the following:

   
   
   
   
 

 

From this we can tell a few things, including:

  1. The licensing solution has been in use since at least 1999 – Its issuance date is ‎Thursday, ‎April ‎15, ‎1999 12:00:00 AM, we can assume this is the case if we assume the “Microsoft Enforced Licensing Intermediate PCA” is in a Microsoft vault (very safe to assume).
  2. The “Licensing Server” was built to include unverified information in each license originally – We can assume this since the other certificates we saw included no organization details in the subject name while this one has a DN of  “CN = Terminal Services LS, L = Stradella, S = PVC =  IT, E = ater**@libero.it”
  3. The PKI was rebuilt in around 2004 – In the first certificates we saw the Microsoft operated CA that issued the “licensing server” (Microsoft Enforced Licensing Registration Authority CA) expired in 2009, in the most recent certificate the chain is a one leaf shorter but same logical entity expires in ‎2004.
  4. The format of the certificates changed around 2004 – In the first certificates we saw only a single proprietary extension, in this one we see four. This suggests that there was a code change in how licenses were validated around this time. This also approximately aligns with when one might expect the Windows Server development to begin for the next version (at the time development cycles were 3-4 years).
  5. Attackers were using this in combination with other attacks even then – In 2003 there was a security defect in the way Windows handled Basic Constraints (see MS 02-050). This allowed certificates that were not CAs to be trusted to issue CA certificates.

In this most recent certificate chain we can see that a Terminal Services LS (a license) has issued another certificate, the validity of this certificate was based on a combination of two factors:

    • The root CA being trusted by the entity validating the certificate.
    • The license certificate being able to issue other certificates.

This combination allowed the attacker to simply have access to a CAL, not even the License Server and produce a certificate that could be used to attack another system – I will do a post about this later.

  1. Terminal Services doesn’t do full certificate validation on its licenses – The post I got this certificate from states it works on machines as recent as 2008 but the “Terminal Services LS” expires in in 2004, the certificate that was minted bellow it (the evil “PORTATILE” one) is good until 2038. This would not pass the CryptoAPI certificate validation logic on those platforms due to the BasicConstraints restrictions on the certificate and the date issues alone.

So in summary, this issue has been around a very long time, much longer than Flame or Stuxnet. There are a few very interesting conclusions I come to as I write this:

  1. Even without the Basic Constraints issue no collision attack would have been needed to make use of this original Licensing PKI .
  2. Despite the changes in the client behavior in Windows Vista in how Critical Extensions are handled, the weak cryptography made it practical to continue to use this PKI maliciously.
  3. Those who knew about this tried to keep it a secret as long as possible as it was very valuable.
  4. Despite developing a world-class security program over the last decade this issue managed to fly under the radar until now.

MSRC 2718704 and Terminal Services License Certificates

When looking at the certificates the Flame authors used to sign the malware in their Windows Update module one has to wonder what was different between a normal terminal services license certificate and the one they used.

Well getting at one of these certificates is a little awkward as they are kept in the registry vs. in the traditional certificate stores, thank goodness with a little due diligence you can find one.

   
   

 

So from this what can we learn? There are a few things:

  1. Every subordinate CA has the same name “Microsoft LSRA PA”
  2. Every terminal services license has the same name “Terminal Services LS”
  3. The license appears to have a random serial number (even the CA certs did not!)
  4. The license has a 512bit RSA key.
  5. The license is signed with MD5.
  6. No restrictions have been explicitly placed on this certificate.
  7. This license is good for 257 days.
  8. The CRL url is invalid.
  9. The Certificate Issuer url is invalid.
  10. It contains a proprietary extension (1.3.6.1.311.18)

As far as best practices go there are clearly a lot of problems here:

  1. Same name for every CA – the point of a CA is to identify an entity, if every CA has the same name this can’t really happen.
  2. Same name for every subject – while the certificates here are for authenticating entitlement to make a connection and not to authenticate an identity the subject should still be unique.
  3. Over entitled– The certificate contains no Extended Key Usage or other restrictions, as a result this certificate is only as restricted as its issuers issuer is, namely it is good for:
    • Code Signing (1.3.6.1.5.5.7.3.3)
    • License Server Verification (1.3.6.1.4.1.311.10.6.2)
  4. Weak keysRSA 512 bit was factored in 1999, there should be no CA issuing 512 bit keys period especially in a PKI as powerful as this, especially for one that is good for code signing.
  5. Weak hash algorithm MD5 has been known to be weak since 1993 and has had increasingly effective attacks published against it ever since — There should be no CA issuing MD5 signed certificates in 2012.
  6. Invalid URLs – This indicates that the CA was not actively managed, it also included internal only file references in the certificate also a no-no and a indicator of the same.
  7. Poorly thought out proprietary extension – The extension is not well thought out, the OID (1.3.6.1.311.18) is an arc for Hydra (Terminal Services) and not a unique identifier (or shouldn’t be at least) for a specific extension. This isn’t harmful but a signal the design didn’t go through much review.

The other items are odd, like the 257 day validity window and the proprietary extension but they are not problematic – at least on the surface.

Well that’s what we know so far.

 

MSRC 2718704 and the Terminal Services Licensing Protocol

There has been a ton of chatter on the internet the last few days about this MSRC incident; it is an example of so many things gone wrong it’s just not funny anymore.

In this post I wanted to document the aspects of this issue as it relates to the Terminal Services Licensing protocol.

Let’s start with the background story; Microsoft introduced a licensing mechanism for Terminal Services that they used to enforce their Client Access Licenses (CAL) for the product. The protocol used in this mechanism is defined in [MS-RDPELE].

A quick review of this document shows the model goes something like this:

  1. Microsoft operates a X.509 Certificate Authority (CA).
  2. Enterprise customers who use Terminal Services also operate a CA; the terminal services team however calls it a “License server”.
  3. Every Enterprise “License server” is made a subordinate of the Microsoft Issuing CA.
  4. The license server issues “Client Access Licenses (CALs)” or aka X.509 certificates.

By itself this isn’t actually a bad design, short(ish) lived certificates are used so the enterprise has to stay current on its licenses, it uses a bunch of existing technology so that it can be quickly developed and its performance characteristics are known.

The devil however is in the details; let’s list a few of those details:

  1. The Issuing CA is made a subordinate of one of the “Microsoft Root CAs”.
  2. The Policy  CA  (“Microsoft Enforced Licensing Intermediate PCA”) on the Microsoft side was issued “Dec 10 01:55:35 2009 GMT”
  3. All of the certificates in the chain were signed using RSAwithMD5.
  4. The CAs were likely a standard Microsoft Certificate Authority (The certificate was issued with a Template Name of “SubCA”)
  5. The CAs were not using random serial numbers (its serial number of one was 7038)
  6. The certificates the license server issued were good for :
    • License Server Verification (1.3.6.1.4.1.311.10.6.2)
    • Key Pack Licenses (1.3.6.1.4.1.311.10.6.1)
    • Code Signing (1.3.6.1.5.5.7.3.3)
  7. There were no name constraints in the “License server” certificate restricting the names that it could make assertions about.
  8. The URLs referred to in the certificates were invalid (URLs to CRLs and issuer certificates) and may have been for a long time (looks like they may not have been published since 2009).

OK, let’s explore #1 for a moment; they needed a Root CA they controlled to ensure they could prevent people from minting their own licenses. They probably decided to re-use the product Root Certificate Authority so that they could re-use the CertVerifiyCertificateChainPolicy and its CERT_CHAIN_POLICY_MICROSOFT_ROOT policy. No one likes maintaining an array of hashes that you have to keep up to date on all the machines doing the verification. This decision also meant that they could have those who operated the existing CAs manage this new one for them.

This is was also the largest mistake they made, the decision to operationally have this CA chained under the production product roots made this system an attack vector for the bad guys. It made it more valuable than any third-party CA.

They made this problem worse by having every license server be a subordinate of this CA; these license servers were storing their keys in software. The only thing protecting them is DPAPI and that’s not much protection.

They made the problem even worse by making each of those CAs capable of issuing code signing certificates.

That’s right, every single enterprise user of Microsoft Terminal Services on the planet had a CA key and certificate that could issue as many code signing certificates they wanted and for any name they wanted – yes even the Microsoft name!

It doesn’t end there though, the certificates were signed using RSAwithMD5 the problem is that MD5 has been known to be susceptible to collision attacks for a long time and they were found to be practical by December 2008 – before the CA was made.

Also these certificates were likely issued using the Microsoft Server 2003 Certificate Authority (remember this was 2009) as I believe that the Server 2008 CA used random serial numbers by default (I am not positive though – it’s been a while).

It also seems based on the file names in the certificates AIAs that the CRLs had not been updated since the CA went live and maybe were never even published, this signals that this was not being managed at all since then.

Then there is the fact that even though Name Constraints was both supported by the Windows platform at the time this solution was created they chose not to include them in these certificates – doing so would have further restricted the value to the attacker.

And finally there is the fact that MD5 has been prohibited from use in Root Programs for some time, the Microsoft Root Program required third-party CAs to not to have valid certificates that used MD5 after January 15, 2009. The CA/Browser Forum has been a bit more lenient in the Baseline Requirements by allowing their use until December 2010. Now it’s fair to say the Microsoft Roots are not “members” of the root program but one would think they would have met the same requirements.

So how did all of this happen, I don’t know honestly.

The Security Development Lifecycle (SDL) used at Microsoft was in full force in 2008 when this appears to have been put together; it should have been caught via that process, it should have been caught by a number of other checks and balances also.

It wasn’t though and here we are.

P.S. Thanks to Brad Hill, Marsh Ray and Nico Williams for chatting about this today 🙂

Additional Resources

Microsoft Update and The Nightmare Scenario

Microsoft Certificate Was Used to Sign “Flame” Malware

How old is Flame

Security Advisory 2718704: Update to Phased Mitigation Strategy

Getting beyond 1% — How do we increase the use of SSL?

Today about 1% of the traffic on the Internet is protected with SSL (according to Sandyvine), there are a few key issues keeping this number so small and I thought I would put together a quick post on what I think those issues are.

 

Interoperability

For over a decade we have been working towards migrating to IPv6, despite that we have made little progress, in-fact they say that at the end of 2012 we will run out of IPv4 addresses.

As far as I know not one of the top 10 CAs support IPv6 yet (yes, not even GlobalSign though were working on it). This means it is impossible to host a pure IPv6 SSL solution today (because of the need for revocation data).

This is also interesting because today many sites are hosted on virutal hosting solutions that share the same IP address — this is primarily because IP addresses are a scarce resource, it has the side effect of making it hard (sometimes impossible) to deploy SSL on these hosts.

In 2003 an extension to TLS was proposed to address this problem it’s called Server Name Indication (SNI – now defined RFC 6066).

Today the server support for this extension is quite good but the same can not be said for client support (due to the lingering XP population and influx of mobile devices).

In my opinion this is the #1 issuing holding back the adoption of SSL everywhere.

 

Complexity

It is amazing to me but very little has changed in the CA industry since it’s birth in the mid 90s, certificates are still requested and managed in essentially the same way – it’s a shame, it’s wasteful.

One of the reasons I joined GlobalSign is they have been trying to address this issue by investing in both clients and APIs (check out OneClick SSL and CloudSSL) — with that said that there is still a lot more that can be done in this area.

Then there is the problem of managing and deploying SSL, the SSL Pulse data shows us it’s hard to get SSL configured right; we are getting better tools for this but again there is still a ton of room for improvement.

 

Performance

There has been a bunch of work done in this area over the years; the “solutions” relating to  performance of SSL seem to be broken up into:

  1. Protocol improvements (SPDYFalseStartOCSP Stapling, etc.)
  2. Using different cryptography to make it faster (Smaller keys, DSA, ECDSA, etc.)
  3. Using accelerator products (F5 BigIP, NetScaller, SSL Accelerators, etc.)

I won’t spend much time on protocol improvements as I think it gets a ton of coverage from the likes of Google who have made several proposals in this area over the last few years. I do have concerns with these protocol changes introducing interoperability issues, but I can’t argue with the performance benefits they offer.

You will notice I also included OCSP Stapling in this group, I think this is a great way to improve revocation checking but it’s not about security, it’s about performance and reliability – you should just use this today, it’s safe and very likely supported by your servers already.

The use of different cryptography is an interesting one, however again the issue of compatibility rears its ugly head. Though every implementation of an algorithm will perform differently the Crypto++ benchmarks are a nice way to get high level understanding of an algorithms performance characteristics.

There is a lot of data in there, not all of it related to SSL but one thing definitely is the performance characteristics of RSA vs DSA:

Operation Milliseconds/Operation Megacycles/Operation
RSA 1024 Signature 1.48 2.71
RSA 1024 Verification 0.07 0.13
DSA 1024 Signature 0.45 0.83
DSA 1024 Verification 0.52 0.94

 

You will notice that with RSA it is more expensive to sign than it is to verify, you will also notice that with DSA the opposite is true (it is also faster in this sampling).

Since in the case of SSL it is the server doing the signature and the client doing the verify this is an important fact, it means a server using a DSA certificate will spend less time doing crypto and more time doing other stuff like serving content.

On the surface this sounds great, there are of course problems with this though – for one because of the work researchers have done to “break” RSA over the last few years the browsers are moving CAs to not issue 1024bit RSA keys (by 2013) an effort which CAs have also applied to DSA.

Another not-so trivial factor is that Microsoft only supports DSA keys up to 1024 bits in length which means the larger DSA keys are not viable on these platforms.

So what of the new ciphers like AES and ECDH-ECDSA? This will represent a very large performance boon for web server operators but they too like SNI are not supported by legacy browsers.

What this means for you is for the next few years we have to make do with the “legacy cipher suites” as a means to facilitate TLS sessions.

Miscellaneous

Not everything fits neatly into the above taxonomy, here are a few common topics that don’t:

  1. Increased cost of operation
  2. Inability to do “legitimate” packet inspection

Increased cost of operation can be summerized needing more servers for the same load due to the increased SSL computational costs.

Inability to do “legitimate” packet inspection can be summarized as limiting he practical value of existing security investments of technologies like Intrusion Detection and Network Optimization since once the traffic is encrypted they become totally innefective. To work around this issue networks need to be designed with encryption and these technologies in mind.

 

Summary

I personally think the biggest barriers is ineroperability, the biggest part of this being the lingering XPs installations; the silver lining being the last few years XP has lost market share at about 10% per year, at the current rate we are about three years from these issues being “resolved”.

In the mean time there is a lot the industry can do on the topic of complexity, I will write more on this topic another time.

Browser Revocation Behavior Needs Improvement

Today the best behaving client for revocation behavior is that of Windows, in the case of browsers that means IE and Chrome.

With that said it has a very fundamental problem, if it reaches a CA’s OCSP responder and it provides an authoritative “that’s not mine” (aka Unknown) clients built on this platform treat the certificate as good.

You got that right; it treats a certificate that is clearly invalid as good! This unfortunately is a common behavior that all the browsers implement today.

The other browsers are even worse, Firefox for example:

  1. Do not  maintain a cache across sessions – This is akin of your browser downloading the same image every time you opened a new browser session instead of relying on a cached copy.
  2. Does OCSP requests over POST vs. GET – This prevents OCSP responders from practically utilizing CDN technology or cost-effectively doing geographic distribution of responders
  3. Do not support OCSP stapling – IE has supported this since 2008, Firefox even paid OpenSSL to add support around the same time but they have yet to get support in themselves.

These each seem like fairly small items but when you look at all these issues as a whole they significantly contribute the reality we face today – Revocation Checking isn’t working.

There are other problems as well, for example:

In some cases browsers do support GET as a means to do a OCSP request but if they receive a “stale” or “expired” response from an intermediary cache (such as a corporate proxy server) they do not retry the request bypassing the proxy.

All browsers today do synchronous revocation checking, imagine if your browser only downloaded one image at a time in series; that’s in-essence what the browsers are doing today.

These and other client behaviors contribute to reliability and performance problems that are preventing Hard Revocation Checking from being deployed. These issues need to be addressed and the browser vendors need to start publishing metrics on what the failure rates are as well as under what conditions they fail so that any remaining issues on the responder side can be resolved.

 

How to do OCSP requests using OpenSSL and CURL

 

It pretty easy, the OpenSSL and CURL manuals make it fairly easy but I thought I would put it all here in a single post for you.

First in these examples I used the certificates from the http://www.globalsign.com site, I saved the www certificate to globalsignssl.crt and its issuer to globalsignssl.crt.

Next you will find a series of commands used to generate both POSTs and GETs for OCSP:

1. Create a OCSP request to work with, this also will produce a POST to the OCSP responder

openssl ocsp -noverify -no_nonce -respout ocspglobalsignca.resp -reqout ocspglobalsignca.req -issuer globalsigng2.cer -cert globalsign.com.cer -url "http://ocsp2.globalsign.com/gsextendvalg2" -header "HOST" "ocsp2.globalsign.com" -text

2. Base64 encode the DER encoded OCSP request

openssl enc -in ocspglobalsignca.req -out ocspglobalsignca.req.b64 -a

3. URL Encode the Base64 blob after removing any line breaks (see: http://meyerweb.com/eric/tools/dencoder/ for a decoder)

4. Copy the Base64 into the URL you will use in your GET

http://ocsp2.globalsign.com/gsextendvalg2/{URL encoded Base64 Here}

5. Do your GET:

curl --verbose --url http://ocsp2.globalsign.com/gsextendvalg2/MFMwUTBPME0wSzAJBgUrDgMCGgUABBSgcg6ganxiAlTyqPWd0nuk87cvpAQUsLBK%2FRx1KPgcYaoT9vrBkD1rFqMCEhEhD0Xjo%2FV7lgq3ziGoWG69rA%3D%3D

 

If you like you can also re-play the request that was generated with OpenSSL as a POST:

curl --verbose --data-binary  @ocspglobalsignca.req -H "Content-Type:application/ocsp-request" --url http://ocsp2.globalsign.com/gsextendvalg2

Hard revocation checking and why it’s not here yet.

If you follow discussions around x.509 and SSL you have likely heard that “Revocation Checking is Broken”, you might even hear it will never work therefore we should start over with a technology that isn’t dependent on this concept.

There are some merits to these arguments but I don’t agree with the conclusion, I thought I would summarize what the problems are in this post.

Fundamentally the largest problem is that, as-deployed, all x.509 revocation technologies introduce a communication with a third-party (the Certificate Authority).

This isn’t necessarily a deal breaker but it does have consequences, for example in the case of SSL:

  1. It can slow down the user’s experience.
  2. It introduces a new point of failure in a transaction.

These issues can be mitigated through intelligent deployments and engineering but unfortunately this really has not happened, as a result Browsers have implemented what is called “Soft-fail revocation checking”.

With soft-fail revocation checking browsers ignore all conditions other than an authoritative “revoked” message, in the case of OCSP that means if they reach the responder and it says “I don’t know the status” or if it fails to reach the responder it assumes it is “good”.

This behavior is of course fundamentally flawed, the Browsers say they have no choice (I disagree with this conclusion but that’s a topic for another post) other than to behave this way, but why?

The rational is as follows:

  1. Revocation repositories are not reliable.
  2. Revocation repositories are slow.
  3. Revocation repositories are not always available (captive portals).
  4. Revocation messages are too large to be returned in time.
  5. There are too many revocation messages to be returned in time.

These are all legitimate concerns, ones that are unfortunately as true today as they were almost a decade ago.

They are not however insurmountable and I think it’s time we as an industry did something about it.

SSL/TLS Deployment Best Practices

SSL/TLS seems simple, you go to a CA to prove who you are they give you a credential, you install it on your server, turn on SSL and then you are done.

Unfortunately there is more to it than that, I recently had an opportunity to contribute to a Best Practices Guide (PDF)  that aims to provide clear and concise intructions to help administrators understand how to people deploy it securely.

The intention is to work on an advanced version of this document in the future that covers more details and advanced topics as well (think OCSP Stapling, SPDY, etc).

I hope you find it useful.