The Hidden Cost of a Single Click

I type how I talk, and I tell stories in the voice of the time. It was a different time in corporate IT and a different attitude with the Microsoft stack ranking system. In 2007 I interviewed for the ACE Team at Microsoft and one of my interviewers was Roger Grimes. He asked me “How does EFS work?” in reference to the Encrypting File System feature in Windows Explorer.

EFS is still a feature in Windows today though it is not embraced as widely as it was in the Windows 2000-XP days. In 2001 when I started at Microsoft, it was the only native file encryption capability and it had a lot of flaws. And I received a majority of the call volume for the product from 2001-2005 as a Support Engineer. Roger’s question wasn’t fair. And I probably hadn’t marketed myself enough. By sheer occupational accident, I was among the world’s experts in the feature. In my competitive mind in trying to get the position with ACE, I had to approach the question with a bit of fury. In my mind, he asked Picasso how to paint.

For most end-users, EFS is one checkbox.

What exactly happens when you check that checkbox (and hit OK) isn’t trivial. EFS touched several components; Windows Explorer, NTFS, Data Protection API, Crypto API… There were several different transparent actions that would occur. This is why the guy taking the most phone calls and closing the most cases happened to know the most stuff. The people writing the code were only concerned about their one piece. The end user sees one checkbox. I got every error message and an awareness of every shortfall of the product. I knew where the API failed. Because I had to close the most cases.

Fortunately, I was able to be authoritative in my answer enough for Roger and team to bring me on board (and coincidentally give someone named Mark Cooper a knowledge monopoly in the organization I departed). One of my key contributions on ACE was co-authoring the Securing PKI Whitepaper (aka.ms/securingpki) and while it was written for Windows Server 2012 many of the concepts will extend far beyond the lifetime of ADCS and well into the next generation of PKI. I tried very specifically to make my parts of the paper as vendor and PKI agnostic as possible. Yes, I worked for Microsoft, but x.509 is x.509.

One part I authored I am particularly proud of is a graphical view of different types of CA compromise and example remediation steps. Here is a graphic of Server OS Compromise.

Source: Securing Public Key Infrastructure (aka.ms/securingpki)

In today’s world, a majority of cyber attacks fall into a single lane…

The CA attack scenario most likely to occur. Online + Remote/Vulnerability + Compromised Key

A lot of these scenarios are still very realistic, but the path of least resistance to most attackers is to use a vulnerability to access the CA and then gain control of the ability to issue certificates. The vulnerability could be environmental or end-user, it does not matter how the attacker got in, the damage can now be done.

I only had enough space in the graphic to suggest a handful of remediation actions. “Patch exploited vulnerability” is a loaded phrase. It could mean “require IPSec for RDP to the CA” or it could mean “Update Windows” or it could mean “Implement a solution for secured administration” – one line item could be a 6 month project. The bigger line item is a single click (technically there are two actions, but the main click is below) where I don’t think anyone who hasn’t been through it will understand what happens when it does.

“Revoke CA certificate and publish Root CRL” is a LOADED phrase.

Consider what happens when you click on the certificate to revoke it. Absolutely nothing. And for some CAs, that revocation may not be felt for a very long time. Months or possibly more. Because the action of flagging the certificate for revocation doesn’t do anything functionally to end users. In fact that action would occur in this scenario on an airgapped or offline server.

Publishing a CRL (and ensuring the CRL Distribution Points are updated) is a loaded action.

One of the first things you have to remember is that PKI is always a race. How long can a perceivably unique key last relative to the technology that is constantly improving that can be used to make that key not unique anymore? That is the reason we put time limits on certificates and CRLs. Cryptography strength vs. Emerging Processing Capability. And Root CAs often only issue 1-2 active keys at any given time so the risk is lower and the CRL doesn’t need to be refreshed every day or every week in most situations.

So suppose the CRL for the Root that was current before the revocation action occurred has six months of validity left. When you replace the CRL with a new CRL with a year left over, you are just updating the end date, potentially adding certificates to the list, right? Unfortunately, that’s not how it works from an end-user perspective. How does it work? Sadly, it depends on the end-user device or even the application the end-user is running. Some applications will cache the original time-valid CRL and not fetch a new CRL until there is a small amount of time left over on the original CRL. In that case, the revocation action may not be felt for another six months. Some applications might fetch the new CRL immediately, rendering the revoked CA and all of its issued certificates invalid. Some applications won’t check for a new CRL and continue to honor certificates issued using the compromised key. And some applications might need Windows to restart. If the old CRL is distributed using Group Policy… more uncertainty.

The action of revoking a CA isn’t exactly a “measure once, cut once” proposition. I am willing to bet there are statistically zero organizations that truly understand what would truly happen…

And this assumes you have a plan in place to replace the certificates for end-entities or that plan is in motion when you are performing the revocation. If you truly have an attacker in your environment, they are very likely to open back doors and scatter at any sign of remediation. Your new CA could be compromised before the old CA is gone. Revoking a CA needs to be a finely-choreographed process. It needs to be part of your disaster recovery drilling.

Fortunately, PKI Solutions can help in several different ways. If you are concerned about PKI-based attacks, PKI Spotlight is the premier tool for detecting security weakness as well as gaining a picture of your environment to assess the repercussions of an action like large-scale revocation. PKI Solutions also offers professional services which can help you prepare for disaster recovery drilling and we can be leveraged to assist organizations who have identified their PKI has been compromised.

Let us know how we can help!

About Shawn Rabourn

Chief Technology Officer at PKI Solutions. I have two decades of full-range information security and identity management experience in engineering, design, and architecture roles. My background includes time in the trenches with Azure, Active Directory, Certificate Services/Public Key Infrastructure, Identity Management, Enterprise Governance and Risk Management, Business Continuity, and Compromise Response.

Leave a Comment





This site uses Akismet to reduce spam. Learn how your comment data is processed.