Monthly Archives: November 2024

From Fairways to the Cloud: Estimating Golf Balls in Flight to Tackling Cloud Workload Scale

Early in my career, I worked in quality assurance at Microsoft, analytical skills were a core trait we tried to hire for, at the time  “brain teasers” were often used in interviews to assess these skills. One memorable interview question went like this:

“How would you figure out how many golf balls are in flight at any given moment?”

This question wasn’t about pinpointing the exact number; it was a window into the candidate’s analytical thinking, problem-solving approach, and ability to break down complex problems into manageable parts. It was always interesting to see how different minds approached this seemingly simple yet deceptively complex problem. If a candidate wasn’t sure how to begin, we would encourage them to ask questions or to simply document their assumptions, stressing that it was the deconstruction of the problem—not the answer—that we were looking for.

In engineering, we often need to take big, abstract problems and break them down. For those who aren’t engineers, this golf ball question makes that process a bit more approachable. Let me walk you through how we might tackle the golf ball question:

  1. Number of Golf Courses Worldwide
    • There are approximately 38,000 golf courses globally.
  2. Players and Tee Times
    • On average, each course hosts about 50 groups per day.
    • With an average group size of 4 players, that’s 200 players per course daily.
  3. Shots Per Player
    • An average golfer takes around 90 shots in a full round.
  4. Total Golf Balls in Play
    • 200 players × 90 shots = 18,000 shots per course per day.
  5. Time a Golf Ball Is in the Air
    • Let’s estimate each shot keeps the ball airborne for about 5 seconds.
  6. Calculating Balls in Flight
    • Over 12 hours of playtime, there are 43,200 seconds in a golfing day.
    • Total airborne time per course: 18,000 shots × 5 seconds = 90,000 seconds.
    • Average balls in flight per course: 90,000 seconds ÷ 43,200 seconds2 golf balls.
  7. Global Estimate
    • 2 balls per course × 38,000 courses = 76,000 golf balls in flight at any given moment worldwide.

This exercise isn’t about precision; it’s about methodically breaking down a complex question into digestible parts to arrive at a reasonable estimate. As the saying goes, all models are wrong, but some are useful. Our goal here is to find a stable place to stand as we tackle the problem, and this question does a decent job at doing that, if nothing else, letting us see how a candidate might approach unknown topics.

Transitioning from the Green to the Cloud

Today, the biggest challenges in cloud workload identity management remind me of these kinds of problems—except far more complex. Unlike in a round of golf, most workloads aren’t individually authenticated today; instead, they rely on shared credentials, essentially passwords, stored and distributed by secret managers, and anything needing access to a resource must have access to that secret. 

But with the push for zero trust, rising cloud adoption, infrastructure as code, and the reality that credential breaches represent one of the largest attack vectors, it’s clear we need a shift. The future should focus on a model where every workload is independently authenticated and authorized.

So, let’s put the “golf balls soaring through the air” approach to work here, using the same framework to break down the cloud workload scale:

  1. Global Cloud Infrastructure
    • Major cloud providers operate data centers with an estimated 10 million servers worldwide.
  2. Workloads Per Server
    • Each server might run an average of 100 workloads (virtual machines or containers).
    • 10 million servers × 100 workloads = 1,000 million  (1 billion) workloads running at any given time.
  3. Ephemeral Nature of Workloads
    • Let’s assume 50% of these are ephemeral, spinning up and down as needed.
    • 1 billion workloads × 50% = 500 million ephemeral workloads.
  4. Workload Lifespan and Credential Lifecycle
    • If each ephemeral workload runs for about 1 hour there are 24 cycles in a 24-hour day.
    • 500 million workloads × 24 cycles = 12 billion ephemeral workloads initiated daily.
  5. Credentials Issued
    • Each new workload requires secure credentials or identities to access resources.
    • This results in 12 billion credentials needing issuance and management every day.
  6. Credentials Issued Per Second
    • With 86,400 seconds in a day:
    • 12 billion credentials ÷ 86,400 seconds138,889 credentials issued per second globally.

In this updated example, just as with the golf balls in flight question, we deconstruct a complex system to better understand its core challenges:

  • Scale: The number of workloads and credentials needed to achieve this zero-trust ideal is much higher than we would need to simply pass around shared secrets.
  • Dynamics: These credentialing systems must have much higher availability than static systems to support the dynamism involved.
  • Complexity: Managing identities and credentials at this scale is a monumental task, emphasizing the need for scalable and automated solutions.

Note: These calculations are estimates meant to illustrate the concept. In real-world cloud environments, actual numbers can vary widely depending on factors like workload type distribution, number of replicas, ephemerality of workloads, and, of course, individual workload needs.

Conclusion

This exercise demonstrates a fundamental point: analytical thinking and problem-solving are timeless skills, applicable across domains.

You don’t need to be an expert in any given system to get started; you simply need to understand how to break down a problem and apply some basic algebra.

It also serves as a way to understand the scope and scale of authenticating every workload to enable zero trust end-to-end. Simply put, this is a vastly different problem than user and machine authentication, presenting a unique challenge in managing identities at scale.