Why ‘Root Cause’ Is the Wrong Question to Ask

Why ‘Root Cause’ Is the Wrong Question to Ask

Challenging our fundamental bias for simplicity in a complex world.

The conference room smelled like stale coffee and fear. Mark, the VP from corporate, leaned forward, his knuckles white on the mahogany table. It was the third time he’d asked the same question, his voice getting quieter and more pointed each time. “I understand the contributing factors. I’ve seen the chart. But I need to know the root cause. What was the one thing?”

Silence. You could feel the collective mental energy of twenty engineers and project managers trying to translate a complex, multi-faceted system failure into a single, digestible, blame-friendly noun. It was impossible. It was like trying to identify the root cause of a traffic jam by pointing at a single brake light.

Insight: This wasn’t a domino chain; it was a web, and it had vibrated in just the right way to collapse.

Simple Cascade

Interconnected Web

We love simple stories. We are hardwired for them. A beginning, a middle, an end. A hero, a villain, a conflict. And when something goes terribly wrong, we desperately crave a villain. A single, identifiable culprit allows us to feel a sense of control. If we can find the bad part, the lazy person, the broken rule, we can fix it, punish it, or replace it. Then we can all sleep at night, assured that the monster has been slain.

“The problem is, in any system built with more than three moving parts, the monster is never a single thing. The monster is the system itself, waiting for the stars to align.”

The Conspiracy of Conditions

Just this morning, I was locked out of a critical server for 43 minutes. Why? I typed the password wrong. That’s the simple story. That’s the root cause an executive like Mark would love. “Employee error. Cause: user incompetence. Solution: be more careful.” But what’s the real story?

Contributing Factors to Server Lockout:

Policy

Hardware

Urgency

Typing

The password was a 23-character string of random letters, numbers, and symbols, mandated by a security policy written by people who don’t have to type it 13 times a day. The keyboard on my laptop has a sticky shift key. I was trying to respond to an urgent request that had come in at 3 AM, so my focus was already frayed. The system’s lockout policy is brutally unforgiving, with no gradient for a simple typo versus a genuine hacking attempt. So, what was the root cause? My typing? The policy? The hardware? The urgency?

Yes. All of them. There was no single cause; there was a conspiracy of conditions.

I used to be a firm believer in finding the breaking point. I thought it was a sign of intellectual rigor to drill down past the symptoms to find the one thing that started the cascade. I argued for it. I insisted on it in my own post-mortems. I now realize I was just looking for comfort, not truth. The truth is messy. The truth is that most disasters, big and small, are preceded by a series of tiny, almost invisible failures. A sensor that’s off by 0.3%. A line of code with a minor, edge-case bug. A technician who skipped step 13 of a 33-step checklist because they’d done it a thousand times before. An operator who is tired from a double shift. None of these things on their own are catastrophic. They are latent failures, sleeping dangers. They are holes in the layers of Swiss cheese we call safety protocols. On most days, the holes don’t align. The cheese holds. But one day, through sheer random chance, the holes line up perfectly, and something sails right through all your defenses.

“Holes in the layers of Swiss cheese… one day, they line up perfectly.”

That’s not a story. It’s a condition.

Grief and the Tapestry of Reality

My friend Noah N. is a grief counselor. His job is to sit with people in the wreckage of their lives. He told me the single biggest obstacle to healing is the search for a root cause. The husband who replays the argument before his wife’s car crash. “If only I hadn’t said that, she wouldn’t have left angry.” The mother whose child had a rare allergic reaction. “If only I had chosen a different snack.” They are trying to find the one domino they could have pulled from the line to prevent the whole thing from falling. Noah says his work isn’t to find them an answer, but to help them accept that there isn’t one. The world is too complex for that.

“Grief, like a system failure, is the result of a thousand threads, not a single snapped cord. To obsess over one thread is to ignore the entire tapestry.”

It’s a form of narrative bias that protects us from the terrifying randomness of reality, but it also prevents us from truly seeing the whole picture and learning from it.

When you stop hunting for a culprit, you start seeing the connections.

You see that the flawed inspection protocol and the poorly designed user interface are part of the same organizational mindset. A mindset that prioritizes speed over thoroughness, or cost savings over human factors engineering. You see how a series of small budget cuts over 3 years created a culture where people were afraid to report near-misses for fear of being blamed. These aren’t root causes. They are the fertile soil in which root causes grow. Looking for a single broken part is like blaming a single weed for a field of them; you miss the fact that the soil itself is the problem.

Fragility in Logistics

This is especially true in logistics, where the number of handoffs creates an exponential number of potential failure points. A container is scanned incorrectly at the port. That’s a small error. But it’s loaded onto the wrong chassis because the driver is working with a manifest that’s 3 minutes out of date. Then it gets routed to the wrong warehouse, causing a 13-hour delay for a critical manufacturing part, which in turn costs the client $373,000. What was the root cause? The scanner? The data lag? The driver?

1

Container Scanned Incorrectly

2

Loaded Onto Wrong Chassis

3

Routed to Wrong Warehouse

Costs Client $373,000

The real issue is the fragility of the connections between those steps.

A truly competent intermodal drayage services provider understands that their piece of the puzzle is connected to 23 other moving parts, and they build resilience at those connection points.

I am not saying that individuals are never accountable. People make mistakes. Some mistakes are negligent. But accountability is not the same as causation.

Blaming a Person

Easy. People can be fired. Simplistic comfort.

VS

Blaming the System

Hard. Requires redesign. True transformation.

Blaming the system is hard, because systems have to be redesigned, rethought, and rebuilt. It requires humility. It requires admitting that our creation is flawed and that we don’t have total control.

The Only True Answer

So back in that conference room, when Mark asked for the third time, the lead engineer, a quiet woman named Maria, finally spoke up. She didn’t give him a root cause. She didn’t blame a person or a part.

The Solution: A Process-Oriented View

18 Months of Slow Erosion & 13 Decisions

“Mark, the cause was a slow erosion of our safety margins over eighteen months. We can give you a timeline showing the 13 decisions that led here. There is no one thing to fix. We have to fix the process that allowed us to make those 13 decisions.”

It wasn’t the answer he wanted. It was simple, but not simplistic. It was the only answer that was true.

Embracing complexity is the first step towards true understanding and lasting solutions.