scissors in an operating room

The operation was successful but the patient is dead — A guide on balancing systems and outcomes

|

|

12–18 minutes

read

The companion piece to Let it run, this guide adds the critical counterweight: running a system is not the same as running it toward something. For hospital leaders, it covers Goodhart’s Law, Campbell’s Law on metric gaming, Deming’s drive-out-fear principle, Edmondson’s psychological safety, and Mauboussin’s skill-luck continuum — applied to healthcare.

The provocative title of this article originates from surgical medicine. It describes a procedure that, by every technical standard, went exactly as it should have — correct incisions, correct sutures, correct sequence. The team followed protocol to the letter. And yet the patient died. The operation succeeded. The patient did not survive.

It is a disturbing phrase. And it is an increasingly recognisable pattern in hospital administration.


The previous piece in this series made the case for systems — structured, recurring processes with a named owner, defined steps, a set cadence, and a feedback loop. It argued that the question, once a system is live, should be “is the system running like we wanted it to?” and not “are the results here yet?” Gestation takes time. Let it run.

That advice holds. But it needs a companion. Because a system that runs perfectly while changing nothing has not succeeded. It has merely consumed resources with great discipline.

Last time, we left Dr. Sharma having done something unusual for a hospital administrator: he had stopped chasing referrals and started building a system. His GP liaison officer had a visit schedule, a follow-up template, a tracking sheet, and a named owner. The question Dr. Sharma had committed to asking was not “are referrals up?” but “is the system running as designed?” He had decided, wisely, to let it run.

Three months in, the system is running. Now comes the harder question.

The question is not only whether the system is running. It is also whether it is running toward anything.

Three ways a system can fail you while appearing to succeed

The most dangerous systems are not the ones that obviously break down. Those are easy to diagnose. The dangerous ones are the systems that keep running — meetings are held, reports are filed, boxes are checked — while the outcomes they were designed to produce quietly fail to materialise. There are three distinct failure modes that produce this pattern.

Judging too early

I covered this in the previous article. Systems that address Trust and Awareness — building GP referral relationships, establishing a reputation in the community, creating content that educates prospective patients — operate on long gestation cycles. You plant in one season and harvest in another. The temptation to declare a system broken because results have not arrived in week six is a reflex, not a diagnosis.

Measuring the wrong thing

In 1975, British economist Charles Goodhart observed something that has since become one of the most useful principles in management: “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” The anthropologist Marilyn Strathern later gave it a cleaner form: “When a measure becomes a target, it ceases to be a good measure.” This is Goodhart’s Law.

The failure mode works like this. A hospital wants to improve GP referrals — a genuine outcome. So they begin tracking the number of GP visits their liaison team makes each month. For a while, this works. More visits, more relationships, more referrals. Then the measure becomes a target. The liaison team, under pressure to hit their visit numbers, starts scheduling short drop-by visits to clinics they already know well. The count looks good. The relationships being built are shallow. Referrals plateau.

The system is running perfectly, as measured. The outcome is stagnating. The measure has become the goal.

Goodhart’s Law is not an argument against measurement — it is an argument for choosing measures that remain honest under pressure, and for watching what behaviour the measure is actually incentivising.

What Goodhart describes passively, Campbell’s Law describes actively. The sociologist Donald T. Campbell articulated a related principle in 1979: “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”

Where Goodhart observed that measures degrade over time, Campbell predicted that people would actively game them — especially when the stakes are high. The distinction matters in practice.

  • Goodhart’s version is drift: the liaison team schedules easy visits to familiar clinics without intending to deceive anyone — they are simply optimising for what gets counted.
  • Campbell’s version is deliberate gaming: the liaison logs an inbound referral call — one the GP clinic initiated — as a successful outreach touchpoint. The referral count rises. No outreach was done. Or: a ward team adjusts the recorded discharge time to meet a bed turnover target, while the patient physically leaves at the same hour they always did. The metric moves. The throughput does not.

In management theory, this connects to what is called the principal-agent problem: the agent (the person running the system) has access to information the principal (the hospital leader) does not, and when incentives are misaligned, the agent acts in their own interest rather than the principal’s.

Measurement without verification creates the gap. Performance pressure fills it with gaming.

None of this makes the people gaming systems bad actors. They are, in most cases, rational actors responding to the incentives in front of them.

Deming’s 94% rule applies here too — the system created the conditions; the person found the path of least resistance. The corrective is structural, not moral: design measures that are harder to game, pair quantitative metrics with qualitative spot-checks, and reduce the pressure that makes gaming feel like the rational choice.

Treating system design as fixed

The standard operating procedure (SOP) you wrote when the system launched was a hypothesis, not a verdict. It was your best guess, given what you knew at the time, about how to produce a particular outcome. Evidence from the real world is now flowing back. If evidence suggests the hypothesis needs updating, the system must be updated.

The failure mode here is common in organisations that have invested heavily in process documentation: the SOP becomes scripture. Updating it feels like admitting the original design was wrong, which feels like failure. So the system runs. The SOP stays unchanged. The evidence accumulates on the desk of the system owner, unacted on.

Consider a post-discharge follow-up system. The SOP specified a phone call to every patient 48 hours after discharge — a reasonable design, written at a desk, based on how follow-up had always been done. After two months, the data showed callback rates under 20%. Patients were not answering numbers they did not recognise. The system owner reported this at the monthly review. The SOP was revised: a WhatsApp message first, identifying the hospital and the purpose, followed by a call if there was no response within 24 hours. Callback rates climbed to 65%.

What had changed? Not the goal — post-discharge contact to reduce readmissions and build patient trust. Not the owner, not the cadence. Only the method, updated in light of evidence. The revision did not mean the original design was wrong. It meant the system was doing what a well-designed system should do: learning. The team that resists this revision — because the SOP was agreed upon, because changing it implies someone was mistaken, because revision creates paperwork — will watch their callback rate stay at 20% indefinitely, in full compliance with their process.

All SOPs must be treated as living documents. A living document is one that changes when the world changes. An SOP that has never been revised is a museum piece, not a management tool.

The review meeting is doing more work than you think

Before getting to the balance framework, there is a structural issue worth naming — one that often makes all three failure modes worse: the way most system review meetings are run.

The default format is something like this. The system owner presents their numbers. The leader listens. Questions are asked. The tone is evaluative. Somewhere in the room, performance pressure is present — perhaps openly, perhaps not. The system owner, sensing this, makes the numbers look as good as they reasonably can.

Once again, it was W. Edwards Deming, the management theorist whose thinking shaped the quality revolution in post-war Japanese manufacturing and much of modern operations management, identified this dynamic as one of the most destructive forces in any organisation. His eighth point for management: drive out fear. His observation, supported by decades of data: when fear governs behaviour, the data becomes unreliable — not because people are dishonest, but because honesty feels dangerous.

A review meeting that subtly places blame on the system owner for outcomes that the system, not the owner, is responsible for is solving the wrong problem.

Amy Edmondson at Harvard Business School arrived at a related insight through a different route. In her landmark 1999 study of hospital nursing teams, she found something counterintuitive: higher-performing teams reported moremedication errors than lower-performing onesNot because they made more errors — but because they were psychologically safe enough to report them honestly. Teams with high psychological safety brought problems to the surface. Teams without it kept problems hidden until they were impossible to ignore.

The implication for system review meetings is direct. If the meeting feels like an appraisal — of the system owner, of their commitment, of their competence — you will not get as much honesty.

The alternative is for the leader to sit, figuratively, on the same side of the table as the system owner. Not evaluating the person, but examining the system together. The question becomes what we can learn, not who is responsible.

The balance framework: holding process and outcomes separately

The core discipline of this article can be stated plainly. Process compliance and outcome accountability are two different questions, and they need to be tracked separately, with different cadences and different emotional registers.

Robert Kaplan and David Norton introduced the Balanced Scorecard in the Harvard Business Review in 1992 to address a related problem: organisations that managed only their financial results — what had already happened — with no visibility into the drivers of those results, meaning what was happening now that would produce results later.

  • Their framework distinguished between lagging indicators (outcomes: revenue, patient volume, market share) and leading indicators (process health: GP relationships built, referrals initiated, satisfaction scores).
  • The insight was that leading indicators are more actionable and more honest — they tell you what you can still influence, not just what has already occurred.

The process/outcome separation this article is describing follows the same logic. Process metrics are leading indicators: the GP liaison made eight clinic visits this week, the follow-up call was made within 24 hours, the patient satisfaction survey was sent within 48 hours of discharge. These tell you whether the system is running. Outcome metrics are lagging indicators: referral volume, bed occupancy, patient acquisition cost. These tell you whether the system is working.

Tracking them separately gives you three possible diagnoses.

  1. The system is running and outcomes are moving: the system is working. Maintain it.
  2. The system is running but outcomes are not moving: the system may need revision, or the gestation period has not elapsed. Investigate without panicking.
  3. The system is not running: this is a compliance problem. Fix the system first before drawing any conclusions about outcomes.

Counting chickens before they hatch

The more dangerous direction, however, is the one that receives less attention: Just because results are coming does not mean the system is responsible for them.

And even when the system is responsible, that does not mean you understand why it is working — which means you cannot reliably replicate or scale it.

Annie Duke, in her book Thinking in Bets, gives a name to a cognitive error she calls resulting: the tendency to judge the quality of a decision — or in this case, a process — by the quality of its outcome.

The error runs in both directions. A good process can produce a bad outcome because of factors outside your control. A bad process can produce a good outcome for the same reason. Treating either result as a verdict on the process is a thinking error, not a management methodology.

Michael Mauboussin, in The Success Equation, frames this through what he calls the skill-luck continuum. Outcomes in most real-world domains — business, investing, marketing, sales — contain a mix of skill (what your process contributes) and luck (what the environment contributes). In domains with significant luck, a single good outcome tells you very little about process quality, and neither does a single bad one. What you are looking for is a consistent signal across many cycles, not a verdict from any one.

The practical implication: when results arrive, ask whether the system deserves credit for them. When results do not arrive, ask whether the system deserves the blame. The answer is almost always “partially” — and the discipline is in figuring out which part.

Dr. Sharma’s diagnostic

Three months into running his GP liaison system — weekly visit schedule, templated follow-up messages, progress tracked in a shared sheet — Dr. Sharma called a review meeting with his team.

The numbers were encouraging. Referral volumes from the GP network had risen 18% over the quarter. The team was energised. Someone suggested they had “cracked the code” on referrals.

Dr. Sharma asked a question that changed the room: “Which part of the system produced that 18%?”

Nobody could answer cleanly. The team had been consistent — visits were made, follow-ups sent, tracking maintained. But a large clinic three kilometres away had also closed in February, and their patient load had redistributed across the catchment area. Some portion of the 18% was the system. Some portion was the clinic closure. The ratio was unknown.

Dr. Sharma did not deflate the achievement. He made two notes. First: the system had run as designed — that was genuinely good, and worth holding onto. Second: before assuming the current cadence and template were optimal, they needed more cycles, more data, and a cleaner signal. The next quarter would not have the clinic closure as a variable. That would be the real test.

This is the discipline that separates a systems thinker from someone who merely runs systems. The distinction between “the system ran” and “the system worked” — and between “results arrived” and “the system deserves credit” — is not pedantry. It is the only honest basis for deciding whether to scale a system, revise it, or leave it alone.

The practical test

When reviewing a running system, there is a question worth asking before all others:

“Is this system doing what we designed it to do — and how would we know if it wasn’t?”

This question has two parts for a reason.

  • The first part checks compliance. The second part checks whether your measurement infrastructure is honest.
  • If you cannot answer the second part — if the system could be running perfectly, or running quietly off the rails, and you would not be able to tell the difference from your current data — then the first thing to fix is the visibility, not the system.

Toyota’s engineers call this going to the gemba — the Japanese word for the actual place where value is created. The principle, attributed to Taiichi Ohno and captured in the phrase genchi genbutsu (“go and see for yourself”), is that conclusions drawn from reports and dashboards are always one step removed from reality. The system review should include some direct observation of the system in action: sitting in on a GP liaison visit, reading a sample of the follow-up messages, talking to the receptionist who logs the referrals. Data tells you what happened. The gemba tells you why.

The system review is not complete until both parts of that question have been answered.

Systems are tools, not trophies

This series began with a problem: hospitals that invest in clinical capability but do not see it translate into patient volume. The JTBD framework gave us a way to diagnose which stage of the patient journey was creating friction. The previous article gave us a way to address that friction through systems — structured, recurring processes that compound over time rather than reset with every isolated effort.

This article adds the final discipline: the system is not the point. The patient is.

A system that runs impeccably but moves no patients closer to care has failed its purpose. It has become a trophy — something the organisation can point to as evidence of operational discipline, while the gap between capability and patient volume quietly remains.

The surgical metaphor in the title is not an indictment of process. Surgical protocols save lives. It is a reminder of what process is for. The procedure serves the patient. The system serves the outcome. When they diverge — when the operation succeeds and the patient is not helped — the operation has failed, however clean the incision.

Build systems. Let them run. And keep asking what they are running toward.


If this was useful, there’s more where it came from.

I’m Aviral. I help Indian healthcare organisations grow and run better, by putting the right systems in place. Subscribe to stay updated.

Leave a Reply

Discover more from Aviral Prakash

Subscribe now to keep reading and get access to the full archive.

Continue reading

I write about the business of medicine - how healthcare practices get built and run better.

Subscribe if this is useful to you.