I watched a term sheet drop by $8M in real time.

The company had a 4-hour outage three weeks before due diligence. Pretty bad, but not catastrophic. They fixed it, wrote the postmortem, moved on.

Then the investor’s technical team asked: “What was the total financial impact?”

They didn’t have a number. Nobody had calculated it.

$8M haircut. Four hours of downtime.

The number you don’t know

I ask every CTO I work with: what does one hour of downtime actually cost you?

Most don’t know. The ones who guess are usually off by 3x.

Here’s the formula they should be running:

hourly_cost = (
    lost_transactions_per_hour
    + (engineering_hourly_rate * engineers_in_war_room)
    + support_ticket_surge_cost
    + sla_penalty_per_hour
    + (enterprise_deal_value * probability_deal_goes_cold)
)

I’ve seen one bad quarter tank a Series C. The company had 99.9% uptime. Sounds great. That’s 8.7 hours of downtime per year. At their scale, those 8 hours cost $2.1M.

Your Series B was probably $15-30M. Do the math.

The costs that don’t show in dashboards

MTTR is a nice metric. Clean. “We fixed it in 47 minutes.” Board nods, moves on.

But I’ve seen what happens after you close that incident.

That enterprise prospect who saw your status page? They went cold. You’ll never know why. Their procurement lead will just say “we’re going another direction.” Three months of pipeline, gone. It won’t show up in any postmortem.

The customer who experienced your outage doesn’t churn that week. They churn when their contract is up. By then you’ve forgotten about the incident. They haven’t.

SOC2 auditors ask about incident frequency. Your cyber liability insurance tracks every major outage. I’ve seen premiums jump 40% after a bad year. That cost hits 14 months later, and nobody connects it back to that Thursday afternoon when everything went sideways.

Engineers talk. “Oh, that company? Down every other week.” Good luck hiring senior talent when your Glassdoor mentions oncall burnout.

The 60% problem

Only 40% of companies can produce a financial impact report after an outage.

The other 60% are guessing. Or worse, hoping the board doesn’t ask.

Here’s the thing about hidden costs: they compound. The enterprise deal that went cold leads to missed targets leads to lower valuation leads to harder fundraise. One outage, rippling for 18 months.

What actually works

Stop treating reliability as a technical problem.

First, know your number. Calculate the actual cost of your last 5 incidents. Include everything. The delayed effects. The opportunity cost of engineers who should be building features. The deal that went quiet.

Second, report in dollars, not percentages. “99.9% uptime” means nothing to your board. “$340k incident costs this quarter” means everything.

Third, invest before the bad one happens. Every company has a really bad outage eventually. The question is whether you catch it in staging or production. Whether it’s 10 minutes or 10 hours. Whether you have traces that show you exactly what broke or you’re grepping logs at 3am.

Your next outage is coming. The only question is how much it costs you.

I put together a framework for calculating these hidden costs at reliability costs. Run your numbers before someone else does.

— Youn