A Series C VP of Eng forwarded me their Datadog invoice last month with one line. “explain this.”
Logs were 60% of the bill. They swore they hadn’t changed their logging volume. The bill just kept climbing anyway.
So I walked them through how log billing actually works. Turns out they’d been paying twice for the same log line and didn’t know it.
Two meters, not one
Datadog logs run on two separate meters.
First you pay to ingest. Roughly $0.10 per GB of logs sent. That’s the part everyone knows about.
Then, if you want those logs to be searchable, you pay again to index them. Around $1.70 per million events for 15-day retention. That’s the part that quietly eats your budget.
Ingest is the cover charge. Indexing is the bar tab. The bar tab is where it gets expensive.
50M log events/day, indexed at 15 days:
50M × $1.70/M = $85/day → ~$2,550/mo (indexing)
+ ingest on volume ~$900/mo
----------------------------------------
~$3,450/mo for logs you mostly never query
Most teams index everything by default because that’s the path of least resistance. Then six months later log volume doubled because you added 12 services, and nobody connected the dots.
It grows even when you do nothing
That’s the trap. Your log bill goes up while you sit still.
Every new service logs. Every deploy adds debug lines someone forgot to remove. Every retry storm during an incident dumps a tower of logs right when you can least afford to think about cost.
The bill is up 30-50% year over year for most teams I see. Not because anyone decided to log more. Because logging more is the default and nobody owns the meter.
How to get it back down
You don’t need to log less. You need to index less.
Split ingest from indexing. Send everything. Index only what you’ll actually search. Datadog lets you set index filters and exclusion rules, so use them. Most teams index 100% and query maybe 5%.
Tier your retention. Hot searchable logs for 3-7 days. Everything else goes into cheap archive storage on S3 that you can rehydrate if an incident needs it. You almost never need 15-day instant search on debug logs.
Kill the noisy talkers first. Pull the per-service log volume report. It’s always the same shape, two or three services produce 80% of the volume, usually health checks, retries, or a debug line that shipped to prod by accident. Fix those and the bill drops without touching anything that matters.
And sample the high-volume low-value stuff. You do not need every single 200 OK access log. You need a representative sample plus 100% of the errors.
Nobody owns the bill
That’s the real problem underneath all of this.
Engineers add logging to debug something and never remove it. Finance sees a number go up and asks engineering, who shrug. The bill is everyone’s and therefore no one’s.
The fix isn’t a tool. It’s making one person responsible for the ingest-to-index ratio and reviewing it monthly. Treat telemetry volume like you treat your cloud bill, because it is one now.
One caveat: pricing tiers shift, so check Datadog’s current numbers before you quote them in a board deck. The mechanic, pay to send then pay again to search, is the part that doesn’t change.
If your observability bill is climbing faster than your traffic, I broke down where the money actually goes in my guide on reliability costs.
— Youn