← Back to Patterns

BigQuery cost spikes usually come from table shape, not queries

When BigQuery spend jumps, the cause is usually in model shape, weak incremental design, or unnecessary reprocessing long before it's a single bad query.

By Ivan Richter LinkedIn

Last updated: Mar 24, 2026

4 min read

On this page

The rule

When BigQuery cost spikes, we look at table shape before we look at query cleverness.

Bad queries can waste money. That’s real. But most expensive warehouses aren’t expensive because one person wrote reckless SQL on a Tuesday. They’re expensive because the system keeps doing more work than the business question required, and it does that every day.

By the time someone opens the bill and starts looking for a query to blame, the waste has usually already been built into the model.

Cost problems usually start in the model

A warehouse table is part of the cost model, not just the semantic model.

If a table is too wide, carries duplicated attributes, or still looks like a cleaned-up staging artifact instead of a stable business entity, every downstream query pays for that decision. Consumers scan columns they don’t need. They repeat joins the platform should’ve resolved once. They work around unclear grain because the table never became precise enough to trust.

That’s why decision boundaries matter even when the concern is cost. Good model shape doesn’t just make SQL easier to read. It reduces how much repeated work the warehouse has to do.

It’s also why cost reviews that begin and end with query tuning usually go nowhere. You can improve the SQL and still keep paying for the same structural mistake because the table is wrong in a way every consumer inherits.

Weak incrementals turn uncertainty into spend

A lot of BigQuery waste comes from incremental models that are fast when everything is clean and expensive the moment trust drops.

That usually happens when change detection is vague. The model can’t say exactly what changed, so the system compensates in predictable ways. Refresh a wider window. Rewrite more partitions than necessary. Rerun the same correction logic to be safe. Pull in more upstream data than the downstream table actually needed.

None of that looks dramatic when you read the SQL in isolation. The query can look perfectly reasonable. The cost comes from how often the system has to fall back to overprocessing because the incremental path isn’t specific enough.

That’s why explicit change detection matters. Once the model loses precision around change, it starts buying safety with compute.

Stale-row fear gets expensive fast

You see the same thing once trust in old rows starts to slip.

At that point, teams usually start paying for confidence with brute force. Fact models get rebuilt more often than they should. Wider ranges get reprocessed than the actual change touched. Cleanup jobs keep running because nobody wants to find out two weeks later that the table drifted quietly and the dashboard’s been lying with a straight face.

That isn’t a separate cost issue. It’s the operational shadow of a correctness issue.

That’s the logic behind stale-row handling. If stale-row handling is weak, BigQuery doesn’t care what name you give the problem. It still bills the extra work.

Orchestration can multiply waste

Cost also climbs when orchestration starts carrying logic that belongs in the model.

A scheduler branch reruns cleanup that should’ve been expressed once in SQL. A backfill path becomes permanent because nobody wants to remove it. Two tasks end up reading nearly the same data to produce slightly different versions of the same table. Runtime switches create multiple ways to build an output that should only have one path.

Again, there may be no single terrible query in any of this. The waste comes from the total amount of unnecessary work the platform now treats as normal.

That’s why orchestration boundaries matter. Thin orchestration is easier to review, easier to trust, and usually cheaper because it does less accidental data processing.

What we check first

When BigQuery spend jumps, we don’t start by policing analysts or hunting for one ugly statement in query history.

We check whether the table is shaped around a real business entity or still carrying upstream mess. We check whether the incremental path can identify change precisely enough to avoid rewriting large parts of the table. We check whether orchestration is creating repeated work because the system doesn’t trust its own models.

Those checks usually explain the bill faster than query heroics do.

The point

Query tuning still matters. It’s just rarely the first move.

In most cases, BigQuery gets expensive because the platform keeps reprocessing, rescanning, or compensating for design decisions that were never made cleanly enough in the first place. The bill isn’t just reflecting usage. It’s reflecting structure.

More in this domain: Data

Browse all

Related patterns