On-demand vs slots: the SME decision boundary

For SMEs, the question is not which BigQuery pricing model is more sophisticated. The question is when workload classes have become distinct enough to deserve different compute lanes.

Decision memo Data

By Ivan RichterLinkedIn

Last updated: Mar 29, 2026

5 min read

bigquery cost-control workload-management

On this page

Start on on-demand until the warehouse proves it needs lanes

For most SME warehouses, on-demand is the right default while the warehouse is still figuring out what kind of work it actually does. Team size and sophistication don’t decide it. Some queries are real production dependencies. Some are half-formed analysis. Some come from recurring reporting that nobody has admitted is now part of the product. Buying capacity before those boundaries are clear only changes the shape of the invoice.

That’s why we’d rather start with simple cost guardrails and enough visibility to see where pressure is really coming from. On-demand is useful at that stage because mistakes stay relatively cheap and the warehouse is still free to show you what kind of system it’s trying to become.

The real trigger is collision, not sophistication

Workload collisions usually force the change before any grand financial threshold does.

You see it when scheduled transforms get slower because interactive work lands at the wrong time. You see it when dashboards feel fine one hour and sticky the next, and nobody can explain the shift without resorting to vibes. You see it when analysts start talking about the warehouse as if it has moods. That’s the point where one shared compute pool has stopped behaving like a general workspace and started behaving like several incompatible workload classes pretending to coexist peacefully.

Repeated reporting traffic is a common trigger. From the BI side it can look harmless enough, just some dashboards, some recurring refreshes, nothing dramatic. From the warehouse side it can look very different, especially once dashboard churn turns into steady pressure instead of a few expensive one-offs. At that point, the same lane is being asked to serve curiosity, production transforms, and reader-facing workloads with latency expectations. The lane now serves several incompatible jobs.

This is mostly a routing decision

Choose between on-demand and slots by asking which workloads have earned separate treatment.

BigQuery gives you the flexibility to mix these models within a region. Some projects can stay on-demand. Others can sit behind reservations. A project can inherit capacity and then be explicitly kept out of it if that isn’t where it belongs. BigQuery lets the decision stay narrow: certain classes can get their own compute lane without switching the whole warehouse to slots.

That’s a much better frame, because it pushes the conversation toward routing instead of prestige. If recurring reporting, scheduled transforms, or service-facing reads need predictability, use a reservation layout to separate them cleanly.

Most teams only need two lanes

For many SMEs, two lanes are enough: one for exploratory work and one for predictable workloads.

Lane 1: on-demand
- analyst exploration
- one-off backfills
- ad hoc debugging

Lane 2: reservations
- dashboards with latency expectations
- scheduled transformations
- recurring executive/reporting workloads

That split is boring, which is exactly why it works. Exploration stays cheap and flexible. Production readers and recurring jobs stop fighting with ad hoc curiosity. And when somebody asks why one query is allowed to fail fast while another is expected to complete predictably, the answer is no longer political. The work is different, so the lane is different.

It also forces a useful question that teams try hard to avoid: should this workload even stay live? Slots can calm a warehouse down, but they aren’t the only way. Sometimes the right answer is a better serving model. Sometimes it’s less dashboard churn. Sometimes the workload belongs higher up the precompute ladder and the live path should have been shortened weeks ago. Buying capacity can hide that smell for a while. It doesn’t make it go away.

Buy slots when the pressure is real, repeated, and named

We want a few things to be true before reservations enter the picture.

First, identify and measure a real production workload. A vague sense that the warehouse is getting more important doesn’t qualify. Second, that workload has to be recurring enough that dedicated capacity actually solves a steady problem instead of flattering a temporary spike. Third, there needs to be a routing story. If the team still can’t say which jobs belong on on-demand and which belong behind reservations, buying slots is just a more expensive form of indecision.

Teams often end up buying slots because it feels like the grown-up move, then keep the same blurry workload boundaries they had before. Now they’ve got a more complicated warehouse and the same confusion. If the pressure is real and recurring, reservations start earning their keep. If the pressure is still vague, the warehouse probably isn’t ready for them yet.

Staying on on-demand can be the disciplined move

A lot of teams should stay on on-demand longer than they do. If the warehouse is still mostly exploratory, if the production pressure is still light, and if the expensive moments are sporadic rather than structural, reservations are usually premature. In that stage, the better work is often lower in the stack: tighten table shape, calm down reporting behavior, fix obvious serving problems, and make sure the cost controls are aimed at the right kinds of mistakes.

Those changes age well even if reservations come later. Premature slot layouts often don’t. They get built around fuzzy workload definitions, temporary pains, and whatever happened to be loudest that month. Then the team inherits a routing model it doesn’t really believe in.

select
  project_id,
  priority,
  count(*) as jobs,
  round(sum(total_slot_ms) / 1000 / 60, 1) as slot_minutes,

from
  `region-eu`.INFORMATION_SCHEMA.JOBS_BY_PROJECT

where
  creation_time >= timestamp_sub(current_timestamp(), interval 1 day)

group by
  project_id,
  priority,

order by
  slot_minutes desc;

If the data still shows mostly exploratory churn with only light recurring pressure, on-demand remains the right boring answer.

The rule

On-demand is the default. Slots start making sense when workload collisions stop being occasional and start becoming part of normal operations.

At that point you aren’t really choosing a more sophisticated pricing model. You’re admitting the warehouse now does distinct kinds of work, and some of that work deserves its own lane.

More in this domain: Data

Browse all

BigQuery cost guardrails that won't break your teams

BigQuery cost control works when guardrails are designed around workload shape and blast radius, not around shaming whoever happened to run the last expensive query.

Partitioning defaults for event tables that don't lie

Partitioning is not just a performance tweak. It is one of the cheapest ways to control scan blast radius, but only if the partition contract matches how the table is actually queried.

Physical vs logical storage: a dataset classification rule for SMEs

Physical versus logical storage billing is not a warehouse philosophy debate. It is a dataset classification choice based on change rate, retention behavior, and how much storage churn the table creates.

Reservations for workload isolation: the minimal setup

Reservation design for SMEs is usually not an enterprise org chart. It is a small blast-radius pattern that keeps BI, batch, and sandbox work from bullying each other.

Streaming buffer is your hidden constraint

When BigQuery streaming pain shows up as a DML error, the real problem is usually workload shape. Streaming wants append-and-reconcile thinking, not row-by-row sync fantasies.

Related patterns

BigQuery cost spikes usually come from table shape, not queries

When BigQuery spend jumps, the cause is usually in model shape, weak incremental design, or unnecessary reprocessing long before it's a single bad query.

Constraints without enforcement: still worth it?

Non-enforced constraints are useful when they tell the truth. They act as semantic contracts and optimizer hints, but they become actively dangerous the moment the warehouse is asked to trust a lie.

How we decide whether a transformation belongs in SQLX, code, or orchestration

We keep transformations in SQLX by default, move to code when the logic truly stops being legible in SQL, and keep orchestration for sequencing rather than business meaning.

Why declarative data models scale better than script-driven pipelines

Declarative modeling scales better because it keeps business shape, dependencies, and reviewable intent visible as the platform and team both grow.