Streaming buffer is your hidden constraint

When BigQuery streaming pain shows up as a DML error, the real problem is usually workload shape. Streaming wants append-and-reconcile thinking, not row-by-row sync fantasies.

Implementation note Data

By Ivan RichterLinkedIn

Last updated: Mar 29, 2026

4 min read

bigquery streaming data-engineering

On this page

The boundary usually shows up through a failed `MERGE`

The streaming buffer rarely gets treated as a design concern up front. It usually arrives later, through a failure. A MERGE starts rejecting the newest rows. An UPDATE behaves as if recent data is briefly out of reach. A cleanup step lands in an awkward half-working state where the warehouse seems close to supporting the pattern but not quite. That looks like a product quirk if the pipeline was designed around row-level mutation. The ingestion and mutation shapes no longer align.

Once fresh rows are arriving continuously, the clean assumption is append first, reconcile later. BigQuery is perfectly capable of sitting inside near-real-time systems, but it does not want to behave like a transaction engine with a nicer SQL dialect. If the pipeline keeps asking for row-by-row certainty against the newest streamed data, the workload is fighting a boundary the warehouse already made explicit.

Streaming changes what “latest state” can honestly mean

Once a table is fed through streaming, the safest starting point is an append-only landing zone that tells the truth about arrival. That’s why this topic sits naturally next to event-table partitioning. Raw data should land cheaply, predictably, and in a shape that is easy to revisit. Stronger claims about current status, latest state, or clean upserts belong later, after reconciliation has had a chance to do real work.

Many streaming designs quietly import OLTP expectations into the warehouse. Stream the row. Mutate it immediately. Assume the freshest state is the settled state. That’s the part that breaks. BigQuery can support fresh analytical pipelines. It just expects those pipelines to admit they are analytical. Arrival is one thing. Curated truth is another.

Append first, reconcile on purpose, publish state later

The durable pattern is simple. Land raw events. Revisit a window. Collapse duplicates and late arrivals against business identity. Then publish the curated surface. It’s less satisfying than the fantasy of immediate perfect state, but it survives contact with real ingestion behavior.

Once the model is shaped that way, a lot of related choices get clearer. Now the pipeline has to care about stale rows, about explicit change detection, and about whether the curated table is revisiting enough history to catch late changes instead of only reprocessing whatever happened to land in the last few minutes.

-- Raw append-only landing
insert into raw.orders_events (
  event_at,
  order_id,
  status,
  payload,
)
values (current_timestamp(), 'O123', 'paid', '{"source":"app"}');

-- Periodic reconciliation into curated latest-state table
merge mart.orders_latest t
using (
  select
    order_id,
    any_value(status) as status,
    max(event_at) as updated_at,

  from
    raw.orders_events

  where
    event_date >= date_sub(current_date(), interval 3 day)

  group by
    order_id
) s
on
  t.order_id = s.order_id

when matched then update set
  status = s.status,
  updated_at = s.updated_at

when not matched then insert
  (order_id, status, updated_at)
  values (s.order_id, s.status, s.updated_at);

That shape admits what the system can actually guarantee. Raw arrival stays cheap. Curated state gets published by a process that can revisit enough history to correct itself. The warehouse stops being asked to pretend that streaming and instant mutation are the same thing.

The reconciliation window is part of the model

A revisit window defines part of what the curated table means.

If the reconciliation step only looks back a few minutes when late data, changed children, or retried events can arrive much later, the table is quietly overstating its own certainty. The size of that window should be driven by actual arrival patterns and actual business tolerance for wrongness, not by whatever feels cheapest in the moment. A curated model that does not revisit enough history is semantically weak.

That also affects the serving layer. Fresh raw ingestion does not automatically mean every dashboard now deserves live reads against half-settled state. In many cases the more honest move is to publish a calmer reporting surface higher up the precompute ladder and let the reconciliation process settle before the read path starts making strong promises.

Sometimes the right answer is to stop streaming

There is a point where the design pressure becomes clear. If the dominant requirement is immediate row mutation, transactional certainty, or application-grade upsert behavior against the freshest records, then the pipeline has crossed a boundary. At that point, decide whether BigQuery is still the right place for that interaction pattern.

That’s also where ingestion shape and storage shape start to connect. Append-heavy raw zones behave very differently from mutation-heavy datasets, which is part of the dataset classification logic in physical versus logical storage. The warehouse doesn’t need one universal answer. Each layer still needs an answer that matches what it’s actually doing.

The rule

Treat the streaming buffer as a warehouse boundary that requires reconciliation. If data streams in, land it append-only, reconcile it through a real revisit window, and publish curated state afterward. Let raw tables describe arrival. Let curated tables describe settled meaning.

And if the system keeps demanding transactional mutation semantics against the newest rows, stop trying to bend the warehouse into that shape. The ingestion pattern has probably crossed into a different kind of system.

More in this domain: Data

Browse all

BigQuery cost guardrails that won't break your teams

BigQuery cost control works when guardrails are designed around workload shape and blast radius, not around shaming whoever happened to run the last expensive query.

On-demand vs slots: the SME decision boundary

For SMEs, the question is not which BigQuery pricing model is more sophisticated. The question is when workload classes have become distinct enough to deserve different compute lanes.

Partitioning defaults for event tables that don't lie

Partitioning is not just a performance tweak. It is one of the cheapest ways to control scan blast radius, but only if the partition contract matches how the table is actually queried.

Physical vs logical storage: a dataset classification rule for SMEs

Physical versus logical storage billing is not a warehouse philosophy debate. It is a dataset classification choice based on change rate, retention behavior, and how much storage churn the table creates.

Reservations for workload isolation: the minimal setup

Reservation design for SMEs is usually not an enterprise org chart. It is a small blast-radius pattern that keeps BI, batch, and sandbox work from bullying each other.

Related patterns

Dataform vs. script piles: how we keep transformations reviewable

We prefer a declarative transformation layer over ad hoc script piles once warehouse logic becomes shared, incremental, and worth reviewing as a system.

Constraints without enforcement: still worth it?

Non-enforced constraints are useful when they tell the truth. They act as semantic contracts and optimizer hints, but they become actively dangerous the moment the warehouse is asked to trust a lie.

BigQuery cost spikes usually come from table shape, not queries

When BigQuery spend jumps, the cause is usually in model shape, weak incremental design, or unnecessary reprocessing long before it's a single bad query.

How we decide whether a transformation belongs in SQLX, code, or orchestration

We keep transformations in SQLX by default, move to code when the logic truly stops being legible in SQL, and keep orchestration for sequencing rather than business meaning.

The boundary usually shows up through a failed MERGE