duck-spc

How I Learned to Stop Worrying
and Trust Statistics

Statistical process control over your Parquet, powered by DuckDB —
and the case for one little constant: 2.66

The 3am problem

A metric moved. Someone got paged. Was it real?

Mistake	What it costs
Chasing routine noise	wasted investigations — and tampering: reacting to a stable process provably increases its variation
Dismissing a real shift	the regression ships, the pump fails, the fraud continues

Both failure modes come from answering the wrong question: "did the number change?" It always changed.

Every metric wiggles

This process is perfectly stable. Nothing happens — all day, every day. Every point is different. No point has an explanation.

The only question worth asking: did the process that generates the number change?

Two kinds of variation

Common cause

The noise inherent to the process. Routine. Unexplainable point-by-point — and predictable in range.

Response: leave it alone (or improve the system).

Special cause

Variation with a findable, assignable cause that is not part of the process.

Response: go find it. This page is worth answering.

A process behaviour chart exists to tell these apart — so people stop doing it by vibes.

The XmR chart: the whole machine

X̄ = mean(baseline)      mR̄ = mean(|xᵢ − xᵢ₋₁|)
UNPL = X̄ + 2.66·mR̄       LNPL = X̄ − 2.66·mR̄   ← limits frozen, then extended forever

Where 2.66 comes from

2.66 = 3 / 1.128

1.128 converts the mean moving range into a sigma estimate: for consecutive points, E|xᵢ − xᵢ₋₁| = 1.128 σ. So σ̂ = mR̄ / 1.128 — built from point-to-point variation only.
3 is Shewhart's economic choice — a century of practice balancing false-alarm cost against missed-signal cost.

People will pressure you to use 2 ("more sensitive") or 3.5 ("fewer pages"). Refuse. Tuning the constant is how a chart degenerates back into an arbitrary threshold.

“But my data isn’t normal!”

Nothing so far assumed it was. Every distribution below is standardized to the same mean and variance — so the ±3σ lines never move. Watch the shape go pathological while the red tail past 3σ stays tiny.

What you're willing to assume	P(stable point beyond 3σ)
Nothing at all (finite variance) — Chebyshev	≤ 1/9 ≈ 11.1%
Unimodal, that's it — Vysochanskij–Petunin	≤ 4/81 ≈ 4.9%
Normal (the familiar case)	0.27%

The gauntlet: the whole procedure, measured

Simulated stable processes (2,000 trials each): 28-point baseline, mR̄-estimated sigma, frozen limits — then count false alarms on 500 in-control points. Estimation error included. Nothing hidden.

Every monster lands under the unimodal bound and at less than half of Chebyshev's ceiling. You don't need to know your distribution.

The trick: sigma from the moving range

Same data, same 3-sigma shift. The global SD is inflated by the very signal you're hunting — its limits swell until the chart goes blind. Never compute limits as mean ± 3·std(data).

Compute once. Freeze. Extend forever.

Same shift hits both charts. The frozen limits keep firing; the rolling window quietly swallows the shift into its own baseline and goes blind.

Rolling limits absorb every anomaly into the baseline. The chart adapts to the disease and stops reporting it.
And the window length is just one more arbitrary knob to pick and babysit forever — there's no defensible value.
Re-baseline only on a verified process change — a human act, with provenance. duck-spc encodes this: limits are a saved artifact that records its own baseline window, and check never recomputes.

Only two rules

Rule 1	a point outside the natural process limits
Rule 2	nine consecutive points on one side of the center line (catches sustained smaller shifts)

The Western Electric handbook lists more. Every rule you add buys sensitivity with false alarms — and each false alarm consumes an investigation and erodes trust in the chart.

Minimal rules is the same philosophy as the constant: resist the urge to tune.

duck-spc: this math, at bucket scale

$ duck-spc baseline \
    --source 's3://bucket/events/' \
    --value latency_ms \
    --group-by region,service \
    --derive day:p95 \
    --window 2026-01-01:2026-01-29 \
    > limits.json

$ duck-spc check --limits limits.json
# exit 0 → stable. go back to sleep.
# exit 1 → the process changed.

All XmR math is SQL pushed into DuckDB over read_parquet() — thousands of streams, one scan, nothing materialized but answers
Frozen limits as artifacts with provenance — every chart's limits are traceable to a baseline
Derived streams (day:p95, day:rate, diff) for seasonal/trending/noisy raw data
JSON on stdout, verdict in the exit code — pipe it at your pager

Within natural limits,
nothing happened.

Go back to sleep.

duck-spc · Postgres-and-a-bucket lineage · roadmap: DuckLake sources, nonparametric limits, live ingestion
notebook: notebooks/trust_the_limits.py — every number in this deck is computed there