<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Parquet on Jon Brown&#39;s Webpage</title>
    <link>/tags/parquet/</link>
    <description>Recent content in Parquet on Jon Brown&#39;s Webpage</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Fri, 05 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="/tags/parquet/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>How I Learned to Stop Worrying and Trust Statistics</title>
      <link>/posts/duckspc/</link>
      <pubDate>Fri, 05 Jun 2026 00:00:00 +0000</pubDate>
      
      <guid>/posts/duckspc/</guid>
      <description>&lt;p&gt;It&amp;rsquo;s 3am and a dashboard line moved. Someone got paged. They&amp;rsquo;re awake now,
squinting at a wiggle, trying to decide if it&amp;rsquo;s &lt;em&gt;real&lt;/em&gt;. Here&amp;rsquo;s the uncomfortable
truth about that moment: &lt;strong&gt;the number always changed.&lt;/strong&gt; Every point on every
chart is different from the last one. The question that actually matters — the
only question — is whether the &lt;em&gt;process that generates the number&lt;/em&gt; changed. And
almost nobody answers that question with anything more principled than vibes.&lt;/p&gt;</description>
      <content>&lt;p&gt;It&amp;rsquo;s 3am and a dashboard line moved. Someone got paged. They&amp;rsquo;re awake now,
squinting at a wiggle, trying to decide if it&amp;rsquo;s &lt;em&gt;real&lt;/em&gt;. Here&amp;rsquo;s the uncomfortable
truth about that moment: &lt;strong&gt;the number always changed.&lt;/strong&gt; Every point on every
chart is different from the last one. The question that actually matters — the
only question — is whether the &lt;em&gt;process that generates the number&lt;/em&gt; changed. And
almost nobody answers that question with anything more principled than vibes.&lt;/p&gt;
&lt;p&gt;This post is about answering it with one constant from 1924, a bucket of Parquet
files, and DuckDB.&lt;/p&gt;
&lt;p&gt;Long-time readers know I have a soft spot for venerable, empirical statistics
that punch way above their weight. My
&lt;a href=&#34;https://poisson-confidence-intervals.brojonat.com&#34;&gt;Poisson confidence intervals calculator&lt;/a&gt;
exists because of Gehrels
(&lt;a href=&#34;https://ui.adsabs.harvard.edu/abs/1986ApJ...303..336G/abstract&#34;&gt;1986&lt;/a&gt;) — a
few decades old, basically a lookup table, and still the thing I reach for
whenever &amp;ldquo;we observed zero&amp;rdquo; needs to become an actual limit on a rate (zero
never fluctuates to one, but one sometimes fluctuates to zero). There&amp;rsquo;s a
pattern here: the sturdiest tools in statistics tend to be old, simple, and
allergic to assumptions. Today&amp;rsquo;s entry is older still.&lt;/p&gt;
&lt;p&gt;In &lt;a href=&#34;/posts/you-dont-need-kafka/&#34;&gt;the last post&lt;/a&gt; I replaced Kafka with Postgres
and a bucket, and promised a growing gallery of expensive infrastructure
short-circuited by boring alternatives. One of the items on that ballot was the
observability stack — and before I can short-circuit the &lt;em&gt;storage&lt;/em&gt; side of that
(coming, I promise), I needed the &lt;em&gt;brain&lt;/em&gt;: the thing that looks at a metric and
tells you, with a straight face, whether anything actually happened. That brain
is called &lt;strong&gt;statistical process control&lt;/strong&gt;, the tool is called an &lt;strong&gt;XmR chart&lt;/strong&gt;,
and the whole algorithm fits on a napkin.&lt;/p&gt;
&lt;h2 id=&#34;the-two-expensive-mistakes&#34;&gt;The two expensive mistakes&lt;/h2&gt;
&lt;p&gt;When a metric moves and you have to decide whether to care, there are exactly
two ways to be wrong:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Chase routine noise.&lt;/strong&gt; You burn an investigation on nothing. Worse, if you
&amp;ldquo;fix&amp;rdquo; a stable process in response to individual points, you&amp;rsquo;re &lt;em&gt;tampering&lt;/em&gt; —
Deming demonstrated this provably increases variation. You made it worse by
reacting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dismiss a real shift.&lt;/strong&gt; The regression ships. The pump fails. The fraud
continues.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Most alerting setups handle this tradeoff by&amp;hellip; someone typing a threshold into
a YAML file. Page when latency &amp;gt; 500ms. Says who? Based on what? That number is
a vibe with a uniform. SPC replaces it with limits the &lt;em&gt;process itself&lt;/em&gt; tells
you.&lt;/p&gt;
&lt;h2 id=&#34;the-whole-algorithm&#34;&gt;The whole algorithm&lt;/h2&gt;
&lt;p&gt;Take your metric&amp;rsquo;s values from a baseline period — a few weeks where you believe
nothing weird happened. Compute:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;X̄    = mean(x)
mR̄   = mean(|xᵢ − xᵢ₋₁|)        # average gap between consecutive points
UNPL = X̄ + 2.66 · mR̄            # upper natural process limit
LNPL = X̄ − 2.66 · mR̄            # lower natural process limit
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Freeze those limits. Forever (or until you deliberately change the process).
Then two rules, and only two:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Rule 1&lt;/strong&gt;: a point lands outside the limits → something happened, go find it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Rule 2&lt;/strong&gt;: nine consecutive points on one side of the center line → the level
shifted, go find out why.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Inside the limits? &lt;em&gt;Nothing happened.&lt;/em&gt; Not &amp;ldquo;probably nothing&amp;rdquo; — the honest,
statistically defensible answer is that points inside natural process limits
carry no explanation. There is no root cause to find. Go back to sleep. That&amp;rsquo;s
the entire pitch: this is anomaly detection you can run in your head, and people
have replaced fleets of deep-learning anomaly detectors with exactly this, at
three or four orders of magnitude fewer parameters, because a small team can
actually understand and trust it.&lt;/p&gt;
&lt;h2 id=&#34;where-266-comes-from-and-why-your-weird-data-doesnt-break-it&#34;&gt;Where 2.66 comes from (and why your weird data doesn&amp;rsquo;t break it)&lt;/h2&gt;
&lt;p&gt;Two ingredients. For consecutive points drawn from a stable process the
following is (roughly) true for almost any distribution:
&lt;code&gt;E|xᵢ − xᵢ₋₁| = 1.128σ&lt;/code&gt;. So the mean moving range gives you a sigma estimate:
&lt;code&gt;σ̂ = mR̄ / 1.128&lt;/code&gt;. And limits go at ±3 sigma — Shewhart&amp;rsquo;s &lt;em&gt;economic&lt;/em&gt; choice, a
century of practice balancing false-alarm cost against missed-signal cost.
Multiply: &lt;code&gt;3 / 1.128 = 2.66&lt;/code&gt;. Not arbitrary, and — this is the part everyone
gets wrong — &lt;strong&gt;not a normality assumption.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&amp;ldquo;But my data isn&amp;rsquo;t normal!&amp;rdquo; Good news: nothing above assumed it was. Suppose the
absolute worst. Suppose an adversary designs your distribution:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;What you&amp;rsquo;re willing to assume&lt;/th&gt;
          &lt;th&gt;P(stable point beyond 3σ)&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Nothing at all (finite variance) — Chebyshev&lt;/td&gt;
          &lt;td&gt;≤ 1/9 ≈ &lt;strong&gt;11.1%&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Unimodal, that&amp;rsquo;s it — Vysochanskij–Petunin&lt;/td&gt;
          &lt;td&gt;≤ 4/81 ≈ &lt;strong&gt;4.9%&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Normal&lt;/td&gt;
          &lt;td&gt;&lt;strong&gt;0.27%&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Even against the pathological worst case, 3-sigma limits false-alarm on at most
one stable point in nine. Add the single weakest assumption you can check by
squinting at a histogram — one hump — and the ceiling drops under 5%.&lt;/p&gt;
&lt;p&gt;But bounds are bounds and vibes are not a benchmark, so I measured the &lt;em&gt;whole
procedure&lt;/em&gt;: 28-point baseline, mR̄-estimated sigma, frozen limits, then count
false alarms on 500 in-control points, 2,000 trials per distribution. Estimation
error included, nothing hidden:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;distribution&lt;/th&gt;
          &lt;th&gt;false alarms per stable point&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;uniform&lt;/td&gt;
          &lt;td&gt;0.03%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;bimodal mixture&lt;/td&gt;
          &lt;td&gt;0.04%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;normal&lt;/td&gt;
          &lt;td&gt;0.81%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;exponential (skewed)&lt;/td&gt;
          &lt;td&gt;3.43%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;lognormal (heavy tail)&lt;/td&gt;
          &lt;td&gt;4.49%&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;pareto α=2.5 (very heavy tail)&lt;/td&gt;
          &lt;td&gt;4.80%&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Every monster lands under the unimodal bound and at less than half of
Chebyshev&amp;rsquo;s ceiling. &lt;strong&gt;You don&amp;rsquo;t need to know your distribution.&lt;/strong&gt; That&amp;rsquo;s not a
slogan, it&amp;rsquo;s a table.&lt;/p&gt;
&lt;p&gt;One trap worth calling out: you might think the limits are just
&lt;code&gt;mean ± 3·std(data)&lt;/code&gt;. &lt;strong&gt;They are not, and the difference is the whole trick.&lt;/strong&gt;
The global standard deviation is inflated by the very signals you&amp;rsquo;re hunting —
inject a shift and the SD swells, the limits swell with it, and the chart goes
blind to its own signal. The moving range only sees point-to-point variation, so
a shift contaminates exactly one of its terms. Never compute limits with the
global SD. (Also resist everyone who wants to &amp;ldquo;tune&amp;rdquo; 2.66 to 2 or 3.5. That&amp;rsquo;s
how a chart degenerates back into a YAML vibe.)&lt;/p&gt;
&lt;h2 id=&#34;so-i-built-duck-spc&#34;&gt;So I built duck-spc&lt;/h2&gt;
&lt;p&gt;A century-old algorithm is cute; a century-old algorithm running over your
entire telemetry archive in one SQL query is useful. Following
&lt;a href=&#34;/posts/ducklake/&#34;&gt;my own advice&lt;/a&gt;, my data already lives as date-partitioned
Parquet on a bucket. So I built
&lt;a href=&#34;https://github.com/brojonat/duck-spc&#34;&gt;&lt;code&gt;duck-spc&lt;/code&gt;&lt;/a&gt;: the XmR math — limits,
derived streams, both detection rules — pushed down into DuckDB SQL over
&lt;code&gt;read_parquet()&lt;/code&gt;. The column contract is one line: your rows look like
&lt;code&gt;(ts, category…, value[, exposure])&lt;/code&gt;, where the categories define the streams
(think &lt;code&gt;region, service&lt;/code&gt;) and exposure handles observations-per-unit
normalization when rows carry unequal weight. The moving range is just a window
function:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;SELECT&lt;/span&gt; region, service,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#66d9ef&#34;&gt;avg&lt;/span&gt;(value)                                  &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; center,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#66d9ef&#34;&gt;avg&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;abs&lt;/span&gt;(value &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; lag(value) OVER w))         &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; mr_bar,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;       &lt;span style=&#34;color:#66d9ef&#34;&gt;avg&lt;/span&gt;(value) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;.&lt;span style=&#34;color:#ae81ff&#34;&gt;66&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;avg&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;abs&lt;/span&gt;(value &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; lag(value) OVER w)) &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; unpl
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;FROM&lt;/span&gt; derived_stream
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;WHERE&lt;/span&gt; ts &lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;?&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;AND&lt;/span&gt; ts &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;?&lt;/span&gt;          &lt;span style=&#34;color:#75715e&#34;&gt;-- the frozen baseline window
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;GROUP&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;BY&lt;/span&gt; region, service ...
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Thousands of streams, one scan, nothing materialized in Python except answers.&lt;/p&gt;
&lt;p&gt;The happy path is one command. Point it at a bucket and BOOM:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;duck-spc look --source &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;s3://my-bucket/events/&amp;#39;&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;&lt;/span&gt;  --value latency_ms --group-by region,service --derive day:p95
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;── region=us-east, service=checkout ────────────────────────────
                          ●
 335.2 ┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄  UNPL
            ·    ·    ·          ·      ·     ··   ·  ·
 331.1 ──·──··─·───··──·──··──·───··─·───··──·───·──··──── X̄
       · ·    ·   ·   ·  ·   ··  ·   · ·    ·   ·    ·
 326.9 ┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄  LNPL
       2026-01-01 ── dim = baseline, checked from 2026-01-29 ──
 ✗ 1 signal point(s) — first 2026-02-10 (rule1)

2/4 group(s) show special-cause variation — go find the cause(s).
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;ASCII XmR charts per stream, right in your terminal, signals in red. The
&lt;code&gt;--derive&lt;/code&gt; flag handles the dirty secret of real telemetry — raw streams are
seasonal, trending, and noisy, so you chart a &lt;em&gt;derived&lt;/em&gt; stationary stream
instead: &lt;code&gt;day:mean&lt;/code&gt;, &lt;code&gt;day:p95&lt;/code&gt;, &lt;code&gt;day:rate&lt;/code&gt; (that&amp;rsquo;s &lt;code&gt;sum(value)/sum(exposure)&lt;/code&gt;,
computed in-engine because the ratio of sums is not the mean of ratios), or
first differences.&lt;/p&gt;
&lt;p&gt;When exploration graduates to production, the verbs decompose Unix-style.
&lt;code&gt;baseline&lt;/code&gt; freezes the limits into a JSON artifact that carries its own
provenance — source, derivation, window, per-stream limits — so every verdict is
traceable to the data that justified it. &lt;code&gt;check&lt;/code&gt; scores new data against the
frozen artifact and puts the verdict in the exit code: 0 means stable (no news
is good news), 1 means signals. And because reports embed their limits,
everything pipes:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;duck-spc baseline --source ... --window 2026-01-01:2026-01-29 &amp;gt; limits.json
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;duck-spc check --limits limits.json          &lt;span style=&#34;color:#75715e&#34;&gt;# cron-friendly: exit code talks&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;duck-spc check --limits limits.json | duck-spc visualize    &lt;span style=&#34;color:#75715e&#34;&gt;# human investigating&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;duck-spc chart --limits limits.json --group us-east,checkout -o incident.png
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That artifact is also where the doctrine lives. Limits are computed once from an
explicit window and &lt;strong&gt;frozen&lt;/strong&gt; — re-baselining is a deliberate act (re-run
&lt;code&gt;baseline&lt;/code&gt; after a verified process change), never automatic. Rolling windows
are the classic self-own here: the limits absorb every anomaly into the
baseline, the chart adapts to the disease, and your monitoring goes quietly,
permanently blind.&lt;/p&gt;
&lt;h2 id=&#34;the-numbers&#34;&gt;The numbers&lt;/h2&gt;
&lt;p&gt;Measured on my laptop, because — say it with me — vibes are not a benchmark:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;2.16 million rows, 1,000 streams&lt;/strong&gt;: per-stream limits computed in &lt;strong&gt;0.11s&lt;/strong&gt;;
a full check scoring &lt;strong&gt;62,000 daily points&lt;/strong&gt; against frozen limits in
&lt;strong&gt;0.6s&lt;/strong&gt;. One process, no service, no GPU, no model registry.&lt;/li&gt;
&lt;li&gt;The detection math is cross-checked point-for-point against a reference numpy
implementation in the test suite, and the synthetic data generator plants
known signals (a spike, a sustained shift, a variance change) that the tests
must recover exactly — the clean stream must stay silent.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;what-its-not&#34;&gt;What it&amp;rsquo;s not&lt;/h2&gt;
&lt;p&gt;The chart tells you &lt;em&gt;whether&lt;/em&gt; and &lt;em&gt;when&lt;/em&gt; the process changed — never &lt;em&gt;why&lt;/em&gt;
(that&amp;rsquo;s your job) and never &lt;em&gt;what happens next&lt;/em&gt; (that&amp;rsquo;s forecasting; the chart
only tells you whether the future is likely to resemble the past). And if you
chart raw seasonal data without deriving a stationary stream first, you&amp;rsquo;ll get
noise — that&amp;rsquo;s not the chart failing, that&amp;rsquo;s the chart faithfully reporting that
Mondays differ from Sundays.&lt;/p&gt;
&lt;h2 id=&#34;whats-next&#34;&gt;What&amp;rsquo;s next&lt;/h2&gt;
&lt;p&gt;This slots straight into the
&lt;a href=&#34;https://github.com/brojonat/short-circuit&#34;&gt;short-circuit&lt;/a&gt; worldview from the
Kafka post. The roadmap: a DuckLake catalog as a source (same API, snapshots and
time travel underneath), nonparametric quantile limits for streams too ugly even
for the gauntlet, and bolting this onto the Postgres-log hot path so live
telemetry flows in one end and boring, trustworthy verdicts come out the other.
The &amp;ldquo;Splunk + Prometheus → DuckLake and a cron job&amp;rdquo; short-circuit I&amp;rsquo;ve written
about previously suddenly has its brain.&lt;/p&gt;
&lt;p&gt;A hundred years ago Shewhart figured out how to tell signal from noise with
arithmetic a clerk could do by hand. We&amp;rsquo;ve spent the last decade or so
re-solving that problem with anomaly-detection services that bill by the
gigabyte and page you about Tuesdays. One constant, two rules, a bucket, and a
duck. Stop worrying. Trust statistics.&lt;/p&gt;
&lt;p&gt;The comments below are a Bluesky thread — tell me what your dashboards paged you
about last week that turned out to be nothing.&lt;/p&gt;
</content>
    </item>
    
  </channel>
</rss>
