<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Kafka on Jon Brown&#39;s Webpage</title>
    <link>/tags/kafka/</link>
    <description>Recent content in Kafka on Jon Brown&#39;s Webpage</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Tue, 26 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="/tags/kafka/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>You Probably Don&#39;t Need Kafka (You Need Postgres and a Bucket)</title>
      <link>/posts/you-dont-need-kafka/</link>
      <pubDate>Tue, 26 May 2026 00:00:00 +0000</pubDate>
      
      <guid>/posts/you-dont-need-kafka/</guid>
      <description>&lt;p&gt;I&amp;rsquo;ve been circling this one for a while. A couple of years ago I wrote about
&lt;a href=&#34;/posts/go-postgres-listen-notify/&#34;&gt;Postgres Listen/Notify&lt;/a&gt; and how you can do
PubSub without dragging a message broker into your stack. Last year I got
&lt;a href=&#34;/posts/ducklake/&#34;&gt;a little too excited about DuckLake&lt;/a&gt; and cheap, decoupled
storage and compute. This post is what happens when those two ideas have a baby.&lt;/p&gt;</description>
      <content>&lt;p&gt;I&amp;rsquo;ve been circling this one for a while. A couple of years ago I wrote about
&lt;a href=&#34;/posts/go-postgres-listen-notify/&#34;&gt;Postgres Listen/Notify&lt;/a&gt; and how you can do
PubSub without dragging a message broker into your stack. Last year I got
&lt;a href=&#34;/posts/ducklake/&#34;&gt;a little too excited about DuckLake&lt;/a&gt; and cheap, decoupled
storage and compute. This post is what happens when those two ideas have a baby.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the thing about Kafka: it&amp;rsquo;s almost never &lt;em&gt;just&lt;/em&gt; Kafka. The moment you
have a real use case, you&amp;rsquo;re not running a message bus, you&amp;rsquo;re running a small
fleet of distributed systems. Brokers, plus ZooKeeper or KRaft for coordination.
A Schema Registry so your payloads don&amp;rsquo;t drift. Kafka Connect workers to get data
&lt;em&gt;out&lt;/em&gt; to your warehouse or object store. Kafka Streams or ksqlDB for the stateful
processing and materialized views you inevitably need. And then tiered storage
(or a whole separate pipeline) so retention doesn&amp;rsquo;t bankrupt you. Each of these
is its own thing to scale, upgrade, secure, and get paged about. Frequently it&amp;rsquo;s
a dedicated platform team and a five-figure monthly bill.&lt;/p&gt;
&lt;p&gt;And here&amp;rsquo;s my heretical little question: what does your project &lt;em&gt;actually&lt;/em&gt; need?&lt;/p&gt;
&lt;p&gt;In my experience, &amp;ldquo;we need Kafka&amp;rdquo; almost always decomposes into four things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Distribute work across a pool of workers.&lt;/li&gt;
&lt;li&gt;Fan an event out to several independent consumers.&lt;/li&gt;
&lt;li&gt;Keep a durable, replayable log of what happened.&lt;/li&gt;
&lt;li&gt;Query the history of that log later.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Postgres does the first three in plain SQL. Object storage holds the cold log for
basically free. DuckDB queries it at warehouse speeds with no server to run. So I
built a thing to prove it to myself, the same way I stood up DuckLake last year:
not by taking anyone&amp;rsquo;s word for it, but by actually running it.&lt;/p&gt;
&lt;h2 id=&#34;the-demo&#34;&gt;The demo&lt;/h2&gt;
&lt;p&gt;The project is a template I&amp;rsquo;m calling
&lt;a href=&#34;https://github.com/brojonat/short-circuit/tree/main/templates/kafka-to-pg&#34;&gt;&lt;code&gt;kafka-to-pg&lt;/code&gt;&lt;/a&gt;.
It streams live geographic
telemetry — a synthetic fleet of ~500 &amp;ldquo;aircraft&amp;rdquo; drifting around the continental
US, publishing position reports at about &lt;strong&gt;1,000 messages per second&lt;/strong&gt; by default
(and you can crank that knob). It renders them on a live map that updates over
&lt;a href=&#34;/posts/websockets/&#34;&gt;Server-Sent Events&lt;/a&gt;, maintains a &amp;ldquo;current position per
aircraft&amp;rdquo; table, ages old data out to Parquet on object storage, and lets you
run SQL over that history. There&amp;rsquo;s a little query console in the browser too.&lt;/p&gt;
&lt;p&gt;The entire thing runs on &lt;strong&gt;one Postgres instance and one bucket.&lt;/strong&gt; No brokers, no
ZooKeeper, no Connect cluster, no Streams app. Just SQL and a handful of small,
stateless Go processes that you can kill and restart whenever you feel like it.&lt;/p&gt;
&lt;h2 id=&#34;sql-is-the-api-the-whole-way-down&#34;&gt;SQL is the API, the whole way down&lt;/h2&gt;
&lt;p&gt;The &amp;ldquo;topic&amp;rdquo; is just an append-only table. The trick people worry about — getting
monotonic offsets without a race — is a single transaction. You reserve a block
of offsets and insert the batch in one statement:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;WITH&lt;/span&gt; reserve &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; (
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;UPDATE&lt;/span&gt; log_counter
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;SET&lt;/span&gt; next_offset &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; next_offset &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;cardinality&lt;/span&gt;(&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;::bytea[])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;WHERE&lt;/span&gt; id &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  RETURNING next_offset &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;cardinality&lt;/span&gt;(&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;::bytea[]) &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; first_off
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;INSERT&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;INTO&lt;/span&gt; topic (topic_id, c_offset, payload)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;SELECT&lt;/span&gt; &lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;, r.first_off &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; ord &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;, payload
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;FROM&lt;/span&gt; reserve r, &lt;span style=&#34;color:#66d9ef&#34;&gt;unnest&lt;/span&gt;(&lt;span style=&#34;color:#960050;background-color:#1e0010&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;::bytea[]) &lt;span style=&#34;color:#66d9ef&#34;&gt;WITH&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;ORDINALITY&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; p(payload, ord);
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s the atomicity Kafka gives you, as one SQL statement. Consumer groups are a
&lt;code&gt;consumer_offsets&lt;/code&gt; table and a query that claims a range of offsets and advances
the cursor atomically — same idea, log-based, each group replays independently.
And if what you want is a &lt;em&gt;task queue&lt;/em&gt; rather than fan-out (the RabbitMQ/SQS
shape), that&amp;rsquo;s the classic &lt;code&gt;SELECT ... FOR UPDATE SKIP LOCKED&lt;/code&gt;: many workers pull
from the same table, and a worker that hits a locked row just skips to the next
job instead of waiting. No broker required for any of it.&lt;/p&gt;
&lt;h2 id=&#34;the-part-im-actually-proud-of-retention&#34;&gt;The part I&amp;rsquo;m actually proud of: retention&lt;/h2&gt;
&lt;p&gt;This is where the &amp;ldquo;you&amp;rsquo;ll regret not using Kafka&amp;rdquo; crowd usually has a point.
Logs grow. At 1,000 msg/s mine grows about &lt;strong&gt;a gigabyte an hour&lt;/strong&gt;. You can&amp;rsquo;t just
let that run.&lt;/p&gt;
&lt;p&gt;Kafka has &lt;code&gt;retention.ms&lt;/code&gt; and tiered storage for this. My version is almost
embarrassingly simple: the &lt;code&gt;topic&lt;/code&gt; table is &lt;strong&gt;partitioned by time&lt;/strong&gt;. A little
sweeper process creates upcoming partitions ahead of time, and for partitions
older than the retention window it does this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;DROP&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;TABLE&lt;/span&gt; topic_p_1779807425;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s it. &lt;code&gt;DROP&lt;/code&gt; on a partition is O(1) and leaves &lt;em&gt;zero&lt;/em&gt; dead tuples behind.
Compare that to a &lt;code&gt;DELETE&lt;/code&gt;-based TTL, which churns the heap and leaves you
chasing autovacuum forever. Dropping partitions is &lt;code&gt;retention.ms&lt;/code&gt;, in one line,
with none of the bloat.&lt;/p&gt;
&lt;p&gt;But before it drops a partition, the sweeper hands the rows to a &lt;strong&gt;sink&lt;/strong&gt;. The
interesting sink writes them out as Zstd-compressed Parquet, partitioned by
date, to object storage. So Postgres only ever holds the hot window — the last
few minutes, or hours, or whatever you configure — and the full history lives on
a bucket as columnar files. That&amp;rsquo;s Kafka Connect and tiered storage, replaced by
a cron-shaped Go program and a sink interface.&lt;/p&gt;
&lt;h2 id=&#34;then-duckdb-shows-up-and-eats-ksqldbs-lunch&#34;&gt;Then DuckDB shows up and eats ksqlDB&amp;rsquo;s lunch&lt;/h2&gt;
&lt;p&gt;Once your history is Parquet on a bucket, querying it is the easy part, because
&lt;a href=&#34;/posts/ducklake/&#34;&gt;DuckDB is a marvel&lt;/a&gt;. Point it at the files and you have a SQL
analytics engine over your entire event history, no server:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;SELECT&lt;/span&gt; kind, &lt;span style=&#34;color:#66d9ef&#34;&gt;count&lt;/span&gt;(&lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt;) &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; msgs, &lt;span style=&#34;color:#66d9ef&#34;&gt;count&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;DISTINCT&lt;/span&gt; id) &lt;span style=&#34;color:#66d9ef&#34;&gt;AS&lt;/span&gt; entities
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;FROM&lt;/span&gt; read_parquet(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;archive/**/*.parquet&amp;#39;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;GROUP&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;BY&lt;/span&gt; kind;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I wrapped this in a little web console so you can poke at the archive from the
browser. This is the ksqlDB / stream-analytics box on the Kafka architecture
diagram, and it&amp;rsquo;s a &lt;code&gt;read_parquet&lt;/code&gt; call.&lt;/p&gt;
&lt;h2 id=&#34;the-numbers&#34;&gt;The numbers&lt;/h2&gt;
&lt;p&gt;I measured this stuff on my laptop, because vibes are not a benchmark:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A row in Postgres costs about &lt;strong&gt;317 bytes&lt;/strong&gt; on disk (the JSON payload plus row
overhead plus the index).&lt;/li&gt;
&lt;li&gt;The same row in Zstd Parquet after archival is about &lt;strong&gt;42 bytes&lt;/strong&gt; — roughly a
&lt;strong&gt;7.5× reduction&lt;/strong&gt;, and it&amp;rsquo;s columnar, so analytical scans fly.&lt;/li&gt;
&lt;li&gt;A single Postgres handles the messaging load far past where most people assume
it taps out. Published benchmarks for this pattern put it around 5k writes/s
and 25k reads/s on 4 vCPUs, scaling to hundreds of thousands of writes/s on a
big box. Your &amp;ldquo;Kafka-scale&amp;rdquo; workload is probably smaller than you think.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The punchline I keep coming back to: &lt;strong&gt;two pieces of infrastructure to operate
instead of a fleet.&lt;/strong&gt; One database you already know how to run, and a bucket.&lt;/p&gt;
&lt;h2 id=&#34;when-you-should-still-use-kafka&#34;&gt;When you should still use Kafka&lt;/h2&gt;
&lt;p&gt;I&amp;rsquo;m not going to insult you with &amp;ldquo;Kafka is dead.&amp;rdquo; It isn&amp;rsquo;t, and there are real
reasons it exists. If you genuinely need sustained multi-hundred-MB/s throughput,
huge fan-out to many independent consumer clusters, multi-datacenter replication,
exactly-once processing across topics, or the mature connector ecosystem — use
the right tool. Reach for Kafka, or Pulsar, or Redpanda.&lt;/p&gt;
&lt;p&gt;But that&amp;rsquo;s a &lt;em&gt;narrower&lt;/em&gt; set of requirements than the number of Kafka clusters in
the world would suggest. For a very large fraction of &amp;ldquo;we need a streaming
platform&amp;rdquo; projects, the honest answer is that you need a durable log, a couple of
consumers, and somewhere cheap to keep the history. That&amp;rsquo;s Postgres, a bucket,
and DuckDB. The savings — in dollars, in services, in 3am pages — are not subtle.&lt;/p&gt;
&lt;p&gt;I think this is the most underrated move in backend engineering right now:
look hard at the expensive, operationally heavy thing in your architecture and
ask whether a boring database and some object storage short-circuit the whole
problem. Often they do.&lt;/p&gt;
&lt;p&gt;This Kafka-killer is the first entry in a project I&amp;rsquo;m calling
&lt;a href=&#34;https://github.com/brojonat/short-circuit&#34;&gt;short-circuit&lt;/a&gt; — think of it as a
gallery of savings. The plan is a growing collection of drop-in templates, each
one short-circuiting some expensive, heavyweight piece of infrastructure with a
boring alternative — Postgres and a bucket, or SQLite, or DuckDB, sometimes
just &amp;ldquo;use Postgres&amp;rdquo;. And &amp;ldquo;cheaper&amp;rdquo; really undersells it:
once you add up the brokers you don&amp;rsquo;t run, the platform team you don&amp;rsquo;t staff,
and the 3am pages you don&amp;rsquo;t get, simplicity stops being the budget option and
starts being the way businesses actually get &lt;em&gt;more&lt;/em&gt; for less.&lt;/p&gt;
&lt;p&gt;Kafka&amp;rsquo;s at offset zero, and the backlog is significant — but you get a vote on
what gets consumed next:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Snowflake / BigQuery → a DuckLake lakehouse.&lt;/strong&gt; The warehouse, minus the
warehouse invoice.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Splunk + Prometheus → DuckLake + a cron job.&lt;/strong&gt; Observability without the
per-gigabyte shakedown.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Airflow / Temporal → Postgres-native durable workflows.&lt;/strong&gt; Orchestration with
one fewer system to babysit.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The comments below are a Bluesky thread, so a reply &lt;em&gt;is&lt;/em&gt; a vote. Tell me which
one to short-circuit next.&lt;/p&gt;
</content>
    </item>
    
  </channel>
</rss>
