The Dispatch — SEC Filing Signal Research | Provenance

Latest

Methodology · 6 min read

Why We Put Provenance Inside Claude

A frontier model can reason about anything and verify almost nothing. We spent four years building a citation-bearing data layer — then we made Claude able to query it directly over MCP. On the Finance Agent Benchmark, that one connection took Claude from 0/50 to 10/50 and halved its contradictions. Here is why the MCP layer is the most important distribution decision we have made.

All Posts

May 2026

Why We Put Provenance Inside Claude

A frontier model can reason about anything and verify almost nothing. We spent four years building a citation-bearing data layer — then we made Claude able to query it directly over MCP. On the Finance Agent Benchmark, that one connection took Claude from 0/50 to 10/50 and halved its contradictions. Here is why the MCP layer is the most important distribution decision we have made.

April 2026

The Cooking Is the Moat

Why raw data is a commodity and signal memory is not — and why Nike's nineteenth consecutive quarter of operating margin pressure is not in Bloomberg.

What SEC Filings Say When They Say Nothing

We embedded 150 million sentences from SEC filings into 768-dimensional space. HDBSCAN found 551 clusters. One stopped us cold: 3,282 sentences about nothing happening.

March 2026

What 151 Million Sentences Taught Us About Predicting Stock Moves

151 million sentences. 924,000 filings. 472 classifiers — 159 hand-designed, 313 discovered by machines. Here is what a decade of reading every SEC filing taught us about predicting stock moves: why AUC lies, why trajectories beat events, and why 4 of our 5 most predictive signals were concepts nobody thought to look for.

A New Company Classification System from 150 Million Sentences

The SIC system was designed in 1937. GICS in 1999. Both classify companies by what they sell. We classified 3,000 companies by what they say — 469 binary classifiers across 150 million SEC sentences — and found clusters that predict stock behavior better than any existing taxonomy.