From the Lab

Research & Blog

Deep-dive research reports and editorial from the Provenance team — every thesis traceable to the exact filing sentence that triggered it.

All Research & Blog Posts

The Dispatch

The Cooking Is the Moat

Why raw data is a commodity and signal memory is not — and why Nike's nineteenth consecutive quarter of operating margin pressure is not in Bloomberg.

The Dispatch

What SEC Filings Say When They Say Nothing

We embedded 150 million sentences from SEC filings into 768-dimensional space. HDBSCAN found 551 clusters. One stopped us cold: 3,282 sentences about nothing happening.

The Dispatch

What 151 Million Sentences Taught Us About Predicting Stock Moves

151 million sentences. 924,000 filings. 472 classifiers — 159 hand-designed, 313 discovered by machines. Here is what a decade of reading every SEC filing taught us about predicting stock moves: why AUC lies, why trajectories beat events, and why 4 of our 5 most predictive signals were concepts nobody thought to look for.

The Dispatch

A New Company Classification System from 150 Million Sentences

The SIC system was designed in 1937. GICS in 1999. Both classify companies by what they sell. We classified 3,000 companies by what they say — 469 binary classifiers across 150 million SEC sentences — and found clusters that predict stock behavior better than any existing taxonomy.

Research Report

Biotech & Pharma: Net-Negative Sector Snapshot

822 companies, 7 signal dimensions. Only 1 in 5 names shows a positive echo profile. Abbott and Novo Nordisk at maximum profile mutation.

Research Report

Short Radar

Identify companies with multiple distress signals firing simultaneously. When revenue declines and costs get cut — pay attention.

Research Report

Momentum Matters

Velocity analysis reveals turning points that linear regression misses. The first derivative tells the story.

Research Report

From Chaos, Patterns Emerge

How we extract signals from SEC filings and visualize corporate health trajectories across decades.

Research Report

The Wave Rider

Quality biotech oscillators. 82% win rate, 75-day median hold, +62.5% avg return per wave.

Research Report

Distress Radar

14 classifiers detect financial distress. COVID shock at -3.5σ, recovery signals tracked across 800+ companies.

Research Report

ATYR: Turnaround Signal

9 biotech classifiers, 10 years of filings. Strongest signal in company history.

Research Report

Mining Sector Recovery

22 classifiers, 4.2M sentences. COVID shock at -4.7σ, commodity boom at +3.3σ.

Research Report

Healthcare is Stabilizing

29 classifiers, 1.2M sentences. COVID breakdown at -2.4σ, now recovering toward pre-pandemic baseline.

Research Report

AI Trifurcation in Business Services

Three diverging paths across IT services, staffing, and consulting as AI reshapes demand signals from 300+ companies over 20 years of filings.

Get Access

Ready to run your own thesis?

Request access to the full classifier dataset — every signal, every filing, every sentence.

Request Access