Turn Claude Into a
Citation-Bearing Analyst
Provenance Connect is a Model Context Protocol server that gives Claude the structured layer it's missing. Anything Claude could talk about — companies, biotech pipelines, fund managers, news — it otherwise hallucinates around. Connect it to Provenance and it answers from a graph it can cite, as of any date, including the failures.
Three limits no amount of model scale fixes
Frontier LLMs are extraordinary at language, reasoning, and pattern-matching across their training text. But three structural gaps make them untrustworthy for primary-source research.
One hard number
We ran the Finance Agent Benchmark public 50-question set — exactly the kind of "as-of corporate filing" queries we built Provenance Connect for.
Eight things Claude can't do alone
Each one is a query Claude either gets wrong or can't attempt without a structured, citation-bearing layer underneath it.
32 production tools across seven domains
Every tool returns provenance — source reference, snippet, and as-of date — on every fact. Every derived number carries its methodology version and caveats.
| Domain | What it covers |
|---|---|
| SEC Filing Signals | Echo, distress, mutation, and entropy across millions of filings · 529 themed classifiers over 152M classified sentences. |
| Quarterly Fundamentals | XBRL revenue, COGS, capex, lease, interest expense, dividends, growth rates — 50 columns, 4,400 tickers, back to 2009. |
| 8-K Event Timeline | Item-code filters — 4.01 auditor change, 4.02 non-reliance, 1.03 bankruptcy, 5.02 officer departure, and more. |
| News & Press Releases | 127M sentences (2020–2026) plus 4M article-level rows with event type, materiality score, and post-publication returns. |
| 13F Institutional Ownership | Top holders, full portfolios (not capped), position history, cohort flow, manager track records, all-notable consensus. |
| 13D/13G Activist Positions | Per-stake campaign status, intent classification, and accumulation trail. |
| Biotech Pipelines | Point-in-time pipeline reconstruction, survivorship-complete base rates, target landscapes, comparables, and risk scores — across 1,676 CIKs, including delisted. |
Plus discover_tools — a semantic catalog meta-tool Claude calls when it's uncertain, so the surface scales without quality regression. Most MCP servers don't have one. Magnitude signals are labeled as magnitude, not direction; replication returns are labeled as replication, not audited fund returns. Every fact is verifiable.
From EDGAR to your AI assistant in one hop
No pipeline glue. No bespoke integration. One OAuth handshake and you're querying a corpus built over four years.
Why this is hard to replicate
"Couldn't someone just point an LLM at SEC EDGAR?" The short answer is no — and the long answer is five things that took years to build.
Built for research professionals who need data they can defend
One question. One tool. One citation.
Access is currently by invitation. Request access and we'll add you to the allowlist — then you'll find Provenance Connect in the Claude connector directory and can authenticate in under a minute. Listed in Anthropic's connector directory.
Every signal. Every source. Every time.
Looking for direct data access? Explore the data stream →