What is Provenance Connect?
Provenance Connect is a conversational research connector for SEC filings and corporate disclosure data. It gives Claude access to a curated database of filing signals, classifier-tagged sentences, quarterly fundamentals, and news / press releases — turning natural-language research questions into structured queries against a multi-million-document corpus.
Where most financial data products return filings, Provenance Connect returns signal: language-level patterns computed across the corpus and updated continuously. Users can ask "which companies have rising distress signals" instead of "show me filings with the word bankruptcy."
🔗
Universal endpoint: https://mcp.kscope.io/mcp/ — all users connect to the same URL; identity is carried in the OAuth-issued bearer token.
Connection & Authentication
Provenance Connect uses OAuth 2.1 with magic-link sign-in. The flow from discovery to first query takes under two minutes.
Onboarding Flow
1. Find Provenance Connect in the Claude connector directory
(search "Provenance Connect" or "Kaleidoscope")
2. Click Connect → Claude redirects to mcp.kscope.io for OAuth
3. Enter your work email → Provenance Connect checks the allowlist
4. One-time magic link sent to your email → click to authenticate
5. OAuth token issued (30-day JWT) → returned to Claude
6. All subsequent tool calls carry the bearer token automatically
Token Details
| Property | Value |
| Token format | 30-day JWT (Bearer) |
| Access model | Allowlist-gated — only invited email addresses can authenticate |
| Revocation | Takes effect on the next request |
| Client registration | RFC 7591 dynamic client registration |
| PKCE | S256 required |
| Standards | OAuth 2.1, RFC 8414, RFC 9728 |
⚠️
SSH / CLI users: Device-flow OAuth (RFC 8628) for Claude Code is on the roadmap. In the meantime, authenticate via the browser-based Claude app, which shares the token.
Data Coverage
| Dataset | Coverage | Count |
| SEC filings (with signal aggregates) | All major forms; 2022–2026 | ~262K filings |
| Filing-classifier observations | Score ≥ 0.50 | ~76M observations |
| Filing sentences (vector-searchable) | 2022–2026 (2024–2026 primary) | ~152M sentences |
| Signal classifiers | Distress, growth, capital, governance, sector-specific | 529 classifiers |
| XBRL quarterly fundamentals | Up to 20 quarters per company | ~456K observations, ~4,400 tickers |
| News / press releases | GlobeNewswire + PRNewswire, 2023–2026 | ~41M sentences |
| Company signal snapshots | Current state, refreshed daily | ~5,500 tickers |
Update Cadence
| Collection | Refresh | Latency |
| SEC filing sentences | Continuous (sync worker every 5s) | Minutes after SEC acceptance |
| News sentences | Continuous | Minutes after article publish |
| Company signal snapshots | Daily batch | Overnight refresh |
| XBRL fundamentals | As XBRL filings arrive | Within hours of filing |
| Classifier metadata | Weekly | When classifier weights are re-tuned |
Authentication Detail
The Provenance Connect connector implements a full OAuth 2.1 authorization server at https://mcp.kscope.io/.
Endpoints
Authorization: https://mcp.kscope.io/authorize
Token: https://mcp.kscope.io/token
Registration: https://mcp.kscope.io/register # RFC 7591
Metadata: https://mcp.kscope.io/.well-known/oauth-authorization-server
Resource: https://mcp.kscope.io/.well-known/oauth-protected-resource # RFC 9728
Transport
- HTTPS only, TLS 1.2+ via wildcard certificate
- PKCE S256 required on all authorization requests
- Dynamic client registration (no pre-registration step for MCP clients)
Security & Privacy
| Property | Details |
| Authentication | OAuth 2.1 with magic-link sign-in |
| Authorization | Allowlist-gated — only invited email addresses can authenticate |
| Transport | HTTPS only, TLS 1.2+ |
| Access type | Read-only, non-destructive, idempotent — cannot write or delete data |
| PII collected | Email address (allowlist) and OAuth session state only |
| Prompt retention | No prompt content or tool inputs retained beyond standard service logs |
Known Limitations
- No forward-return or backtest performance metrics surfaced through tools
- No index-membership lookups (S&P 500, Russell 1000, etc.)
- No analyst estimates or consensus data
- XBRL fundamentals: gross profit missing on ~60% of companies due to inconsistent XBRL tagging — Revenue − COGS fallback in progress
- Filing sentence coverage: 2024–2026 is primary; 2022–2023 is partial; 2018–2021 not yet loaded
- News ticker tagging occasionally surfaces non-company strings (e.g. exchange names) in ticker fields
- Company signal snapshots are current-state only (no historical time series)
- Device-flow OAuth for SSH / Claude Code users not yet supported
Roadmap
| Item | Status |
| Device-flow OAuth (RFC 8628) for Claude Code / SSH users | Planned |
| Admin CLI for allowlist management | Planned |
| Tiered access (free / pro / enterprise) | Planned |
| XBRL gross-profit fallback (Revenue − COGS) | In Progress |
| Expanded historical sentence coverage (argos_2024 backfill) | In Progress |
| 2018–2021 filing sentence collections | Planned |