Under the Hood

From Filing to Alpha Signal
in Four Steps

A rigorous pipeline — every stage auditable, every output reproducible. No black boxes, no discretionary overrides.

01
📡
Ingest
EDGAR · 4–6hr
02
🔬
Parse & Filter
Sentence-level
03
🧠
Classify
472 models
04
📦
Deliver
Frozen vectors
4–6hr
Filing latency
472
Classifiers
0.956
Median AUC
2014
Coverage start
01
Ingest

SEC EDGAR Monitoring

ARGOS continuously polls the SEC EDGAR full-text search and RSS feeds. New filings are detected within minutes of publication and queued for processing. The pipeline has continuous coverage going back to 2014 — every 10-K, 10-Q, and 8-K for 4,800+ liquid tickers.

Filing Types
10-K  ·  10-Q  ·  8-K  ·  DEF 14A
Detection Latency
Near Real-time
Universe Coverage
4,800+ liquid US tickers
Historical Coverage
2014 – present (10+ years)
EDGAR RSS full-text search API deduplication queue-based pipeline 3.7M+ filings processed
02
Parse & Filter

Sentence-Level Decomposition

Raw HTML/XBRL is stripped and the filing text is segmented into individual sentences. Legal boilerplate — forward-looking statement disclaimers, risk factor repetitions, exhibit lists — is identified and removed using Hyperscan pattern matching before any classifier ever sees the text. This removes 60%+ of low-signal content and concentrates classifier power on substantive disclosure.

Input
Raw SEC EDGAR HTML / XBRL document
Noise Removed
60%+ boilerplate filtered before classification
Segmentation
sentence → embedding unit
Output
Clean sentence corpus per filing
Hyperscan pattern matching HTML stripping XBRL parsing sentence segmentation boilerplate removal
03
Embed, Classify & Validate

472 Classifiers. Three Quality Gates.

Each sentence is embedded into 768-dimensional space (E5-base-v2) and scored by every classifier in parallel. 159 are human-curated from investment theses. 313 were discovered by unsupervised clustering on 151M sentences — finding signals humans would never think to look for. But scoring is only half the story. AUC alone doesn’t catch classifiers that fire on the wrong things.

# A classifier with 0.98 AUC scored this sentence with high confidence: "hired 500 employees to support expansion" # The classifier was trained to detect layoff announcements. # AUC said it was excellent. It wasn't.

AUC measures how well a classifier separates its training data. It says nothing about what happens on 151 million real sentences. We built three independent quality gates — each catches a different failure mode.

Gate 1: Separability Test

Cross-validated precision on held-out data. Can the classifier reliably separate its own training examples? If not, it hasn’t learned the concept.

THP@20 ≥ 75% · d′ ≥ 2.5 · Cluster ratio ≥ 0.5

Gate 2: Fire Rate

Score 1M real sentences. If a classifier fires on more than a few percent, it’s detecting a topic, not an event. Specificity drives quality — rare signals are tighter signals.

< 5% for human-designed · < 1% for cluster-discovered

Gate 3: Coherence Audit

Sample 1,000 high-confidence positives and measure whether they cluster in embedding space. A classifier can pass AUC and fire rate but still fire on unrelated sentences that share surface vocabulary. Coherence catches this.

Silhouette ≥ 0.20 · Positives must form a tight concept
Built
977 classifiers
Passed separability
700
Passed fire rate
586
Passed coherence
529 active
After boilerplate filter
472 in production

Removing 62 incoherent classifiers (passed AUC, failed coherence) improved downstream model IC by +48%. They weren’t just useless — they were actively injecting noise. Removing bad signal beats adding good signal.

Browse the full classifier catalog →

Model Architecture
logistic regression (calibrated)
Classifier Sources
159 human-curated + 313 cluster-discovered
Embedding Model
E5-base-v2 (768-dim)
Quality Gates
Separability + Fire Rate + Coherence
sentence embeddings logistic regression Platt scaling coherence audit cluster discovery three-gate validation
04
Aggregate & Deliver

Frozen Filing Vectors

Sentence-level scores are aggregated to the filing level — summed or max-pooled depending on the signal type. The result is a flat vector of classifier scores per accession number, ready to join to your pricing or fundamental data. Classifiers are frozen and immutable: the scores you backtest today will be identical to the scores you receive in production next year.

🦆
DuckDB
Single file, query in-process. Zero infrastructure.
📄
Parquet
Columnar, compressed. Drop into Spark, Pandas, or Polars.
☁️
S3 Bucket
Daily incremental drops or full historical dataset.
🔌
REST API
Query by ticker, date range, or accession number.
🔒
SFTP
Scheduled drops to your existing data pipeline.
🤖
MCP Server
Direct AI agent access. Query signals in natural language. Coming soon
sum aggregation max pooling per-accession vectors frozen classifiers daily incremental full backfill available
🔒 Frozen Classifiers

Your backtest today is your production model tomorrow.

Once a classifier version ships, its weights and thresholds are locked. We never silently retrain or update deployed classifiers. If we improve a classifier, it ships as a new versioned column — the original remains unchanged.

Backtest scores are identical to live production scores — no look-ahead, no drift
Classifier updates ship as new versioned columns, never overwriting history
Every score is attributable to a specific model version and training dataset
No discretionary adjustments, no editorial overrides — pure pipeline output
Full audit trail: accession number → sentence → score → classifier version
Get Started

See the signals in action

Explore live classifier output from recent SEC filings — or request a full sample dataset for your backtest.