Most "signal" debates collapse three different things into one bucket. We treat them as three separate layers (recipe, ingredient, scheduler), and only one of them is a moat. This page is the long version of the pillar's A/B/C cards.
The shape of the conversion from data into a directional read. There are four canonical families. Most popular blog "signals" are members of one of these, and they're all cheap, all replicable, all commodities.
Closed-form math on a sliding window of price/volume. Decades old, exhaustively studied, no hidden secret. The most-replicated category in retail and traditional CTA shops.
Models the market as a hidden state machine. Emits which regime is active (trend / mean-revert / chop) and the probability of switch. Provides regime context an indicator alone can't see.
Any learned model that predicts return / direction / volatility. Quality scales with the L2 / L3 ingredient available to it. Without confidence calibration, ML outputs over-trust themselves at the worst moments.
Cross-sectional factor exposures. Sorts a universe of symbols by an exposure and produces directional reads from spreads. Needs the L2 factor library; the combinator turns the basket of exposures into one decision per symbol.
That's the whole point. The combinator doesn't care whether a directional read came from RSI, an HMM, or a transformer. It consumes a typed SignalOut with a direction and a confidence. The pack is the recipe; the contract is the interface. This separation is what makes fusion productizable.
# every pack — indicator / hmm / ml / factor — implements this. class SignalOut(NamedTuple): direction: int # -1, 0, +1 confidence: float # [0.0, 1.0] horizon_s: int # expected holding period
The substrate a signal is computed on. Better ingredients raise the ceiling, but a data layer is never a signal in itself. Calling "NLP alt-data" a signal is a category error.
Not every recipe makes sense on every ingredient. The matrix below is the currently-shipping intersection, the roadmap, and the speculative far edge, feeding Layer C below.
| L1 · OHLCVprice · volume | L2 · 318-factorfactor library | L3 · alt-dataNLP · macro · on-chain | |
|---|---|---|---|
| IndicatorRSI · MACD · BB | ✓ | — | — |
| HMM · n-gramregime detect | ✓ | — | — |
| MLclassifier · regressor | ✓ | ◐ | ◌ |
| FactorFama-French · custom | — | ◐ | ◌ |
Read at the right altitude: a combinator is anything that fuses many signals into one decision faster than the market can. It's the only layer in this taxonomy that's a moat, and the only one Alpha Factory productizes.
A marketplace matches buyers and sellers. A combinator weights, schedules, and ranks. Five built-in fusion methods take many noisy reads and emit one bet plus a confidence: the unit a portfolio can act on.
Medallion's combinator is an ensemble of weak signals. Jane Street's is a cross-asset quoting engine. Same architectural pattern, different surface. Alpha Factory's is the productized, configurable version.
Individual signals decay and get replaced. The combinator is the layer that survives the turnover. That's why Medallion compounds through founder retirement and signal-stack rewrites. The moat is in the fusion layer, not in any one input.
Failures of this taxonomy in the wild. Each one collapses two layers into one, and each one breaks an architectural decision later.
The data is L3 substrate. Without a recipe (ML, factor) computing on it and a combinator fusing it, there's no signal yet. The conversation hides which two of three layers are missing.
A signal is one read among many a buyer must fuse. The buyer always brings their own combinator, so they're paying for data, not for fusion. Documented at length in the QuantConnect case.
A single recipe with no combinator above it is a single-regime bet. LTCM is the canonical example: one model, levered, no fusion, dead in four months when regime broke.
Code signatures for the three layers, side by side. The shared shape across A and C is what lets the combinator stay agnostic to which recipe produced a read.
# Layer 1 class AlphaSignal(Protocol): def name() -> str def required_data() -> DataSpec def emit(w: Window) -> SignalOut
Each recipe declares which data layer it needs and emits one SignalOut per call.
# Layer 0 — substrate class DataSpec(NamedTuple): tier: int # 1/2/3 fields: tuple[str, ...] cadence: Cadence history: int # bars
A typed specification of what a recipe needs. Never itself a signal, just declares the substrate.
# Layer 2 class Combinator(Protocol): method: FusionMethod def fuse(ss: list[SignalOut]) -> FusedOut def rank(all) -> Scoreboard
The layer the recipes know nothing about. Five built-in methods; the method is a runtime parameter.
Three layers.
One moat. Don't mix them up.