Signal taxonomy · the framework

Recipe.
Ingredient.
Scheduler.

Most "signal" debates collapse three different things into one bucket. We treat them as three separate layers (recipe, ingredient, scheduler), and only one of them is a moat. This page is the long version of the pillar's A/B/C cards.

A
Signal Pack · recipe
What the signal is made of
4 categories of mathematical recipe · cheap to replicate
B
Data Layer · ingredient
What the signal eats
3 substrate tiers · raises the ceiling, never a signal itself
C
Combinator · scheduler
How the signals decide together
5 fusion methods · ranked scoreboard · the moat
A · Recipe

Signal Pack: four mathematical recipes.

The shape of the conversion from data into a directional read. There are four canonical families. Most popular blog "signals" are members of one of these, and they're all cheap, all replicable, all commodities.

Signal Pack · 4 categories · 1 contract
emit() → SignalOut { dir, conf }
01 / 04

Indicator

RSI · MACD · BB · ATR · Stochastic · Ichimoku

Closed-form math on a sliding window of price/volume. Decades old, exhaustively studied, no hidden secret. The most-replicated category in retail and traditional CTA shops.

Strength Deterministic, fast, transparent. A perfect first input to fuse, never a standalone strategy.
02 / 04

HMM · n-gram

Hidden Markov · 2/3-state regime · n-gram on volatility

Models the market as a hidden state machine. Emits which regime is active (trend / mean-revert / chop) and the probability of switch. Provides regime context an indicator alone can't see.

Strength Catches state transitions that break indicator signals. Pair with indicators to gate them: "trade momentum only in trend regime."
03 / 04

ML

XGBoost · LightGBM · LSTM · transformer over factor stacks

Any learned model that predicts return / direction / volatility. Quality scales with the L2 / L3 ingredient available to it. Without confidence calibration, ML outputs over-trust themselves at the worst moments.

Strength Captures non-linear interactions across many factors. Confidence calibration is non-optional, and is exactly what the combinator needs to fuse correctly.
04 / 04

Factor

Fama-French · momentum · quality · low-vol · custom L2

Cross-sectional factor exposures. Sorts a universe of symbols by an exposure and produces directional reads from spreads. Needs the L2 factor library; the combinator turns the basket of exposures into one decision per symbol.

Strength Diversification across factor risks at the portfolio level. Most useful as a scoreboard re-ranker, not a per-tick signal.

Shared contract

All four packs emit the same shape.

That's the whole point. The combinator doesn't care whether a directional read came from RSI, an HMM, or a transformer. It consumes a typed SignalOut with a direction and a confidence. The pack is the recipe; the contract is the interface. This separation is what makes fusion productizable.

# every pack — indicator / hmm / ml / factor — implements this.
class SignalOut(NamedTuple):
    direction:  int     # -1, 0, +1
    confidence: float   # [0.0, 1.0]
    horizon_s:  int     # expected holding period
B · Ingredient

Data Layer: three substrate tiers.

The substrate a signal is computed on. Better ingredients raise the ceiling, but a data layer is never a signal in itself. Calling "NLP alt-data" a signal is a category error.

Data Layer · L1 → L2 → L3 · cost & signal-density rise together
ingredient ≠ signal · category guard
L1tier 1
OHLCVopen · high · low · close · volume · per bar
The universal substrate. Every signal pack can compute on it. Free to cheap from any broker or vendor. Decades of literature; the lowest signal density but the broadest accessibility.
costcheap
signal densitylow
L2tier 2
318-factor libraryvalue · quality · momentum · low-vol · liquidity · 318 features total
A curated stack of cross-sectional factors plus engineered features. Sized to feed ML and Factor recipes without each shop rebuilding pipeline. The first tier where the substrate itself adds informational edge.
costmoderate
signal densitymedium
L3tier 3
Alternative dataNLP on filings & news · macro feeds · on-chain · satellite · weather
Non-market substrate that informs market behavior. Most expensive tier, with the highest noise-per-dollar. Useful when fused, calling alt-data "a signal" is precisely the category error this taxonomy is built to prevent.
costhigh
signal densityuneven
A × B

Where recipe meets ingredient.

Not every recipe makes sense on every ingredient. The matrix below is the currently-shipping intersection, the roadmap, and the speculative far edge, feeding Layer C below.

Recipe × Ingredient · what Alpha Factory currently fuses
ref · pillar §matrix
L1 · OHLCVprice · volumeL2 · 318-factorfactor libraryL3 · alt-dataNLP · macro · on-chain
IndicatorRSI · MACD · BB
HMM · n-gramregime detect
MLclassifier · regressor
FactorFama-French · custom
Layer C · Combinator
shipping
roadmap
R&D
not applicable
C · Scheduler · moat

Combinator: the only layer that's a moat.

Read at the right altitude: a combinator is anything that fuses many signals into one decision faster than the market can. It's the only layer in this taxonomy that's a moat, and the only one Alpha Factory productizes.

01

It fuses. It doesn't just route.

A marketplace matches buyers and sellers. A combinator weights, schedules, and ranks. Five built-in fusion methods take many noisy reads and emit one bet plus a confidence: the unit a portfolio can act on.

02

It can wear different shapes.

Medallion's combinator is an ensemble of weak signals. Jane Street's is a cross-asset quoting engine. Same architectural pattern, different surface. Alpha Factory's is the productized, configurable version.

03

It outlives its inputs.

Individual signals decay and get replaced. The combinator is the layer that survives the turnover. That's why Medallion compounds through founder retirement and signal-stack rewrites. The moat is in the fusion layer, not in any one input.

RSI
HMM
ML
FACTOR
Combinator
5 METHODS
eqconfvotemaxmin
LONG
many noisy reads → one confident bet
04 · Category guard

Three common confusions.

Failures of this taxonomy in the wild. Each one collapses two layers into one, and each one breaks an architectural decision later.

Confusion · 01

Calling alt-data "a signal."

"Our alpha is NLP on 10-K filings."
NLP on 10-Ks is a Data Layer.

The data is L3 substrate. Without a recipe (ML, factor) computing on it and a combinator fusing it, there's no signal yet. The conversation hides which two of three layers are missing.

Confusion · 02

Selling signals as a finished product.

"License our momentum signal."
The buyer needs a Combinator.

A signal is one read among many a buyer must fuse. The buyer always brings their own combinator, so they're paying for data, not for fusion. Documented at length in the QuantConnect case.

Confusion · 03

One model = a strategy.

"We have a stat-arb strategy."
You have one Signal Pack.

A single recipe with no combinator above it is a single-regime bet. LTCM is the canonical example: one model, levered, no fusion, dead in four months when regime broke.

05 · The contracts

One layer, one contract.

Code signatures for the three layers, side by side. The shared shape across A and C is what lets the combinator stay agnostic to which recipe produced a read.

A · Signal Packrecipe

The recipe contract.

# Layer 1
class AlphaSignal(Protocol):
  def name() -> str
  def required_data() -> DataSpec
  def emit(w: Window)
       -> SignalOut

Each recipe declares which data layer it needs and emits one SignalOut per call.

B · Data Layeringredient

The ingredient contract.

# Layer 0 — substrate
class DataSpec(NamedTuple):
  tier:     int   # 1/2/3
  fields:   tuple[str, ...]
  cadence:  Cadence
  history:  int   # bars

A typed specification of what a recipe needs. Never itself a signal, just declares the substrate.

C · Combinatorscheduler · moat

The scheduler contract.

# Layer 2
class Combinator(Protocol):
  method: FusionMethod
  def fuse(ss: list[SignalOut])
        -> FusedOut
  def rank(all)
        -> Scoreboard

The layer the recipes know nothing about. Five built-in methods; the method is a runtime parameter.

Taxonomy · /alpha-factory/signal-taxonomy

Three layers.
One moat. Don't mix them up.