Strategy Hallucination
The model emits plausible trading logic that looks quantitative but has no market rationale, no statistical grounding, or invalid assumptions.
It creates code that passes syntax checks while smuggling in false causality.Structured Insight
A structured comparison of four generations of trading software: indicator-era platforms, strategy frameworks, system pipelines, and LLM prompt-to-code workflows.
The key shift is not only language or UX. Each generation changed the core abstraction traders used to express, test, and operationalize strategy ideas.
| Dimension | Gen 1: Indicator Era~2005-2012 | Gen 2: Strategy Era~2012-2018 | Gen 3: System Era~2018-2023 | Gen 4: LLM Era2023-present |
|---|---|---|---|---|
| Time period | ~2005-2012 | ~2012-2018 | ~2018-2023 | 2023-present |
| Representative products | MT4 / MQL4 | Backtrader, freqtrade, vnpy | QuantConnect, WorldQuant BRAIN | ChatGPT + brokerage API |
| Core abstraction | Modular indicators | Packaged strategy logic: entry, exit, sizing | Feature engineering pipeline | Natural language to code |
| Typical workflow | Drag indicators, set conditions, backtest | Write strategy code, optimize parameters, backtest | Build feature pipeline, train, validate, execute | Prompt, generate strategy, backtest |
| Main progress | Indicators moved from books and forums into reusable components | Complete strategies became packageable, shareable, and reproducible | The pipeline became the product, with standardized validation | Generation is fast and the entry barrier is low |
| Typical trap | Indicator worship and holy-grail thinking | Parameter overfitting: genetic search finds coincidence, not robustness | Data leakage and survivorship bias | Strategy hallucination and backtest overfitting blindness |
| Failure root cause | Rules are manual and lack systematic validation | The optimization target is backtest return, not robustness | Model complexity hides data-quality problems | The model skips infrastructure accumulated across the first three generations |
LLMs lower the cost of producing strategy code, but they do not remove the need for validation infrastructure.
The model emits plausible trading logic that looks quantitative but has no market rationale, no statistical grounding, or invalid assumptions.
It creates code that passes syntax checks while smuggling in false causality.The model treats a profitable backtest as validation and misses leakage, parameter mining, unstable regimes, or survivorship bias.
It accelerates curve fitting because generation speed multiplies untested variants.spread = asset_a.close - hedge_ratio * asset_b.close
z_score = (spread - spread.mean()) / spread.std()
if z_score > 2:
short(asset_a)
long(asset_b)
elif z_score < -2:
long(asset_a)
short(asset_b)Model + Tools
LLMs are useful when wrapped in tools. A serious trading stack includes data validation, pricing engines, execution logic, risk controls, monitoring, and deployment discipline.
Jane Street-style systems show the pattern: the model is one layer inside a larger toolchain, not the whole product.
StratCraft Positioning
StratCraft is not a Gen 5 claim. It gives strategy-framework and system-pipeline developers local C++-grade backtesting performance, plugin isolation, and repeatable validation workflows.
Use the local backtest engine and plugin ecosystem to move from generated ideas to validated systems.