Structured Insight

Trading Software Generations: From MT4 to LLM

A structured comparison of four generations of trading software: indicator-era platforms, strategy frameworks, system pipelines, and LLM prompt-to-code workflows.

Four Generations of Trading Software

The key shift is not only language or UX. Each generation changed the core abstraction traders used to express, test, and operationalize strategy ideas.

DimensionGen 1: Indicator Era~2005-2012Gen 2: Strategy Era~2012-2018Gen 3: System Era~2018-2023Gen 4: LLM Era2023-present
Time period~2005-2012~2012-2018~2018-20232023-present
Representative productsMT4 / MQL4Backtrader, freqtrade, vnpyQuantConnect, WorldQuant BRAINChatGPT + brokerage API
Core abstractionModular indicatorsPackaged strategy logic: entry, exit, sizingFeature engineering pipelineNatural language to code
Typical workflowDrag indicators, set conditions, backtestWrite strategy code, optimize parameters, backtestBuild feature pipeline, train, validate, executePrompt, generate strategy, backtest
Main progressIndicators moved from books and forums into reusable componentsComplete strategies became packageable, shareable, and reproducibleThe pipeline became the product, with standardized validationGeneration is fast and the entry barrier is low
Typical trapIndicator worship and holy-grail thinkingParameter overfitting: genetic search finds coincidence, not robustnessData leakage and survivorship biasStrategy hallucination and backtest overfitting blindness
Failure root causeRules are manual and lack systematic validationThe optimization target is backtest return, not robustnessModel complexity hides data-quality problemsThe model skips infrastructure accumulated across the first three generations

Two LLM-Era Failure Modes

LLMs lower the cost of producing strategy code, but they do not remove the need for validation infrastructure.

Strategy Hallucination

The model emits plausible trading logic that looks quantitative but has no market rationale, no statistical grounding, or invalid assumptions.

It creates code that passes syntax checks while smuggling in false causality.

Backtest Overfitting Blindness

The model treats a profitable backtest as validation and misses leakage, parameter mining, unstable regimes, or survivorship bias.

It accelerates curve fitting because generation speed multiplies untested variants.
Z-score pattern often generated by LLMs
spread = asset_a.close - hedge_ratio * asset_b.close
z_score = (spread - spread.mean()) / spread.std()

if z_score > 2:
    short(asset_a)
    long(asset_b)
elif z_score < -2:
    long(asset_a)
    short(asset_b)

Model + Tools

A Model Is Not a Trading System

LLMs are useful when wrapped in tools. A serious trading stack includes data validation, pricing engines, execution logic, risk controls, monitoring, and deployment discipline.

Jane Street-style systems show the pattern: the model is one layer inside a larger toolchain, not the whole product.

StratCraft Positioning

Infrastructure for Gen 2 and Gen 3 Developers

StratCraft is not a Gen 5 claim. It gives strategy-framework and system-pipeline developers local C++-grade backtesting performance, plugin isolation, and repeatable validation workflows.

Build on the Infrastructure Layer

Use the local backtest engine and plugin ecosystem to move from generated ideas to validated systems.