Building a walk-forward analysis pipeline in TypeScript from scratch
Design a reproducible walk-forward analysis pipeline in TypeScript: data contracts, rolling IS/OOS splits, metrics, and how to keep results stable enough to trust in production.
Walk-forward analysis (WFA) is not a single function call. It is a pipeline: ingest bars, align timestamps, split history into rolling in-sample (IS) and out-of-sample (OOS) windows, run a strategy evaluator, aggregate metrics, and persist artifacts so you can audit decisions later.
This guide shows how to build that pipeline in TypeScript with a senior-engineering mindset: strict types, deterministic ordering, and tests that fail loudly when you leak future data.
What you are building
A minimal WFA pipeline has five stages:
- Data layer - canonical candle schema, timezone rules, missing-bar policy.
- Split engine - anchored or rolling windows, step size, embargo if you simulate fills on the boundary.
- Evaluator - turns IS parameters (or a search space) into OOS trades or equity curves.
- Metrics - WFE, OOS retention, drawdown, turnover, cost drag, stability checks.
- Reporting - JSON artifacts + human-readable summary tables.
If any stage is fuzzy, your WFA will look scientific while silently encoding look-ahead bias.
Step 0: freeze definitions
Before code, write down:
- Bar timeframe and whether you allow forward-filled candles.
- Trade accounting - fees, slippage model, position sizing rule.
- Parameter policy - what you optimize on IS only, and what is frozen on OOS.
- Randomness - if hyperparameter search is stochastic, fix seeds per window.
These choices belong in a RunManifest object you persist next to results.
Step 1: canonical types
Start with boring types. Boring types prevent subtle bugs.
Rules:
- Never mix local time and UTC in the same pipeline. Pick UTC end-to-end.
- Validate strictly:
high >= max(open, close)style checks catch bad vendor feeds early.
Step 2: build rolling windows
Rolling WFA is a loop with a cursor. Keep the logic pure so you can unit test it without running a strategy.
Production upgrades you will want next:
- Warm-up bars for indicators so IS optimization does not start on uninitialized MACD/RSI state.
- Embargo if your trade logic can use information near the IS/OOS boundary.
- Liquidity filters applied identically on IS and OOS.
Step 3: isolate the evaluator behind an interface
Your strategy code will change. Your WFA harness should not.
Pipeline orchestration becomes:
- Slice candles for IS range.
- Run optimization (grid, Bayesian, whatever) - IS only.
- Take best params (or a robustness rule: top-N average).
- Slice candles for OOS range.
- Evaluate once with frozen params.
Step 4: compute metrics that survive skepticism
At minimum, store both:
- OOS headline metrics - return, max drawdown, profit factor.
- Stability metrics - sensitivity to parameter jitter, degradation vs IS.
If you only store the best OOS window, you are optimizing again - just with extra steps.
Step 5: persistence and reproducibility
Write artifacts to disk as JSON Lines or SQLite. Each run should include:
- input dataset hash
- window spec
- evaluator version
- dependency versions (
package-lockhash) - params per window
- metrics per window
This is what lets you answer the question: "Why did this pass last month and fail today?"
Step 6: tests that catch look-ahead
Add three tests every serious pipeline needs:
- Shuffle test - randomly permute OOS labels (not prices) and confirm your splitter still respects time order.
- Leak test - assert OOS slice cannot access candle indices beyond
oosEnd. - Boundary test - a strategy that should be impossible unless future data leaks (for example, "buy if next bar close is higher") must never show impossible OOS performance.
Operational checklist before you trust results
- Costs are applied on both IS and OOS with the same model.
- Corporate actions / contract rolls are handled or explicitly excluded.
- You log the entire hyperparameter search budget (to reason about multiple testing).