Combinatorial Purged Cross-Validation vs Walk-Forward: pros and cons
Combinatorial purged cross-validation versus walk-forward analysis: leakage control, compute cost, and practical adoption for trading research.
Combinatorial purged cross-validation trading and CPCV vs walk-forward are advanced searches. Both families try to reduce leakage between training and testing, but they optimize for different operational questions.
CPCV in one paragraph
CPCV constructs many train/test splits with purges and embargoes so overlapping labels do not leak information across folds. Combinatorial expansion increases the number of scenarios you average over.
Pros: strong leakage control for some label constructions.
Cons: compute cost, implementation complexity, and you can still fool yourself with multiple testing across many strategies.
Walk-forward in one paragraph
Walk-forward analysis simulates time: fit on an older window, test on the next unseen window, roll forward.
Pros: maps to how many trading systems are rebuilt and redeployed.
Cons: fewer independent splits than CPCV can generate; sensitive to window sizing (How to choose IS/OOS).
When to prefer walk-forward
If your deployment story is literally "re-fit periodically and trade the next regime slice," WFA is the natural backbone (What is WFA?).
When CPCV shows up in practice
CPCV is more common in cross-sectional research pipelines with careful label timing. For many retail bot workflows, WFA plus honest costs gets you farther faster.
Neither removes bad data
Pair with Data Quality Guard and Data snooping.