Forecasting: Principles and Practice — Time Series Cross-Validation¶
Summary¶
"Forecasting: Principles and Practice" (3rd edition, Hyndman & Athanasopoulos) is the authoritative open-source textbook on forecasting, and its chapter on time series cross-validation is the definitive reference for understanding why walk-forward validation is the correct method for temporal data. The chapter explains that standard k-fold cross-validation is inappropriate for time series because it randomly mixes past and future observations, creating look-ahead bias.
The chapter covers: (1) the correct way to do cross-validation for time series, (2) the difference between expanding and rolling windows, (3) evaluation metrics averaged across folds, and (4) the purged cross-validation concept (gap between train and test to prevent information leakage).
Key Concepts¶
- Why k-fold fails for time series: Randomly splitting temporal data means the model trains on future data and tests on past data — the opposite of real deployment
- Expanding window: Training set grows over time (all history to T1, all history to T2, etc.). Most common for stable systems.
- Rolling window: Fixed-size training set slides forward. Better when underlying process changes over time (non-stationarity).
- tscv (time series CV): The correct cross-validation for time series — respects temporal order
- Stretch tsibble: The computational tool in R's tidyverts for implementing walk-forward validation
- Evaluation: Metrics are averaged across all folds — the mean out-of-sample performance is the key metric
- Purged cross-validation: Adds a buffer/gap between training and test sets to prevent leakage from near-future data influencing training features
Walk-Forward Algorithm¶
For each fold:
1. Train on all data up to time t
2. Forecast h steps ahead (h = horizon)
3. Compare forecast to actual outcomes
4. Record metric
5. Move forward to t+1
Metrics = mean(metric across all folds)
Formula for Expanding vs. Rolling¶
Expanding window:
$$Training_t = {y_1, y_2, ..., y_t}$$
$$Test_t = {y_{t+1}, y_{t+2}, ..., y_{t+h}}$$
Rolling window:
$$Training_t = {y_{t-m}, y_{t-m+1}, ..., y_t}$$
$$Test_t = {y_{t+1}, ..., y_{t+h}}$$
Where m = window size.
Notes¶
- This is the canonical textbook reference for time series cross-validation — the existing
walk-forward-validation.mdnote covers sports betting specifically, but this source provides the theoretical foundation - Key insight for sports betting: the textbook explicitly warns against k-fold cross-validation for temporal prediction problems — this is the mathematical justification for the client's walk-forward requirement
- The expanding window approach is correct for World Cup modeling: team strength accumulates over years and older data is still informative
- The purged cross-validation concept (adding a gap between train and test) is relevant for betting models where near-term data leakage is a concern
- The textbook's evaluation framework (mean metric across all folds with standard deviation) is exactly what the World Cup model's backtesting framework should report
- For the World Cup model: each tournament is one fold, and the expanding window means training on all previous World Cups plus inter-tournament friendlies/qualifiers