Time Series: Difference between revisions

From BloomWiki
Jump to navigation Jump to search
BloomWiki: Time Series
Tag: Reverted
BloomWiki: Time Series
Tag: Manual revert
Line 1: Line 1:
{{BloomIntro}}
{{BloomIntro}}
Time Series Analysis is a statistical technique that deals with time series data, or trend analysis. Time series data is a sequence of data points indexed in time order (e.g., stock prices, weather data, or population growth). Unlike standard statistics, which assumes data points are independent, time series analysis recognizes that "what happened yesterday" often affects "what happens today." By identifying patterns such as trends, seasonality, and cycles, time series analysis allows us to forecast the future, detect anomalies, and understand the dynamic forces that drive a system over time.
AI for time series and forecasting applies machine learning and deep learning techniques to sequential, time-indexed data to predict future values, detect anomalies, and extract patterns. Time series data is ubiquitous: stock prices, electricity demand, web traffic, sensor readings, weather measurements, and patient vital signs all evolve over time. Traditional forecasting relied on statistical models like ARIMA; modern AI-driven approaches — including LSTMs, Temporal Fusion Transformers, and foundation models for time series — now achieve state-of-the-art performance across domains.


== Remembering ==
== Remembering ==
* '''Time Series''' — A series of data points indexed (or listed or graphed) in time order.
* '''Time series''' — A sequence of data points indexed in time order, typically at regular intervals.
* '''Trend''' — The long-term increase or decrease in the data (e.g., global warming).
* '''Forecasting''' — Predicting future values of a time series based on its historical patterns.
* '''Seasonality''' — Predictable patterns that repeat over a fixed period (e.g., ice cream sales in summer).
* '''Univariate time series''' — A single variable measured over time (e.g., daily sales).
* '''Cyclical Component''' — Patterns that repeat but not over a fixed period (e.g., economic "boom and bust").
* '''Multivariate time series''' — Multiple variables measured simultaneously over time (e.g., temperature, humidity, and pressure together).
* '''Noise (Irregular)''' — Random variations that cannot be explained by trend or seasonality.
* '''Trend''' — The long-term direction of a time series (upward, downward, or flat).
* '''Stationarity''' — A property where the statistical characteristics of the series (mean, variance) do not change over time.
* '''Seasonality''' — Regular, periodic patterns that repeat at known intervals (daily, weekly, yearly).
* '''Autocorrelation''' — The correlation of a signal with a delayed version of itself (how today relates to yesterday).
* '''Residuals''' — The component remaining after removing trend and seasonality; ideally random noise.
* '''Lag''' — The amount of time by which one variable is separated from another.
* '''Stationarity''' — A time series is stationary if its statistical properties (mean, variance) do not change over time. Many models require stationarity.
* '''Moving Average''' — A calculation to analyze data points by creating a series of averages of different subsets of the full data set.
* '''Autocorrelation''' — The correlation of a time series with its own past values (lags).
* '''Forecasting''' — The process of making predictions of the future based on past and present data.
* '''Lag''' — A prior time step. Lag-1 is yesterday's value; lag-7 is last week's value.
* '''ARIMA (AutoRegressive Integrated Moving Average)''' — A popular statistical model for time series forecasting.
* '''ARIMA''' — AutoRegressive Integrated Moving Average; a classical statistical model for univariate forecasting.
* '''Smoothing''' — A technique to remove "noise" from a data set to see the underlying pattern.
* '''LSTM (Long Short-Term Memory)''' — A type of RNN with gating mechanisms that captures long-range dependencies in sequences.
* '''Exponential Smoothing''' — A technique that gives more "weight" to recent data points when forecasting.
* '''Temporal Fusion Transformer (TFT)''' — A transformer-based model for multi-horizon time series forecasting, incorporating attention across time.
* '''Seasonally Adjusted''' — Data that has had the seasonal component removed to reveal the true trend.
* '''Anomaly detection''' — Identifying data points, intervals, or patterns that deviate significantly from expected behavior.
* '''Horizon''' — The number of future time steps to forecast (1-step-ahead vs. multi-step/multi-horizon).
* '''Rolling forecast''' — Re-fitting or updating the model as new data arrives, maintaining accuracy over time.


== Understanding ==
== Understanding ==
Time series analysis is about "Decomposing" a complex signal into its parts.
Time series forecasting is inherently a sequential problem: the order of observations matters, and the past contains information about the future. This distinguishes it from tabular classification, where rows are exchangeable.


'''The Decomposition Equation''':  
'''The decomposition framework''' is key to understanding time series:
$Data = Trend + Seasonality + Cycle + Noise$
<syntaxhighlight lang="text">
By pulling these apart, we can see that a 10% rise in sales might just be "Seasonality" (it's Christmas), not a "Trend" (your business is growing).
Observed = Trend × Seasonal × Residual  (multiplicative)
        = Trend + Seasonal + Residual  (additive)
</syntaxhighlight>
Decomposing a series into these components enables targeted modeling: model the trend with regression, the seasonality with Fourier features or indicator variables, and the residual with a neural network or ARIMA.


'''Stationarity (The Golden Rule)''': Most statistical models only work if the data is "Stationary." If your data has a trend (it's going up), it's not stationary. To fix this, we use '''Differencing'''—looking at the ''change'' from day to day ($X''t - X''{t-1}$) instead of the raw values.
'''Why deep learning?''' Classical models like ARIMA excel at capturing simple autocorrelation but struggle with:
* Non-linear relationships between variables
* Multiple interacting series (multivariate)
* Complex, multi-scale seasonality
* Incorporating exogenous variables (weather, holidays, promotions)


'''Autoregression (AR)''': This is the idea that the current value is a linear function of previous values. If it rained yesterday, it is more likely to rain today. The strength of this relationship is measured by '''Autocorrelation'''.
LSTMs can capture non-linear temporal dependencies and handle arbitrary-length sequences. Transformers add the ability to attend to any past time step directly, avoiding the vanishing gradient problem over long sequences. Foundation models for time series (TimeGPT, MOIRAI, Chronos) pre-trained on billions of time points can zero-shot forecast on new series.
 
'''Evaluation discipline''': A critical mistake in time series is using random train/test splits. This causes data leakage — future data leaks into the training set. Always use chronological splits: train on the first 70–80%, validate on the next 10–15%, test on the most recent 10–15%.


== Applying ==
== Applying ==
'''Calculating a Simple Moving Average (SMA):'''
'''Multi-horizon forecasting with Temporal Fusion Transformer (PyTorch Forecasting):'''
 
<syntaxhighlight lang="python">
<syntaxhighlight lang="python">
def simple_moving_average(data, window_size):
import pandas as pd
    """
from pytorch_forecasting import TimeSeriesDataSet, TemporalFusionTransformer
    Smooths out a time series by averaging over a window.
from pytorch_forecasting.metrics import QuantileLoss
    """
import lightning.pytorch as pl
    averages = []
 
    for i in range(len(data) - window_size + 1):
# Load data: each row = one time step for one series
        window = data[i : i + window_size]
df = pd.read_csv("sales_data.csv")
        averages.append(sum(window) / window_size)
df["time_idx"] = (df["date"] - df["date"].min()).dt.days  # integer time index
    return averages


# Stock prices over 10 days
max_encoder_length = 60  # Use 60 past days as context
prices = [100, 102, 101, 105, 110, 108, 115, 120, 118, 125]
max_prediction_length = 14  # Forecast 14 days ahead
sma_3 = simple_moving_average(prices, 3)


print(f"Original Prices: {prices}")
# Training dataset
print(f"3-Day SMA:       {sma_3}")
training = TimeSeriesDataSet(
# Note how the 'zig-zags' are smoothed out, making the upward
    df[lambda x: x.time_idx <= x.time_idx.max() - max_prediction_length],
# trend easier to see.
    time_idx="time_idx",
    target="sales",
    group_ids=["store_id", "product_id"],      # Multiple series
    min_encoder_length=30,
    max_encoder_length=max_encoder_length,
    max_prediction_length=max_prediction_length,
    static_categoricals=["store_id", "product_id"],
    time_varying_known_reals=["time_idx", "price", "day_of_week", "is_holiday"],
    time_varying_unknown_reals=["sales"],       # Only target is "unknown" in future
    target_normalizer="auto",
)
 
validation = TimeSeriesDataSet.from_dataset(training, df, predict=True, stop_randomization=True)
train_dl = training.to_dataloader(train=True, batch_size=64)
val_dl = validation.to_dataloader(train=False, batch_size=64)
 
# TFT model
tft = TemporalFusionTransformer.from_dataset(
    training,
    learning_rate=0.03,
    hidden_size=64,
    attention_head_size=4,
    dropout=0.1,
    hidden_continuous_size=32,
    output_size=7,          # 7 quantile predictions (p10 to p90)
    loss=QuantileLoss(),
    log_interval=10,
)
 
trainer = pl.Trainer(max_epochs=30, accelerator="gpu", gradient_clip_val=0.1)
trainer.fit(tft, train_dl, val_dl)
</syntaxhighlight>
</syntaxhighlight>


; Time Series in the Real World
; Model selection guide by forecasting scenario
: '''Stock Market''' → Analyzing price "momentum" and volatility.
: '''Simple univariate, clean seasonality''' → SARIMA, Prophet (Meta), ETS
: '''Epidemiology''' → Tracking the "R-number" (spread of a virus) over time.
: '''Univariate with complex patterns''' → N-BEATS, N-HiTS, PatchTST
: '''IoT / Sensors''' → Detecting an "Anomaly" (e.g., a machine vibrating weirdly) in a constant stream of data.
: '''Multivariate with known future covariates''' → Temporal Fusion Transformer, DeepAR
: '''Inventory Management''' → Predicting how many turkeys to stock before Thanksgiving based on the last 5 years of data.
: '''Very short series or irregular intervals''' → Gaussian Processes, ARIMA
: '''Many series, zero-shot''' → TimeGPT, Chronos, MOIRAI (foundation models)
: '''Anomaly detection''' → Isolation Forest (tabular features), LSTMAD, Anomaly Transformer


== Analyzing ==
== Analyzing ==
{| class="wikitable"
{| class="wikitable"
|+ Additive vs. Multiplicative Seasonality
|+ Time Series Model Comparison
! Type !! Description !! Example
! Model !! Type !! Strengths !! Weaknesses
|-
|-
| Additive || Seasonal variation is constant (±10 units) || Monthly gym attendance
| ARIMA/SARIMA || Statistical || Interpretable, fast, works on small data || Assumes linearity, one series at a time
|-
|-
| Multiplicative || Seasonal variation grows with the trend (±10%) || Airline passengers (more people travel in summer as the total industry grows)
| Prophet || Statistical || Handles holidays, trend changepoints || Limited to single series; no covariates
|-
| DeepAR || Deep Learning (LSTM) || Probabilistic, many series || Needs lots of data, slow training
|-
| TFT || Transformer || Multi-horizon, covariate-rich, interpretable || Complex, high data requirement
|-
| N-BEATS || Deep Learning (MLP) || Fast, competitive, no feature engineering || Limited covariate support
|-
| Chronos (foundation) || LLM-style || Zero-shot, no training needed || No covariate support yet; large model
|}
|}


'''The Random Walk''': Some time series (like the stock market in the short term) are "Random Walks." This means the best prediction for tomorrow is simply "whatever it is today" plus some random noise. In a true random walk, technical analysis (looking at charts) is impossible—a hard lesson for many traders.
'''Failure modes:'''
* '''Chronological leakage''' — Random train/test splits allow future data to inform past predictions, producing falsely optimistic results. Always split chronologically.
* '''Ignoring non-stationarity''' — Many models assume stationarity. Differencing (ARIMA) or normalization per-series is required.
* '''Ignoring distributional shift''' — Retail models trained pre-COVID performed terribly during COVID. Extreme events cause structural breaks that no model trained on historical data anticipates.
* '''Point forecast overconfidence''' — Reporting only mean forecasts without uncertainty intervals. Downstream planning needs to understand the range of outcomes, not just the median.
* '''Evaluation on last segment only''' — Evaluating only on the final test period may not represent the model's general quality. Use rolling window backtesting across multiple historical windows.


== Evaluating ==
== Evaluating ==
Evaluating a forecast: (1) '''RMSE (Root Mean Square Error)''': How far off were our predictions on average? (2) '''Overfitting''': Does the model work perfectly on old data but fail on new data? (3) '''Lag''': Does the model "lag" behind reality (only predicting a crash ''after'' it starts)? (4) '''White Noise''': Are the "residuals" (the errors) random? If the errors have a pattern, it means our model is missing something.
Expert time series evaluation uses multiple metrics and rigorous experimental design:
 
'''Regression metrics''': MAE (Mean Absolute Error), RMSE, MAPE (Mean Absolute Percentage Error), sMAPE. MAPE is undefined when actual=0 and is skewed by near-zero values; sMAPE or MAE are more robust.
 
'''Probabilistic metrics''': For probabilistic forecasts (quantile or interval), use CRPS (Continuous Ranked Probability Score) or Winkler score. These reward well-calibrated uncertainty.
 
'''Rolling window backtesting''': Instead of one train/test split, slide a window across history — train on windows [0:T], [0:T+1], … and evaluate on each subsequent step. This tests the model across many historical regimes and avoids cherry-picking a favorable test period.
 
'''Naive benchmarks''': Always compare to: naive (last value), seasonal naive (same period last cycle), and exponential smoothing. If a complex deep learning model cannot beat seasonal naive, it's not adding value.
 
Expert practitioners report backtesting results as distributions (mean ± std across windows) rather than a single number, and explicitly test for robustness during unusual periods (holidays, pandemics, market crashes).


== Creating ==
== Creating ==
Future Frontiers: (1) '''LSTM (Long Short-Term Memory)''': Using deep learning (RNNs) to capture extremely long-term dependencies in data (e.g., weather patterns over decades). (2) '''Real-time Stream Processing''': Analyzing time series as they happen at a rate of millions of points per second (used in fraud detection). (3) '''Multivariate Time Series''': Predicting one variable (e.g., sales) using 50 others (e.g., weather, social media sentiment, competitor prices). (4) '''Causal Time Series''': Determining if one time series ''causes'' another (Granger Causality).
Designing a production time series forecasting system:
 
'''1. Data architecture'''
<syntaxhighlight lang="text">
Raw time series sources (databases, IoT, APIs)
    ↓
[Time-indexed storage: InfluxDB, TimescaleDB, or Parquet partitioned by date]
    ↓
[Feature engineering pipeline:]
│  ├── Temporal features: hour, day of week, month, quarter
│  ├── Lag features: lag-1, lag-7, lag-28, rolling mean/std
│  ├── Fourier features for seasonality
│  └── External covariates: weather, holidays, promotions
    ↓
[Stationarity tests + differencing if needed]
    ↓
[Train/val/test split: chronological]
</syntaxhighlight>
 
'''2. Model training and selection'''
<syntaxhighlight lang="text">
Train multiple models (baseline naive, SARIMA, TFT, foundation model)
    ↓
Evaluate each with rolling window backtesting
    ↓
Select winner by held-out test MAPE and CRPS
    ↓
Train ensemble: weighted average of top-3 models (often beats any single model)
    ↓
Register in model registry with evaluation metrics
</syntaxhighlight>
 
'''3. Production serving and retraining'''
* Serve forecasts via API with caching (same-day forecast is rarely regenerated)
* Nightly retrain on latest data window (rolling retrain strategy)
* Monitor forecast accuracy vs. actuals in real-time; alert on anomalies
* Detect distribution shift: plot forecast distribution vs. actuals weekly
* Trigger manual review when MAE exceeds historical 95th percentile


[[Category:Statistics]]
[[Category:Artificial Intelligence]]
[[Category:Data Science]]
[[Category:Machine Learning]]
[[Category:Finance]]
[[Category:Time Series]]

Revision as of 14:35, 23 April 2026

How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?

AI for time series and forecasting applies machine learning and deep learning techniques to sequential, time-indexed data to predict future values, detect anomalies, and extract patterns. Time series data is ubiquitous: stock prices, electricity demand, web traffic, sensor readings, weather measurements, and patient vital signs all evolve over time. Traditional forecasting relied on statistical models like ARIMA; modern AI-driven approaches — including LSTMs, Temporal Fusion Transformers, and foundation models for time series — now achieve state-of-the-art performance across domains.

Remembering

  • Time series — A sequence of data points indexed in time order, typically at regular intervals.
  • Forecasting — Predicting future values of a time series based on its historical patterns.
  • Univariate time series — A single variable measured over time (e.g., daily sales).
  • Multivariate time series — Multiple variables measured simultaneously over time (e.g., temperature, humidity, and pressure together).
  • Trend — The long-term direction of a time series (upward, downward, or flat).
  • Seasonality — Regular, periodic patterns that repeat at known intervals (daily, weekly, yearly).
  • Residuals — The component remaining after removing trend and seasonality; ideally random noise.
  • Stationarity — A time series is stationary if its statistical properties (mean, variance) do not change over time. Many models require stationarity.
  • Autocorrelation — The correlation of a time series with its own past values (lags).
  • Lag — A prior time step. Lag-1 is yesterday's value; lag-7 is last week's value.
  • ARIMA — AutoRegressive Integrated Moving Average; a classical statistical model for univariate forecasting.
  • LSTM (Long Short-Term Memory) — A type of RNN with gating mechanisms that captures long-range dependencies in sequences.
  • Temporal Fusion Transformer (TFT) — A transformer-based model for multi-horizon time series forecasting, incorporating attention across time.
  • Anomaly detection — Identifying data points, intervals, or patterns that deviate significantly from expected behavior.
  • Horizon — The number of future time steps to forecast (1-step-ahead vs. multi-step/multi-horizon).
  • Rolling forecast — Re-fitting or updating the model as new data arrives, maintaining accuracy over time.

Understanding

Time series forecasting is inherently a sequential problem: the order of observations matters, and the past contains information about the future. This distinguishes it from tabular classification, where rows are exchangeable.

The decomposition framework is key to understanding time series: <syntaxhighlight lang="text"> Observed = Trend × Seasonal × Residual (multiplicative)

        = Trend + Seasonal + Residual   (additive)

</syntaxhighlight> Decomposing a series into these components enables targeted modeling: model the trend with regression, the seasonality with Fourier features or indicator variables, and the residual with a neural network or ARIMA.

Why deep learning? Classical models like ARIMA excel at capturing simple autocorrelation but struggle with:

  • Non-linear relationships between variables
  • Multiple interacting series (multivariate)
  • Complex, multi-scale seasonality
  • Incorporating exogenous variables (weather, holidays, promotions)

LSTMs can capture non-linear temporal dependencies and handle arbitrary-length sequences. Transformers add the ability to attend to any past time step directly, avoiding the vanishing gradient problem over long sequences. Foundation models for time series (TimeGPT, MOIRAI, Chronos) pre-trained on billions of time points can zero-shot forecast on new series.

Evaluation discipline: A critical mistake in time series is using random train/test splits. This causes data leakage — future data leaks into the training set. Always use chronological splits: train on the first 70–80%, validate on the next 10–15%, test on the most recent 10–15%.

Applying

Multi-horizon forecasting with Temporal Fusion Transformer (PyTorch Forecasting):

<syntaxhighlight lang="python"> import pandas as pd from pytorch_forecasting import TimeSeriesDataSet, TemporalFusionTransformer from pytorch_forecasting.metrics import QuantileLoss import lightning.pytorch as pl

  1. Load data: each row = one time step for one series

df = pd.read_csv("sales_data.csv") df["time_idx"] = (df["date"] - df["date"].min()).dt.days # integer time index

max_encoder_length = 60 # Use 60 past days as context max_prediction_length = 14 # Forecast 14 days ahead

  1. Training dataset

training = TimeSeriesDataSet(

   df[lambda x: x.time_idx <= x.time_idx.max() - max_prediction_length],
   time_idx="time_idx",
   target="sales",
   group_ids=["store_id", "product_id"],       # Multiple series
   min_encoder_length=30,
   max_encoder_length=max_encoder_length,
   max_prediction_length=max_prediction_length,
   static_categoricals=["store_id", "product_id"],
   time_varying_known_reals=["time_idx", "price", "day_of_week", "is_holiday"],
   time_varying_unknown_reals=["sales"],       # Only target is "unknown" in future
   target_normalizer="auto",

)

validation = TimeSeriesDataSet.from_dataset(training, df, predict=True, stop_randomization=True) train_dl = training.to_dataloader(train=True, batch_size=64) val_dl = validation.to_dataloader(train=False, batch_size=64)

  1. TFT model

tft = TemporalFusionTransformer.from_dataset(

   training,
   learning_rate=0.03,
   hidden_size=64,
   attention_head_size=4,
   dropout=0.1,
   hidden_continuous_size=32,
   output_size=7,          # 7 quantile predictions (p10 to p90)
   loss=QuantileLoss(),
   log_interval=10,

)

trainer = pl.Trainer(max_epochs=30, accelerator="gpu", gradient_clip_val=0.1) trainer.fit(tft, train_dl, val_dl) </syntaxhighlight>

Model selection guide by forecasting scenario
Simple univariate, clean seasonality → SARIMA, Prophet (Meta), ETS
Univariate with complex patterns → N-BEATS, N-HiTS, PatchTST
Multivariate with known future covariates → Temporal Fusion Transformer, DeepAR
Very short series or irregular intervals → Gaussian Processes, ARIMA
Many series, zero-shot → TimeGPT, Chronos, MOIRAI (foundation models)
Anomaly detection → Isolation Forest (tabular features), LSTMAD, Anomaly Transformer

Analyzing

Time Series Model Comparison
Model Type Strengths Weaknesses
ARIMA/SARIMA Statistical Interpretable, fast, works on small data Assumes linearity, one series at a time
Prophet Statistical Handles holidays, trend changepoints Limited to single series; no covariates
DeepAR Deep Learning (LSTM) Probabilistic, many series Needs lots of data, slow training
TFT Transformer Multi-horizon, covariate-rich, interpretable Complex, high data requirement
N-BEATS Deep Learning (MLP) Fast, competitive, no feature engineering Limited covariate support
Chronos (foundation) LLM-style Zero-shot, no training needed No covariate support yet; large model

Failure modes:

  • Chronological leakage — Random train/test splits allow future data to inform past predictions, producing falsely optimistic results. Always split chronologically.
  • Ignoring non-stationarity — Many models assume stationarity. Differencing (ARIMA) or normalization per-series is required.
  • Ignoring distributional shift — Retail models trained pre-COVID performed terribly during COVID. Extreme events cause structural breaks that no model trained on historical data anticipates.
  • Point forecast overconfidence — Reporting only mean forecasts without uncertainty intervals. Downstream planning needs to understand the range of outcomes, not just the median.
  • Evaluation on last segment only — Evaluating only on the final test period may not represent the model's general quality. Use rolling window backtesting across multiple historical windows.

Evaluating

Expert time series evaluation uses multiple metrics and rigorous experimental design:

Regression metrics: MAE (Mean Absolute Error), RMSE, MAPE (Mean Absolute Percentage Error), sMAPE. MAPE is undefined when actual=0 and is skewed by near-zero values; sMAPE or MAE are more robust.

Probabilistic metrics: For probabilistic forecasts (quantile or interval), use CRPS (Continuous Ranked Probability Score) or Winkler score. These reward well-calibrated uncertainty.

Rolling window backtesting: Instead of one train/test split, slide a window across history — train on windows [0:T], [0:T+1], … and evaluate on each subsequent step. This tests the model across many historical regimes and avoids cherry-picking a favorable test period.

Naive benchmarks: Always compare to: naive (last value), seasonal naive (same period last cycle), and exponential smoothing. If a complex deep learning model cannot beat seasonal naive, it's not adding value.

Expert practitioners report backtesting results as distributions (mean ± std across windows) rather than a single number, and explicitly test for robustness during unusual periods (holidays, pandemics, market crashes).

Creating

Designing a production time series forecasting system:

1. Data architecture <syntaxhighlight lang="text"> Raw time series sources (databases, IoT, APIs)

[Time-indexed storage: InfluxDB, TimescaleDB, or Parquet partitioned by date]

[Feature engineering pipeline:] │ ├── Temporal features: hour, day of week, month, quarter │ ├── Lag features: lag-1, lag-7, lag-28, rolling mean/std │ ├── Fourier features for seasonality │ └── External covariates: weather, holidays, promotions

[Stationarity tests + differencing if needed]

[Train/val/test split: chronological] </syntaxhighlight>

2. Model training and selection <syntaxhighlight lang="text"> Train multiple models (baseline naive, SARIMA, TFT, foundation model)

Evaluate each with rolling window backtesting

Select winner by held-out test MAPE and CRPS

Train ensemble: weighted average of top-3 models (often beats any single model)

Register in model registry with evaluation metrics </syntaxhighlight>

3. Production serving and retraining

  • Serve forecasts via API with caching (same-day forecast is rarely regenerated)
  • Nightly retrain on latest data window (rolling retrain strategy)
  • Monitor forecast accuracy vs. actuals in real-time; alert on anomalies
  • Detect distribution shift: plot forecast distribution vs. actuals weekly
  • Trigger manual review when MAE exceeds historical 95th percentile