Time Series: Difference between revisions

From BloomWiki
Jump to navigation Jump to search
BloomWiki: Time Series
Tag: Manual revert
BloomWiki: Time Series
Tag: Reverted
Line 1: Line 1:
{{BloomIntro}}
{{BloomIntro}}
AI for time series and forecasting applies machine learning and deep learning techniques to sequential, time-indexed data to predict future values, detect anomalies, and extract patterns. Time series data is ubiquitous: stock prices, electricity demand, web traffic, sensor readings, weather measurements, and patient vital signs all evolve over time. Traditional forecasting relied on statistical models like ARIMA; modern AI-driven approaches — including LSTMs, Temporal Fusion Transformers, and foundation models for time series — now achieve state-of-the-art performance across domains.
Time Series Analysis is a statistical technique that deals with time series data, or trend analysis. Time series data is a sequence of data points indexed in time order (e.g., stock prices, weather data, or population growth). Unlike standard statistics, which assumes data points are independent, time series analysis recognizes that "what happened yesterday" often affects "what happens today." By identifying patterns such as trends, seasonality, and cycles, time series analysis allows us to forecast the future, detect anomalies, and understand the dynamic forces that drive a system over time.


== Remembering ==
== Remembering ==
* '''Time series''' — A sequence of data points indexed in time order, typically at regular intervals.
* '''Time Series''' — A series of data points indexed (or listed or graphed) in time order.
* '''Forecasting''' — Predicting future values of a time series based on its historical patterns.
* '''Trend''' — The long-term increase or decrease in the data (e.g., global warming).
* '''Univariate time series''' — A single variable measured over time (e.g., daily sales).
* '''Seasonality''' — Predictable patterns that repeat over a fixed period (e.g., ice cream sales in summer).
* '''Multivariate time series''' — Multiple variables measured simultaneously over time (e.g., temperature, humidity, and pressure together).
* '''Cyclical Component''' — Patterns that repeat but not over a fixed period (e.g., economic "boom and bust").
* '''Trend''' — The long-term direction of a time series (upward, downward, or flat).
* '''Noise (Irregular)''' — Random variations that cannot be explained by trend or seasonality.
* '''Seasonality''' — Regular, periodic patterns that repeat at known intervals (daily, weekly, yearly).
* '''Stationarity''' — A property where the statistical characteristics of the series (mean, variance) do not change over time.
* '''Residuals''' — The component remaining after removing trend and seasonality; ideally random noise.
* '''Autocorrelation''' — The correlation of a signal with a delayed version of itself (how today relates to yesterday).
* '''Stationarity''' — A time series is stationary if its statistical properties (mean, variance) do not change over time. Many models require stationarity.
* '''Lag''' — The amount of time by which one variable is separated from another.
* '''Autocorrelation''' — The correlation of a time series with its own past values (lags).
* '''Moving Average''' — A calculation to analyze data points by creating a series of averages of different subsets of the full data set.
* '''Lag''' — A prior time step. Lag-1 is yesterday's value; lag-7 is last week's value.
* '''Forecasting''' — The process of making predictions of the future based on past and present data.
* '''ARIMA''' — AutoRegressive Integrated Moving Average; a classical statistical model for univariate forecasting.
* '''ARIMA (AutoRegressive Integrated Moving Average)''' — A popular statistical model for time series forecasting.
* '''LSTM (Long Short-Term Memory)''' — A type of RNN with gating mechanisms that captures long-range dependencies in sequences.
* '''Smoothing''' — A technique to remove "noise" from a data set to see the underlying pattern.
* '''Temporal Fusion Transformer (TFT)''' — A transformer-based model for multi-horizon time series forecasting, incorporating attention across time.
* '''Exponential Smoothing''' — A technique that gives more "weight" to recent data points when forecasting.
* '''Anomaly detection''' — Identifying data points, intervals, or patterns that deviate significantly from expected behavior.
* '''Seasonally Adjusted''' — Data that has had the seasonal component removed to reveal the true trend.
* '''Horizon''' — The number of future time steps to forecast (1-step-ahead vs. multi-step/multi-horizon).
* '''Rolling forecast''' — Re-fitting or updating the model as new data arrives, maintaining accuracy over time.


== Understanding ==
== Understanding ==
Time series forecasting is inherently a sequential problem: the order of observations matters, and the past contains information about the future. This distinguishes it from tabular classification, where rows are exchangeable.
Time series analysis is about "Decomposing" a complex signal into its parts.


'''The decomposition framework''' is key to understanding time series:
'''The Decomposition Equation''':  
<syntaxhighlight lang="text">
$Data = Trend + Seasonality + Cycle + Noise$
Observed = Trend × Seasonal × Residual  (multiplicative)
By pulling these apart, we can see that a 10% rise in sales might just be "Seasonality" (it's Christmas), not a "Trend" (your business is growing).
        = Trend + Seasonal + Residual  (additive)
</syntaxhighlight>
Decomposing a series into these components enables targeted modeling: model the trend with regression, the seasonality with Fourier features or indicator variables, and the residual with a neural network or ARIMA.


'''Why deep learning?''' Classical models like ARIMA excel at capturing simple autocorrelation but struggle with:
'''Stationarity (The Golden Rule)''': Most statistical models only work if the data is "Stationary." If your data has a trend (it's going up), it's not stationary. To fix this, we use '''Differencing'''—looking at the ''change'' from day to day ($X''t - X''{t-1}$) instead of the raw values.
* Non-linear relationships between variables
* Multiple interacting series (multivariate)
* Complex, multi-scale seasonality
* Incorporating exogenous variables (weather, holidays, promotions)


LSTMs can capture non-linear temporal dependencies and handle arbitrary-length sequences. Transformers add the ability to attend to any past time step directly, avoiding the vanishing gradient problem over long sequences. Foundation models for time series (TimeGPT, MOIRAI, Chronos) pre-trained on billions of time points can zero-shot forecast on new series.
'''Autoregression (AR)''': This is the idea that the current value is a linear function of previous values. If it rained yesterday, it is more likely to rain today. The strength of this relationship is measured by '''Autocorrelation'''.
 
'''Evaluation discipline''': A critical mistake in time series is using random train/test splits. This causes data leakage — future data leaks into the training set. Always use chronological splits: train on the first 70–80%, validate on the next 10–15%, test on the most recent 10–15%.


== Applying ==
== Applying ==
'''Multi-horizon forecasting with Temporal Fusion Transformer (PyTorch Forecasting):'''
'''Calculating a Simple Moving Average (SMA):'''
 
<syntaxhighlight lang="python">
<syntaxhighlight lang="python">
import pandas as pd
def simple_moving_average(data, window_size):
from pytorch_forecasting import TimeSeriesDataSet, TemporalFusionTransformer
    """
from pytorch_forecasting.metrics import QuantileLoss
    Smooths out a time series by averaging over a window.
import lightning.pytorch as pl
    """
 
    averages = []
# Load data: each row = one time step for one series
    for i in range(len(data) - window_size + 1):
df = pd.read_csv("sales_data.csv")
        window = data[i : i + window_size]
df["time_idx"] = (df["date"] - df["date"].min()).dt.days  # integer time index
        averages.append(sum(window) / window_size)
    return averages


max_encoder_length = 60  # Use 60 past days as context
# Stock prices over 10 days
max_prediction_length = 14  # Forecast 14 days ahead
prices = [100, 102, 101, 105, 110, 108, 115, 120, 118, 125]
sma_3 = simple_moving_average(prices, 3)


# Training dataset
print(f"Original Prices: {prices}")
training = TimeSeriesDataSet(
print(f"3-Day SMA:       {sma_3}")
    df[lambda x: x.time_idx <= x.time_idx.max() - max_prediction_length],
# Note how the 'zig-zags' are smoothed out, making the upward
    time_idx="time_idx",
# trend easier to see.
    target="sales",
    group_ids=["store_id", "product_id"],      # Multiple series
    min_encoder_length=30,
    max_encoder_length=max_encoder_length,
    max_prediction_length=max_prediction_length,
    static_categoricals=["store_id", "product_id"],
    time_varying_known_reals=["time_idx", "price", "day_of_week", "is_holiday"],
    time_varying_unknown_reals=["sales"],       # Only target is "unknown" in future
    target_normalizer="auto",
)
 
validation = TimeSeriesDataSet.from_dataset(training, df, predict=True, stop_randomization=True)
train_dl = training.to_dataloader(train=True, batch_size=64)
val_dl = validation.to_dataloader(train=False, batch_size=64)
 
# TFT model
tft = TemporalFusionTransformer.from_dataset(
    training,
    learning_rate=0.03,
    hidden_size=64,
    attention_head_size=4,
    dropout=0.1,
    hidden_continuous_size=32,
    output_size=7,          # 7 quantile predictions (p10 to p90)
    loss=QuantileLoss(),
    log_interval=10,
)
 
trainer = pl.Trainer(max_epochs=30, accelerator="gpu", gradient_clip_val=0.1)
trainer.fit(tft, train_dl, val_dl)
</syntaxhighlight>
</syntaxhighlight>


; Model selection guide by forecasting scenario
; Time Series in the Real World
: '''Simple univariate, clean seasonality''' → SARIMA, Prophet (Meta), ETS
: '''Stock Market''' → Analyzing price "momentum" and volatility.
: '''Univariate with complex patterns''' → N-BEATS, N-HiTS, PatchTST
: '''Epidemiology''' → Tracking the "R-number" (spread of a virus) over time.
: '''Multivariate with known future covariates''' → Temporal Fusion Transformer, DeepAR
: '''IoT / Sensors''' → Detecting an "Anomaly" (e.g., a machine vibrating weirdly) in a constant stream of data.
: '''Very short series or irregular intervals''' → Gaussian Processes, ARIMA
: '''Inventory Management''' → Predicting how many turkeys to stock before Thanksgiving based on the last 5 years of data.
: '''Many series, zero-shot''' → TimeGPT, Chronos, MOIRAI (foundation models)
: '''Anomaly detection''' → Isolation Forest (tabular features), LSTMAD, Anomaly Transformer


== Analyzing ==
== Analyzing ==
{| class="wikitable"
{| class="wikitable"
|+ Time Series Model Comparison
|+ Additive vs. Multiplicative Seasonality
! Model !! Type !! Strengths !! Weaknesses
! Type !! Description !! Example
|-
|-
| ARIMA/SARIMA || Statistical || Interpretable, fast, works on small data || Assumes linearity, one series at a time
| Additive || Seasonal variation is constant (±10 units) || Monthly gym attendance
|-
|-
| Prophet || Statistical || Handles holidays, trend changepoints || Limited to single series; no covariates
| Multiplicative || Seasonal variation grows with the trend (±10%) || Airline passengers (more people travel in summer as the total industry grows)
|-
| DeepAR || Deep Learning (LSTM) || Probabilistic, many series || Needs lots of data, slow training
|-
| TFT || Transformer || Multi-horizon, covariate-rich, interpretable || Complex, high data requirement
|-
| N-BEATS || Deep Learning (MLP) || Fast, competitive, no feature engineering || Limited covariate support
|-
| Chronos (foundation) || LLM-style || Zero-shot, no training needed || No covariate support yet; large model
|}
|}


'''Failure modes:'''
'''The Random Walk''': Some time series (like the stock market in the short term) are "Random Walks." This means the best prediction for tomorrow is simply "whatever it is today" plus some random noise. In a true random walk, technical analysis (looking at charts) is impossible—a hard lesson for many traders.
* '''Chronological leakage''' — Random train/test splits allow future data to inform past predictions, producing falsely optimistic results. Always split chronologically.
* '''Ignoring non-stationarity''' — Many models assume stationarity. Differencing (ARIMA) or normalization per-series is required.
* '''Ignoring distributional shift''' — Retail models trained pre-COVID performed terribly during COVID. Extreme events cause structural breaks that no model trained on historical data anticipates.
* '''Point forecast overconfidence''' — Reporting only mean forecasts without uncertainty intervals. Downstream planning needs to understand the range of outcomes, not just the median.
* '''Evaluation on last segment only''' — Evaluating only on the final test period may not represent the model's general quality. Use rolling window backtesting across multiple historical windows.


== Evaluating ==
== Evaluating ==
Expert time series evaluation uses multiple metrics and rigorous experimental design:
Evaluating a forecast: (1) '''RMSE (Root Mean Square Error)''': How far off were our predictions on average? (2) '''Overfitting''': Does the model work perfectly on old data but fail on new data? (3) '''Lag''': Does the model "lag" behind reality (only predicting a crash ''after'' it starts)? (4) '''White Noise''': Are the "residuals" (the errors) random? If the errors have a pattern, it means our model is missing something.
 
'''Regression metrics''': MAE (Mean Absolute Error), RMSE, MAPE (Mean Absolute Percentage Error), sMAPE. MAPE is undefined when actual=0 and is skewed by near-zero values; sMAPE or MAE are more robust.
 
'''Probabilistic metrics''': For probabilistic forecasts (quantile or interval), use CRPS (Continuous Ranked Probability Score) or Winkler score. These reward well-calibrated uncertainty.
 
'''Rolling window backtesting''': Instead of one train/test split, slide a window across history — train on windows [0:T], [0:T+1], … and evaluate on each subsequent step. This tests the model across many historical regimes and avoids cherry-picking a favorable test period.
 
'''Naive benchmarks''': Always compare to: naive (last value), seasonal naive (same period last cycle), and exponential smoothing. If a complex deep learning model cannot beat seasonal naive, it's not adding value.
 
Expert practitioners report backtesting results as distributions (mean ± std across windows) rather than a single number, and explicitly test for robustness during unusual periods (holidays, pandemics, market crashes).


== Creating ==
== Creating ==
Designing a production time series forecasting system:
Future Frontiers: (1) '''LSTM (Long Short-Term Memory)''': Using deep learning (RNNs) to capture extremely long-term dependencies in data (e.g., weather patterns over decades). (2) '''Real-time Stream Processing''': Analyzing time series as they happen at a rate of millions of points per second (used in fraud detection). (3) '''Multivariate Time Series''': Predicting one variable (e.g., sales) using 50 others (e.g., weather, social media sentiment, competitor prices). (4) '''Causal Time Series''': Determining if one time series ''causes'' another (Granger Causality).
 
'''1. Data architecture'''
<syntaxhighlight lang="text">
Raw time series sources (databases, IoT, APIs)
    ↓
[Time-indexed storage: InfluxDB, TimescaleDB, or Parquet partitioned by date]
    ↓
[Feature engineering pipeline:]
│  ├── Temporal features: hour, day of week, month, quarter
│  ├── Lag features: lag-1, lag-7, lag-28, rolling mean/std
│  ├── Fourier features for seasonality
│  └── External covariates: weather, holidays, promotions
    ↓
[Stationarity tests + differencing if needed]
    ↓
[Train/val/test split: chronological]
</syntaxhighlight>
 
'''2. Model training and selection'''
<syntaxhighlight lang="text">
Train multiple models (baseline naive, SARIMA, TFT, foundation model)
    ↓
Evaluate each with rolling window backtesting
    ↓
Select winner by held-out test MAPE and CRPS
    ↓
Train ensemble: weighted average of top-3 models (often beats any single model)
    ↓
Register in model registry with evaluation metrics
</syntaxhighlight>
 
'''3. Production serving and retraining'''
* Serve forecasts via API with caching (same-day forecast is rarely regenerated)
* Nightly retrain on latest data window (rolling retrain strategy)
* Monitor forecast accuracy vs. actuals in real-time; alert on anomalies
* Detect distribution shift: plot forecast distribution vs. actuals weekly
* Trigger manual review when MAE exceeds historical 95th percentile


[[Category:Artificial Intelligence]]
[[Category:Statistics]]
[[Category:Machine Learning]]
[[Category:Data Science]]
[[Category:Time Series]]
[[Category:Finance]]

Revision as of 14:28, 23 April 2026

How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?

Time Series Analysis is a statistical technique that deals with time series data, or trend analysis. Time series data is a sequence of data points indexed in time order (e.g., stock prices, weather data, or population growth). Unlike standard statistics, which assumes data points are independent, time series analysis recognizes that "what happened yesterday" often affects "what happens today." By identifying patterns such as trends, seasonality, and cycles, time series analysis allows us to forecast the future, detect anomalies, and understand the dynamic forces that drive a system over time.

Remembering

  • Time Series — A series of data points indexed (or listed or graphed) in time order.
  • Trend — The long-term increase or decrease in the data (e.g., global warming).
  • Seasonality — Predictable patterns that repeat over a fixed period (e.g., ice cream sales in summer).
  • Cyclical Component — Patterns that repeat but not over a fixed period (e.g., economic "boom and bust").
  • Noise (Irregular) — Random variations that cannot be explained by trend or seasonality.
  • Stationarity — A property where the statistical characteristics of the series (mean, variance) do not change over time.
  • Autocorrelation — The correlation of a signal with a delayed version of itself (how today relates to yesterday).
  • Lag — The amount of time by which one variable is separated from another.
  • Moving Average — A calculation to analyze data points by creating a series of averages of different subsets of the full data set.
  • Forecasting — The process of making predictions of the future based on past and present data.
  • ARIMA (AutoRegressive Integrated Moving Average) — A popular statistical model for time series forecasting.
  • Smoothing — A technique to remove "noise" from a data set to see the underlying pattern.
  • Exponential Smoothing — A technique that gives more "weight" to recent data points when forecasting.
  • Seasonally Adjusted — Data that has had the seasonal component removed to reveal the true trend.

Understanding

Time series analysis is about "Decomposing" a complex signal into its parts.

The Decomposition Equation: $Data = Trend + Seasonality + Cycle + Noise$ By pulling these apart, we can see that a 10% rise in sales might just be "Seasonality" (it's Christmas), not a "Trend" (your business is growing).

Stationarity (The Golden Rule): Most statistical models only work if the data is "Stationary." If your data has a trend (it's going up), it's not stationary. To fix this, we use Differencing—looking at the change from day to day ($Xt - X{t-1}$) instead of the raw values.

Autoregression (AR): This is the idea that the current value is a linear function of previous values. If it rained yesterday, it is more likely to rain today. The strength of this relationship is measured by Autocorrelation.

Applying

Calculating a Simple Moving Average (SMA): <syntaxhighlight lang="python"> def simple_moving_average(data, window_size):

   """
   Smooths out a time series by averaging over a window.
   """
   averages = []
   for i in range(len(data) - window_size + 1):
       window = data[i : i + window_size]
       averages.append(sum(window) / window_size)
   return averages
  1. Stock prices over 10 days

prices = [100, 102, 101, 105, 110, 108, 115, 120, 118, 125] sma_3 = simple_moving_average(prices, 3)

print(f"Original Prices: {prices}") print(f"3-Day SMA: {sma_3}")

  1. Note how the 'zig-zags' are smoothed out, making the upward
  2. trend easier to see.

</syntaxhighlight>

Time Series in the Real World
Stock Market → Analyzing price "momentum" and volatility.
Epidemiology → Tracking the "R-number" (spread of a virus) over time.
IoT / Sensors → Detecting an "Anomaly" (e.g., a machine vibrating weirdly) in a constant stream of data.
Inventory Management → Predicting how many turkeys to stock before Thanksgiving based on the last 5 years of data.

Analyzing

Additive vs. Multiplicative Seasonality
Type Description Example
Additive Seasonal variation is constant (±10 units) Monthly gym attendance
Multiplicative Seasonal variation grows with the trend (±10%) Airline passengers (more people travel in summer as the total industry grows)

The Random Walk: Some time series (like the stock market in the short term) are "Random Walks." This means the best prediction for tomorrow is simply "whatever it is today" plus some random noise. In a true random walk, technical analysis (looking at charts) is impossible—a hard lesson for many traders.

Evaluating

Evaluating a forecast: (1) RMSE (Root Mean Square Error): How far off were our predictions on average? (2) Overfitting: Does the model work perfectly on old data but fail on new data? (3) Lag: Does the model "lag" behind reality (only predicting a crash after it starts)? (4) White Noise: Are the "residuals" (the errors) random? If the errors have a pattern, it means our model is missing something.

Creating

Future Frontiers: (1) LSTM (Long Short-Term Memory): Using deep learning (RNNs) to capture extremely long-term dependencies in data (e.g., weather patterns over decades). (2) Real-time Stream Processing: Analyzing time series as they happen at a rate of millions of points per second (used in fraud detection). (3) Multivariate Time Series: Predicting one variable (e.g., sales) using 50 others (e.g., weather, social media sentiment, competitor prices). (4) Causal Time Series: Determining if one time series causes another (Granger Causality).