Time Series Analysis: Difference between revisions
BloomWiki: Time Series Analysis |
BloomWiki: Time Series Analysis |
||
| Line 1: | Line 1: | ||
<div style="background-color: #4B0082; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | |||
{{BloomIntro}} | {{BloomIntro}} | ||
Time Series Analysis is a statistical technique that deals with time series data, or trend analysis. Time series data is a sequence of data points indexed in time order (e.g., stock prices, weather data, or population growth). Unlike standard statistics, which assumes data points are independent, time series analysis recognizes that "what happened yesterday" often affects "what happens today." By identifying patterns such as trends, seasonality, and cycles, time series analysis allows us to forecast the future, detect anomalies, and understand the dynamic forces that drive a system over time. | Time Series Analysis is a statistical technique that deals with time series data, or trend analysis. Time series data is a sequence of data points indexed in time order (e.g., stock prices, weather data, or population growth). Unlike standard statistics, which assumes data points are independent, time series analysis recognizes that "what happened yesterday" often affects "what happens today." By identifying patterns such as trends, seasonality, and cycles, time series analysis allows us to forecast the future, detect anomalies, and understand the dynamic forces that drive a system over time. | ||
</div> | |||
== Remembering == | __TOC__ | ||
<div style="background-color: #000080; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | |||
== <span style="color: #FFFFFF;">Remembering</span> == | |||
* '''Time Series''' — A series of data points indexed (or listed or graphed) in time order. | * '''Time Series''' — A series of data points indexed (or listed or graphed) in time order. | ||
* '''Trend''' — The long-term increase or decrease in the data (e.g., global warming). | * '''Trend''' — The long-term increase or decrease in the data (e.g., global warming). | ||
| Line 17: | Line 22: | ||
* '''Exponential Smoothing''' — A technique that gives more "weight" to recent data points when forecasting. | * '''Exponential Smoothing''' — A technique that gives more "weight" to recent data points when forecasting. | ||
* '''Seasonally Adjusted''' — Data that has had the seasonal component removed to reveal the true trend. | * '''Seasonally Adjusted''' — Data that has had the seasonal component removed to reveal the true trend. | ||
</div> | |||
== Understanding == | <div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Understanding</span> == | |||
Time series analysis is about "Decomposing" a complex signal into its parts. | Time series analysis is about "Decomposing" a complex signal into its parts. | ||
| Line 28: | Line 35: | ||
**Autoregression (AR)**: This is the idea that the current value is a linear function of previous values. If it rained yesterday, it is more likely to rain today. The strength of this relationship is measured by **Autocorrelation**. | **Autoregression (AR)**: This is the idea that the current value is a linear function of previous values. If it rained yesterday, it is more likely to rain today. The strength of this relationship is measured by **Autocorrelation**. | ||
</div> | |||
== Applying == | <div style="background-color: #8B0000; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Applying</span> == | |||
'''Calculating a Simple Moving Average (SMA):''' | '''Calculating a Simple Moving Average (SMA):''' | ||
<syntaxhighlight lang="python"> | <syntaxhighlight lang="python"> | ||
| Line 57: | Line 66: | ||
: '''IoT / Sensors''' → Detecting an "Anomaly" (e.g., a machine vibrating weirdly) in a constant stream of data. | : '''IoT / Sensors''' → Detecting an "Anomaly" (e.g., a machine vibrating weirdly) in a constant stream of data. | ||
: '''Inventory Management''' → Predicting how many turkeys to stock before Thanksgiving based on the last 5 years of data. | : '''Inventory Management''' → Predicting how many turkeys to stock before Thanksgiving based on the last 5 years of data. | ||
</div> | |||
== Analyzing == | <div style="background-color: #8B4500; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Analyzing</span> == | |||
{| class="wikitable" | {| class="wikitable" | ||
|+ Additive vs. Multiplicative Seasonality | |+ Additive vs. Multiplicative Seasonality | ||
| Line 69: | Line 80: | ||
**The Random Walk**: Some time series (like the stock market in the short term) are "Random Walks." This means the best prediction for tomorrow is simply "whatever it is today" plus some random noise. In a true random walk, technical analysis (looking at charts) is impossible—a hard lesson for many traders. | **The Random Walk**: Some time series (like the stock market in the short term) are "Random Walks." This means the best prediction for tomorrow is simply "whatever it is today" plus some random noise. In a true random walk, technical analysis (looking at charts) is impossible—a hard lesson for many traders. | ||
</div> | |||
== Evaluating == | <div style="background-color: #483D8B; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Evaluating</span> == | |||
Evaluating a forecast: (1) **RMSE (Root Mean Square Error)**: How far off were our predictions on average? (2) **Overfitting**: Does the model work perfectly on old data but fail on new data? (3) **Lag**: Does the model "lag" behind reality (only predicting a crash *after* it starts)? (4) **White Noise**: Are the "residuals" (the errors) random? If the errors have a pattern, it means our model is missing something. | Evaluating a forecast: (1) **RMSE (Root Mean Square Error)**: How far off were our predictions on average? (2) **Overfitting**: Does the model work perfectly on old data but fail on new data? (3) **Lag**: Does the model "lag" behind reality (only predicting a crash *after* it starts)? (4) **White Noise**: Are the "residuals" (the errors) random? If the errors have a pattern, it means our model is missing something. | ||
</div> | |||
== Creating == | <div style="background-color: #2F4F4F; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> | ||
== <span style="color: #FFFFFF;">Creating</span> == | |||
Future Frontiers: (1) **LSTM (Long Short-Term Memory)**: Using deep learning (RNNs) to capture extremely long-term dependencies in data (e.g., weather patterns over decades). (2) **Real-time Stream Processing**: Analyzing time series as they happen at a rate of millions of points per second (used in fraud detection). (3) **Multivariate Time Series**: Predicting one variable (e.g., sales) using 50 others (e.g., weather, social media sentiment, competitor prices). (4) **Causal Time Series**: Determining if one time series *causes* another (Granger Causality). | Future Frontiers: (1) **LSTM (Long Short-Term Memory)**: Using deep learning (RNNs) to capture extremely long-term dependencies in data (e.g., weather patterns over decades). (2) **Real-time Stream Processing**: Analyzing time series as they happen at a rate of millions of points per second (used in fraud detection). (3) **Multivariate Time Series**: Predicting one variable (e.g., sales) using 50 others (e.g., weather, social media sentiment, competitor prices). (4) **Causal Time Series**: Determining if one time series *causes* another (Granger Causality). | ||
| Line 79: | Line 94: | ||
[[Category:Data Science]] | [[Category:Data Science]] | ||
[[Category:Finance]] | [[Category:Finance]] | ||
</div> | |||
Latest revision as of 02:00, 25 April 2026
How to read this page: This article maps the topic from beginner to expert across six levels � Remembering, Understanding, Applying, Analyzing, Evaluating, and Creating. Scan the headings to see the full scope, then read from wherever your knowledge starts to feel uncertain. Learn more about how BloomWiki works ?
Time Series Analysis is a statistical technique that deals with time series data, or trend analysis. Time series data is a sequence of data points indexed in time order (e.g., stock prices, weather data, or population growth). Unlike standard statistics, which assumes data points are independent, time series analysis recognizes that "what happened yesterday" often affects "what happens today." By identifying patterns such as trends, seasonality, and cycles, time series analysis allows us to forecast the future, detect anomalies, and understand the dynamic forces that drive a system over time.
Remembering[edit]
- Time Series — A series of data points indexed (or listed or graphed) in time order.
- Trend — The long-term increase or decrease in the data (e.g., global warming).
- Seasonality — Predictable patterns that repeat over a fixed period (e.g., ice cream sales in summer).
- Cyclical Component — Patterns that repeat but not over a fixed period (e.g., economic "boom and bust").
- Noise (Irregular) — Random variations that cannot be explained by trend or seasonality.
- Stationarity — A property where the statistical characteristics of the series (mean, variance) do not change over time.
- Autocorrelation — The correlation of a signal with a delayed version of itself (how today relates to yesterday).
- Lag — The amount of time by which one variable is separated from another.
- Moving Average — A calculation to analyze data points by creating a series of averages of different subsets of the full data set.
- Forecasting — The process of making predictions of the future based on past and present data.
- ARIMA (AutoRegressive Integrated Moving Average) — A popular statistical model for time series forecasting.
- Smoothing — A technique to remove "noise" from a data set to see the underlying pattern.
- Exponential Smoothing — A technique that gives more "weight" to recent data points when forecasting.
- Seasonally Adjusted — Data that has had the seasonal component removed to reveal the true trend.
Understanding[edit]
Time series analysis is about "Decomposing" a complex signal into its parts.
- The Decomposition Equation**:
$Data = Trend + Seasonality + Cycle + Noise$ By pulling these apart, we can see that a 10% rise in sales might just be "Seasonality" (it's Christmas), not a "Trend" (your business is growing).
- Stationarity (The Golden Rule)**: Most statistical models only work if the data is "Stationary." If your data has a trend (it's going up), it's not stationary. To fix this, we use **Differencing**—looking at the *change* from day to day ($X_t - X_{t-1}$) instead of the raw values.
- Autoregression (AR)**: This is the idea that the current value is a linear function of previous values. If it rained yesterday, it is more likely to rain today. The strength of this relationship is measured by **Autocorrelation**.
Applying[edit]
Calculating a Simple Moving Average (SMA): <syntaxhighlight lang="python"> def simple_moving_average(data, window_size):
"""
Smooths out a time series by averaging over a window.
"""
averages = []
for i in range(len(data) - window_size + 1):
window = data[i : i + window_size]
averages.append(sum(window) / window_size)
return averages
- Stock prices over 10 days
prices = [100, 102, 101, 105, 110, 108, 115, 120, 118, 125] sma_3 = simple_moving_average(prices, 3)
print(f"Original Prices: {prices}") print(f"3-Day SMA: {sma_3}")
- Note how the 'zig-zags' are smoothed out, making the upward
- trend easier to see.
</syntaxhighlight>
- Time Series in the Real World
- Stock Market → Analyzing price "momentum" and volatility.
- Epidemiology → Tracking the "R-number" (spread of a virus) over time.
- IoT / Sensors → Detecting an "Anomaly" (e.g., a machine vibrating weirdly) in a constant stream of data.
- Inventory Management → Predicting how many turkeys to stock before Thanksgiving based on the last 5 years of data.
Analyzing[edit]
| Type | Description | Example |
|---|---|---|
| Additive | Seasonal variation is constant (±10 units) | Monthly gym attendance |
| Multiplicative | Seasonal variation grows with the trend (±10%) | Airline passengers (more people travel in summer as the total industry grows) |
- The Random Walk**: Some time series (like the stock market in the short term) are "Random Walks." This means the best prediction for tomorrow is simply "whatever it is today" plus some random noise. In a true random walk, technical analysis (looking at charts) is impossible—a hard lesson for many traders.
Evaluating[edit]
Evaluating a forecast: (1) **RMSE (Root Mean Square Error)**: How far off were our predictions on average? (2) **Overfitting**: Does the model work perfectly on old data but fail on new data? (3) **Lag**: Does the model "lag" behind reality (only predicting a crash *after* it starts)? (4) **White Noise**: Are the "residuals" (the errors) random? If the errors have a pattern, it means our model is missing something.
Creating[edit]
Future Frontiers: (1) **LSTM (Long Short-Term Memory)**: Using deep learning (RNNs) to capture extremely long-term dependencies in data (e.g., weather patterns over decades). (2) **Real-time Stream Processing**: Analyzing time series as they happen at a rate of millions of points per second (used in fraud detection). (3) **Multivariate Time Series**: Predicting one variable (e.g., sales) using 50 others (e.g., weather, social media sentiment, competitor prices). (4) **Causal Time Series**: Determining if one time series *causes* another (Granger Causality).