Introduction
Time Series forecasting is a technique to predict future values based on historical data. It is widely used in various domains such as finance, sales, and weather forecasting, but this can also be used as a supervised learning problem for Machine Learning models. To do so, there are a few challenges that arise when preparing our data to fit into the supervised learning pattern, before we can apply our usual ML techniques to it.
Using a Sliding Window
Lets say we want to apply this framework to a time series. One approach is known as the sliding window method, or the lag method.
Let's consider an example where we have the daily gas price fluxuation in the state of Washington for the last 10 months, as defined as X:
X = [3.84, 3.94, 4.21, 4.62, 4.24, 4.21, 5.44, 4.73, 4.22, 4.44]
We can define the target variable to be the next entry in the series:
X = [?, 3.84, 3.94, 4.21, 4.62, 4.24, 4.21, 5.44, 4.73, 4.22, 4.44]
Y = [3.84, 3.94, 4.21, 4.62, 4.24, 4.21, 5.44, 4.73, 4.22, 4.44, ?]
Now, we can determine that for every entry in X (with the exception of the first and last) the corresponding entry in y is the next day's temperature. Thus, we would want to delete the first and last entries in both time series, and the lag in this method would be the number of previous steps that there are, which in this case, we would have a lag of 1.
Sliding Window for Multivariate Time Series
We can now apply this to a multivariate time series for better and a more robust forecasting mode. Modeling multivariate time series data with machine learning is much more robust compared to other more standardized methods like regression, since ML typically covers a lot more ground when it comes to training and testing prior data, and thus, we can use the same sliding window approach for multivariate time series.
Let's say we have the monthly average of gas prices, as well as the inflation percentage changes in the state of Washington.
X = [[3.84, 0.4], [3.94, 0.5], [4.21, 0.7], [4.62, 0.8], [4.24, 0.64], [4.21, 0.72], [5.44, 0.87], [4.73, 0.91], [4.22, 1.01], [4.44, 0.92]]
If we want to predict only the gas prices in Washington, we can apply our sliding window on only the gas prices like above. However, if we want to predict both the gas prices and the inflation percentage changes, then we need to take into account both the lagged versions of the inputs as well as the target variables.
With this, we would only be able to predict a single step into the future, however, how would this scale with multi-step forecasting?
Multi-step Forecasting
When it comes to multi-step forecasting, our predictions will start becoming less precise the further out we predict, but as long as the lag is reasonable, it should still yield decent results.
Let's consider our initial example. We want to predict N
months into the future, we could increase our range of our forecast by creating N
target variables, with one up to N
months' lag described by the following:
X = [3.84, 3.94, 4.21, 4.62, 4.24, 4.21, 5.44, 4.73, 4.22, 4.44]
Y_1 = [3.84, 3.94, 4.21, 4.62, 4.24, 4.21, 5.44, 4.73, 4.22, 4.44, ?]
Y_N = [3.94, 4.21, 4.62, 4.24, 4.21, 5.44, 4.73, 4.22, 4.44, ?, ?]
We would need to ensure that our target variables are sliding with the size of our sliding window intact. If we start increasing the size of our sliding window, we could be skewing the prediction.
While this technique can work, we would be putting more emphasis on our initial input data, which will definitely skew the results of the prediction, so it's best to keep N
a smaller constant rather than scaling it further in the future.
Data Pre-processing Notes
One thing we need to consider is how we prepare our data for time series forecasting with ML models. We need to scale or normalize the on the inputs that we provide, and the data needs to be stationary, so we can detrend the input data if applicable.
Conclusion
When it comes to creating good ML models for Time Series Forecasting, we need to determine how we are processing and training our input data. Supervised learning, like many of the current techniques for time series forecasting, works well in specific situations, and this is a reminder that there is no one-model fits all. It is crucial that you determine which model and technique is right for your given situation, so that we can create better forecasts and models.