Time series forecasting was the topic of week 10’s lecture. To complete time series forecasting we first need to remove anything that is easy to forecast from the data:
- Trends
- Cycles (Cyclical Components)
- Seasonal variations
- ‘Irregular’ the hardest to predict and the component which our neural networks will be attempting to forecast.
Autocorrelation generally stronger for recent data items and degrades in quality as we step back through the time series. Autocorrelation uses past data items in an attempt to predict n time steps into the future. It must be noted that error incurred in lesser timesteps forward are more than likely to grow as the prediction continues to step forward. Although this point seems obvious when viewing data predictions that seem intuativley correct our own confirmation bias often outweighs the awareness of a models limitations.
Spatio-Temporal models incorporate a principal component. This is a variable/s who’s influence on future timesteps is significant. An example in our water use prediction would be the rainfall of previous months. Low rainfall would suggest higher water usage. Their are many methods for identifying Principal components, Karl Pearson invented this field with the introduction of his Pearson Product-moment analysis.
Forecasting linear time series can be conducted using a Single layer perceptron. It may however be questionable as to how much this tool would be superior as opposed to more simplistic modelling methods. Auto-regressive with external variables [Arx] models utilize both previous time series data and principal component states for generating forecasts.
Evaluating model accuracy can be done in rudamentary fashion using root mean square error [RMSE].
Moving past the simplisting Single layer networks we review time lagged feed forward networks:
We then moved to Non-Linear Auto-regressive with external variable [NArx] networks:
The same training principles as with standard NNs applies to time series forecasting. Importantly the training data must be viewed in chronological order as forcasting would suggest in contrast to classification.
Again, awareness must be given to over/under fitting. Minimizing RMSE on training data does not infer an accurate model for all/future data.