In one of my earlier blogs, I used linear forecasting method to predict a team’s scores in the future. While that was darn close to reality (when we checked with actual results), it did not mathematically account for fluctuations or seasonality and the outcome was always linear.
In this blog, I show another method (that uses more sophisticated algorithm) to account for seasonality and also compare both in numbers and visual charts the difference. We’ll be using ETS algorithm for forecasting. ETS (Error, Trend, Seasonal) method is an approach for forecasting time-series univariate which focuses on trend and seasonal components. By this very nature, it’s used in weather, sales, stock market, and many economic forecasting as it uses a weighted mean of past values.
Whereas, the linear forecasting method uses the least squares or linear regression to forecast future values based on historical figures. It’s also useful for forecasting sales, store stock requirements, trends etc.
Let’s use them both and compare!
We have the following data set (say, these are expenses of a business in column Known in hundreds of units) for each quarter over the past year. We want to predict values for the next year (year 5) and all its quarters.
So, we can tabulate as follows, where “?” are the ones we seek.
I used Forecast linear function in Forecast (Lin) column and the ETS function in Forecast (ETS) columns independently. While, I didn’t have to calculate for periods 1 through 16 (these are already known in its own column), I wanted to calculate and see how close each algorithm came close to the facts…the known values for the past year.
The Forecast (ETS) clearly is closer to reality. This is mainly because of the nature of the data. If we just plot the known data, we start to see a pattern…a seasonality as follows:
There are ups and downs at the same specific quarters in every year, even though the overal trend (we’ll do that soon) looks to be going upward YoY.
Although we can see the forecasts seem very credible (especially from ETS), it’ll be good to understand how close they are…or how confident are we in these predictions? How do we quantify that?
One way is to get ETS confidence interval at each predicated data point. It’ll tell us how close the predicted value is likely to be. The lower the value, the more accurate the forecast. So, I added Conf. Int. column to calculate them as shown below:
It’s often known as CONFINT subfunction of ETS module of FORECAST function.
The value 0.43 means it’s the predicted value has +/-0.43 error margin. That’s at 95% confidence level. Pretty darn credible!
One more non-numeric way to see how close we are is to plot all the known and predicted values right on top of one another in a multi-series plot as below:
As we see the linear forecast (yellow line) is a good fit that goes right through the data points very well, however, it doesn’t have the granularity to account for fluctuations as seasonality algorithm gives us. The orange line shows the exact pattern of known values from the known year. The blue is mathematically predicted here and is following very very very closely! So, it’s a great predictor. The vertical gray line stops marks where the orange line ends…at the end of year 4 as that’s the end of known values’ time period. The blue line, forecast values continue and show us what to expect in the future.
These have numerous applications in real-life situations everyday! I hope you enjoyed this blog. I deliberately stayed away from the underlying discussions of the statistical computations and proofs behind these, but you can just search on the internet for the terms I used here if interested to learn more about data science/statistics/analytics.
Bonus tip: How would you know what # of seasonality was used in these complex computations or auto-detected? Use ETS.SEASONALITY to get that number. It can be customized but auto-detect worked for me as it reasonably identified 4.