Prophet Forecasting
Suravi Mahanta
Senior Consultant at EY GDS | Ex-Accenture | Microsoft Modern Data Platform Expert | Big Data Specialist | AI/ML Engineer | 4X Microsoft Certified | 3X Databricks Certified | Data Architecture
Forecasting is one of the most commonly used machine learning algorithms in any business. It’s become a necessity to forecast to understand any data in a better way. To take any decisions, mangers in every companies are using some or the other kind of forecast. Which make it more important for business.
However, the question which we should ask ourselves is “Does the forecasting technique which we were using from last 5-10 years are flexible and scalable for today’s business?”. I think the answer would be no, because today we’re dealing with more complex business problem than before. And so, lots of engineers are working on improving and advancing the forecasting techniques every single day.
Facebook is one of such company who designed its own library “Prophet” to make forecasting much more scalable and automated to solve their business problem and help others to do easy forecast. Prophet is based on an additive model which can have both linear and non-linear trend. This library has lots of advantages then other forecasting methods such as:
a. It is open source.
b. It is fully automated.
c. It can handle outliers.
d. It can handle Missing Data.
e. It can handle strong seasonal effects.
f. It is available in Python and R.
g. It is very fast and Accurate.
h. It gave full access to control its parameters.
i. It can handle sudden fluctuations in time series.
Prophet time series forecasting library works well with day wise date data.
Statistically, it’s a kind of additive model which means it’s a combination of different custom functions. Custom functions are considered to be more effective then normal function. For trend L1- Regularized trend shift and for seasonality Fourier series is being used. And so statistically also prophet is considered more effective time series forecasting model.
Mathematically:
To install prophet in python for Anaconda notebook you can follow these steps:
- Open Anaconda command prompt and type “condo install gcc” after that type “conda install -c conda-forge fbprophet”.
To install prophet in python for Azure databricks notebook you can follow these steps:
- Open Azure Databricks Portal and login and open databricks.
- Launch workspace
- Open and start cluster
- Open the cluster under which you want to use this library.
- Open libraries and then install new library.
To understand Prophet in a better way, let’s do a quick project on the same. For demo I downloaded the “Air Passenger” dataset from Kaggle. So, let’s start the demo with importing different libraries which we will need:
After downloading the dataset, the next step would be to import the dataset:
Afterwards we have to check the data types and the first few rows to understand the data. For any time, series analysis, we expect two columns which are date and sales or in our case it is the Month and #Passengers columns. And among these two columns the date column must be in datetime data type.
So, we can now see that the “Month” column is not in datetime datatype. And that why we need to change the datatype.
Now let’s try to understand the data with the help of some visualization and to do I used the “Matplotlib” library.
The plot above is not stationary. It has an upward trend and some seasonality and noise well. So this time series need some transformations. Prophet can handle seasonalit of its own and so we don’t have to worry about it. However, if you want then you can use any type of transformations of your choice.
Prophet model require a Dataframe with two columns “ds” and “y”. So, you have to rename the “Month” and “#Passengers” columns as “ds” and “y”.
When the dataset is ready then we can call the Prophet function and can start building the model.
After calling prophet function and building model on it. We have to design a dataframe which can hold different output fields of including the prediction column. And to do so we can use “make_future_dataframe “ function under prophet model and then start predicting or forecasting.
Along with predicted value, prophet give lot many other fields as output which we can see above. But if you’re interested only in yhat, yhat_lower and yhat_upper colmns then you can check the value of those as follows.
The result which we’re seeing above the result after log transformation. To see the exact result, we need to revert back to the original data.
To visualize the forecast, you can use directly plot the predicted value as follows:
To visualize different components of time series, you can use plot_components function as below:
With this simple demo I hope you can understand how Prophet works in real time. However if in case you want to control or tune the model then it provides different parameters for it which you can check with the help function in python.
Conclusion
With this article I tried to explain why Prophet is one of the best models in forecasting and tried to explain the same with the demo.
I hope with this article you might learn something new. If you have any queries then please comment below. I’ll be happy to discuss further on the same.