登录查看更多内容

Experimenting on Facebook Prophet

Chris Shayan

Product Experience Architect | Head of AI

发布日期: 2018年12月12日

If you have ever worked with time series predictions, I am quite sure you are well aware of the strains and pains that come with them. Time series predictions are difficult and always require a very specialized data scientist to implement it.

Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well. You can read the paper in here.

So I decided to give a try on a small eCommerce in Vietnam. I have daily data from March to November. I’ve to feed in data like below as a csv format and remember the headers must be ds and y. (case sensitive)

ds,y
1/3/18,1700000
3/3/18,2745000
5/3/18,1665000
6/3/18,1530000
7/3/18,2070000
8/3/18,1665000

I decided to use Prophet for 3 predictions:

Predicting Average Order Value
Predicting number of sold SKUs
Predicting number of Sales Orders

Predicting Average Order Value

Here is the source code:

import pandas as pd

from fbprophet import Prophet
from fbprophet.diagnostics import cross_validation
from fbprophet.diagnostics import performance_metrics

dataFile = pd.read_csv('files/Mar-Nov-18-eCom-aov.csv')
dataFile.head()
# adding the outliers into the model
dataFile.loc[(dataFile['ds'] == '20-10-2018'), 'y'] = None
dataFile.loc[(dataFile['ds'] == '26/11/2018'), 'y'] = None
dataFile.loc[(dataFile['ds'] == '27/11/2018'), 'y'] = None

prophet = Prophet(
    growth='linear',
    seasonality_mode='additive')

prophet.fit(dataFile)

future = prophet.make_future_dataframe(freq='D', periods=30*6)
future.tail()

forecast = prophet.predict(future)
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()

fig1 = prophet.plot(forecast)
fig1.savefig('forecastAOV.png')

fig2 = prophet.plot_components(forecast)
fig2.savefig('forecastComponentsAOV.png')

cross_validation_results = cross_validation(prophet, initial='210 days', period='15 days', horizon='70 days')
print cross_validation_results

performance_metrics_results = performance_metrics(cross_validation_results)
print performance_metrics_results

Prophet includes functionality for time series cross validation to measure forecast error using historical data. This is done by selecting cutoff points in the history, and for each of them fitting the model using data only up to that cutoff point. We can then compare the forecasted values to the actual values. This cross validation procedure can be done automatically for a range of historical cutoffs using the cross_validation function. We specify the forecast horizon (horizon), and then optionally the size of the initial training period (initial) and the spacing between cutoff dates (period). By default, the initial training period is set to three times the horizon, and cutoffs are made every half a horizon.

The output of cross_validation is a dataframe with the true values y and the out-of-sample forecast values yhat, at each simulated forecast date and for each cutoff date. In particular, a forecast is made for every observed point between cutoff and cutoff + horizon. This dataframe can then be used to compute error measures of yhat vs. y.

The performance_metrics utility can be used to compute some useful statistics of the prediction performance (yhat, yhat_lower, and yhat_uppercompared to y), as a function of the distance from the cutoff (how far into the future the prediction was). The statistics computed are mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), and coverage of the yhat_lower and yhat_upper estimates. These are computed on a rolling window of the predictions in df_cv after sorting by horizon (ds minus cutoff). By default 10% of the predictions will be included in each window, but this can be changed with the rolling_window argument.

    horizon         mse       rmse        mae      mape  coverage
116  7 days   15.671313   3.958701   3.113743  1.227230  0.892857
232  8 days   13.096140   3.618859   3.005956  1.225235  0.892857
59   8 days   12.537128   3.540781   2.893784  1.217342  0.892857
176  8 days   12.390640   3.520034   2.868738  1.131018  0.892857
6    8 days   12.947606   3.598278   2.934908  1.092784  0.857143
117  8 days   12.441204   3.527209   2.801724  1.076299  0.857143
60   9 days   12.404544   3.522009   2.782781  1.070309  0.857143
177  9 days   11.970364   3.459821   2.725228  0.959787  0.892857
118  9 days   11.792319   3.433995   2.687827  0.917405  0.892857
233  9 days   12.637409   3.554913   2.789919  0.887098  0.857143
61  10 days   12.245963   3.499423   2.718576  0.829174  0.857143
178 10 days   12.042642   3.470251   2.679663  0.783344  0.857143

This is the forecast of AOV in VND currency:

Predicting number of sold SKUs

Source code is as above, the rest is similar as mentioned in above and this is the forecast:

Predicting number of Sales Orders

Source code is as above, the rest is similar as mentioned in above and this is the forecast:

Prediction vs Actual

After few days, this is the result comparing the yhat and actual numbers. It’s not that bad but as I am not a data scientist or expert on any time series analysis I found this pretty good.

I’ll wait for few more days to verify the prediction vs actual then can see if this works or not. I am thinking of using this for various predictions such as: social network, budget, inventory demand, sales forecast, headcount planning, etc. Also we can integrate Prophet to your eCommerce (magento) to smartly select what product to feature in our homepage depending on time and day, as well as the up-sell and cross-sell recommendations.

要查看或添加评论，请登录

Chris Shayan的更多文章

Customer Behavior Analysis with Neo4j in banking

2025年3月17日

Customer Behavior Analysis with Neo4j in banking

Connected Data on a Knowledge Graph Traditional relational databases, while effective for transactional data, often…

8 条评论
Beyond Transactions: Primary Banks.

2025年3月10日

Beyond Transactions: Primary Banks.

This article is intended for senior bank executives who are responsible for driving strategic growth and profitability.…

1 条评论
The Future of Search

2025年3月3日

The Future of Search

I’ve been thinking a lot about how search is going to change in the future, and what that means for us. So, I decided…

5 条评论
AI-Augmented Leader

2025年2月21日

AI-Augmented Leader

While AI is remarkable, it can never replicate the depth of human compassion. Although designed to simulate human…

1 条评论
Intelligent Banking - Beyond Automation to Augmentation

2025年2月17日

Intelligent Banking - Beyond Automation to Augmentation

The postings on this site are my own and do not necessarily represent the postings, strategies or opinions of my…

5 条评论
Conquering Data Mesh Challenges in Banking & Driving CLV

2025年2月10日

Conquering Data Mesh Challenges in Banking & Driving CLV

The postings on this site are my own and do not necessarily represent the postings, strategies or opinions of my…

2 条评论
Data Mesh in Banking. Orchestrating CLV.

2025年2月5日

Data Mesh in Banking. Orchestrating CLV.

The postings on this site are my own and do not necessarily represent the postings, strategies or opinions of my…

2 条评论
Stop Treating Customers Like ATMs: A Guide to Sustainable Banking

2025年1月20日

Stop Treating Customers Like ATMs: A Guide to Sustainable Banking

The postings on this site are my own and do not necessarily represent the postings, strategies or opinions of my…
Augmented Intelligence in Banking

2025年1月14日

Augmented Intelligence in Banking

The postings on this site are my own and do not necessarily represent the postings, strategies or opinions of my…

3 条评论
AI-Driven Customer Lifetime Orchestration for Banks

2025年1月10日

AI-Driven Customer Lifetime Orchestration for Banks

The postings on this site are my own and do not necessarily represent the postings, strategies or opinions of my…

2 条评论

See all articles

Experimenting on Facebook Prophet

Chris Shayan

Product Experience Architect | Head of AI

Predicting Average Order Value

Predicting number of sold SKUs

Predicting number of Sales Orders

Prediction vs Actual

Chris Shayan的更多文章

社区洞察

其他会员也浏览了

Fun with Graphing in Power BI - Part SQRT(POWER(SQRT(8),2) + POWER(SQRT(8),2))

Fun with Graphing in Power BI - Part 1

Using GenAI for Analytics + using GenAI to understand something technical

Power BI's 2023 Revolution

Time Series Analysis with Facebook Prophet: How it works and How to use it

Avoiding Errors of Interpretation: the case of Selby & Ainsty

LINEAR REGRESSION ON BOSTON DATASET

Take it a step further - explain your solution in business language

Exploring Univariate Combo Charts

Exploring Different Types of Graphs and Their Applications

Predicting Average Order Value

Predicting number of sold SKUs

Predicting number of Sales Orders

Prediction vs Actual

Chris Shayan的更多文章

Customer Behavior Analysis with Neo4j in banking

Beyond Transactions: Primary Banks.

The Future of Search

AI-Augmented Leader

Intelligent Banking - Beyond Automation to Augmentation

Conquering Data Mesh Challenges in Banking & Driving CLV

Data Mesh in Banking. Orchestrating CLV.

Stop Treating Customers Like ATMs: A Guide to Sustainable Banking

Augmented Intelligence in Banking

AI-Driven Customer Lifetime Orchestration for Banks

社区洞察

其他会员也浏览了

Fun with Graphing in Power BI - Part SQRT(POWER(SQRT(8),2) + POWER(SQRT(8),2))

Fun with Graphing in Power BI - Part 1

Using GenAI for Analytics + using GenAI to understand something technical

Power BI's 2023 Revolution

Time Series Analysis with Facebook Prophet: How it works and How to use it

Avoiding Errors of Interpretation: the case of Selby & Ainsty

LINEAR REGRESSION ON BOSTON DATASET

Take it a step further - explain your solution in business language

Exploring Univariate Combo Charts

Exploring Different Types of Graphs and Their Applications