Tips from Big Data Spain 17: Prophet
Fernando Bayon
CTO at dezzAI | Artificial Intelligence | LLMs | RAG | Multiagentic | IR | FinTech | Large Language Models - GPT - ChatGPT | Generative AI | Digital & StartUp Culture
Sean J. Taylor is a computational social scientist on Facebook's Data Science team, he was the third speaker past Friday at Big Data Spain 17 in a keynote called "The Data Errors we Make", about how to learn to anticipate errors in data, models, and predictions we make, and how to estimate uncertainty in a wide range of data science applications: surveys, product tests, and forecasts.
In this keynote we found out ‘Prophet’, a new package announced by Sean J. Taylor and Ben Letham from Facebook some months ago.
The forecasting capability can be improved a lot by cleaning up the input data, choosing the right model (ARIMA, exponential smoothing, etc.) and configuring the parameters better. The problem is that, in reality, most of us don’t have enough knowledge or experience on how to do such, hence we either end up getting something that is completely useless like the one above or don’t even bother.
And this is why we are super excited to heard about this package. Prophet is basically a library to build forecasting models for time series data, but instead of using the traditional way of building the model such as using ARIMA, etc., it is fitting additive regression models or known as ‘curve fitting’. They have implemented the core part of the procedure in Stan’s probabilistic programming language. Because of this, “Stan performs the MAP optimization for parameters extremely quickly (<1 second), gives us the option to estimate parameter uncertainty using the Hamiltonian Monte Carlo algorithm, and allows us to re-use the fitting procedure across multiple interface languages.”, according to the authors.
It is designed to handles typical data challenges like the followings by default.
- A reasonable number of missing observations or large outliers
- Historical trend changes, for instance due to product launches or logging changes
- Trends that are non-linear growth curves, where a trend hits a natural limit or saturates
And the cool thing is that they use it for their forecasts at facebook. So we are trying this library this weekend, and we are looking for use cases in healthcare to test this tool, of course, Python powered ;)
Account Executive
1 年Thank you Fernando, for sharing.