登录查看更多内容

5 Things to Consider in Developing a Useful and Impactful Predictive Analytics Model

Joe Krekelberg

VP Administration and CFO

发布日期: 2018年5月16日

Five Things is a series of thoughts on the arts and science of finance, analytics, and corporate development, with occasional forays into leadership, communication, and other topics for the well-rounded professional

The focus of this article is not to explore the latest statistical technique or "big data" concepts. Rather, it examines a pretty generic, vanilla application of a tried-and-true statistical method and how to make the results useful and impactful.

A few years ago I took on a consulting project for a client who wanted to predict the number of hours required to serve a given customer in a given year. This was important because the customers were not charged on an hourly basis, but on a negotiated fee for a bundle of different services. So predicting the drivers of the time required would really help to drive pricing decisions and understand customer-level profitability.

This was a fun and challenging project, and one that really highlighted the key elements of really useful predictive analytics that can be applied to an important business problem.

So, here are the 5 things gleaned from this project and my experience that can help you make a predictive modeling exercise useful and impactful:

1. Building a predictive model is an exercise in using data and math: but mostly it's about the data

The foundation of success on a project like this is having good, reliable data. The greatest, most mathematically and statistically sound analysis will be useless if applied to bad data. You need to:

Understand where your data is coming from, including the who, what, why, and when. In this case, the data was gathered by individual time tracking over the past 18 month.
Ask - is it enough? In this case, we measured the time to serve 2,000 customers for 18 months - not bad.
Ask - is it good? Do some initial exploration. Look for outliers. In this case, there were some clients where the cost to serve was close to 0 and some where the cost to serve was very high. This led to some even better discussions about how the data was gathered and how and led us to (carefully) remove or replace "bad" data.

2. The road to predictive analysis starts with descriptive and diagnostic analysis

Once your data is ready, jumping to the modeling phase is premature. Thinking about Gartner's analytics maturity model. . .

. . .we do not want to jump in to predictive analytics just yet. There's too much to be gained in exploring the descriptive and diagnostic analytics first, looking at things like:

Simple statistics for each potential variable (mean, standard deviation, percentiles, histograms)
Looking at scatterplots (in this case, does it look like there is a relation to the cost?)
Delving a bit into diagnostics, explore the correlation matrix? ( Which of the potential descriptive variables are correlated with cost? Which of the potential descriptive variables are correlated with each other?

I found a lot of value in reviewing with my client the "fact pack" of these summary statistics for each variable in that it 1) provided another level of scrutiny in terms of data quality 2) gave them some very interesting and useful information they never never seen and 3) it was a great "warm up" for what was to come.

3. Balance building the "best" model with building the "most useful" model

Now that we have data than we understand well and have some initial insights, we are ready to build our model, considering:

What kind of statistical technique? In this case, we used good old multiple linear regression. Essentially we are looking for a formula (where Y is "cost to serve" and x's are the most important descriptive varibles):

Which variables are important enough to use in our model (the x's)? There are some techniques that make this more of a science (e.g. stepwise linear regression) and then there is the trial-and-error approach. The best method, in my opinion, is to use what we learned in our data exploration to understand the most likely candidates for "the x's" and than to do a bit of trial and error on likely models until you get the right fit. You can also investigate, where it seem to make sense, non-linear models (e.g. exponential or logarithmic lines) and interactions between variables.
How many descriptive variables to use? In this case, limited the number of x variables in our final model was quite important - to ensure effective communication, understanding, and easier implementation. Is a model with 40 variable better than a model with 10 variables if it has a slightly better model fit statistics? The key is to get to "just right" in terms of the best descriptive variables

4. Don't trust your model - test it!

Good in-sample and out-of-sample test results are really important in communicating that your model works well in real life. In this case, I ignored, or "held out", at random, 25% of the data for the last calendar year and all of the data for the 6 months of the current year. Since this data wasn't used in building the model in step 3, I could use this data to validate that my model holds up (and, yes, it pretty much did):

5. The real main event: communicating and implementing

My clients were not data scientists or statisticians, but they needed to be comfortable in using this model to make impactful business decisions. So probably one of my most critical tasks was to communicate and gain buy in for the methodology and results of this model. Specifically, I needed to address questions like:

What process did you follow? (The five step process of data gathering, data cleaning, data exploration, modeling building, and model validation)
What methodology/technique did you choose and why? (I chose an appropriate model, given the structure of the problem, that is regularly taught in intermediate/MBA-level statistics courses)
How accurate is this model going to be at estimating the cost of services? (I was able to compare my model vs. the current estimation process and show that it was far better)
What variables factor into the estimation? Why did you choose variable x? Why didn't you choose variable y? (I had to spend a lot of time discussing how variable were chosen, how many of them are correlated with one another, and how fewer variables can be better)

In Conclusion

I feel good that this outcome was both useful and impactful to my client. The key was not having the most sophisticated modeling technique, using the newest "big data" techniques, or leveraging enormous computing power, but rather it succeeded from following a structured process, getting regular client engagement and feedback, and a executing a strong communication plan.

Until next time. . .

~~[Insert closing catch phrase here]~~ Work hard, work smart, and keep in touch!

With 20+ years of experience helping firms make strategically and financially sound decisions that drive profitable growth, Joe Krekelberg has held finance, corporate development, and actuarial leadership positions in multiple industries. He is located in the greater Minneapolis-St. Paul area and can be reached at [email protected].

要查看或添加评论，请登录

Joe Krekelberg的更多文章

The 5 Core Activities of the Effective Finance Business Partner

2018年7月24日

The 5 Core Activities of the Effective Finance Business Partner

Five Things is a series of thoughts on the art and science of finance, analytics, and corporate development, with…

5 条评论
5 Approaches to Make You a Great Finance Business Partner

2018年7月10日

5 Approaches to Make You a Great Finance Business Partner

Five Things is a series of thoughts on the arts and science of finance, analytics, and corporate development, with…

4 条评论
5 Questions to Help Finance Leaders Drive Decisions and Build Credibility

2018年5月22日

5 Questions to Help Finance Leaders Drive Decisions and Build Credibility

Five Things is a series of thoughts on the arts and science of finance, analytics, and corporate development, with…
5 Questions That Lead to Great Development Planning For Your Team

2018年5月8日

5 Questions That Lead to Great Development Planning For Your Team

Five Things is a series of thoughts on the art and science of finance, analytics, and corporate development, with…

1 条评论
5 Things You Should Include in Your M&A Playbook

2018年5月3日

5 Things You Should Include in Your M&A Playbook

Five Things is a series of thoughts on the art and science of finance, analytics, and corporate development, with…
5 Things I Learned About the Science of Strategy From "Profit Beyond the Hockey Stick"

2018年5月1日

5 Things I Learned About the Science of Strategy From "Profit Beyond the Hockey Stick"

Five Things is a series of thoughts on the art and science of finance, analytics, and corporate development, with…

3 条评论

See all articles

5 Things to Consider in Developing a Useful and Impactful Predictive Analytics Model

Joe Krekelberg

VP Administration and CFO

1. Building a predictive model is an exercise in using data and math: but mostly it's about the data

2. The road to predictive analysis starts with descriptive and diagnostic analysis

3. Balance building the "best" model with building the "most useful" model

4. Don't trust your model - test it!

5. The real main event: communicating and implementing

In Conclusion

Joe Krekelberg的更多文章

社区洞察

其他会员也浏览了

Forecasting and predictive analysis aren't the same things: Here is why

Unleashing the Power of Business Analytics: From Data to Decisions

Exploring Strategies to Ensure Accuracy and Interactivity

Tampering or Tuning? How Visual Data Tools Help You Make the Right Choice

5 step guide for Data Analytics Project Journey from Advanced Analytics series – Part 3

Analytics – Helping businesses see the Bigger Picture

Mastering Analytics: The Art of Storytelling Through Data and Mitigating Data Noise

Data-Driven Decision Making: Harnessing Analytics for Better Outcomes

How Analytics is Changing the Business World?

From Data Cleaning to Machine Learning: How WMAD Delivers Comprehensive Data Analysis Services

1. Building a predictive model is an exercise in using data and math: but mostly it's about the data

2. The road to predictive analysis starts with descriptive and diagnostic analysis

3. Balance building the "best" model with building the "most useful" model

4. Don't trust your model - test it!

5. The real main event: communicating and implementing

In Conclusion

Joe Krekelberg的更多文章

The 5 Core Activities of the Effective Finance Business Partner

5 Approaches to Make You a Great Finance Business Partner

5 Questions to Help Finance Leaders Drive Decisions and Build Credibility

5 Questions That Lead to Great Development Planning For Your Team

5 Things You Should Include in Your M&A Playbook

5 Things I Learned About the Science of Strategy From "Profit Beyond the Hockey Stick"

社区洞察

其他会员也浏览了

Forecasting and predictive analysis aren't the same things: Here is why

Unleashing the Power of Business Analytics: From Data to Decisions

Exploring Strategies to Ensure Accuracy and Interactivity

Tampering or Tuning? How Visual Data Tools Help You Make the Right Choice

5 step guide for Data Analytics Project Journey from Advanced Analytics series – Part 3

Analytics – Helping businesses see the Bigger Picture

Mastering Analytics: The Art of Storytelling Through Data and Mitigating Data Noise

Data-Driven Decision Making: Harnessing Analytics for Better Outcomes

How Analytics is Changing the Business World?

From Data Cleaning to Machine Learning: How WMAD Delivers Comprehensive Data Analysis Services