登录查看更多内容

A True Citizen Data Scientist End-to-End ML Example: Lead Scoring

Walter Adamson

? Helping business owners transform every role with AI-Thinking to boost productivity ? Empowering human potential one person at a time by enhancing productivity and role deliverables ? Beyond AI to AI-Thinking

发布日期: 2021年8月16日

The power of ever-higher levels of algorithmic abstraction and codification continues to enable non-IT professionals to dramatically improve their efficiency and effectiveness e.g. Microsoft's Power BI.

But for many of these advances, the promises remain elusive. Think of Robotic Process Automation, no-code app builders, and ... the fabled citizen data scientist.

A 600-fold Improvement - Plus The Steak Knives

For the latter, the world has just changed, and I experienced it myself. A new tool reduced 60 hours (of someone's else's work) into 6 minutes of my time! With better results.?

Not so many years ago I tagged myself as a citizen data scientist on my LinkedIn profile. It attracted more comments than anything else I've had on my profile, usually along the lines of "what is it?", or "love it".?

To be open, I'm a bit of a faux-citizen data scientist as I spent 7 years of my career as a consultant in computational statistics. But I left the service of statistics more than 40 years ago after being told by many people that there was no future in it. Sure, I still know the difference between correlation and causation but not enough else to be cast out of the ranks of citizen data scientists.

According To Gartner

According to Gartner, a citizen data scientist is a person who creates or generates models that leverage predictive or prescriptive analytics, but whose primary job function is outside of the field of statistics and analytics.

"These roles are often promoted as a silver bullet that can accelerate organizations into artificial intelligence (AI) and ML easily and cost-effectively."

The compelling idea of the citizen data scientists is that:

A power engineer who wants to predict firm power output to the grid tomorrow can do it at her own desk without writing a business case and getting in the queue of data analytics projects.
A maintenance engineer can take IoT data and other data e.g. weather, load fluctuations, and build a better condition-based maintenance strategy in a morning.
A marketing team can better predict churn and test their model without having geeky meetings about augmented data analysis.

And the holy grail? These citizen data scientists will be able to utilise and operationalise what they have created almost instantly. Meaning that the models are immediately able to be integrated with enterprise or operational systems and put into action to inform business decision-making.

Sound too good to be true? Up until now, it has been. To quote Gartner (June 2021):

However, very few organizations have managed to harness the capabilities of citizen data scientists" - Gartner How to Use Citizen Data Scientists to Maximize Your D&A Strategy

The Astounding Here-Now Capabilities for Citizen Data Scientists

In a 2020 article on Medium Adam Barnhard explains how he developed a lead-scoring system. The system analyses attributes about each new lead in relation to the chances of that the lead actually becoming a customer. With this, a sales team can prioritise their time to focus on leads that are most likely to convert.

Article: A True End-to-End ML Example: Lead Scoring

He walks through the full end-to-end implementation of a custom-built lead-scoring model. This includes pulling the data, building the model, deploying that model, and finally pushing those results directly to where they matter most — the tools that are used by a sales team.

My conservative estimate of the time taken to complete this end-to-end project is 60 hours:

The article itself has an estimated read time of 13 minutes, meaning about 4,000 words of highly technical writing which would take about 10 hours.
It requires expertise in (1) Python and Jupyter Notebook, (2) Machine Learning and MLflow, (3) AWS and Sagemaker, (4) Booklet.ai and (5) Intercom (for final integration). I'm estimating 10 hours of effort on each of those five elements as applied to the model development.

This totals 60 hours of effort.

And this is detailed, messy and error-prone work.

领英推荐

Data Science and AI Trends 2021 Rundown

Michael Spencer 3 年前

Data Science Best Practices

Pratibha Kumari J. 1 年前

Data Science 101: An Introduction to the Fundamentals…

Kadir Sümerkent 2 年前

For example, Adam cautions: "While building an ML model, you will likely go through multiple iterations and test a variety of model types. It’s important to keep track of metadata about those tests as well as the model objects themselves. What if you discover an awesome model on your 2nd of 100 tries and want to go back to use that?"

(Above: Process Overview of Adam Barnhard's Lead-Scoring System ML Project)

I found Adam's article, downloaded the same leads database from Kaggle, uploaded it into Hazlo.ai deleted data two columns of quantitative nominal data, selected the variable to predict and 219 seconds later had a model ready to be deployed to the field.?

I'm calling that 6 minutes of work - compared to 6 hours for Adam's lead scoring project.

And my model has an accuracy of 95.25% compared to Adam's 82% (which he described as "not too shabby").

(Above: My Lead Scoring Model, by Hazlo.ai)

You can my try model and its predictions of sales conversion: try it for yourself here.

Vary any of the parameters and the prediction will show the probability of conversion. (I compared it to the settings in the image in Adam's article and it gave the same result - "likely to convert".)

When you get better data you can simply upload and retrain the model, and compare it to the current model - and I mean simply. And with a single click you can take advantage of Hazlo's ever-evolving algorithms giving better results.

The careful reader will notice that I have brushed over the final integration from Booklet to Intercom which Adam's project completed. Hazlo provides an API for each model for the same purpose - to immediately build the model into business operations.?

You'll still need our programming friends for that bit. But I'm still claiming a 600X improvement over Adam's implementation time.

Conclusion

Of course, a little knowledge is dangerous*. But in potential ignorance, I am quite astounded by this result and the opportunity provided by these kinds of web-based systems.

What do you think about their potential and my assessment of the difference with the traditional approach? Where are the traps for young players? What are the organisational pitfalls?

*For example leaving the Lead Number and the Prospect Id in the data set for analysis gave an even better "accuracy" of 98.18% (up from 95.29% above). But clearly, those two data items do not represent more information in relation to the object of the analysis. They are simply more data, not more information, and spurious data at that.

Data doesn't always speak for itself, you need to use your business knowledge to eliminate that which is clearly spurious.

A big thank you to @AdamMBarnhard for his article giving me the inspiration to compare.

查看更多评论

要查看或添加评论，请登录

Walter Adamson的更多文章

Decoding the Spectrum of AI—and Its Implications for Competitive Strategy in 2025

2025年2月6日

Decoding the Spectrum of AI—and Its Implications for Competitive Strategy in 2025

(And the Rise of micro and nano-Verified Language Models) In this age of Generative AI the vast majority of people…

9 条评论
DeepSeek’s “Aha Moment”: The Next AI Revolution or Just an Incremental Step?

2025年2月3日

DeepSeek’s “Aha Moment”: The Next AI Revolution or Just an Incremental Step?

In 2017, Google’s Attention Is All You Need paper reshaped AI by introducing the Transformer architecture, paving the…

2 条评论
ChatGPT Refuses To Say Jensen Huang Is Wrong - Why?

2024年11月26日

ChatGPT Refuses To Say Jensen Huang Is Wrong - Why?

Fox News Headline (website): Jensen Huang says AI will do 20%-40% of jobs Coincidently, or perhaps not - since that's…
AI Thinking for the CEO Pragmatist

2024年11月19日

AI Thinking for the CEO Pragmatist

I recently flicked to a VentureBeat article by "thought leader" Khufere Qhamata which I thought would shed some insight…
The One Thing AI Told Me About Myself

2024年10月15日

The One Thing AI Told Me About Myself

You've seen the prompt. From all of our interactions what is the one thing that you can tell me about myself that I may…

2 条评论
The Tipping Point: How AI Will Cut Away 60 Years of Dead Wood in Programming and Software Engineering

2024年9月29日

The Tipping Point: How AI Will Cut Away 60 Years of Dead Wood in Programming and Software Engineering

I started full-time as a scientific programmer (Fortran, ALGOL) in 1967 in the "Computation Laboratory" of the…

3 条评论
AI Knows Your Job: How Generative AI Understands the Role of Today’s COO

2024年9月27日

AI Knows Your Job: How Generative AI Understands the Role of Today’s COO

“How can AI possibly know my job?” This question is top-of-mind for many COOs, but the reality is that generative AI…
Maximising Business Valuation: How Generative AI Can Boost Bankable Income and Predictable Operating Margins for a Stronger Sale

2024年9月9日

Maximising Business Valuation: How Generative AI Can Boost Bankable Income and Predictable Operating Margins for a Stronger Sale

For many business owners, selling a company is a critical moment that demands a clear strategy. Whether the aim is…

3 条评论
Maximising your Exit: Choosing Between Earnouts and Upfront Payouts When Selling Your Business

2023年5月1日

Maximising your Exit: Choosing Between Earnouts and Upfront Payouts When Selling Your Business

When selling a business, one of the most important decisions a business owner must make is whether to seek a 100%…

2 条评论
Mastering the Exit: Addressing This Pain Point Upfront Will Help You Sleep Better At Night

2023年3月20日

Mastering the Exit: Addressing This Pain Point Upfront Will Help You Sleep Better At Night

Lists of pain points for owners selling their businesses are plentiful and typically look like the one below. But in my…

1 条评论

See all articles

A True Citizen Data Scientist End-to-End ML Example: Lead Scoring

Walter Adamson

? Helping business owners transform every role with AI-Thinking to boost productivity ? Empowering human potential one person at a time by enhancing productivity and role deliverables ? Beyond AI to AI-Thinking

A 600-fold Improvement - Plus The Steak Knives

According To Gartner

The Astounding Here-Now Capabilities for Citizen Data Scientists

领英推荐

Conclusion

Walter Adamson的更多文章

社区洞察

其他会员也浏览了

Is Data Technology? The Great Debate!

Hire Data Science Experts: Everything You Should Know

What Data Science Means and Why It Matters

Making Big Data normal with graph analysis

The Imperative of Data Quality for the Effectiveness of Artificial Intelligence with Varsha Ramesar

The Future of Work: Data Skills You Need to Thrive

Data Science Notes _ Part 1

Common challenges in Data Science

Your December Dose of Data & AI

A 600-fold Improvement - Plus The Steak Knives

According To Gartner

The Astounding Here-Now Capabilities for Citizen Data Scientists

领英推荐

Conclusion

Walter Adamson的更多文章

Decoding the Spectrum of AI—and Its Implications for Competitive Strategy in 2025

DeepSeek’s “Aha Moment”: The Next AI Revolution or Just an Incremental Step?

ChatGPT Refuses To Say Jensen Huang Is Wrong - Why?

AI Thinking for the CEO Pragmatist

The One Thing AI Told Me About Myself

The Tipping Point: How AI Will Cut Away 60 Years of Dead Wood in Programming and Software Engineering

AI Knows Your Job: How Generative AI Understands the Role of Today’s COO

Maximising Business Valuation: How Generative AI Can Boost Bankable Income and Predictable Operating Margins for a Stronger Sale

Maximising your Exit: Choosing Between Earnouts and Upfront Payouts When Selling Your Business

Mastering the Exit: Addressing This Pain Point Upfront Will Help You Sleep Better At Night

社区洞察

其他会员也浏览了

Is Data Technology? The Great Debate!

Hire Data Science Experts: Everything You Should Know

What Data Science Means and Why It Matters

Making Big Data normal with graph analysis

The Imperative of Data Quality for the Effectiveness of Artificial Intelligence with Varsha Ramesar

The Future of Work: Data Skills You Need to Thrive

Data Science Notes _ Part 1

Common challenges in Data Science

Your December Dose of Data & AI