登录查看更多内容

Data and Processes First, Machine Learning Second

Michael Dorazio

I help North America startups succeed and deliver strategic value to Woven Capital and Toyota.

发布日期: 2020年6月29日

Machine learning is a hot topic among many businesses right now, with executives constantly hearing pitches promising keen insights and automation of all kinds of business tasks. After being sold on the potential of machine learning in the workplace the first instinct of many managers is to try and apply it to anything and everything, from production control to customer experience and contracts management. However, machine learning isn’t a magic bullet - it needs to be applied to the right challenges, and more importantly it needs to be built on top of the right foundation to be successful (or even useful). I’ve seen many teams try to implement machine learning in the workplace, but ultimately fail and become disillusioned with the technology because they either attempted to solve problems that really needed underlying process improvements or because the available data wasn’t up to the task.

Data science insights and Machine Learning models are only as good as the data and processes on which they are based. Without a strong foundation of clear, streamlined, and “gotcha”-free processes combined with well-structured and logically organized data, these new tools are hamstrung from the start. In fact, industry estimates indicate that data scientists spend up to 60% of their time cleaning and normalizing data sets before they can actually generate insights. And that’s assuming that the data is even robust enough and the business use case is tailored enough to get started in the first place. The best way to get started with machine learning is to first evaluate data and process quality before spending large amounts of time and money chasing new technologies.

Solid Data = Solid Results

When looking at data science-based approaches to automation and insights, the logical place to start is with the data itself. Many organizations assume that just because they have large volumes of data, it will be easy to plug it into machine learning models or to quickly make sense of it with the right software tools. The reality is almost always much different, with corporate data sets plagued by years of lax standards, completely siloed repositories, and data entry error or laziness. Almost half of business leaders believe their corporate data is too siloed to make sense of, and nearly a quarter attribute inaccuracies in forecasts and predictions to poor data quality.

In a very real sense, the quality of machine learning outputs is tied directly to the quality of data inputs. Before embarking on new initiatives to make use of this technology, companies should first be looking at implementing strong data governance policies, cleaning and linking their existing data (with or without expert help), and eliminating silos that prevent access to data across teams. While data science tools can certainly help with some of this effort as part of broader machine learning adoption, there’s no replacement for a strong IT organization with a data-centric worldview.

Better Processes = Better Performance

An increasingly large goal of machine learning in the business world is augmenting or replacing humans in complex tasks like reviewing and updating contracts. But virtually all tasks that might be worth applying machine learning to are built on top of a chain of business processes with multiple inputs, outputs, and potential paths. In most cases, these processes have been built up over years, or sometimes decades, of human experience in response to business challenges. The net result is often a task that, while seemingly simple, actually follows a convoluted path to completion that relies on tribal knowledge to navigate effectively. It should be no surprise that attempting to apply machine learning to these processes without first mapping and streamlining them usually results in poor performance at best.

The upside of process improvement is that it doesn’t just help machines. It also helps people within an organization, especially those who are new to a role and need to get up to speed quickly while minimizing mistakes. And while many people equate process improvement with massive initiatives that require full management systems and lengthy ramp-up times, the truth is that many business activities can be improved by something as simple as creating a detailed process map and looking for obvious sources of confusion. Basic activities like this serve to get everyone on a team aligned with the “right” way to perform a task and help uncover inefficiencies and roadblocks, regardless of whether or not a machine learning initiative is ever undertaken.

Steps to Readiness

If machine learning is on your short list of initiatives to pursue, following a few steps before embarking can save you time, money, and headaches down the road. I recommend clients take these actions as a first step:

Evaluate your data environment objectively and determine if it is really clean enough, organized enough, and unified enough for machine learning
Work with your IT organization to shore up any data issues before hiring outside experts
Perform detailed process mapping for the tasks you want to apply machine learning to
Improve your processes to remove inconsistencies and “gotchas” while standardizing steps and eliminating reliance on tribal knowledge

Once these steps are complete, teams can more accurately gauge if machine learning even makes sense, if it’s realistically possible, and if the return on investment will be worthwhile. If a machine learning initiative is the right next step, the outcomes of these pre-work items will make an implementation much faster and head off many potential issues along the way. And even if machine learning isn’t the right next step today, cleaner and more structured data combined with well-defined and reviewed processes will pay dividends regardless of their end use.

Nada Alnajafi

Award Winning In-House Counsel | Founder of Contract Nerds ?? ?? | Author of Contract Redlining Etiquette | Keynote Speaker & In-House Trainer

4 年

Spot on, Michael! Having implemented contacts management systems, I always start by understanding and mapping the current contract review process. The more work you put in on the front end, the less work (and time and resources) you'll have to put in later on. That's one of the things you taught me, and what makes you an excellent Program Manager all around!

2 次回应

要查看或添加评论，请登录

查看全部

Data and Processes First, Machine Learning Second

Michael Dorazio

I help North America startups succeed and deliver strategic value to Woven Capital and Toyota.

Solid Data = Solid Results

Better Processes = Better Performance

Steps to Readiness

更多精彩文章

社区洞察

其他会员也浏览了

How To Scope Out A Dataset From Scratch (Enterprise ML)

The Struggle is Real - 3 Considerations to Make Machine Learning More Effective

Machine learning & unstructured data

From Data to Insights: Building Machine Learning Models with Low-Code Tools

Rise of the Machines: Why Knowing Your Data is Vital

The Impact of Machine Learning on Data Analysis

Machine learning frees up data scientists' time, simplifies smart applications

Machine Learning Monitoring, Part 5: Why You Should Care About Data and Concept Drift

Putting Superb Curate to the Test on the MNIST Dataset: How Does It Work?

Navigating Parametric and Non-Parametric Data in Machine Learning

Solid Data = Solid Results

Better Processes = Better Performance

Steps to Readiness

Why Are There So Few Female Entrepreneurs?

2023年2月2日

5 Misconceptions about Autonomous Vehicles

2020年3月23日

2040: The End of the Combustion Engine?

2019年11月20日

Automotive 2020: 5 Approaches in a Shrinking Market

2019年11月5日

Why Autonomous Vehicles Won't Solve Traffic

2019年5月6日

3D Printing Is Going to Shake up the Parts and Accessories Market

2019年4月9日

Electric Trucks are Coming, but Where?

2019年3月25日

The Coming Rise of the Autonomous Road Trip

2019年3月7日

You Might Be Gender Biased If…

2019年2月12日

CES 2019 Trends and Takeaways

2019年1月14日

社区洞察

其他会员也浏览了

How To Scope Out A Dataset From Scratch (Enterprise ML)

The Struggle is Real - 3 Considerations to Make Machine Learning More Effective

Machine learning & unstructured data

From Data to Insights: Building Machine Learning Models with Low-Code Tools

Rise of the Machines: Why Knowing Your Data is Vital

The Impact of Machine Learning on Data Analysis

Machine learning frees up data scientists' time, simplifies smart applications

Machine Learning Monitoring, Part 5: Why You Should Care About Data and Concept Drift

Putting Superb Curate to the Test on the MNIST Dataset: How Does It Work?

Navigating Parametric and Non-Parametric Data in Machine Learning