Machine Learning (ML) has the potential to create new business models, make dramatic improvements in efficiency, radically improve the customer experience and transform entire industries.? Although the historic success rate for ML projects has been very low (Gartner estimates that 85% of AI/ML projects fail), Gartner is now estimating that by 2024, 75% of organizations will shift from piloting to operationalizing AI/ML.? So the results are in - the potential benefits of ML are real and a large majority of organizations are attempting to use ML to gain a competitive advantage.
There have been many success stories for ML and, unfortunately, even more ‘failure stories’.? The good news is that patterns have begun to emerge and we are seeing a common set of principles and practices that separate successful machine learning projects from failures.??
In this blog post we will outline five best practices for conducting successful machine learning projects:
- Have a well-defined problem statement and business use case - in the early days of machine learning projects, companies sometimes did not put much thought into the problem statement and business use case.? They were typically too excited by the new capability and were anxious to get started.? Also, ML was often thought of as a panacea with almost ‘magical’ qualities - so it was thought that things would ‘work themselves out’.? This contributed to many ML project failures. Now, with some track record, it is a strong best practice to clearly define the problem statement and the business use case.? ML is not a panacea and is not the correct solution for all problems.? It is important to make sure that ML techniques map well to the problem statement.? Also, it is important to look at the problem statement from a business standpoint - e.g., will it increase revenue?? Will it increase profitability? Improve customer retention/experience?? New product innovation?? What business outcome is desired??
- Confirm that the required data can be obtained - not just once but at regular intervals - these days, in corporate settings, the majority of machine learning is performed using supervised learning where machine learning models are trained with labeled data sets.? What this means is that the machine learning model is trained by 'giving it the answer' under many different scenarios. Thereafter, the trained ML model should be able to make predictions with a high level of accuracy when presented with a similar dateset. However, this requires a lot of data and not just any data. In the early stages of ML projects, it is important to make sure that the organization has the data required to support the problem statement and business use case.? It is also important to determine the viability of obtaining the data on regular intervals to retrain the models and ensure they don't get stale.
- Ensure you have the right expertise - for small projects where people in the organization just want to get familiar with some basic concepts of ML, it is OK to learn as you go and not include credentialed experts.? When it comes time to initiate a true ML project, it is important to include people with the right kind of expertise.? The three most important skill sets to ensure are on the project pertain to data science, data engineering and operationalizing ML models.? Having a professional data scientist on the team will save a lot of time and make sure the goals of the project remain achievable from an ML perspective.? We said earlier that ML models are ‘data hungry’ and most of the effort of ML models is related to obtaining, cleansing and transforming data.? A strong set of data engineering skills will greatly help an ML model achieve success.? Lastly, we now know that the skills required to build and train and fine tune the ML model are different than the skills required to deploy the ML model to production and fully operationalize it.? It is important to have people on the team that have some experience with operationalizing ML models.
- Think about operationalizing the ML model from day one - when you see statistics like 85% of ML projects fail what that really means is that the ML model fails to make it into production use.? If success was defined by being able to create and train a model in a purely test environment with test data then success rates would be dramatically higher.? Now that there is a track record of ML projects and we know what works and what does not, we know it is important to think about and start planning for moving the ML model into production and operationalizing it right at the very start of the project.? When you do this it is more likely that the scope of the project stays focused on reproducible positive business outcomes (and not one off pet projects without a real business use case).
- Start small and build additional use cases incrementally - another reason for a lot of the failures of ML projects is that the projects are simply too big and ambitious and, as a result, are not defined specifically enough.? Often the impact of such ML projects on the organization is too significant from a data gathering, definition and surface area standpoint.? To be successful with ML it is better to start with a small, achievable use case with real business benefits.? Once successful, it is possible to then broaden out to additional use cases (but still making sure that no single use case gets too big).??
Although there are other important considerations for driving success with ML projects (e.g., Executive Sponsorship), we think these 5 best practices, if followed, will have the most positive impact on your ML project’s success.