Why Your Machine Learning Project Might Fail And How to Avoid It
Machine learning (ML) is fast approaching the “plateau of productivity”, according to Gartner analysts, and as evidenced by multi-million dollar investments of CIOs and CEOs. Their companies have been hoarding data for decades – it’s time to capitalise on it.
However, it's not that simple. We often see recurring errors in the development of client systems. Companies invest in machine learning projects in the hope of improving their position in the market, but many projects go astray.
ML-based developments are often unreasonably expensive and the resulting systems are slow. However, ML projects cannot be perceived as scientific research, they should bring clear business benefits.
Our team at MindGap has implemented dozens of cost-effective projects based on machine learning technologies. Leveraging our experience, we have formed a basic approach to planning and development of ML solution. Of course, an approach is not the only thing required to successfully implement an ML solution. You'll need a strong team, motivation, rare and complex competencies, but practice shows that projects succeed largely due to competent planning and aligned expectations.
In this article, we will discuss the approach and support it with case studies. We hope it could help businesses avoid a huge waste of resources.
The tip of the iceberg. The top level of our business process consists of only four stages, but allows you to see in advance the difficulties that threaten to "sink" the project.
Stage 1. Data collection and preparation
Refined, realistic data in the right amount is the ideal foundation on which to build a working system. But one should not mix reality with wishful thinking: data is rarely like that.
Quantity matters
For ML projects, the amount of data is fundamentally important. If there is little data (from 10 thousand to 100 thousand data points), then it is extremely difficult to make a prediction. At this level, ML tools will not perform better than statistics, and it is cheaper to use the latter. In this case, we help build the selection process: integrate into the customer's systems, start collecting data points that seem promising to us. Until this moment, we do not promise anything and do not make estimates on the timing.
Business is sometimes in the pleasant delusion that several years’ worth of data are stored in one relational database (DB). In practice, some of the data could be stored in three different generations of the database, and the rest of it is stored in the ERP system. For proper operation, these data sources will have to be reduced to a single denominator. This translates into a large integration effort, for which business will need to provide enough resources.
Closed Information. Our approach to solving the problem
Businesses fairly treat their company data as capital. It is often hidden even from employees, let alone a third-party development team. In this case, our company uses an approach that we call "lunar rover". Before the decisive departure to the moon, the Soviet lunar rovers were tested on the ground, where conditions close to real ones, were simulated. We have a similar scheme.
If the client is not yet ready to provide access to databases, we ask them for a small dataset - a set of real, but anonymised data about the business. What happens next with the dataset? On its basis we build a model.
The installer, prototype and tester are packaged and sent to a remote server. Learning is executed, results are tested, and conclusions are drawn.
This is a win-win approach. The client does not show us confidential data until we shake hands, and we can see how our model will behave in a real project.
Example: data from health insurance companies
This, for example, often happens with health insurance companies that are regulated by HIPAA (HIPAA compliance). Another regulation prohibits health insurance companies from denying services to clients, even if they lead an apparently unhealthy lifestyle, have chronic illnesses - and are obviously unprofitable to the insurance company.
One of our clients in the United States decided to supply such companies with an ML-system, which could predict the increase in the cost of treating an insured customer. For those customers who are at risk of getting ill soon, the company will assign a coordinator - a specialist who will carefully guide this particular person in the risk group: they will offer to take tests on time, and will encourage them to return to a healthy lifestyle.
This is where (in the case of medical data) our approach performed well. We have built the system on real data, since the prototype result satisfied both the client and us.
The result is that our client is able to sell the system to health insurance companies, and it significantly reduces the costs associated with treating insured customers.
The system managed not only to predict the possible cost of future treatment, but also specific diseases. For example, the system can even predict the occurrence and development of opioid addiction: we have found a direct relationship with lifestyle changes.
Erroneous data
Sometimes the client's business process does not exclude the introduction of erroneous information into the databases, and these errors are then transferred into the prototype. The prototype, unable to distinguish between erroneous and true data, will simply process both. Then it will show the conclusions, and the business will get the wrong predictions.
Errors in data received from the client, we treat especially carefully. Otherwise, the entire project may be at risk
Stage 2. Prototype development
The development of a prototype takes two to four weeks, and a team is allocated for it. It is this stage that helps us understand whether we will be able to achieve the desired level of accuracy in the project.
System accuracy level
Once we have developed the prototype, we look at the level of accuracy we have achieved with the available data and compare it to the client's expectations. If the prototype predicts events with 60-80% accuracy, then there is a high probability of reaching 90-95% on real data. If the prototype shows 30%, and the client expects 95%, we are looking for ways to create a prototype with different data and a different approach.
Example: a neural network that "reads" medical scans with an accuracy of c.95%
One of our clients for the development of an ML-solution was an American medical services company. It provides doctors in private practices with devices for ultrasound examinations of the carotid arteries, as well as a service for decoding the captured video. The device itself fits in a suitcase, and it is convenient for a doctor to carry it to a patient's home. The more patients a doctor visits in a day, the higher their revenue. Therefore, most often the doctor transmits the video for interpretation to the device supplier. This saves time and allows them to make more scans per day.
In turn, the supplier company must quickly and accurately decode the video stream: measure the diameter of the artery and look for cholesterol plaques (if any). This work can only be done by a qualified technician. A lot of time is spent on each interpretation - they need to view the recording several times at slow speed, and manually measure the diameter of the artery. As the customer base grew, more and more expensive technicians had to be hired.
It is possible to automate the interpretation of the video recording only by employing machine learning: the location of a plaque cannot be statistically calculated, its shape and size are always different. This is the task we were retained for. We were handed ultrasound footage and video findings, and on this test set, we found that we could:
- detect some plaques with good accuracy;
- identify the parts of the artery where it was most convenient to measure it.
The client provided us with the required amount of test data and worked closely with us: the technicians periodically reviewed the result of the predictions. The openness of the client, the quality of the data and the good performance of the prototype made it possible to promise high accuracy.
Neural network helps determine the location of the arteries and recognize cholesterol plaques.
As a result, the search for a clearly captured area of the artery to measure its diameter was carried out with an accuracy of 95%, the detection of frames with a potential plaque - with an accuracy of 80%. The average duration of interpretation of one video by one technician dropped by a factor of 5.
Not all algorithms are created equal
Nowadays there are various algorithms and platforms with catchy names. But not all processes are suitable for expensive neural network, not everyone really needs to buy data from TrendForce (a company that conducts marketing research for technology companies). We prevent the client from using those algorithms that are not suitable for solving their problems, even if a lot has been written about them on the Internet. After all, ML projects are 90% selection of the best algorithm and effort, and only 10% art.
Stage 3. ROI Calculation
The prototype and ROI (Return On Investment) calculation help figure out whether it is worth investing in increasing the model’s accuracy. After all, each percentage point of accuracy improvement can cost tens of times more than the previous one.
It is important to calculate the profitability of the project in advance and agree on a degree of accuracy that will deliver a sufficient impact in comparison with the money spent.
Here it is also necessary to agree on the balance of false positive and false negative predictions. Perhaps the most important thing we have learned from our experience is that a 100% accuracy can generally be unprofitable for most types of businesses.
Prototyping and ROI calculation are interconnected. After the prototype development, a realistic level of accuracy that is profitable for the business is selected. It is also important to keep profit expectations in mind during the development.
Balance false positive and false negative
Since the algorithm will inevitably make mistakes, it is important to understand which mistake will cost the business more. False-positive is a misconception that an event will happen. False-negative - that it will not happen.
No matter how cynical it may seem from a general human point of view, from a business point of view it is sometimes more profitable to miss one patient with a real probability of illness than to generate many false-positive cases and get expensive doctors to analyse them.
Reasons for the results
It is important to be aware of the reasons for the obtained predictions, not only based on the decision of the system. If the model predicts that there will be no event, we try to understand why it really did not happen. If, for example, the patient of the insurance company did not come for an examination, it is better to understand in advance, what the reason is. Otherwise, it may increase the costs of this patient in the future. After all, their absence from the examination does not mean that they are healthy: perhaps they simply do not want to take care of themselves.
Example: the reasons for an increase in conversion
Our client sells first-class air tickets online. Wealthy buyers, typically company executives make the requests on the website. They don't have time to fill out the request form in detail. At the same time, the agent, before calling the requestor, needs to think over what exactly they can offer and how to lead this call to a sale.
The task of the ML-system was to optimize the work of the call centre: help the agent think over the composition of the proposal to the requestor, and also rank the applications in order of the highest probability of conversion. The highest-quality applications were moved to the top of the queue: these are the customers who, according to the system, are more likely to immediately pick up the phone and accept the offer. Due to the small amount of data, the minimum task was set - at the stage of prototype development, to increase sales by at least 6%.
A complex set of data serves as the basis for predictions: the client's mail and his IP address, geolocation, user behaviour on the site, pages viewed, traffic source, request content, and more.
In reality, already in the prototype stage, we increased the conversion by 17%. The reason turned out to be that all requests went to the queue for processing. The call centre called requestors in order, and spent a lot of time processing uninterested customers. During this time, customers who really wanted to purchase a ticket waited and could change their mind. The results showed that the client’s customers are very sensitive to the timeliness of call-backs. The algorithm indicates those customers who are ready to make a purchase and will immediately respond to the agent's call. As a result, customers' waiting times for a call have decreased. Prioritization of requests delivered a 17% increase in sales in the real project.
Number of subsystems in the prototype
When people talk about machine learning, they often mean some kind of giant model that produces predictions. We have a different approach: we strive to make models for literally every type of event. In our case, almost every project has five or even ten models that predict different types of situations depending on the processes of the client company. Although not everything is so simple here: such development is much more complicated and requires a lot of skill and deep understanding. But it is precisely this approach that ensures the accuracy of predictions.
For example, on the backend, our disease prediction application has its own model for almost every diagnosis. And in real-time, it can predict not only the cost of treatment, but also the diseases themselves, including exacerbations: a thousand models predict a thousand different diagnoses.
Stage 4. Solution development
When developing predictive analytics systems, there are two camps: programmers, who have been developing systems on the market for many years, and machine learning specialists, data scientists. A struggle begins between them for precedence.
Developers have more knowledge about development technologies, but not enough knowledge about the intricacies of the data scientist role and the needs of systems. When developing, we are looking to balance the expertise of all team members, keeping in mind the key advantages of the future system.
Operating speed and resource intensity
ML algorithms have different operating speeds. For example, neural networks are slower than linear regression. You can minimize the consumption of computing resources by choosing a less demanding but workable algorithm in advance. It is important to keep track of the resources that the model will require after the implementation.
For example, if you are creating a model for predictive analytics of a large number of objects in real-time. Could the cost of maintaining it be a million dollars a month? Easily. Is this cost-effective? Hardly.
There is another threat as well. The business provides a refined, error-free historical database from which a prototype is built, but after release, the system is unable to make its predictions in real-time. If the business needs data analytics real-time, we make sure in advance that the data also arrives into the system in real-time.
The final (solution development) and the previous (ROI calculation) stages are cyclical: when a solution is developed, you can go back a step and calculate whether it is possible to improve the model and how much it will cost to add a few more percentage points of accuracy.
In summary
To sum up, we have identified four important factors for the success of an ML project:
- Data. For the success of the project, you need to collect as much data as possible - more than 100 thousand data points, and if at first the business is not ready to share these, then anonymising the dataset for a trial project is a good way out of the situation.
- Prototype. It’s better to spend 2-4 weeks developing a pilot project, but to make sure that the idea is viable.
- The ROI model should be calculated and recorded at the very beginning of the project. It is necessary to convince everyone that when the required characteristics are achieved, the client will benefit from the development.
- Finally, it is important to anticipate in advance what resources the final model will require and strike a balance between developers and data scientists.
In conclusion, we can add that there are always many ideas about what can be predicted. For implementation, you should choose the most profitable ones, since investment in machine learning shouldn’t be done out of curiosity, but for the sake of profit.
----
Ben Gutkovich is the Managing Director of MindGap, where he supports businesses of all sizes with Strategic AI solutions, helping identify revenue and cost-saving opportunities from data, and deliver Artificial Intelligence and Machine Learning solutions to drive real business value.
Could Machine Learning help your business achieve its goals? Get in touch for a free consultation.
Agile & Project Management Expert. I help organizations to achieve their goals with outstanding management tools.
4 年Thanks for sharing your insights!