Machine Learning as a Product
No spiel on what Machine Learning (ML) is, as it doesn't need an intro. We all know what ML is, what it truly stands for, and the tremendous potential it has. With the cost of big data compute so low, harnessing the power of ML to solve business problems in smarter ways is the new normal or pretty much a baseline. At the same time, teams should stay clear of falling for shiny-object-syndrome and avoid pushing their resources to do more using ML framework and techniques, because that just makes the whole thing unnecessarily complex and less interpretable.
Machine Learning (ML) is production-grade software engineering applied to data science to make it START
Scalable, Transferable, Automated, Repeatable, and Testable
Key Goals of ML-based systems
- Reduce time doing operational tasks for data science such as time to deploy, time to train, the freshness of data and live AB tests
- Reduce onboarding time for new data scientists and engineers
- Enable self-service for downstream teams as much as possible
- Improve transparency and visibility of ML models to non-technical teams
- Support non-data-science/engineering teams with useful data, tools, and models
- Improve redundancy and spread the data science knowledge
Adoption challenges - Though ML is very powerful, there are challenges that need to be addressed
- Expertise - pretty much in all the cases specialized skills such as Data Science and ML Engineering are required
- Infrastructure - teams have to be equipped with big data processing such as AWS, Azure, or other cloud platforms
- Accessibility - generic models (available on market places) vs specific moels (in-house built) vary by a lot in their performances
- Value - if the models cannot be understood and their output cannot be interpreted, people tend to trust them less
- Speed of Delivery and Quality of Product - without upfront high investment into building automation, guard rails, and processes balancing between the speed of delivery and quality of the product is a constant battle
Data drives the enterprise
No Data, No Science, simple. No ML will be complete without data flowing from one side to the other. The picture on the right shows various workstream, tasks, jobs, functions that need data processing with Machine Learning (ML) and Data Science (DS) at the center. Two or more of the red blocks from the diagram could be connected to make an end-to-end operation. e.g. Train model + deploy it for AB Test, or run data quality (clean) job, and feed that into an ETL that then stores data for consumption.
Model Interpretability is very important. If you cannot explain how your data science model is operating, why the output is what it is - then you need to go back to the whiteboarding and redo it.
There are vast resources detailing Model Interpretability on the internet, I suggest checking those out. I have added a few relevant links at the end of this blog.
Now comes my favorite topic,
ML as a Product
If you a product manager/owner/person working on leveraging ML to solve your business problems here is what I learned from my work experience. These are solely my own interpretations. I will be more than happy to correct anything that is incorrect or understand more about ML product management - do send me a LinkedIn message.
As the standard product management process goes, know your users. Most of the time they include Applied Researchers (AR), Data Scientist (DS), Product Analysts, Finance & Forecasting, Marketing, and even include ML Engineers! The golden rule applies here too - start with why. Find that why. Why ML vs just plain old engineering? What problems are being solved? Like any other consumer or enterprise product - conduct surveys, probe for more info, questions the intent to figure out if ML is being thrown on the table because everyone else is doing it.
Data, Development, and Deployment
Pay attention to three main categories Data, Development, and Deployment. I emphasized the importance of data already. Understand the ways your team deploys the model output. Deployment could be bundling the code into containers such as Docker, Zip, or lib and pushed into the wild to take live traffic, or it could be running an offline job, creating an output file e.g. bidding file for meta marketing and sending it over to your marketing partner. Development is a vast green field. Make sure you have a robust CICD in place to maintain standardized development processes. Work with AR/DS and MLE folks to create a productive, reliable, and consistent development process. Even though by principle it should be a part of the development process, coding best practices are missing from many teams working with ML. Make the code quality a part of your product spec!
Just like other themes of product prioritization - use RoI, RICE, Kano Model to choose your sequence of delivery. Avoid getting blindsided by any biases. Keep an eye on User (desirability), Business (viability), and Technology (feasibility) goals. From a business point of view make cost optimization your daily mantra. As a PM of ML-based product, maniacally managing the cost is paramount, especially what we learned from COVID-19 time.
Build your North Star vision and attach OKR to it.
Without objective and key results to track the progress, it is almost like flying an airplane with a blindfold on.
For a reference, this is what I have for my team's CICD North Star vision...
As a PM of ML products, it can get tricky building your backlog. Beware that you are not piling up a laundry list (unless you ok being a product janitor) or a long list of rocket science wish list (unless you actually work for NASA ;-)) - neither is useful. Besides the Ops related work items (no one can escape those stinkers) build your backlog keeping user-business-technology in mind, constantly speak with your user base, seek feedback, and over-communicate about what you are building (the over part comes with a caveat - it is better over than less communication, but the over-communication should keep a check on that the value is not diminished).
I have started attending a lot of ML, AI, Data Science, Big Data, etc. technical meet-ups, summits, and conferences to gather data for my backlog.
Closing Note - If you ask 10 people you will get 10 different points of view on what ML is and how it should be done. If I may, my experience is that we do not need ML to solve every problem. Sometimes a simple and automated data pipeline can get you to your finish line. ML AI, Deep Learning they have their own place in every org. That said, ML is a darn pretty, powerful, and useful thing.
Happy Machine Learning!
P.S. Some useful link from my bookmarks
Curious Learner and Technologist
4 年Good article and we'll written, I liked closing remark 'If you ask 10 people you will get 10 different points of view on what is ML and how it should be done.'
Global Executive | Strategic Leadership | Product & Marketing | Business Operations | Board Member | Keynote Speaker
4 年Kapil, insightful article! “Without objective and key results to track the progress, it is almost like flying an airplane with a blindfold on.” Is spot on