You should learn to deploy your Machine Learning models! The way to deploy is dictated by the business requirements. You should not start any ML development before you know how you are going to deploy the resulting model. There are 4 main ways to deploy ML models:
- Batch deployment - The predictions are computed at defined frequencies (for example, on a daily basis), and the resulting predictions are stored in a database and can easily be retrieved when needed. However, we cannot use more recent data, and the predictions can very quickly become outdated. Look at this article on how AirBnB progressively moved from batch to real-time deployments: “Machine Learning-Powered Search Ranking of Airbnb Experiences“. ?
- Real-time - the "real-time" label describes the synchronous process where a user requests a prediction, and the request is pushed to a backend service through HTTP API calls that, in turn, will push it to an ML service. It is great if you need personalized predictions that utilize recent contextual information, such as the time of the day or recent searches by the user. The problem is that until the user receives its prediction, the backend and the ML services are stuck waiting for the prediction to come back. To handle additional parallel requests from other users, you need to count on multi-threaded processes and vertical scaling by adding additional servers. Here are simple tutorials on real-time deployments in Flask and Django: “How to Easily Deploy Machine Learning Models Using Flask“, “Machine Learning with Django“.
- Streaming deployment - This allows for a more asynchronous process. An event can trigger the start of the inference process. For example, as soon as you get on the Facebook page, the ads ranking process can be triggered, and by the time you scroll, the ad will be ready to be presented. The process is queued in a message broker such as Kafka, and the ML model handles the request when it is ready. This frees up the backend service and saves a lot of computation power by an efficient queueing process. The resulting predictions can also be queued and consumed by backend services when needed. Here is a tutorial in Kafka: “A Streaming ML Model Deployment“.
- Edge deployment - This is when the model is directly deployed on the client, such as the web browser, a mobile phone, or IoT products. This results in the fastest inferences and can also be predicted offline (disconnected from the internet), but the models usually need to be pretty small to fit on smaller hardware. For example, here is a tutorial on deploying YOLO on IOS: “How To Build a YOLOv5 Object Detection App on iOS“. ? ?
Founder and CEO at Streambased
10 个月I feel like all of these are more dependent on the way on which the data is made available than the business requirements. With unified batch/streaming approaches like Confluent 's tableflow and Streambased you may be able to pick and choose the best aspects of all? I can see a future where the model is trained periodically on a larger set of streaming data that is 'downcast' to batch and then iteratively trained in between.
Organizer of the International Real Estate Investment Congress /Организатор Международного Конгресса
1 年If you do not control reality and do not create information or not create something, then you are controlled by: reality and information and the one who creates it. It's time to create and not destroy
Technology and AI Leader | Caltech AI Instructor | Recovering Founder | Startup Advisor
1 年Really nice clear breakdown, thanks Damien Benveniste, PhD
Junior Data Scientist Intern @ Zummit Infolabs | Ex - Intern @ Tata Consultancy Services | M. Tech, Data Science @ Rajalakshmi Engineering College | PG Diploma in Data Science & Analytics @ NIELIT | B. Tech, ECE @ KITS
1 年Incredible breakdown, Damien Benveniste, PhD. Your expertise in Machine Learning Engineering, System Design & MLOps shines through. Looking forward to more enlightening content from you! ?? #MachineLearning #MLOps #DeployMLModels
?I help Businesses Upskill their Employees in Data Science Technology - AI, ML, RPA
1 年Absolutely! Deployment strategies should be at the forefront of every ML project. Thanks for stressing this crucial aspect, Damien. #machinelearning #deploymentstrategies