登录查看更多内容

Book Review: Building Machine Learning Powered Applications

Ryan Dorrell

发布日期: 2020年4月15日

Like many of you may have now, I have recently found myself with a bit of free time while waiting for the COVID-19 storm to pass. Time that was previously used for social activities like high school baseball games and tennis practices is now freed up, and there’s only so many home projects I can tackle at once. I've been using this newly found time to read!

I’ve been looking for a book about machine learning that wasn’t just about a specific topic, but instead took me through the entire process, at some level of detail, around creating a software solution that used machine learning. I’ve read and watched enough about these topics to be dangerous, but wanted to get a more tactical look at a complete working process. I think I may have found a book that does this: “Building Machine Learning Powered Applications: Going From Idea To Product” by Emmanuel Ameisen. The title describes perfectly what I had been looking for, and below is my take on this new book.

The books is divided into multiple parts (collections of 2-3 chapters) that cover the end to end process of building a ML-powered application, from planning to data pipelining to model iteration to deployment and monitoring.

Part I: Find the Correct ML Approach

Even if you don’t read the rest of the book, this first section is a great intro to ML-powered applications, and the information you need to make the best decisions about getting started. It covers various types of approaches and the pros and cons of each, while providing some pragmatic examples and practices. It also discusses data needs, and model performance in a very approachable way. There aren’t any code examples yet, and I found these first few chapters easy to digest and understand. It also lays out the example application that the rest of the book takes you through. The example app is an editor that provides writing recommendations for questions to be posted to online forums (such as Stack Overflow). You feed it your question (in the form of one or more sentences) and it provides suggestions like “use less adverbs". At first I struggled with this as an example because it wasn’t highly business-focused, but it works for the purposes needed by the rest of the book, and you can apply this same thinking of this type of application to other scenarios. At the end of the day, the model takes some information from the user, and provides suggestions for them on how to improve – that is certainly useful in many business scenarios to support nearly everything that business users do on a daily basis. And while this part doesn’t get technical, if you just read this first section and put the rest of the book down, you’d get value out of it.

Part II: Build a Working Pipeline

Part II dives right into writing code (Python) to start to build the “pipeline” for our model. Now, I haven’t written any real code for several years, but the examples are fairly straightforward enough to follow. If you’ve never written code before, from this part on could be pretty challenging to follow, as it gets technical really quickly. If you are non-ML software developer reading this, I’d encourage you to follow along with the examples on your own machine as you go through it – I didn’t do that, but I would bet I’d have gotten more out of it if I had.

What I really like about the approach used here is that it starts with a non-ML model – just some hard-coded heuristics – to help test the pipeline process, and then builds on that. In a way, it reminded me of a Test-Driven Development type approach – write just enough code that works, run a test, tweak the code, run another test, and so on.

This part also covers data in more detail, and really has some great advice and examples. A large amount of effort goes into data preparation and processing to build an ML model, and this covers at least at a high level, what those steps are and also explores data types outside what is needed in the example application.

Part III: Iterate on Models

This part gets even more highly technical, and I’ll admit to having to re-read paragraphs a few times to make sure I actually understood before moving on. There is some assumption of knowledge of various math topics, but the author does do a good job of straightforward explanations when necessary. There are also many references to tools and examples that you can dive into as you read or follow along in code.

The three chapters in this part take you through the core steps of iterating on a model, and get you to a reasonably working model. The author encourages the reader to iterate through the steps again, measuring performance and identifying areas for improvement.

Part IV: Deploy and Monitor

By now in the narrative of the book, we have built a working model which gives writing recommendations, and we are ready to deploy it! There’s some extremely good information in here on topics you may not be familiar with but are highly valuable. For example, it covers how someone might defeat a model to preform fraudulent activity (think fraudulent credit card transactions). It also covers various deployment models, and pros/cons of different approaches. The last few chapters cover error management, performance, and monitoring models to improve predictive performance.

Summary

Although a bit technical for my current skillset at times, I thought this was an excellent overview of looking at the end to end process for building a machine learning powered application. At around 238 pages, it doesn’t go super-deep on any one topic, and that’s a good thing. It’s consumable while giving you an idea of all the pieces you need to know to go from end to end. I’d recommend this book to any software developer or manager who needs to understand how ML applications are built, and needs to start crossing that bridge from traditional software development to ML-focused software development.

More about the book can be found here: https://mlpowered.com/book/

p.s. Aside from work-related books, I’m also reading the Mitch Rapp series by the late Vince Flynn, but that’s purely for some escapism entertainment. I’m enjoying them – a nice distraction.

Douglas McMurtry

Senior Developer

4 年

This is also a great ML primer. https://themlbook.com/

Emmanuel Ameisen

ML research at Anthropic. Author of Building Machine Learning Powered Applications.

4 年

Thanks for the thorough review Ryan, I'm glad you enjoyed the book enough to recommend it!

1 次回应

James Parks

Chief Data Scientist at IntelAgree

4 年

I’m about halfway through, really good stuff, thx for the recommendation

查看更多评论

要查看或添加评论，请登录

Ryan Dorrell的更多文章

Books I Read :: v2019

2019年12月9日

Books I Read :: v2019

I love to read, and unfortunately, it’s one of those things that I frequently fail at making the right amount of time…

4 条评论
Using Predictive Analytics to Improve Decision Making

2019年10月14日

Using Predictive Analytics to Improve Decision Making

If you are in the tech industry, there’s no doubt you've heard about machine learning and predictive analytics. You’ve…

2 条评论
AgileThought's Top 2018 Blog Posts

2019年1月4日

AgileThought's Top 2018 Blog Posts

With 2018 now in the books, I wanted to take a look back at the most popular content published by our AgileThinkers…
Laws of Software Development

2018年6月19日

Laws of Software Development

In speaking with people about the complexity of software development, one comparison I’ve often used to describe it is…

9 条评论
Agile Reading List – 2017 Q3 Update

2017年10月25日

Agile Reading List – 2017 Q3 Update

Since 2011, I've published a software development-focused reading list. These are books, that in my opinion, should be…

4 条评论
Advice for early-career software development professionals

2017年9月5日

Advice for early-career software development professionals

A few times a year, I’m asked to talk to our incoming class of typically freshly-graduated computer science and…

2 条评论
The future of context-adaptive devices?

2017年3月3日

The future of context-adaptive devices?

We are seeing growing trend in mobile platforms is to attempt to be relevant in the context in which you are using…
A Day in the Life of a Software Developer, 2031 Edition

2016年8月31日

A Day in the Life of a Software Developer, 2031 Edition

I thought I’d post something a little different, and take a fun look at what might the day in the life of a software…

5 条评论
Where should Scrum Masters report?

2016年8月1日

Where should Scrum Masters report?

I have heard this question perhaps twenty times over the past several months. “What part of the organization should…

5 条评论
The Software Project Model is Broken

2016年6月8日

The Software Project Model is Broken

Yes, I went with a provocative headline to grab your attention. It must have worked because you are reading this.

26 条评论

See all articles

Book Review: Building Machine Learning Powered Applications

Ryan Dorrell

Ryan Dorrell的更多文章

社区洞察

其他会员也浏览了

Standardization and Normalization Techniques in Machine Learning - Part 07

Understanding ML.NET: Machine Learning for Everyone

?? Boost Your Machine Learning Models with Ensemble Methods! ??

Understanding the Essentials of Machine Learning: A Deep Dive into Module 6 / Chapter 3 of Tom M. Mitchell, Machine Learning Book -Decision Trees

Data Encoding in Machine Learning - Part 08

Feature Transformation Topics in Machine Learning - Part 05

Boosting Techniques Battle: CatBoost vs XGBoost vs LightGBM vs scikit-learn GradientBoosting vs Hierarchical GB

How to generate data for machine learning

Ensemble Methods in Machine Learning: Boosting and Bagging

Ryan Dorrell的更多文章

Books I Read :: v2019

Using Predictive Analytics to Improve Decision Making

AgileThought's Top 2018 Blog Posts

Laws of Software Development

Agile Reading List – 2017 Q3 Update

Advice for early-career software development professionals

The future of context-adaptive devices?

A Day in the Life of a Software Developer, 2031 Edition

Where should Scrum Masters report?

The Software Project Model is Broken

社区洞察

其他会员也浏览了

Standardization and Normalization Techniques in Machine Learning - Part 07

Understanding ML.NET: Machine Learning for Everyone

?? Boost Your Machine Learning Models with Ensemble Methods! ??

Understanding the Essentials of Machine Learning: A Deep Dive into Module 6 / Chapter 3 of Tom M. Mitchell, Machine Learning Book -Decision Trees

Data Encoding in Machine Learning - Part 08

Feature Transformation Topics in Machine Learning - Part 05

Boosting Techniques Battle: CatBoost vs XGBoost vs LightGBM vs scikit-learn GradientBoosting vs Hierarchical GB

How to generate data for machine learning

Ensemble Methods in Machine Learning: Boosting and Bagging