A.Iron Chef - Cook up AI in your Data Kitchen
Cupid Chan - A.Iron Chef

A.Iron Chef - Cook up AI in your Data Kitchen

One day, 2 cooks had an argument. One said "my secret recipe is so good that it can turn any ingredient into a great dish". The other cook disagreed and said "my fresh ingredients are more important since those are what a person actually eats…" The story to be continued later but let me take a guess on you now:

You have been eaten in the past 24 hours!

Of course you did and I don’t need any AI to make such prediction. We treat food as an integral part of our lives. However, when you pay close attention, you will find out that once the sci-fi-y “AI” idea appeared only in movies, is now everywhere around us. More interestingly, the core concepts of AI match closely to food and cooking. How?

Imagine in a friend’s party, you've met someone charming, kind, cute, and best of all you both love food. The conversation just unstoppable when you 2 talked about Iron Chef, your favorite Food Network TV show in which world-class chefs battling for the title of legendary Iron Chefs of America. To carry on the conversation, you extended this attractive acquaintance having a cookout dinner next Sunday at your home and the response is a resounding “YES”! But now you have another problem, “How to impress this new friend in the first date?”

I know! Beef Wellington! But your admirer is a vegetarian!!! – Know your Business Requirement

No alt text provided for this image

Who doesn’t want tendered, flavor-exploding Beef Wellington just coming out of the oven baked in the perfect temperature with rich and savory sauce? The answer: A vegetarian!!! If you think you have the perfect dish so that you can create a great impression in your first date. Think twice because you may have skipped the most important step to make a dish successful – knowing your audience’s taste. Similarly, for an AI project, it has no point to waste time in building some fancy algorithms but cannot use them practically. I love technology as a kid loves toy. Unfortunately, most technology projects end up failure because people just treat it as a “toy” so that the new and shiny one always catches one’s eyes. Without real business value, the coolest technology is only a toy for a geek. Therefore, the first step is to understand what the goal is and comprehend why the business wants it. This is true not only for AI, but also for any technology project in general. 

Start shopping the ingredient – Data Gathering

No alt text provided for this image

After stalking the Instagram post, you know seafood gumbo is your new friend’s favorite. Now, it’s the time to shop for the ingredient. You saw Whole Foods has a weekly special on frozen lobster, and the local seafood store having daily catch with a higher price. Your cousin also recommends a new website importing seafood directly from Mediterranean. There are many choices. Which one should you pick? For AI, data is your ingredient which is critical to produce result. However, it may not be easy to find the right data set. Sometimes you find multiple data sources but cannot determine which one is more reliable. In other times, you may spend weeks or months and finally identify one data set but nothing seems appropriate after second look. Only one thing for sure, in order to support a sound analysis, you need the right data to be ingested with a reliable collection process. 

Prepare the ingredient – I mean your data

No alt text provided for this image

Lucky you that finally found all ingredients for your dish. Now it’s the time to do the prep work. Some basic steps like measuring, cleaning and cutting ingredients apply to both cooking and AI. You need to measure or profile the data to ensure there is no surprise. If you see 100°F in Alaska last winter, something is wrong that you need to “clean” it up. Besides removing bad data points, cleaning also involves handling missing data. At the same time, this process should detect and eliminate duplicated records in the data set. When you shopped for ingredients, you may purchase more than you need just in case but in preparation, you need to decide which ones should really be used. You may need to discard data if that doesn’t fit, even you spent long time to acquire it. Also, please be aware that the portion of a dish for a couple is very different than a family meal for 10 people. Therefore, you need to “normalize” the data with the right amount and balance across all features.

Which recipe should I use? – Choose a right model

No alt text provided for this image

You think you have the secret recipe from the top gourmet chef which made everyone drooling even just by the smell. Isn’t it the equivalent secret recipe in AI called “Deep Learning” which surpasses all other algorithms? If so, why even bother to try other models if we have a deep neural network? There are at least 2 reasons. First of all, Deep Learning usually requires a lot more data to train a reliable model. But the availability of the data may not be always handy. Secondly, the cost to train with deep neural network is also higher than a simple one, yet the result may be just a tiny better if not the same. Therefore, it’s not always the complicated recipe wins. Sometimes an old grandma’s simple recipe is what you need. Over-engineering will waste your time and may end up just a marginal better result – less is more!

Talking about recipe, some chefs invent new receipts as they become more experienced and share them so that we can just borrow it in our meal. For the same token, there are a lot of hardcore data scientists dedicating their times to create new algorithm or perfect an existing ones to make them faster and more effective. Best of all, they open source it so that we can enjoy the result by applying that in our projects. Take advantage of that instead of reinventing the wheel yourself.

Let’s fire up the stove! – Training 

No alt text provided for this image

Ingredients? Checked! Recipe? Checked! Believe it or not, if you are at this point, you have already completed 60 to 70% work for the whole AI process, before even heading to the stove to cook some data up by the algorithm! Just like at the bottom of most recipes would have a reminder like “Adjust Based on Personal Taste”, Training in AI can vary and we can fine tune it for a better result. As you feed more data, the model will learn better the pattern and provide a more reliable and trustable result. Sometimes, you just need to try different hyperparameters to make the process faster and more accurate. Occasionally, in the very worst case if you find out the result is spoiled because it has a lot of bias towards certain groups of people, we may need to trash the whole pot and restart the process by collecting broader set of data from the beginning. Training is an iterative process. 

Smell good but how about the taste? – Evaluation

No alt text provided for this image

You think the dish is ready because the whole house smells like heaven. But wait until you taste it and make sure it’s all good. This is where the Evaluation comes in for the AI process. Even though Training gives you an encouraging result, it doesn’t mean the model is ready to be consumed. Sometimes, too good of a model is just an indicator of overfitting. You need to test it out by having someone else to taste and see whether they also think it tastes as awesome as you believe it is. There are different approaches to assess the result including simple Confusion Matrix or Mean Average Error (MAE). The goal here is to make sure the model works not only for the training data, but it is generalized and perform as good to cases have not been seen before.

Ready to serve – Besides prediction, there are more to consider  

No alt text provided for this image

Finally, the dish is ready to be served. However, it doesn’t mean just for making prediction. Remember the very first step in this article? Ultimately, business requirement is what we need to fulfill. Prediction can be, and is, a very important component that people always seek for. But that doesn’t mean it’s the only consideration. There is still a fine line. For example, in making a spam email filter, it will be much of a harm if an important email is mis-categorized as spam rather than letting a junk email gets in your inbox. On the other hand, if a model is to detect Covid-19, it’s more preferable to have a bit higher false positive rate instead of missing a real case which falls through the cracks and spread the virus in the community. Where to put the fine line is the balance between accuracy and practicality.

Check please

No alt text provided for this image

It will be na?ve to think Iron Chefs like Bobby Flay, Cat Cora or Michael Symon cook only a few dishes in an Iron Chef Episode every week. They are the owners of world class restaurant turning ingredient and recipe to gourmet dishes for hundreds of customers. The volume is much larger in a restaurant compared to a meal for a family of 4. Similarly, running a full scale of AI in Data Kitchen for an enterprise is another level of complexity. But don’t worry, everything still builds on the fundamental principles which you have already learnt in this article.

So What?!

Remember the story about the 2 cooks in the beginning? Soon after the argument started, they both realized neither of them are correct as both good ingredient with a matching recipe are critical for a delicious meal. Also, some recipes only work well for certain ingredients. So What? It’s a science as well as an art to make a perfect dish. Just like in AI, a magnificent result always comes from a unique data sets with a harmonizing algorithm, inseparably. With an open mind and empty stomach, let the AI battle begin! "Allez cuisine!"

Below is the YouTube Version of this article


Donald Farmer

Data without analysis is a wasted asset. Analytics without action is wasted effort. I write compulsively and advise startups, established software vendors, investors and enterprises on data, analytics and AI strategy.

4 年

Great article!

Dalton Ruer

Data Cathedral Architect, Chief Question Officer

4 年

Delicious points you make with a spot on analogy. Reminded me of a video that Donald Farmer made 6 years ago to help people understand what an Information Supply Chain was. https://www.youtube.com/watch?v=utOyoywOkjo

Sean Seerey

Data Science and AI: Helping Federal Agencies gain insights from data through Advanced Analytics

4 年

Well done Cupid!

Michael Lazar

Sr. Sales Engineer - Starburst

4 年

Great example and fun read. You nailed it.

要查看或添加评论,请登录

Cupid Chan的更多文章

社区洞察

其他会员也浏览了