登录查看更多内容

Will the real ML problem please stand up

Ankit Pareek

Driving Digital Innovation

发布日期: 2019年1月31日

We are gradually entering in the world filled and driven by the data around us, that looks like us, acts like us and talks like us. For those of us who are driving it, they know we are not kidding.

There was a time when the (some still have) handful of data in a structure format, lots of time and limited actions to be taken on the basis of that. While a good percentage of the world still acts the same. We would like to steer our thoughts on the what is slowly becoming the standard of the industry irrespective of the domain , functional knowledge or the business use case i.e., to feed the data to a black box machine learning model and try to sell its outcome attached with the confidence score to the same proportion of guys sitting on the opposite side.

When a machine is taking a decision on the basis of the data its being fed, the person who is designing the model should either have ample knowledge of the business or have enough data to create that correlation and identify patterns. One of the tricky part comes that the person sitting on the opposite side is either not aware of the assumption that machine learning outcome is not called out by God himself. But, its just on the basis of the data it has been fed and the manipulations done on the top of the data.

Its good to know the different types of machine learning systems here :-

If the machine learning algorithms have been trained / not-trained under human supervision (supervised, unsupervised, semi supervised or Reinforcement learning)
If they can learn incrementally on the fly (online vs. batch learning)
Whether they compare new data with matching data or identify patterns in the data (instance-based vs model-based learning)

The above snippet taken from internet, describes some of the familiar use cases on which either the

Supervised Learning, which involves creating labels against each desired solution. The classification algorithm like spam or not spam typically does the same.

However, if you have to predict a numeric value Regression come for the rescue. Some of the key Supervised learning algorithms one needs to be aware of are :

a) Linear regression - Line's slope is good enough to tell if the outcome would be positive or negative.

b) Logistic regression - Can also be used to solve a classification problem along with confidence score.

c) k-nearest neighbor

d) Support vector machines (SVMs)

e) Decision Trees and Random Forests

f) Neural networks

Unsupervised ; learning, is when the training data is unlabeled.

Clustering : If you want to detect groups of similar visitors shopping, buying the algorithm comes handy. (k-means, Hierarchical Cluster Analysis (HCA) , Expectation maximization)
Visualization and Dimensionality reduction (Principal Component Analysis (PCA), Kernal PCA, Locally-linear embedding, t-SNE (t-distribution Stochastic Neighbor Embedding)
Association rule learning (Apriori, Eclat)

Semi supervised learning is what Google photos use for labeling your images on the basis of selected labels applied on few faces.

Reinforcement Learning uses a learning system called agent. The agent learns by itself on the basis of a policy which either penalizes the agent on wrong decision or rewards for each right decision. This keeps on iterating till the agent is able to reduce its losses in minimum iterations or steps.

Batch and Online Learning

In case, of the batch learning the system is incapable of learning incrementally: it must be trained on all the available data. So in the production system, with each new increment to the batch of data.

You would need to train an algorithm in isolation with new data.
Switch off the production system
Discontinue the older algorithm and replace it with new model trained with whole data

Whereas, in case of the online learning the system incrementally feeds the data instance sequentially. This is a good machine learning system in cases such as stock price prediction, ad / product recommendation on e-commerce websites.

Instance-based vs Model-based Learning

Instance based learning the system learns the examples (labels) by heart and generalizes the new use cases using a similarity measure.

If you create model of these labels and use the outcome to predict the outcome. Like using a linear regression to achieve a correlation between the life expectancy and Average Income.

This is a just a way to be disciplined in an ever changing world of so much noise around the ML project with main challenges of Data Quality, Business / Function knowledge, Irrelevant / Missing data, Insufficient Quantity of data, Non representative trained data.

I am pretty sure some of us might already trapped in the dogma surrounding the force fitting of a ML model even when traditional approach might work just fine.

Amen

Narender Bansal

6 年

Very nice explanation...

要查看或添加评论，请登录

Ankit Pareek的更多文章

At the Crossroads: From DeepSeek V3 FP8 to Nvidia Blackwell GB200NVL72 FP4

2025年2月3日

At the Crossroads: From DeepSeek V3 FP8 to Nvidia Blackwell GB200NVL72 FP4

"They tossed around numbers like $5M and $500B, and we were instantly sold" Numbers captivate us, and when they're…
Future of Generative AI for Enterprises: The Game-Changing Potential of Small Language Models

2024年6月24日

Future of Generative AI for Enterprises: The Game-Changing Potential of Small Language Models

In 15 months, Large Language Models like GPT-4 have surged in prominence, boasting parameter counts that exceed a…
Why CEOs are embracing this Gen AI feature more than anything else

2024年4月29日

Why CEOs are embracing this Gen AI feature more than anything else

In a recent survey by PwC, it was revealed that within just one year of its launch, over 54% of companies have…

6 条评论
LLM Orchestration: The Secret Weapon of Enterprise AI

2024年3月26日

LLM Orchestration: The Secret Weapon of Enterprise AI

LLM orchestration addresses the challenges of deploying and managing generative AI solutions in today's dynamic…
Turn Social Noise into Smart Engagement: Your RAG Powered Listening & Response Bot

2024年3月17日

Turn Social Noise into Smart Engagement: Your RAG Powered Listening & Response Bot

Hey there! Ankit Pareek here, ready to drop some knowledge that’s going to help your enterprise get more traction on…
The $25 Million Lesson: Hands-On Demos for Spotting & Stopping Voice Cloning & Deepfakes

2024年3月4日

The $25 Million Lesson: Hands-On Demos for Spotting & Stopping Voice Cloning & Deepfakes

In recent weeks, a high-profile incident in Hong Kong has garnered sustained attention as a finance worker fell victim…
Would You Conduct a Quality Audit on Your Buying Behaviour/ Procurement? Exploring Mindful Purchases with Generative AI Guidance

2024年2月4日

Would You Conduct a Quality Audit on Your Buying Behaviour/ Procurement? Exploring Mindful Purchases with Generative AI Guidance

In a digital era overflowing with choices, the impact of our purchasing habits on our lives has never been more…
The Algorithmic Allure: Decoding the Psychological Drivers of Chatbot-Influenced Upselling and Cross-Selling in Retail

2024年1月29日

The Algorithmic Allure: Decoding the Psychological Drivers of Chatbot-Influenced Upselling and Cross-Selling in Retail

In 2017, a girl set tongues wagging with her Tinder profile. Forget swiping – she sought a travel companion for a…
Gemini Vision Pro: The Vision of a Commodified Future for Productivity and Object detection Tools

2023年12月18日

Gemini Vision Pro: The Vision of a Commodified Future for Productivity and Object detection Tools

What the vision tools have lacked so far, was the ability to be multi-modal and doing the Image analysis, object…

3 条评论
Generative QA : Retrieval-Augmented Generation (RAG) Systems, RAG Triads and TruLens

2023年12月4日

Generative QA : Retrieval-Augmented Generation (RAG) Systems, RAG Triads and TruLens

To tackle the limitations inherent in LLMs, a pivotal strategy involves augmenting prompts with pertinent data…

See all articles

Will the real ML problem please stand up

Ankit Pareek

Driving Digital Innovation

Ankit Pareek的更多文章

社区洞察

其他会员也浏览了

Types and Application of Machine Learning Algorithms

MLOps - An Overview

Artificial Intelligence #48: How do we combine statistical thinking and machine learning?

Mastering the Machine Learning Journey: Navigating the Algorithm Selection Sea #Stage3

The Five Schools of Thought in AI/ML.

What is machine learning and how does it work?

Why Is Machine Learning Considered The Future?

AI-Enhanced Indexing: Learned Index Structures

Why Is Machine Learning Considered The Future?

The Role of Outliers in Machine Learning: Should You Keep or Remove Them?

Ankit Pareek的更多文章

At the Crossroads: From DeepSeek V3 FP8 to Nvidia Blackwell GB200NVL72 FP4

Future of Generative AI for Enterprises: The Game-Changing Potential of Small Language Models

Why CEOs are embracing this Gen AI feature more than anything else

LLM Orchestration: The Secret Weapon of Enterprise AI

Turn Social Noise into Smart Engagement: Your RAG Powered Listening & Response Bot

The $25 Million Lesson: Hands-On Demos for Spotting & Stopping Voice Cloning & Deepfakes

Would You Conduct a Quality Audit on Your Buying Behaviour/ Procurement? Exploring Mindful Purchases with Generative AI Guidance

The Algorithmic Allure: Decoding the Psychological Drivers of Chatbot-Influenced Upselling and Cross-Selling in Retail

Gemini Vision Pro: The Vision of a Commodified Future for Productivity and Object detection Tools

Generative QA : Retrieval-Augmented Generation (RAG) Systems, RAG Triads and TruLens

社区洞察

其他会员也浏览了

Types and Application of Machine Learning Algorithms

MLOps - An Overview

Artificial Intelligence #48: How do we combine statistical thinking and machine learning?

Mastering the Machine Learning Journey: Navigating the Algorithm Selection Sea #Stage3

The Five Schools of Thought in AI/ML.

What is machine learning and how does it work?

Why Is Machine Learning Considered The Future?

AI-Enhanced Indexing: Learned Index Structures

Why Is Machine Learning Considered The Future?

The Role of Outliers in Machine Learning: Should You Keep or Remove Them?