登录查看更多内容

Unsupervised does not mean automatic

Joni Salminen

I train PhD students to become better.

发布日期: 2019年8月20日

A common misconception among business people (and maybe even some CS folks) is that clustering would be somehow "automatic learning". In reality, clustering typically requires one to set hyperparameters such as the number of cluster and the maximum/minimum distance between the clusters. These are typically tinkered with manually, so the process is not "automatic learning".

Clustering, however, is considered to be "unsupervised learning" because we don't use labeled data but try to create the labels by finding similarities between the datapoints. Therefore, the key thing is this: "unsupervised learning" is not "automatic learning". It's just learning from data that is not labeled [1]. And this learning typically requires manual human-made decisions to set the "correct" hyperparameters.

Sorry to bust your bubble; there is no "intelligence" that would magically solve all your problems!

Reference

[1] Adi Bronshtein: "Clustering is considered unsupervised learning, because there’s no labeled target variable in clustering. Clustering algorithms try to, well, cluster data points into similar groups (or… clusters) based on different characteristics of the data. In supervised learning, we have a labeled target variable we’re trying to predict, estimate (regression) or classify (classification)."

要查看或添加评论，请登录

Joni Salminen的更多文章

When to Defend Your Ideas (As a PhD Student)

2024年12月31日

When to Defend Your Ideas (As a PhD Student)

When to defend. Now, there's a thing with being a PhD student and a supervisor.

2 条评论
Thoughts on integrating GenAI in education at bachelor's level

2024年5月8日

Thoughts on integrating GenAI in education at bachelor's level

Sharing some thoughts, based on a discussion with colleagues, on how generative AI should (and should not) be…

6 条评论
Bachelor's thesis quality assessment questions

2024年3月13日

Bachelor's thesis quality assessment questions

I've noticed a trend in thesis supervision: students (at least at the Bachelor's level) don't know how to write…

10 条评论
How to Detect AI-Generated Answers in Student Assignments? A List of 11 Cues

2024年2月29日

How to Detect AI-Generated Answers in Student Assignments? A List of 11 Cues

Had a good session today with Dr. Waleed Akhtar, PhD (IT/Comp Sci).

27 条评论
"It eliminates all the fun." Automation taking over marketing?

2022年9月16日

"It eliminates all the fun." Automation taking over marketing?

Jon Loomer, a well-respected digital marketer, was interviewed by Andy Gray in Andy's podcast. They discussed…

3 条评论
21 things that are wrong about "algorithmic bias"

2021年11月10日

21 things that are wrong about "algorithmic bias"

Just spent 1.5hrs talking to a journalist about algorithms.

2 条评论
Algorithms that describe a researcher's mind

2020年7月14日

Algorithms that describe a researcher's mind

Algorithms that describe a researcher's mind: (a) Work on the paper "closest to publication". => downside: can reduce…

2 条评论
Unit for cognitive effort

2020年5月3日

Unit for cognitive effort

We should come up with a unit for cognitive effort. Like in information science you have a "bit" (binary digit that…

1 条评论
Your work is non-essential, according to coronavirus

2020年3月28日

Your work is non-essential, according to coronavirus

Interesting observations about coronavirus and economy: It's striking how FEW people we need to sustain many. More than…

3 条评论
Economic observations about corona. #coronaeconomy #economics

2020年3月24日

Economic observations about corona. #coronaeconomy #economics

Outlining five trends I'm observing at the moment. NB: These are my personal opinions, mostly based on business news…

2 条评论

See all articles

Unsupervised does not mean automatic

Joni Salminen

I train PhD students to become better.

Joni Salminen的更多文章

社区洞察

其他会员也浏览了

How to train a Perceptron ?

Handling class imbalance problem in machine learning

Important Machine Learning Terminology

Model Validations: Towards Deep Learning

Decision Tree

What is the time adaptive self-organizing map introduced by Hamed Shah-Hosseini?

Machine Learning: A Beginners Guide and why is it important for your Business

What are Training Set and Test Set?

Machine Learning for Trend Analysis

Repetition in data helps. Research by FAIR, Meta

Joni Salminen的更多文章

When to Defend Your Ideas (As a PhD Student)

Thoughts on integrating GenAI in education at bachelor's level

Bachelor's thesis quality assessment questions

How to Detect AI-Generated Answers in Student Assignments? A List of 11 Cues

"It eliminates all the fun." Automation taking over marketing?

21 things that are wrong about "algorithmic bias"

Algorithms that describe a researcher's mind

Unit for cognitive effort

Your work is non-essential, according to coronavirus

Economic observations about corona. #coronaeconomy #economics

社区洞察

其他会员也浏览了

How to train a Perceptron ?

Handling class imbalance problem in machine learning

Important Machine Learning Terminology

Model Validations: Towards Deep Learning

Decision Tree

What is the time adaptive self-organizing map introduced by Hamed Shah-Hosseini?

Machine Learning: A Beginners Guide and why is it important for your Business

What are Training Set and Test Set?

Machine Learning for Trend Analysis

Repetition in data helps. Research by FAIR, Meta