登录查看更多内容

Cluster Analysis

Serigne DIAW

Data Engineer | Data Scientist | Data Architect LinkedIn Group Owner

发布日期: 2020年7月24日

+ 关注

What is Cluster Analysis?

Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups.

Based on information found in the data that describes the objects and their relationships.
Also known as unsupervised classification.

Many applications

Understanding: group related documents for browsing or to find genes and proteins that have similar functionality.
Summarization: Reduce the size of large data sets.

Web Documents are divided into groups based on a similarity metric.

Most common similarity metric is the dot product between two document vectors.

What is not Cluster Analysis?

Supervised classification.

Have class label information.

Simple segmentation.

Dividing students into different registration groups alphabetically, by last name.

Results of a query.

Groupings are a result of an external specification.

Graph partitioning

Some mutual relevance and synergy, but areas are not identical.

Types of Clusterings

A clustering is a set of clusters.

One important distinction is between hierarchical and partitional sets of clusters.

Partitional Clustering

A division data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset.

Hierarchical clustering

A set of nested clusters organized as a hierarchical tree.

DIAW Serigne
Data Engineer, Data Scientist at Business and Decision
paper : https://www.ieee.org.ar/downloads/Srivastava-tut-pres.pdf

Serigne DIAW的更多文章

CRM & Big Data Analytics

2020年8月11日

CRM & Big Data Analytics

What is this Big Data everyone keeps talking about these days? Big Data refers to the huge volumes of data being…

Cluster Analysis

Serigne DIAW

Data Engineer | Data Scientist | Data Architect LinkedIn Group Owner

What is Cluster Analysis?

Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups.

Many applications

Web Documents are divided into groups based on a similarity metric.

What is not Cluster Analysis?

Supervised classification.

Simple segmentation.

Results of a query.

Graph partitioning

Types of Clusterings

A clustering is a set of clusters.

One important distinction is between hierarchical and partitional sets of clusters.

Partitional Clustering

Hierarchical clustering

Serigne DIAW的更多文章

社区洞察

其他会员也浏览了

TIME SERIES FORECASTING APPROACH

4 Types of Trees in Data Structure Explained: Properties & Applications

Linear regression for housing data using randomized search, cross-validation, search grid, or combines:

Part 2: Predicting results, and working with Command Boards using Machine Learning

Building A Simple Linear Regression Model.

The What How & When in the life of a Histogram

Unlocking Insights: How Everyday Charts Boost Business Understanding and Decision-Making

Understanding Ridge Regression with 2D Data & Custom Implementation

Visualizing Missing Values in a DataFrame Using Matplotlib

Logistic Regression

What is Cluster Analysis?

Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups.

Many applications

Web Documents are divided into groups based on a similarity metric.

What is not Cluster Analysis?

Supervised classification.

Simple segmentation.

Results of a query.

Graph partitioning

Types of Clusterings

A clustering is a set of clusters.

One important distinction is between hierarchical and partitional sets of clusters.

Partitional Clustering

Hierarchical clustering

Serigne DIAW的更多文章

CRM & Big Data Analytics

社区洞察

其他会员也浏览了

TIME SERIES FORECASTING APPROACH

4 Types of Trees in Data Structure Explained: Properties & Applications

Linear regression for housing data using randomized search, cross-validation, search grid, or combines:

Part 2: Predicting results, and working with Command Boards using Machine Learning

Building A Simple Linear Regression Model.

The What How & When in the life of a Histogram

Unlocking Insights: How Everyday Charts Boost Business Understanding and Decision-Making

Understanding Ridge Regression with 2D Data & Custom Implementation

Visualizing Missing Values in a DataFrame Using Matplotlib

Logistic Regression