登录查看更多内容

Clustering with K-Means

Angad Gupta ,MIEEE, BITS-Pilani

Renewable Energy | Clean Tech | DR | VPP| DERMS|EV

发布日期: 2020年6月13日

What is Clustering in Data Mining?

Clustering is the grouping of specific objects based on their characteristics and their similarities. As for data mining, this methodology divides the data that is best suited to the desired analysis using special join algorithms. This analysis allows an object not to be part or strictly part of a cluster, which is called the hard partitioning of this type. However, smooth partitions suggest that each object in the same degree belongs to a cluster. More specific divisions can be created like objects of multiple clusters, a single cluster can be forced to participate or even hierarchic trees can be constructed in group relations. This filesystem can be put into place in different ways based on various models. These Distinct Algorithms apply to each and every model, distinguishing their properties as well as their results. A good clustering algorithm is able to identify the cluster independent of cluster shape. There are 3 basic stages of clustering algorithm which are shown below

Methods of Clustering in Data Mining

The different methods of clustering in data mining are as explained below:

Clustering is an unsupervised learning

Clustering is a powerful machine learning tool for detecting structures in datasets. Unlike supervised methods, clustering is an unsupervised method that works on datasets in which there is no outcome (target) variable nor is anything known about the relationship between the observations, that is, unlabeled data.

Goal of Clustering

Clustering Algorithm:

Illustration of clustering by using Slearn inbuilt and very famous Iris dataset

Building and running the Model

here we defined 3 clusters

Plotting the output of Model in Scatter plots

Here we have plotted the output in 2 scatter plots graph, one id based on the Iris target variable and the second plot is based on the clustering labels, where we can see the labels are a mismatch

Relabeling and regenerating the plots

Now we can see that both scatter plots are looks similar

Evaluation of the clustering model

Here Precision: a measure of the model's relevancy and Recall: a measure of the model's completeness. High Precision + High Recall = Highly Accurate model results

here we can see that the clustered variable O is having 100% precision and Recall and which is very well clustered and variable 1 & 2 is also performed very well and reached above 70%

Overall model has done 83% accurate clustering

Strength and weakness of K-Means

#datascience #machinelearning #regression #multiple regression #MLR #python #statistics #statemodel #modeling #model interpretation #MLR #linearregression #learning #ml #datascience #datamodeloing #dataevalution #datavisualization #gupta #clusttering #k-means #unsupervisiedlearning #iris #learning #clusteringexample #slearn

Angad Gupta ,MIEEE, BITS-Pilani的更多文章

TYPES OF ELECTRIC VEHICLES AND ITS KEY COMPONENTS

2024年5月23日

TYPES OF ELECTRIC VEHICLES AND ITS KEY COMPONENTS

There are four types of electric vehicles available: Battery Electric Vehicle (BEV): Fully powered by electricity…
eRoaming : a Revolutionary step in EV Charging

2024年5月23日

eRoaming : a Revolutionary step in EV Charging

eRoaming presents a revolutionary advantage in the realm of electric vehicles. Firstly, it ensures universal access for…
EV Roaming and Its different protocols (OICP, OCPI, OCHP eMIP)

2024年5月16日

EV Roaming and Its different protocols (OICP, OCPI, OCHP eMIP)

An e-Mobility Service Provider (eMSP) is a company that facilitates electric vehicle (EV) charging roaming services…
Open Charge Point Protocol (OCPP) vs. Open Charge Point Interface (OCPI)

2024年5月16日

Open Charge Point Protocol (OCPP) vs. Open Charge Point Interface (OCPI)

What is OCPI? The Open Charge Point Interface (OCPI) is an open, automated protocol that connects EV charge point…
Interoperability in EV charging Infrastructure

2024年5月16日

Interoperability in EV charging Infrastructure

Interoperability and standardization are essential factors in the development and widespread adoption of electric…
Relationship between SOH (State of Health)and SOC (State of Charge) of the battery

2024年5月12日

Relationship between SOH (State of Health)and SOC (State of Charge) of the battery

SOH (State of Health) is mainly influenced by SOC (State of Charge), temperature, discharge multiplier, cumulative…
Battery states: State of charge (SoC), State of Health (SoH),Depth-of-Discharge(DoD)

2024年5月10日

Battery states: State of charge (SoC), State of Health (SoH),Depth-of-Discharge(DoD)

SoC= State-of-charge SoC stands for State of Charge, which is a measure of how much energy is remaining in a battery as…
V2X and Its Stakeholders

2024年3月11日

V2X and Its Stakeholders

V2X has a diversified range of stakeholders including OEMs, semiconductor companies, telecommunication operators, and…
Your Electric Car is Your Power House with V2X Technologies (Bidirectional Charging)

2024年3月9日

Your Electric Car is Your Power House with V2X Technologies (Bidirectional Charging)

The concept of vehicle-to-everything (V2X) V2X technologies, including vehicle-to-grid (V2G), vehicle-to-home (V2H)…
Bidirectional Charging EVs: V2X [V2G, V2H,V2L , V2V, V2B and V2F]

2024年3月7日

Bidirectional Charging EVs: V2X [V2G, V2H,V2L , V2V, V2B and V2F]

Bidirectional charging is becoming more common in electric vehicles, and buyers are increasingly looking for models…

See all articles

Clustering with K-Means

Angad Gupta ,MIEEE, BITS-Pilani

Renewable Energy | Clean Tech | DR | VPP| DERMS|EV

What is Clustering in Data Mining?

Methods of Clustering in Data Mining

Clustering is an unsupervised learning

Clustering Algorithm:

Illustration of clustering by using Slearn inbuilt and very famous Iris dataset

Building and running the Model

Evaluation of the clustering model

Angad Gupta ,MIEEE, BITS-Pilani的更多文章

社区洞察

其他会员也浏览了

Data Science: theory free or hypothesis based?

The Science of Data Mining (Part 3) — Data Clustering Analysis

Big Data and data mining

Using Bayesian Regression for Stacking Time Series Predictive Models

Probability: Normal, Binomial, Poisson Distributions and Bayes theory

DATA SCIENCE

Data Science Basic Problem Solving

Data Mining vs Machine Learning

Operational Research: important tools for data scientists

Decision Tree

What is Clustering in Data Mining?

Methods of Clustering in Data Mining

Clustering is an unsupervised learning

Clustering Algorithm:

Illustration of clustering by using Slearn inbuilt and very famous Iris dataset

Building and running the Model

Evaluation of the clustering model

Angad Gupta ,MIEEE, BITS-Pilani的更多文章

TYPES OF ELECTRIC VEHICLES AND ITS KEY COMPONENTS

eRoaming : a Revolutionary step in EV Charging

EV Roaming and Its different protocols (OICP, OCPI, OCHP eMIP)

Open Charge Point Protocol (OCPP) vs. Open Charge Point Interface (OCPI)

Interoperability in EV charging Infrastructure

Relationship between SOH (State of Health)and SOC (State of Charge) of the battery

Battery states: State of charge (SoC), State of Health (SoH),Depth-of-Discharge(DoD)

V2X and Its Stakeholders

Your Electric Car is Your Power House with V2X Technologies (Bidirectional Charging)

Bidirectional Charging EVs: V2X [V2G, V2H,V2L , V2V, V2B and V2F]

社区洞察

其他会员也浏览了

Data Science: theory free or hypothesis based?

The Science of Data Mining (Part 3) — Data Clustering Analysis

Big Data and data mining

Using Bayesian Regression for Stacking Time Series Predictive Models

Probability: Normal, Binomial, Poisson Distributions and Bayes theory

DATA SCIENCE

Data Science Basic Problem Solving

Data Mining vs Machine Learning

Operational Research: important tools for data scientists

Decision Tree