Chapter 9. Unsupervised Learning Techniques
??? ?? ??????? ?? ????? ????? ?? machine learning Algorithms (Supervised)
?????? ??? ?? ????? ?????? (Unsupervised)
??????? ????? ??
1: Clustering ??????? : k-means and DBSCAN
?????? ?? ???? ???? ??? ?? Clusters ??? ?? ???? ??????
1: elbow Method
2: silhouette score
?????? ????? ?? ??? ??? ??? ??????? Inertia
which is a performance matrix which is the sum of the squared distances between the instances and their closest centroids.
??????? ??????? ???? ????? ?? ??? ??????
Smaller Inertia: A smaller inertia value indicates that the data points are closer to their respective cluster centroids. This generally signifies a better clustering performance because it means the points within each cluster are more tightly grouped
Larger Inertia: A larger inertia value indicates that the data points are farther from their respective cluster centroids. This typically signifies poorer clustering performance because the points within each cluster are more spread out.
????? ?? Clustering for Image Segmentation
1 Color segmentation : Suppose we have an image containing objects of different colors, such as red apples and yellow bananas. We want to segment the image so that we have two segments: one containing all the red pixels (apples) and one containing all the yellow pixels (bananas).
2 semantic segmentation : For example, in an image of a street scene, pixels might be classified as "road," "car," "pedestrian," etc.
领英推荐
3 instance segmentation: all pixels that are part of the same individual object are assigned to the same segment. In this case there would be a different segment for each pedestrian.
????? ?????? ?? clustering
DBSCAN:
The density-based spatial clustering of applications with noise (DBSCAN) algorithm defines clusters as continuous regions of high density.
1 ε-Neighborhood:
For each instance in the dataset, the algorithm defines a region around it called the ε-neighborhood. This region includes all instances that are within a small distance ε (epsilon) from the instance. The ε-neighborhood captures the local density around each instance.
2 Core Instances:
An instance is considered a core instance if it has at least min_samples other instances (including itself) within its ε-neighborhood. Core instances are located in dense regions of the dataset.
3 Cluster Formation:
All instances within the ε-neighborhood of a core instance are considered part of the same cluster. This means that core instances act as seeds for cluster formation. Since the ε-neighborhood of a core instance may contain other core instances, a long sequence of neighboring core instances can form a single cluster.
4 Anomalies:
Any instance that is not a core instance and does not have a core instance in its ε-neighborhood is considered an anomaly or noise. Anomalies are typically isolated instances that do not belong to any dense region of the dataset.
?????? ????? ?? ??? Algorithm ?? Gaussian Mixtures ???? ??? ????? ?? ??????? ??? ?? ?????? ?????? Gaussian distributions ?? Parameters ??????? ?? ??????
? Agglomerative clustering ?Spectral clustering
NLP Engineer | AI Master's Degree From Queen's University
10 个月??? ? ??? ???? ??????? ? ?????? ??? ????? ??????? ??? ?????? ??????