课程: R for Data Science: Lunch Break Lessons

免费学习该课程!

今天就开通帐号,24,100 门业界名师课程任您挑!

Clustering with pam and clara

Clustering with pam and clara

- [Instructor] Last week I talked about clustering with k-means and k-means uses the average distance from a centroid to determine a cluster. Centroids are nothing more than points that stick in the middle of a cluster. The interesting thing about centroids is because they're an average, they don't necessarily need to be part of the actual data set. But you may want to do a cluster where the centroid is part of the data set, and in this case, you'll want to use something called PAM, which is short for Partitioning Around Medoids. And medoids are a type of centroid, but they are a member of the dataset. Incidentally, where k-means produces the average, PAM, Partitioning Around Medoids, uses the median to determine the center of a cluster. So let's find out how to do that. The first thing I've done, and you can see this up in line two is create a vector called simplequakes and into it, I've placed a subset of the quakes…

内容