登录查看更多内容

Simple Python Script for Clustering Keywords [ Script Included ]

Venkata Pagadala

SEO AI Product Manager | Gen Ai | RAG | Agentic Ai - Ai Agents | Programmatic SEO (PSEO) | Enterprise & Technical SEO

发布日期: 2023年3月16日

Simple Python Script for Clustering Keywords

??Python code that performs clustering of keywords using the Agglomerative Clustering algorithm and TF-IDF vectorization. Here is a brief overview of the functions in the code:

?? read_keywords(file_path) - reads the keywords from a CSV file specified by file_path and returns a list of keywords.

?? write_clusters_to_csv(file_path, clusters, keywords) - writes the clusters of keywords to a CSV file specified by file_path. The clusters are assigned integer labels and are written to the second column of the output file, with the corresponding keyword in the first column.

?? text_similarity(keywords) - calculates the TF-IDF similarity matrix of the input keywords and returns the matrix.

?? cluster_keywords(similarity_matrix, num_clusters) - performs agglomerative clustering on the similarity matrix using num_clusters clusters and returns the cluster labels.

?? main() - defines the input file, output file, and number of clusters, reads the keywords from the input file, calculates the similarity matrix, performs clustering, and writes the clusters to the output file.

Overall, this code can be used to cluster a set of keywords based on their similarity using TF-IDF vectorization and the Agglomerative Clustering algorithm, and write the resulting clusters to a CSV file.

Script Included

import?cs
import?numpy?as?np
from?sklearn.cluster?import?AgglomerativeClustering
from?sklearn.feature_extraction.text?import?TfidfVectorizer


#?Read?keywords?from?input?file
def?read_keywords(file_path):
????keywords?=?[]
????with?open(file_path,?"r")?as?f:
????????reader?=?csv.reader(f)
????????for?row?in?reader:
????????????keywords.append(row[0])
????return?keywords


#?Write?clustered?keywords?to?output?file
def?write_clusters_to_csv(file_path,?clusters,?keywords):
????with?open(file_path,?"w",?newline='')?as?f:
????????writer?=?csv.writer(f)
????????writer.writerow(["Keyword",?"Cluster"])
????????for?keyword,?cluster?in?zip(keywords,?clusters):
????????????writer.writerow([keyword,?cluster])


#?Calculate?text?similarity?using?TF-IDF
def?text_similarity(keywords):
????vectorizer?=?TfidfVectorizer()
????keyword_matrix?=?vectorizer.fit_transform(keywords)
????return?keyword_matrix


#?Perform?clustering
def?cluster_keywords(similarity_matrix,?num_clusters):
????clustering?=?AgglomerativeClustering(n_clusters=num_clusters)
????clusters?=?clustering.fit_predict(similarity_matrix.toarray())
????return?clusters


#?Main?function
def?main():
????input_file?=?"keywordsinput.csv"
????output_file?=?"Cluster.csv"
????num_clusters?=?5


????keywords?=?read_keywords(input_file)
????similarity_matrix?=?text_similarity(keywords)
????clusters?=?cluster_keywords(similarity_matrix,?num_clusters)
????write_clusters_to_csv(output_file,?clusters,?keywords)


if?__name__?==?"__main__":
????main()v

Input File

Output

要查看或添加评论，请登录

Venkata Pagadala的更多文章

Top 8 Points - Improving your SEO with conceptual models - Mark Williams-Cook

2024年12月17日

Top 8 Points - Improving your SEO with conceptual models - Mark Williams-Cook

Here are the top 10 important points from Improving your SEO with conceptual models - Mark Williams-Cook Video:…

1 条评论
Brighton SEO Conference 2024

2024年11月23日

Brighton SEO Conference 2024

3 Reasons why you need to attend Its highly impossible to meet 100+ SEO's in one place Build network, find mentors…

18 条评论
Mastering Google Search Operators: A Comprehensive Guide

2023年5月12日

Mastering Google Search Operators: A Comprehensive Guide

Introduction The internet is a vast ecosystem of information, and Google is the gatekeeper. It's no surprise then, that…
Python & NLP for SEO: A Basic Guide

2022年4月29日

Python & NLP for SEO: A Basic Guide

Thanks to Hamlet Batista, Jean-Christophe Chouinard, What is Python? Python is an interpreted, object-oriented…

Simple Python Script for Clustering Keywords [ Script Included ]

Venkata Pagadala

SEO AI Product Manager | Gen Ai | RAG | Agentic Ai - Ai Agents | Programmatic SEO (PSEO) | Enterprise & Technical SEO

Venkata Pagadala的更多文章

社区洞察

其他会员也浏览了

Top 10 Ways to deal with Missing Values in Python

?????? # 4 ???????????????????? ?????? ?????????? ???? ????????????: Basic Data Types in Python

Inbuilt Data Structures in Python

Python Lists – Learn Data Structures in Python

Python Fundamental 01- print function (), variable, Data Types & comments. | Belayet Hossain.

Mastering Python Dictionaries: Key to Efficiency ????

Fine Tuning Your Own Sentence Transformers with Python

Python Recommendation Systems

Mastering Regular Expressions (Regex) in Python: A Complete Guide with Cheat Sheet & Examples

Mastering Hotel Revenue Management with Python: Dynamic Pricing Made Easy with Code Snippets

Venkata Pagadala的更多文章

Top 8 Points - Improving your SEO with conceptual models - Mark Williams-Cook

Brighton SEO Conference 2024

Mastering Google Search Operators: A Comprehensive Guide

Python & NLP for SEO: A Basic Guide

社区洞察

其他会员也浏览了

Top 10 Ways to deal with Missing Values in Python

?????? # 4 ???????????????????? ?????? ?????????? ???? ????????????: Basic Data Types in Python

Inbuilt Data Structures in Python

Python Lists – Learn Data Structures in Python

Python Fundamental 01- print function (), variable, Data Types & comments. | Belayet Hossain.

Mastering Python Dictionaries: Key to Efficiency ????

Fine Tuning Your Own Sentence Transformers with Python

Python Recommendation Systems

Mastering Regular Expressions (Regex) in Python: A Complete Guide with Cheat Sheet & Examples

Mastering Hotel Revenue Management with Python: Dynamic Pricing Made Easy with Code Snippets