NLP Text Similarity on python

NLP Text Similarity on python

import sys
# Define the documents
doc_trump = "How do I make a Directory entry?"

doc_election = "I make a Directory submit"

#doc_putin = "i am feeling very very bad"

documents = [doc_trump, doc_election]

# Scikit Learn
from sklearn.feature_extraction.text import CountVectorizer
import pandas as pd

# Create the Document Term Matrix
count_vectorizer = CountVectorizer(stop_words='english')
count_vectorizer = CountVectorizer()
sparse_matrix = count_vectorizer.fit_transform(documents)

# OPTIONAL: Convert Sparse Matrix to Pandas Dataframe if you want to see the word frequencies.
doc_term_matrix = sparse_matrix.todense()
df = pd.DataFrame(doc_term_matrix,
                  columns=count_vectorizer.get_feature_names(),
                  index=['doc_trump', 'doc_election'])
df
# Compute Cosine Similarity
from sklearn.metrics.pairwise import cosine_similarity
print(cosine_similarity(df, df))


Output:

[[1.     0.51639778]

 [0.51639778 1.    ]]

要查看或添加评论,请登录

Sathishkumar Nagarajan的更多文章

  • Bridging the Gap: How Generative AI and LLMs Are Revolutionizing Our Relationship with Animals

    Bridging the Gap: How Generative AI and LLMs Are Revolutionizing Our Relationship with Animals

    The animal kingdom, with its incredible diversity and intricate communication systems, has always captivated human…

  • Why Even Consider An Al

    Why Even Consider An Al

    Chatbot Let's start by talking about what are the components of a successful, converting, revenue-generating eCommerce…

  • Cluster Management Tools to Compare

    Cluster Management Tools to Compare

    Overview of Cluster Management Tools Clearly, containers are an exciting new advancement in creating and delivering…

  • JAVA in Sockets

    JAVA in Sockets

    The java.net.

  • Applications of Java with Real-world Examples

    Applications of Java with Real-world Examples

    1. Desktop GUI Applications Desktop applications can be easily developed using Java.

  • Business Startup Strategy

    Business Startup Strategy

    I strongly suggest that would-be entrepreneurs do a business plan. As a result of completing the plan you will be much…

  • Design Patterns

    Design Patterns

    Design Patterns In software engineering, a design pattern is a general repeatable solution to a commonly occurring…

  • Building a Conversational Interface in 10 Steps

    Building a Conversational Interface in 10 Steps

    Great conversational applications require both advanced technology and solid design judgement. The most widely used…

  • Chatot Product Ideas

    Chatot Product Ideas

    Interview process management bot. It helps you to organize and manage interviews of a candidate.

  • Infrastructure as code (IaC)

    Infrastructure as code (IaC)

    Infrastructure as Code (IaC) is the management of infrastructure (networks, virtual machines, load balancers, and…

社区洞察

其他会员也浏览了