How to Perform Zero-Shot Text Classification Using Hugging Face Transformers Library in Python

How to Perform Zero-Shot Text Classification Using Hugging Face Transformers Library in Python

Zero-shot classification is a machine learning technique where a model is able to classify text into predefined categories without having seen any examples from those categories during training. This can be useful when we have a large number of possible classes and it would be impractical or time-consuming to train a separate model for each one. The Hugging Face Transformers library provides a simple way to perform zero-shot classification using pre-trained language models such as BART.

Notebook link: https://github.com/ArjunAranetaCodes/LangChain-Guides/blob/main/zero_shot_classification_facebook_bart_large_mnli.ipynb

Model link: https://huggingface.co/facebook/bart-large-mnli

!pip install transformers        

!pip install transformers: This command installs the Hugging Face Transformers library which contains the necessary functionality for performing zero-shot classification.

from transformers import pipeline        

from transformers import pipeline: This imports the pipeline function from the transformers module. The pipeline function allows us to easily use pre-trained models for various NLP tasks, including zero-shot classification.

classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")        

classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli"): Here we create a new instance of the Pipeline class specifically for the task of zero-shot classification. We also specify that we want to use the facebook/bart-large-mnli pre-trained model for this task.

sequence_to_classify = "one day I will see the world"        

sequence_to_classify = "one day I will see the world": This variable holds the sequence of text that we want to classify. In this example, we are trying to determine what the person is interested in based on their statement.

candidate_labels = ['travel', 'cooking', 'dancing']        

candidate_labels = ['travel', 'cooking', 'dancing']: These are the predefined labels that our model can choose from. It's important to note that these labels should match the ones used during training.

classifier(sequence_to_classify, candidate_labels)        

classifier(sequence_to_classify, candidate_labels) : Finally, we call the classifier object with two arguments - the sequence we want to classify and the list of candidate labels. The method returns a dictionary containing three keys:

  • 'labels' - The original list of candidate labels passed to the method.
  • 'scores' - A list of scores corresponding to each label representing how confident the model is that the given input belongs to that particular class. The higher the score, the more confidence the model has.
  • 'sequence' - The input sequence passed to the method.

Result

{'sequence': 'one day I will see the world', 'labels': ['travel', 'dancing', 'cooking'], 'scores': [0.9938650727272034, 0.003273802110925317, 0.002861041808500886]}


By analyzing the output, we can conclude that the most likely category for the sentence "one day I will see the world" is 'travel'.


The result shows the classification report generated by the zero-shot classification model. Let me break down the different parts of the report:

  • 'sequence': 'one day I will see the world': This represents the input sequence that was classified.
  • 'labels': ['travel', 'dancing', 'cooking']: This lists the set of predefined labels that were considered during the classification process.
  • 'scores': [0.9938650727272034, 0.003273802110925317, 0.002861041808500886]: This vector gives the probabilities associated with each label. Specifically, the first value 0.9938650727272034 corresponds to the probability of the sequence belonging to the 'travel' category. Similarly, the second value 0.003273802110925317 corresponds to the probability of the sequence belonging to the 'dancing' category, while the third value 0.002861041808500886 corresponds to the probability of the sequence belonging to the 'cooking' category. As we can observe, the highest probability score is assigned to the 'travel' category, indicating that the model predicts the sequence is related to traveling. Overall, the interpretation suggests that the model has classified the given sequence accurately, identifying travel as the dominant theme in the sequence.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了