Getting Started With Hugging Face-Installation and setUp

Getting Started With Hugging Face-Installation and setUp

Hub Client Library :

The huggingface_hub library allows us to interact with Hugging FaceHub wich is a ML platform .There are hundreds of pre trained models and datasets hosted on te hub.The models and datasets that are created by u also can be hosted.

It mostly deals with PYTHON

INSTALLATION:

To install Hugging Face :


pip install git+https://github.com/huggingface/huggingface_hub        

Repositories on the Hub are git version controlled, and we can download a single file or the whole repository.The files are downloaded with the "hf_hub_download()".This function will download and cache a file on our local disk.


from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="google/pegasus-xsum", filename="config.json")        

Login

In a lot of cases, we must be logged in with a Hugging Face account to interact with the Hub such as download private repos, upload files and so on.

huggingface-cli login 
huggingface-cli login --token $HUGGINGFACE_TOKEN        

OR

We can programmatically login using?login()?in a notebook or a script:

from huggingface_hub import login
login()        

Create a repository

In order to create repo()


from huggingface_hub import HfApi
api = HfApi()
api.create_repo(repo_id="super-cool-model")        

Download files

Repositories on the Hub are git version controlled, and users can download a single file or the whole repository.? the?hf_hub_download()?function can be used to download files. This function will download and cache a file on your local disk.

the repository id and the filename of the file are required to download.

Example consider Pegasus model config gile:

Pegasus:

  • Pegasus’ pretraining task is intentionally similar to summarization: important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary.
  • Pegasus achieves SOTA summarization performance on all 12 downstream tasks, as measured by ROUGE and human eval.

Tips:

  • Sequence-to-sequence model with the same encoder-decoder model architecture as BART. Pegasus is pre-trained jointly on two self-supervised objective functions: Masked Language Modeling (MLM) and a novel summarization specific pretraining objective, called Gap Sentence Generation (GSG).
  • MLM: encoder input tokens are randomly replaced by a mask tokens and have to be predicted by the encoder (like in BERT)
  • GSG: whole encoder input sentences are replaced by a second mask token and fed to the decoder, but which has a causal mask to hide the future words like a regular auto-regressive transformer decoder.



from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="google/pegasus-xsum", filename="config.json")        


Upload files

To upload files use upload_file()

The following things should be mentioned:

1.Path to the file to upload

2.The path of the file in repo

3.The repo id of where we wanna add the file .

from huggingface_hub import HfApi 
api = HfApi()
api.upload_file(
path_or_fileobj ="/home/dir/...",
path_in_repo = "Readme.md",
repo_id = "username/test_dataset"
)        

Installation

Install with pip


pip install --upgrade huggingface_hub        

Install optional dependencies


pip install 'huggingface_hub[tensorflow]'        


pip install 'huggingface_hub[cli,torch]'
        


Install from source


pip install git+https://github.com/huggingface/huggingface_hub

or
        


pip install git+https://github.com/huggingface/huggingface_hub@my-feature-branch        


Editable install



git clone https://github.com/huggingface/huggingface_hub.git

cd huggingface_hub
pip install -e .        


Installing with Conda


conda install -c conda-forge huggingface_hub        

Conclusion:

This article includes what is hugging ce and the installationa and the setup.Th following articles will cover more concepts related to Hgging Face.

要查看或添加评论,请登录

Dhanushkumar R的更多文章

  • MLOPS -Getting Started .....

    MLOPS -Getting Started .....

    What is MLOps? MLOps (Machine Learning Operations) is a set of practices and principles that aim to streamline and…

    1 条评论
  • Pydub

    Pydub

    Audio files are a widespread means of transferring information. So let’s see how to work with audio files using Python.

    1 条评论
  • Introduction to Python libraries for image processing(Opencv):

    Introduction to Python libraries for image processing(Opencv):

    Image processing is a crucial field in computer science and technology that involves manipulating, analyzing, and…

    1 条评论
  • @tf.function

    @tf.function

    Learning Content:@tf.function @tf.

  • TEXT-TO-SPEECH Using Pyttsx3

    TEXT-TO-SPEECH Using Pyttsx3

    Pyttsx3 : It is a text to speech conversion library in Python which is worked in offline and is compatible with both…

    2 条评论
  • Web Scraping

    Web Scraping

    Web scraping is the process of collecting structured web data in an automated manner. It's also widely known as web…

  • TORCHAUDIO

    TORCHAUDIO

    Torchaudio is a library for audio and signal processing with PyTorch. It provides I/O, signal and data processing…

  • Audio Features of ML

    Audio Features of ML

    Why audio? Description of sound Different features capture different aspects of sound Build intelligent audio system…

  • Learning Path: "Voice and Sound Recognition"

    Learning Path: "Voice and Sound Recognition"

    Chapter 1: SOUND AND WAVEFORMS The concept that I learnt from this content is fundamental concepts related to sound and…

  • Pytorch Learning -3 [TRANSFORMS]

    Pytorch Learning -3 [TRANSFORMS]

    TRANSFORMS Data does not always come in its final processed form that is required for training machine learning…

社区洞察

其他会员也浏览了