登录查看更多内容

Getting Started With Hugging Face-Installation and setUp

Dhanushkumar R

Microsoft Learn Student Ambassador - GOLD |Data Science @BigTapp Analytics|Ex-Intern @IIT Kharagpur|Machine Learning|Deep Learning|Data Science|Gen AI|LLM | Azure AI&Data |Databricks |Technical Blogger ??????

发布日期: 2023年8月7日

Hub Client Library :

The huggingface_hub library allows us to interact with Hugging FaceHub wich is a ML platform .There are hundreds of pre trained models and datasets hosted on te hub.The models and datasets that are created by u also can be hosted.

It mostly deals with PYTHON

INSTALLATION:

To install Hugging Face :

pip install git+https://github.com/huggingface/huggingface_hub

Repositories on the Hub are git version controlled, and we can download a single file or the whole repository.The files are downloaded with the "hf_hub_download()".This function will download and cache a file on our local disk.

from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="google/pegasus-xsum", filename="config.json")

Login

In a lot of cases, we must be logged in with a Hugging Face account to interact with the Hub such as download private repos, upload files and so on.

huggingface-cli login 
huggingface-cli login --token $HUGGINGFACE_TOKEN

We can programmatically login using?login()?in a notebook or a script:

from huggingface_hub import login
login()

Create a repository

In order to create repo()

from huggingface_hub import HfApi
api = HfApi()
api.create_repo(repo_id="super-cool-model")

Download files

Repositories on the Hub are git version controlled, and users can download a single file or the whole repository.? the?hf_hub_download()?function can be used to download files. This function will download and cache a file on your local disk.

the repository id and the filename of the file are required to download.

Example consider Pegasus model config gile:

Pegasus:

Pegasus’ pretraining task is intentionally similar to summarization: important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary.
Pegasus achieves SOTA summarization performance on all 12 downstream tasks, as measured by ROUGE and human eval.

Tips:

Sequence-to-sequence model with the same encoder-decoder model architecture as BART. Pegasus is pre-trained jointly on two self-supervised objective functions: Masked Language Modeling (MLM) and a novel summarization specific pretraining objective, called Gap Sentence Generation (GSG).
MLM: encoder input tokens are randomly replaced by a mask tokens and have to be predicted by the encoder (like in BERT)
GSG: whole encoder input sentences are replaced by a second mask token and fed to the decoder, but which has a causal mask to hide the future words like a regular auto-regressive transformer decoder.

from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="google/pegasus-xsum", filename="config.json")

领英推荐

The Singleton: The Alternatives Monostate Pattern and…

Rainer Grimm 2 年前

Using Directed Acyclic Graphs in Airflow to Automate…

Sravan Puttagunta 2 年前

Week of June 24th

Stefan Krawczyk 8 个月前

Upload files

To upload files use upload_file()

The following things should be mentioned:

1.Path to the file to upload

2.The path of the file in repo

3.The repo id of where we wanna add the file .

from huggingface_hub import HfApi 
api = HfApi()
api.upload_file(
path_or_fileobj ="/home/dir/...",
path_in_repo = "Readme.md",
repo_id = "username/test_dataset"
)

Installation

Install with pip

pip install --upgrade huggingface_hub

Install optional dependencies

pip install 'huggingface_hub[tensorflow]'

pip install 'huggingface_hub[cli,torch]'

Install from source

pip install git+https://github.com/huggingface/huggingface_hub

or

pip install git+https://github.com/huggingface/huggingface_hub@my-feature-branch

Editable install


git clone https://github.com/huggingface/huggingface_hub.git

cd huggingface_hub
pip install -e .

Installing with Conda

conda install -c conda-forge huggingface_hub

Conclusion:

This article includes what is hugging ce and the installationa and the setup.Th following articles will cover more concepts related to Hgging Face.

要查看或添加评论，请登录

Dhanushkumar R的更多文章

MLOPS -Getting Started .....

2024年6月18日

MLOPS -Getting Started .....

What is MLOps? MLOps (Machine Learning Operations) is a set of practices and principles that aim to streamline and…

1 条评论
Pydub

2023年9月4日

Pydub

Audio files are a widespread means of transferring information. So let’s see how to work with audio files using Python.

1 条评论
Introduction to Python libraries for image processing(Opencv):

2023年9月2日

Introduction to Python libraries for image processing(Opencv):

Image processing is a crucial field in computer science and technology that involves manipulating, analyzing, and…

1 条评论
@tf.function

2023年8月21日

@tf.function

Learning Content:@tf.function @tf.
TEXT-TO-SPEECH Using Pyttsx3

2023年8月14日

TEXT-TO-SPEECH Using Pyttsx3

Pyttsx3 : It is a text to speech conversion library in Python which is worked in offline and is compatible with both…

2 条评论
Web Scraping

2023年8月11日

Web Scraping

Web scraping is the process of collecting structured web data in an automated manner. It's also widely known as web…
TORCHAUDIO

2023年8月10日

TORCHAUDIO

Torchaudio is a library for audio and signal processing with PyTorch. It provides I/O, signal and data processing…
Audio Features of ML

2023年8月6日

Audio Features of ML

Why audio? Description of sound Different features capture different aspects of sound Build intelligent audio system…
Learning Path: "Voice and Sound Recognition"

2023年8月6日

Learning Path: "Voice and Sound Recognition"

Chapter 1: SOUND AND WAVEFORMS The concept that I learnt from this content is fundamental concepts related to sound and…
Pytorch Learning -3 [TRANSFORMS]

2023年8月5日

Pytorch Learning -3 [TRANSFORMS]

TRANSFORMS Data does not always come in its final processed form that is required for training machine learning…

See all articles

Getting Started With Hugging Face-Installation and setUp

Dhanushkumar R

Microsoft Learn Student Ambassador - GOLD |Data Science @BigTapp Analytics|Ex-Intern @IIT Kharagpur|Machine Learning|Deep Learning|Data Science|Gen AI|LLM | Azure AI&Data |Databricks |Technical Blogger ??????

Login

Create a repository

Download files

领英推荐

Upload files

Installation

Install optional dependencies

Install from source

Editable install

Installing with Conda

Dhanushkumar R的更多文章

社区洞察

其他会员也浏览了

Shining Some Light on The New Milvus Lite

Are Your Dashboards Missing Context? Discover How We Automated Annotations in Domo!

Understanding Protocol Buffers with Practical Examples

Pandas in Multidimensional Magic: Navigating Arrays from 2D to 5D

Optimize Data Science & LLM Projects with below Tools & Workflows ??

RStudio Became Posit PBC Yesterday - Here's Why I Think That's Good News

Vector Databases Demystified: Part 2 - Building Your Own (Very) Simple Vector Database in Python

LangGraph: A Quick Start

How to Estimate Chance with Dice Rolls Using Convolutions and Recursion

The Chance Framework: How to Explain A/B Test Results to Managers Using Probability (Without p-values)

Login

Create a repository

Download files

领英推荐

Upload files

Installation

Install optional dependencies

Install from source

Editable install

Installing with Conda

Dhanushkumar R的更多文章

MLOPS -Getting Started .....

Pydub

Introduction to Python libraries for image processing(Opencv):

@tf.function

TEXT-TO-SPEECH Using Pyttsx3

Web Scraping

TORCHAUDIO

Audio Features of ML

Learning Path: "Voice and Sound Recognition"

Pytorch Learning -3 [TRANSFORMS]

社区洞察

其他会员也浏览了

Shining Some Light on The New Milvus Lite

Are Your Dashboards Missing Context? Discover How We Automated Annotations in Domo!

Understanding Protocol Buffers with Practical Examples

Pandas in Multidimensional Magic: Navigating Arrays from 2D to 5D

Optimize Data Science & LLM Projects with below Tools & Workflows ??

RStudio Became Posit PBC Yesterday - Here's Why I Think That's Good News

Vector Databases Demystified: Part 2 - Building Your Own (Very) Simple Vector Database in Python

LangGraph: A Quick Start

How to Estimate Chance with Dice Rolls Using Convolutions and Recursion

The Chance Framework: How to Explain A/B Test Results to Managers Using Probability (Without p-values)