Large Language Models - the new "new best thing" after sliced bread.
If you've been curious about LLMs (large language models), you're not alone. Everywhere you turn, people are raving about the incredible capabilities of these machine learning models that use deep learning algorithms to process and understand natural language. They are the new "new best thing" after sliced bread, and it's no wonder why. Midjourney can create artwork, ChatGPT can write poetry - the possibilities seem endless.
But how do you get started with LLMs? It can be overwhelming to navigate this new space, especially if you're not familiar with the science behind it. That's where this post comes in. I’ll focus on how to use existing LLMs in a simple standalone (but not optimized by any means) Python program to get some results. Once you've mastered this, you'll be able to take the leap to integrate LLMs into your own applications with ease.
But first, let's answer a basic question:
What are LLMs, exactly? In simple terms, they are machine learning models that use deep learning algorithms to process and understand natural language. They are typically trained on vast amounts of textual data, which means they take a long time to train (at least, the bigger and more generic ones). For example, GPT-4 took between 4-7 months to train.
There are many (more than 100K) LLMs available for use, few of the most popular ones are Bloom, GPT-4, BERT, XLNet, T5, and RoBERTa. Integrating with LLMs can seem daunting, but Hugging Face provides a simple hub of open-source models for natural language processing, computer vision, and other AI areas. With their APIs and a little bit of setup, you'll be able to start using LLMs in your own applications in no time.
To leverage the Hugging Face APIs, start by creating an account and verifying your email address. Once you have an active account, click on your profile icon located on the top right corner of the page. Next, navigate to settings and select Access Tokens from the left menu bar. From here, create a new token and set the role as "write." This token will be used later to authorize API calls in the exercise.
What’s next? Next comes the exciting part – let’s write some code!
Exercise 1. Translation from English to German. Head over to hugging face to see what pre-trained Translation specific natural language processing models are available to us. The first one in the list is the t5-base model which supports translations between English, French, German and Romanian languages.
In your favorite python IDE (I use Wing), type in and execute the following code:
import request
from pprint import pprint
API_URL = 'https://api-inference.huggingface.co/models/t5-base'
headers = {'Authorization': 'Bearer {API_TOKEN}'}
# The {API_TOKEN} is just a placeholder, change it to your owm
huggingface API access key.
def query(payload):
? ? response = requests.post(API_URL, headers=headers, json=payload)
? ? return response.json()
??
params = {'min_length':10, 'max_length': 400, 'top_k': 2, 'temperature': 0.9}
output = query({
? ? 'inputs': 'Translate to German: This post is about how to use large language models',
? ? 'parameters': params,
})
pprint(output)
--output--
[{'translation_text': 'In diesem Beitrag geht es um die Verwendung gro?er '
? ? ? ? ? ? ? ? ? ? ? 'Sprachmodelle.'}]
Exercise 2: Question & Answer. There are many conversational pretrained models you can choose from. I chose Microsoft's DialoGPT-medium.
领英推荐
import request
from pprint import pprint
API_URL = 'https://api-inference.huggingface.co/models/microsoft/DialoGPT-medium'
headers = {'Authorization': 'Bearer {API_TOKEN}'}
# The {API_TOKEN} is just a placeholder, change it to your owm huggingface API access key.
def query(payload):
? ? response = requests.post(API_URL, headers=headers, json=payload)
? ? return response.json()
??
params = {'min_length':10, 'max_length': 400, 'top_k': 2, 'temperature': 0.9, 'max_tokens' : 150, 'pad_token_id' : 50256}
output = query({
? ? 'inputs': 'Can you travel to the sun?',
? ? 'parameters': params,
})
pprint(output)
--output--
{'conversation': {'generated_responses': ["I can't, but I can travel to the "
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 'moon.']
Exercise 3: Summarize a movie plot. For this exercise, I copied the plot for the movie Forest Gump from Wikipedia to a locally stored doc file. I named this file forest_gump.docx and stored it in the same folder as the python program. To summarize the plot, I used Facebook’s Bart Large CNN LLM.
import request
import docx2txt
from pprint import pprint
API_URL = 'https://api-inference.huggingface.co/models/facebook/bart-large-cnn'
headers = {'Authorization': 'Bearer {API_TOKEN}'}
# The {API_TOKEN} is just a placeholder, change it to your owm huggingface API
access key.
def query(payload):
? ? response = requests.post(API_URL, headers=headers, json=payload)
? ? return response.json()
??
params = {'do_sample' : False, 'max_length' : 800, 'min_length' : 120}
def text_from_file(file):
? ? text = docx2txt.process(file)
? ??
? ? if text:
? ? ? ? return text.replace('\t', ' ')
? ??
? ? return None
to_summarize = text_from_file('forest_gump.docx')
output = query({
? ? 'inputs': to_summarize,
? ? 'parameters': params,
})
pprint(output)
--output--
[{'summary_text': 'As a boy in 1956, Forrest Gump has an IQ of 75 and is '
? ? ? ? ? ? ? ? ? 'fitted with leg braces to correct a curved spine. He lives '
? ? ? ? ? ? ? ? ? 'in Greenbow, Alabama, with his mother, who runs a boarding '
? ? ? ? ? ? ? ? ? 'house and encourages him to live beyond his disabilities. '
? ? ? ? ? ? ? ? ? 'In 1981, a man named\xa0Forrest Gump\xa0recounts his life '
? ? ? ? ? ? ? ? ? 'story to strangers who happen to sit next to him at a bus '
? ? ? ? ? ? ? ? ? 'stop. Forrest is awarded theMedal of Honor for his heroism '
? ? ? ? ? ? ? ? ? 'by President\xa0Lyndon B. Johnson. In 1974, Forrest is '
? ? ? ? ? ? ? ? ? 'honorably discharged from the Army and returns to Greenbow '
? ? ? ? ? ? ? ? ? 'where he makes a company that makes ping-pong.'}]
Notes:
>>python -m pip install <missing_module_name>
In this post, we've learned how to use Hugging Face APIs to interact with pretrained models (t5-base, DialoGPT-medium, and bart-large-cnn) through a simple Python program. We've explored three different use cases, including translation, Q&A, and summarization.
To continue building on this knowledge, here are a few suggested next steps:
By exploring these possibilities, you can further enhance your skills in using Hugging Face's APIs and building more sophisticated applications with natural language processing capabilities.
Head of Data at Opendoor | Ex-Amazon | Professional mission: I help organizations drive growth & profitability through data engineering, analytics, and data science
1 年??