登录查看更多内容

A Small Overview and Demo of Google Flan-T5 Model

Balayogi G

Interdisciplinary enthusiast | Ph.D. Candidate in Computer Science | UGC NET Certified | Human-Computer Interaction | Accessibility | Usable Security | Artificial Intelligence | Computational Security

发布日期: 2023年2月2日

+ 关注

This article presents an overview of the Google Flan-T5 model developed by Google.

The content of this post:

What is Flan-T5 model?
Packages for running Flan-T5 model
A Demo of using Google Flan-T5 model

What is Flan-T5 model?

FLAN-T5 is a combination of two: a network and a model. Here, FLAN is Finetuned LAnguage Net and T5 is a language model developed and published by Google in 2020. This model provides an improvement on the T5 model by improving the effectiveness of the zero-shot learning. Google have developed and published several language models: BERT (in 2018), PaLM (in 2022), and LaMDA (in 2022).

FLAN-T5 model comes with many variants based on the numbers of parameters.

FLAN-T5 small (60M)
FLAN-T5 base (250M)
FLAN-T5 large (780M)
FLAN-T5 XL (3B)
FLAN-T5 XXL (11B)

Packages for running Flan-T5 model

transformers
sentencepiece
accelerate

The above mentioned packages can be installed using pip command in the console.

pip install transformers
pip install sentencepiece
pip install accelerate

Demo of Google Flan-T5 model

Step 1: Importing packages and downloading the Google Flan-T5 model. (In this example, I used Google FLAN-T5 large (780M) model.

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-large", device_map="auto")

Step 2: Writing a function to parse the query to the model to generate the results. (This function is taken from the post by Koki Noda).

def inference(input_text)
  input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
  outputs = model.generate(input_ids, max_length=200, bos_token_id=0)
  result = tokenizer.decode(outputs[0], skip_special_tokens=True)
  print(result):

Step 3: Pass the input text to the model and print the results.

No alt text provided for this image — Google FLAN-T5 on Google Colab

As the model is smaller in size (780M parameters) it may provide correct answers and wrong answers.

References:

https://medium.com/@koki_noda/try-language-models-with-python-google-ais-flan-t5-ba72318d3be6
https://www.narrativa.com/flan-t5-a-yummy-model-superior-to-gpt-3/

Assaf Toledo

Research Staff Member at IBM

1 年

Hi, Small correction: FLAN-T5 small is 60M params. See https://arxiv.org/pdf/1910.10683.pdf, page 36. at the bottom. Best

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

A Small Overview and Demo of Google Flan-T5 Model

Balayogi G

Interdisciplinary enthusiast | Ph.D. Candidate in Computer Science | UGC NET Certified | Human-Computer Interaction | Accessibility | Usable Security | Artificial Intelligence | Computational Security

更多精彩文章

社区洞察

其他会员也浏览了

Introducing Mixtral-8x22B: The new open model from Mistral outperforms all existing open LLMs ??

HOW TO MAKE GPT-4 YOUR POWERFUL ALLY

GenAI Weekly — Edition 34

???????????? ?????????????????? ?????? ?????? ????????????????????????

A new paradigm in foundation models: Why o1 is different and how it will transform LLM applications

GPT-4 Turbo is here! Now what? Long Context analysis and some implications for Legal

Leverage the Potential of LLMs: More Than Just Text Generation!

GPT-3 is Not a Way To Go and Here's Why

Play Your Hand

Cyber Security Awareness Month: Small Changes, Big Impact

2024年10月1日

Harmonizing Intelligence: Human-AI Co-Learning

2024年6月11日

World Braille Day: Empowering Secure and Confident Navigation in the Digital Realm

2024年1月4日

PyScript - A way to run python in your HTML page

2023年5月16日

VisualPython - GUI based Python Code Generator

2023年2月17日

A small comparison of web UI tools for Machine learning

2021年10月30日

New Gen AI architecture – Pathways by Google

2021年10月29日