登录查看更多内容

RAG System with Video

Kiruthika Subramani

Innovating AI for a Better Tomorrow | AI Engineer | Google Developer Expert | Author | IBM Dual Champion | 200+ Global AI Talks | Master's Student at MILA

发布日期: 2024年9月13日

Hello Everyone,It’s Friday, and guess who’s back? Hope you all had a fantastic week! This week, let’s dive into building a RAG system with a YouTube video.

No more procrastination this time, I promise! I’ll bring it to your devices on time.

If the article is delivered on time, don’t forget to leave a tip and a 5-star rating. I mean, likes and comments!Back in India, I used to hear this whenever I ordered food.

Now, I’m hungry just thinking about it. Let me finish this and grab my breakfast!We all love YouTube, right? Swiping through videos, each one not more than a minute long. And the most common phrase we hear is, “Like, share, subscribe!” I heard someone got a BMW with their YouTube income. Impressive! I spend 15 minutes every day contributing to someone’s income. Proud of myself!

I watch countless cooking videos, but I still end up making curd rice every day.I love that. Even in the freezing winter, haha.

I’ve noticed something, we’re not watching YouTube videos, just the shorts.The moral of the story is, when you feel sleepy, sleep. Don’t strain yourself swiping for leisure entertainment. Not more than 15 minutes a day!

We have tons of resources on YouTube. How about building a RAG that takes the video, extracts the transcript, and answers my questions?

Let’s say some YouTubers say, “Watch until the end to know the truth.” Just take the video, don’t waste your time. Give it to this RAG, and it will give you the answer instantly.Come on, let’s start building it!

No youtubers income is harmed here, disclaimer

Officially, welcome to the second episode of AI Weekly with Krithi!

Example of RAG with a YouTube Video

We download the transcription of a YouTube video and use an LLM for extracting information from that video. This is what we are going to do

Install the Dependencies

!pip3 install langchain
!pip3 install langchain_pinecone
!pip3 install langchain[docarray]
!pip3 install docarray
!pip3 install pypdf
!pip3 install youtube_transcript_api

Why we need to install these?

langchain is the main builder of your rag system.
langchain_pinecone helps you connect your system to a database if you're using one.
docarray and langchain[docarray] help you manage and organize the information from your videos.
pypdf helps you handle any PDF files related to your videos.
youtube_transcript_api gets the text from your YouTube videos.

Download an Example Transcript from a YouTube Video

You can change the ID of the video to download other video transcriptions. We save the content to a file.

from youtube_transcript_api import YouTubeTranscriptApi

srt = YouTubeTranscriptApi.get_transcript("SWm86rBsECw")  # CHANGE THE ID OF THE VIDEO 

with open("./files/youtube_transcription.txt", "a") as file:
    for i in srt:
        file.write(i['text'])

This code gets the transcript from a YouTube video with the ID "SWm86rBsECw" and saves it as a text file named "youtube_transcription.txt".

What’s the video about?

Let’s ask this to our RAG, but if you are very interested, here’s a hint - It’s a birthday party of Mr. Bean, celebrated alone. It evokes different emotions based on our mindset.

Don’t you believe me? If you watch it from an audience perspective, you will laugh. If you imagine yourself as Mr. Bean in the situation, it hurts.

Beloved Birthday Wishes from Kiruthika to whoever is reading this article.

I heard your mind’s voice. Today is not my birthday. I forgot to add these terms - belated or advance, whichever applies to you.

But the wishes from my heart are heartfelt.

And the next part is

领英推荐

Astonishing! More than One Million People Have Taken…

Gretchen Rubin 7 年前

Hi, I’m Tansel Ali.

Tansel Ali 10 个月前

19 truths, 1 lie: explained

John Espirian 5 年前

Select the LLM model to use

The model must be downloaded locally to be used, so if you want to run llama3, you should run:

ollama pull llama3

Check the list of models available for Ollama here: https://ollama.com/library

You need to choose which LLM you want to use. Ollama offers a variety of models, including Llama 3, Phi 3, Mistral, Gemma 2, and more. Once you’ve selected the model, you need to download it locally to use it.

We instantiate the model and the embeddings

It means we create and initialize the model and its numerical representations of data (embeddings) for use in computations.

#MODEL = "gpt-3.5-turbo"
#MODEL = "mixtral:8x7b"
#MODEL = "gemma:7b"
#MODEL = "llama2"
MODEL = "llama3" # https://ollama.com/library/llama3
from langchain_community.llms import Ollama
from langchain_community.embeddings import OllamaEmbeddings

model = Ollama(model=MODEL)
embeddings = OllamaEmbeddings(model=MODEL)

This code sets up and initializes the Llama 3 model (or whichever the model you choose from the comment line) and its embeddings for use. It imports the necessary classes from then langchain_community library and creates instances of the model and embedding.

Now, let us load the transcription previously saved using TextLoader

from langchain_community.document_loaders import TextLoader

loader = TextLoader("./files/youtube_transcription.txt")
text_documents = loader.load()
text_documents

Let us make the the document into chunks

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
text_documents = text_splitter.split_documents(text_documents)[:5]
text_documents

This code imports the RecursiveCharacterTextSplitter class from then langchain.text_splitter module and uses it to split a text into smaller chunks. It then takes the first five chunks from the split text documents.

Why first 5 chunks?

The choice to take the first five chunks from the split text documents is likely for demonstration or testing purposes. By selecting a manageable number of chunks, the code can easily showcase how the text is split without overwhelming the user with too much information. This approach helps in verifying that the text splitting process works as intended and allows for quick inspection of the results.

Store the PDF in a vector space.

DocArrayInMemorySearch is a tool that stores documents in your computer’s memory, making it easy to quickly search through small sets of documents without needing a full database. It’s great for simple, small-scale projects where you want fast and straightforward document searching.

from langchain_community.vectorstores import DocArrayInMemorySearch

vectorstore = DocArrayInMemorySearch.from_documents(text_documents, embedding=embeddings)
retriever = vectorstore.as_retriever()

We instantiate the parser

from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

But why?

StrOutputParser is likely used to process and parse output data into a string format. This can be useful when you need to convert various types of output into a consistent string format for further processing or display.

Generate the conversation template

from langchain.prompts import PromptTemplate

template = """
Answer the question based on the context below. If you can't
answer the question, answer with "See your video don't talk about this, I don't know".

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
prompt.format(context="Here is some context", question="Here is a question")

This code creates a prompt template and formats it with given context and question.

Let us extract the information from the video!

retrieved_context = retriever.invoke("laptop")
questions = [
   "what did Mr. Bean  ordered in his alone birthday party video and he don't like that food?"
]

for question in questions:
    formatted_prompt = prompt.format(context=retrieved_context, question=question)
    response_from_model = model.invoke(formatted_prompt)
    parsed_response = parser.parse(response_from_model)

    print(f"Question: {question}")
    print(f"Answer: {parsed_response}")
    print()

What dish it is ? Do you have any guesses run this code and let me know in the comment section!!

View the full code Here

We successfully built a Retrieval-Augmented Generation (RAG) system using the Youtube Videos.

Thank you for joining me on the second episode of AI Weekly with Krithi!

I hope you found it informative and engaging.

See you next week for more exciting AI topics and practical demos.

Have a great week ahead and stay tuned!

Cheers,

Kiruthika Subramani.

AI Weekly With Krithi

1,080 位关注者

查看更多评论

要查看或添加评论，请登录

Kiruthika Subramani的更多文章

Building a RAG System using Gemini API

2024年9月6日

Building a RAG System using Gemini API

Welcome to the first episode of AI Weekly with Krithi! In this series, we’ll explore various AI topics, tools, and…

3 条评论
Evaluation methods for LLMs

2024年5月22日

Evaluation methods for LLMs

Hey all, Welcome back for the sixth Episode of Cup of Coffee Series with LLMs. Again we have Mr.
Different Fine-tuning Methods for LLMs

2024年5月10日

Different Fine-tuning Methods for LLMs

Hey all, Welcome back for the fifth Episode of Cup of Coffee Series with LLMs. Again we have Mr.

1 条评论
Pretraining and Fine Tuning LLMs

2024年5月5日

Pretraining and Fine Tuning LLMs

Hey all, Welcome back for the fourth Episode of Cup of Coffee Series with LLMs. Again we have Mr.
Architecting Large Language Models

2024年5月2日

Architecting Large Language Models

Hey all, Welcome back for the third Episode of Cup of Coffee Series with LLMs. Again we have Mr.
LLMs #2

2024年4月29日

LLMs #2

Hey all, Welcome back for the second Episode of Cup of Coffee Series with LLMs. Again we have Mr.

2 条评论
LLM's Introduction

2024年4月26日

LLM's Introduction

Hello Everyone! Kiruthika here, after a long. I am back with the cup of coffee series with LLMs.

2 条评论
Transformers

2023年12月25日

Transformers

Hello, folks! Kiruthika is back after a long break. Yep, let's get started with our Cup of Coffee Series! Today, we…

4 条评论
Generative Adversarial Network (GAN)

2023年10月24日

Generative Adversarial Network (GAN)

??????Pour yourself a virtual cup of coffee with GANs after a long. Finally, we are stepping into 19 th week of this…

1 条评论
Autoencoder

2023年9月19日

Autoencoder

?????? It's time for a "Cup of Coffee with Autoencoder"! ???? ???? An autoencoder is a neural network architecture used…

See all articles

RAG System with Video

Kiruthika Subramani

Innovating AI for a Better Tomorrow | AI Engineer | Google Developer Expert | Author | IBM Dual Champion | 200+ Global AI Talks | Master's Student at MILA

Example of RAG with a YouTube Video

Install the Dependencies

Download an Example Transcript from a YouTube Video

领英推荐

Select the LLM model to use

We instantiate the model and the embeddings

Now, let us load the transcription previously saved using TextLoader

Let us make the the document into chunks

Store the PDF in a vector space.

We instantiate the parser

Generate the conversation template

Let us extract the information from the video!

I hope you found it informative and engaging.

AI Weekly With Krithi

1,080 位关注者

Kiruthika Subramani的更多文章

社区洞察

其他会员也浏览了

Wacky Witchy’s Top 5 “Mini-Stories” to Make You Smile!

Grab(ber) for attention

Lara Acosta - Hmmm - literallyacademy Happenings

This week's must-reads: How to Hook Users in 3 Steps: An Intro to Habit Testing

2025: From a Year of Pain to a Year of Correction

Newness

Terraria and being

?? The top five essays of 2020

How to tell truth from lies

A more daring Examine?

Example of RAG with a YouTube Video

Install the Dependencies

Download an Example Transcript from a YouTube Video

领英推荐

Select the LLM model to use

We instantiate the model and the embeddings

Now, let us load the transcription previously saved using TextLoader

Let us make the the document into chunks

Store the PDF in a vector space.

We instantiate the parser

Generate the conversation template

Let us extract the information from the video!

I hope you found it informative and engaging.

AI Weekly With Krithi

1,080 位关注者

Kiruthika Subramani的更多文章

Building a RAG System using Gemini API

Evaluation methods for LLMs

Different Fine-tuning Methods for LLMs

Pretraining and Fine Tuning LLMs

Architecting Large Language Models

LLMs #2

LLM's Introduction

Transformers

Generative Adversarial Network (GAN)

Autoencoder

社区洞察

其他会员也浏览了

Wacky Witchy’s Top 5 “Mini-Stories” to Make You Smile!

Grab(ber) for attention

Lara Acosta - Hmmm - literallyacademy Happenings

This week's must-reads: How to Hook Users in 3 Steps: An Intro to Habit Testing

2025: From a Year of Pain to a Year of Correction

Newness

Terraria and being

?? The top five essays of 2020

How to tell truth from lies

A more daring Examine?