登录查看更多内容

GenerativeAI - RAG Application using LLaMa2 LLaMaIndex Replicate for Named Entity Recognition (NER)

Arun R

AI/ML Consultant @ Deloitte USI | AI | ML | Generative AI | Deep Learning | Data Science

发布日期: 2024年4月2日

Retrieval Augmented Generation (RAG) stands at the forefront of natural language processing advancements, particularly in tasks like Named Entity Recognition (NER). By leveraging LLama2, an open-source Large Language Model (LLM) developed by Meta AI Research, RAG-based applications redefine the landscape of NER tasks. LLama2's robust training on vast textual data empowers it to understand and generate human-like text across various domains, crucial for accurate NER.

LLamaIndex further bolsters LLama2's capabilities by offering efficient indexing and retrieval functionalities for large-scale knowledge sources. This ensures swift and precise retrieval of pertinent information, enhancing LLama2's performance in NER tasks. Replicate, serving as a cloud API, seamlessly integrates LLama2 and LLamaIndex, enabling developers to deploy and run these models at scale.

Together, RAG, LLama2, LLamaIndex, and Replicate revolutionize NER applications, providing developers and researchers with powerful tools to create contextually rich and accurate text generation systems.

1) Install Packages

pip install llama_index pypdf sentence_transformers llama-index-llms-huggingface llama-index-llms-replicate llama-index-embeddings-langchain

pip install -q transformers bitsandbytes accelerate langchain

2) Import Dependencies

from llama_index.core import VectorStoreIndex,SimpleDirectoryReader,ServiceContext,PromptTemplate
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.llms.replicate import Replicate
import torch
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index.embeddings.langchain import LangchainEmbedding

3) Load All the Required Documents

All the documents to be extracted are loaded using SimpleDirectoryReader() function from llama index

documents = SimpleDirectoryReader("YOUR_DIRECTORY_PATH").load_data()

4) Write the Prompt

Usually LLMs don't perform good for NER task. But, trust me this way of tweaking the prompt gave me good result. Tweaking the prompts with domain specific names and some other names depending on the document, would give you good output

system_prompt="""
You are a Named Entity Recognizer.
You need to extract the given entities from the data.
The extraction must be accurate"""

query_wrapper_prompt=PromptTemplate("<|USER|>{query_str}<|ASSISTANT|>")

5) Load LLaMa2-70B Model using Replicate

Here, I have used LLaMa2-70B model from Replicate

os.environ["REPLICATE_API_TOKEN"] = "YOUR_REPLICATE_API_KEY"

llama2_70b = "replicate/llama70b-v2-chat:2d19859030ff705a87c746f7e96eea03aefb71f166725aee39692f1476566d48"

llm = Replicate(
    context_window=4096,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.02, "do_sample": False},
    system_prompt=system_prompt,
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name="meta-llama/Llama-2-7b-chat-hf",
    model=llama2_70b,
    device_map="auto")

6) Load the Embedding Model

embed_model=LangchainEmbedding(
    HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2"))

service_context=ServiceContext.from_defaults(
    chunk_size=1024,
    llm=llm,
    embed_model=embed_model
)

7) Load the Vectors to Vector Store

The vectors from the embedding model was loaded to VectorStoreIndex, which is a vector store. Any other vector store can be used here

index=VectorStoreIndex.from_documents(documents,service_context=service_context)

query_engine = index.as_query_engine()

8) Shoot the Query

This was the query I used to extract the entities from the document. Some reference of the document, like unique names, IDs must be defined to get the information from that particular document. Else the information will be from multiple documents

response = query_engine.query("""From {SOME REFERENCE OF THE DOCUMENT}, Extract the following entities
Invoice_Date (i,e.,When was the invoice generated?),
Payment_Increase% (in $),
Company_Name,
Invoice Address""")

print (response)

And that's it. The response was good for me with the tweaky prompt and query. 7 out of 8 entities were accurate and I'm sure all the outputs will be good if I tweak the prompt and query, use different embedding model, use different LLM.

Anna Y. 5 个月前

ChatGPT vs Gemini; Uncertainty Quantification in…

Danny Butvinik 8 个月前

?????? LLMs Opening Their Inner Eyes

Pascal Biese 7 个月前

GenerativeAI - RAG Application using LLaMa2 LLaMaIndex Replicate for Named Entity Recognition (NER)

Arun R

AI/ML Consultant @ Deloitte USI | AI | ML | Generative AI | Deep Learning | Data Science

1) Install Packages

2) Import Dependencies

3) Load All the Required Documents

4) Write the Prompt

5) Load LLaMa2-70B Model using Replicate

6) Load the Embedding Model

7) Load the Vectors to Vector Store

8) Shoot the Query

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Our 4-Tool Stack + Strategy for Building Enterprise AI Solutions on LLMs - AI&YOU #53

The Limits of GPT: Why It Is Not AGI Now and How It Might Become One in the Future!

Crafting Intelligence: The Art of Tailoring Large Language Models for Precision and Relevance

Bicity AI Generates High-Quality, Entirely Automated Texts, and Articles Using A Cutting-Edge AI Model.

Unlocking the Power of Retrieval-Augmented Generation (RAG) in the Age of Long-Context Language Models: A Critical Perspective

Late Chunking: Revolutionizing Text Retrieval with Long-Context Embeddings

Exploring the Power of Self-Refine Prompting in AI

Medusa: An AI Technique for Parallel Intelligence

Will Long-Context LLMs Cause the Extinction of RAG?

Exploring SLMs -Getting started with Phi-3

1) Install Packages

2) Import Dependencies

3) Load All the Required Documents

4) Write the Prompt

5) Load LLaMa2-70B Model using Replicate

6) Load the Embedding Model

7) Load the Vectors to Vector Store

8) Shoot the Query

领英推荐

Prompt Engineering with ChatGPT and Python

2023年5月4日

社区洞察

其他会员也浏览了

Our 4-Tool Stack + Strategy for Building Enterprise AI Solutions on LLMs - AI&YOU #53

The Limits of GPT: Why It Is Not AGI Now and How It Might Become One in the Future!

Crafting Intelligence: The Art of Tailoring Large Language Models for Precision and Relevance

Bicity AI Generates High-Quality, Entirely Automated Texts, and Articles Using A Cutting-Edge AI Model.

Unlocking the Power of Retrieval-Augmented Generation (RAG) in the Age of Long-Context Language Models: A Critical Perspective

Late Chunking: Revolutionizing Text Retrieval with Long-Context Embeddings

Exploring the Power of Self-Refine Prompting in AI

Medusa: An AI Technique for Parallel Intelligence

Will Long-Context LLMs Cause the Extinction of RAG?

Exploring SLMs -Getting started with Phi-3