GenerativeAI - RAG Application using LLaMa2 LLaMaIndex Replicate for Named Entity Recognition (NER)
Retrieval Augmented Generation (RAG) stands at the forefront of natural language processing advancements, particularly in tasks like Named Entity Recognition (NER). By leveraging LLama2, an open-source Large Language Model (LLM) developed by Meta AI Research, RAG-based applications redefine the landscape of NER tasks. LLama2's robust training on vast textual data empowers it to understand and generate human-like text across various domains, crucial for accurate NER.
LLamaIndex further bolsters LLama2's capabilities by offering efficient indexing and retrieval functionalities for large-scale knowledge sources. This ensures swift and precise retrieval of pertinent information, enhancing LLama2's performance in NER tasks. Replicate, serving as a cloud API, seamlessly integrates LLama2 and LLamaIndex, enabling developers to deploy and run these models at scale.
Together, RAG, LLama2, LLamaIndex, and Replicate revolutionize NER applications, providing developers and researchers with powerful tools to create contextually rich and accurate text generation systems.
1) Install Packages
pip install llama_index pypdf sentence_transformers llama-index-llms-huggingface llama-index-llms-replicate llama-index-embeddings-langchain
pip install -q transformers bitsandbytes accelerate langchain
2) Import Dependencies
from llama_index.core import VectorStoreIndex,SimpleDirectoryReader,ServiceContext,PromptTemplate
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.llms.replicate import Replicate
import torch
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index.embeddings.langchain import LangchainEmbedding
3) Load All the Required Documents
All the documents to be extracted are loaded using SimpleDirectoryReader() function from llama index
documents = SimpleDirectoryReader("YOUR_DIRECTORY_PATH").load_data()
4) Write the Prompt
Usually LLMs don't perform good for NER task. But, trust me this way of tweaking the prompt gave me good result. Tweaking the prompts with domain specific names and some other names depending on the document, would give you good output
system_prompt="""
You are a Named Entity Recognizer.
You need to extract the given entities from the data.
The extraction must be accurate"""
query_wrapper_prompt=PromptTemplate("<|USER|>{query_str}<|ASSISTANT|>")
5) Load LLaMa2-70B Model using Replicate
Here, I have used LLaMa2-70B model from Replicate
os.environ["REPLICATE_API_TOKEN"] = "YOUR_REPLICATE_API_KEY"
llama2_70b = "replicate/llama70b-v2-chat:2d19859030ff705a87c746f7e96eea03aefb71f166725aee39692f1476566d48"
llm = Replicate(
context_window=4096,
max_new_tokens=256,
generate_kwargs={"temperature": 0.02, "do_sample": False},
system_prompt=system_prompt,
query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name="meta-llama/Llama-2-7b-chat-hf",
model=llama2_70b,
device_map="auto")
6) Load the Embedding Model
embed_model=LangchainEmbedding(
HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2"))
service_context=ServiceContext.from_defaults(
chunk_size=1024,
llm=llm,
embed_model=embed_model
)
7) Load the Vectors to Vector Store
The vectors from the embedding model was loaded to VectorStoreIndex, which is a vector store. Any other vector store can be used here
index=VectorStoreIndex.from_documents(documents,service_context=service_context)
query_engine = index.as_query_engine()
8) Shoot the Query
This was the query I used to extract the entities from the document. Some reference of the document, like unique names, IDs must be defined to get the information from that particular document. Else the information will be from multiple documents
response = query_engine.query("""From {SOME REFERENCE OF THE DOCUMENT}, Extract the following entities
Invoice_Date (i,e.,When was the invoice generated?),
Payment_Increase% (in $),
Company_Name,
Invoice Address""")
print (response)
And that's it. The response was good for me with the tweaky prompt and query. 7 out of 8 entities were accurate and I'm sure all the outputs will be good if I tweak the prompt and query, use different embedding model, use different LLM.
领英推荐