Zifo Semantic Search Service - technical details
Author: Ross Burton
Two weeks ago, we shared the exciting development of Zifo Semantic Search Service, a game-changing solution that offers AI-powered search of documents both within and outside your company's knowledge base. Today we want to share how we developed this solution with an overview of the technologies employed.?
Our semantic search service is centered on the idea of retrieval, a concept that requires that documents be converted into “embeddings” using large language models (LLMs), stored in a specialised vector database, and retrieved using a cosine similarity score.?
A general overview is given in the diagram above. The first step is to create a vector database in the domain of interest. In our demonstration, titles and abstracts were used to create this vector database, with 200,000 obtained from the PubMed Open Access dataset and a further 75,000 from the EBI Biostudies database. An LLM converts these documents into vector embeddings, numeric representations of the text in a shared embedded space such that documents can be compared based on their semantic similarity. The embeddings are stored in a specialised vector database, and comparisons made between vectors by cosine similarity. When presented with a new document, query text, or a question (step 2), the LLM computes an embedding that can be compared to those stored within the vector database. Relevant documents are found by ranking vectors by their cosine distance to the query.?
To achieve what is described above, we utilised several key technologies that, in combination, delivered our technical demonstration. The diagram below provides an overview of our infrastructure:?
领英推荐
We use Haystack by deepset to orchestrate document embedding and retrieval with LLMs. Open-source models are obtained from HuggingFace and stored within our infrastructure during deployment. The raw unprocessed text of our documents is kept in a MongoDB database, and the vector embeddings in a Weaviate vector database. When performing document retrieval, first, Weaviate is queried for semantically similar text as vector embeddings, and then the corresponding raw text is retrieved from MongoDB.?
To perform our queries in production, we use a FastAPI server that provides a REST API for interaction with Haystack and our databases. A Streamlit application then uses this REST API to serve the interface we see in our demonstration. Our entire solution is performed using Amazon Web Services (AWS). Each component is deployed using docker. The Streamlit application and the databases are served using Amazon Elastic Compute Cloud. The LLM interface, using haystack and interfaced with FastAPI, is computationally expensive and is therefore deployed with Amazon Elastic Container Service with an auto-load balancer.?
We hope you have enjoyed this technical overview of our demonstration app. If you haven’t already, visit https://semanticboost.zifo-tech.com/ and try out our app. Stay tuned because in the coming weeks, we will be adding more models to our demo, including our own fine-tuned model for document retrieval. If you are interested in how Zifo’s Data Science team can support your use case or search challenges, please contact our team directly at [email protected].?We are here to help you solve your data integration and information search challenges.?
Business Analyst at Zifo RnD Solutions| Driving Digital Transformation| Scientific Informatics Consultant| Change Management
1 年Would this extract data from source files in any format to render it in a standard semantic output ?