Custom Enterprise LLM/RAG with Real-Time Fine-Tuning

Custom Enterprise LLM/RAG with Real-Time Fine-Tuning

Read full article here, with code, data (anonymized corporate corpus, fortune 100 company), embeddings, other contextual backend tables, illustrations, and so on.

This article features an application of xLLM to extract information from a corporate corpus, using prompts referred to as “queries”. The goal is to serve the business user — typically an employee of the company or someone allowed access — with condensed, relevant pieces of information including links, examples, PDFs, tables, charts, definitions and so on, to professional queries.

My custom sub-LLM designed from scratch does not rely on any Python library or API, and performs better than search tools available on the market, in terms of speed and results relevancy. It offers the user the ability to fine-tune parameters in real time, and can detect user intent to deliver appropriate output. The good performance comes from the quality of the well-structured input sources, combined with smart crawling to retrieve the embedded knowledge graph and integrate it into the backend tables. Traditional tools rely mostly on tokens, embeddings, billions of parameters and frontend tricks such as prompt engineering to fix backend issues.

To the contrary, my approach focuses on building a solid backend foundational architecture from the ground up. Tokens and embeddings are not the most important components, by a long shot. Cosine similarity and dot products are replaced by pointwise mutual information. There is no neural network, no training, and a small number of explainable parameters, easy to fine-tune.

When you think about it, the average human being has a vocabulary of 30,000 words. Even if you added variations and other pieces of information (typos, plural, grammatical tenses, product IDs, street names, and so on), you end up with a few millions at most, not trillions. Indeed, in expensive multi-billion systems, most tokens and weights are just noise: most are rarely fetched to serve an answer. This noise is a source of hallucinations.

Read more, access the code and data, here.

Steve Naples

Data Governance | Data Architecture | Data Landscaping

4 个月

Vincent, can this work for a code model too? If so how would you expect to seed the model? I'm curious and would like to do so.

回复
Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

Please teach me more strategies with these types of articles, I can make better systems with your help.

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

Part 2 import math def calculate_pmi(hash_table, token, context): ??total_occurrences = sum(sum(contexts.values()) for contexts in hash_table.table.values()) ??token_occurrences = sum(hash_table.get_contexts(token).values()) ??context_occurrences = sum(hash_table.get_contexts(context).values()) ??co_occurrences = hash_table.get_contexts(token).get(context, 0) ??if co_occurrences == 0: ????return 0 ??p_token = token_occurrences / total_occurrences ??p_context = context_occurrences / total_occurrences ??p_token_context = co_occurrences / total_occurrences ??return math.log(p_token_context / (p_token * p_context)) def calculate_all_pmis(hash_table): ??pmi_table = {} ??for token, contexts in hash_table.table.items(): ????pmi_table[token] = {context: calculate_pmi(hash_table, token, context) for context in contexts} ??return pmi_table def query(hash_table, token): ??if token not in hash_table.table: ????return [] ??contexts = hash_table.get_contexts(token) ??pmi_table = {context: calculate_pmi(hash_table, token, context) for context in contexts} ??sorted_contexts = sorted(pmi_table.items(), key=lambda item: item[1], reverse=True) ??return sorted_contexts # Example usage Continued...

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

Part 1 class NestedHashTable: ??def __init__(self): ????self.table = {} ??def update(self, token, context): ????if token not in self.table: ??????self.table[token] = {} ????if context not in self.table[token]: ??????self.table[token][context] = 0 ????self.table[token][context] += 1 ??def get_contexts(self, token): ????return self.table.get(token, {}) ??def __repr__(self): ????return str(self.table) import re def preprocess_text(text): ??# Basic text cleaning ??text = re.sub(r'\W+', ' ', text.lower()) ??tokens = text.split() ??return tokens def create_multitokens(tokens, n=2): ??multitokens = [' '.join(tokens[i:i+n]) for i in range(len(tokens)-n+1)] ??return multitokens def build_nested_hash_table(texts, n=2): ??hash_table = NestedHashTable() ??for text in texts: ????tokens = preprocess_text(text) ????multitokens = create_multitokens(tokens, n) ????for i, token in enumerate(multitokens): ??????context = multitokens[max(0, i-1):i] + multitokens[i+1:min(len(multitokens), i+2)] ??????for ctx in context: ????????hash_table.update(token, ctx) ??return hash_table

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

This system is GREAT Vincent, you bypass many complexities of the neural network, Here's my simplified version.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了