登录查看更多内容

Custom Enterprise LLM/RAG with Real-Time Fine-Tuning

Vincent Granville

AI/LLM Disruptive Leader | GenAI Tech Lab

发布日期: 2024年7月26日

Read full article here, with code, data (anonymized corporate corpus, fortune 100 company), embeddings, other contextual backend tables, illustrations, and so on.

This article features an application of xLLM to extract information from a corporate corpus, using prompts referred to as “queries”. The goal is to serve the business user — typically an employee of the company or someone allowed access — with condensed, relevant pieces of information including links, examples, PDFs, tables, charts, definitions and so on, to professional queries.

My custom sub-LLM designed from scratch does not rely on any Python library or API, and performs better than search tools available on the market, in terms of speed and results relevancy. It offers the user the ability to fine-tune parameters in real time, and can detect user intent to deliver appropriate output. The good performance comes from the quality of the well-structured input sources, combined with smart crawling to retrieve the embedded knowledge graph and integrate it into the backend tables. Traditional tools rely mostly on tokens, embeddings, billions of parameters and frontend tricks such as prompt engineering to fix backend issues.

领英推荐

Data Science #32

Andriy Burkov 1 个月前

Dash Club 12: AI and Dash, Dash Online Course…

Plotly 1 年前

No-Code LLM Fine-Tuning and Debugging in Real Time:…

Vincent Granville 2 个月前

To the contrary, my approach focuses on building a solid backend foundational architecture from the ground up. Tokens and embeddings are not the most important components, by a long shot. Cosine similarity and dot products are replaced by pointwise mutual information. There is no neural network, no training, and a small number of explainable parameters, easy to fine-tune.

When you think about it, the average human being has a vocabulary of 30,000 words. Even if you added variations and other pieces of information (typos, plural, grammatical tenses, product IDs, street names, and so on), you end up with a few millions at most, not trillions. Indeed, in expensive multi-billion systems, most tokens and weights are just noise: most are rarely fetched to serve an answer. This noise is a source of hallucinations.

Read more, access the code and data, here.

GenAI and Machine Learning

205,468 位关注者

Steve Naples

Data Governance | Data Architecture | Data Landscaping

4 个月

Vincent, can this work for a code model too? If so how would you expect to seed the model? I'm curious and would like to do so.

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

Please teach me more strategies with these types of articles, I can make better systems with your help.

1 次回应

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

Part 2 import math def calculate_pmi(hash_table, token, context): ??total_occurrences = sum(sum(contexts.values()) for contexts in hash_table.table.values()) ??token_occurrences = sum(hash_table.get_contexts(token).values()) ??context_occurrences = sum(hash_table.get_contexts(context).values()) ??co_occurrences = hash_table.get_contexts(token).get(context, 0) ??if co_occurrences == 0: ????return 0 ??p_token = token_occurrences / total_occurrences ??p_context = context_occurrences / total_occurrences ??p_token_context = co_occurrences / total_occurrences ??return math.log(p_token_context / (p_token * p_context)) def calculate_all_pmis(hash_table): ??pmi_table = {} ??for token, contexts in hash_table.table.items(): ????pmi_table[token] = {context: calculate_pmi(hash_table, token, context) for context in contexts} ??return pmi_table def query(hash_table, token): ??if token not in hash_table.table: ????return [] ??contexts = hash_table.get_contexts(token) ??pmi_table = {context: calculate_pmi(hash_table, token, context) for context in contexts} ??sorted_contexts = sorted(pmi_table.items(), key=lambda item: item[1], reverse=True) ??return sorted_contexts # Example usage Continued...

1 次回应

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

Part 1 class NestedHashTable: ??def __init__(self): ????self.table = {} ??def update(self, token, context): ????if token not in self.table: ??????self.table[token] = {} ????if context not in self.table[token]: ??????self.table[token][context] = 0 ????self.table[token][context] += 1 ??def get_contexts(self, token): ????return self.table.get(token, {}) ??def __repr__(self): ????return str(self.table) import re def preprocess_text(text): ??# Basic text cleaning ??text = re.sub(r'\W+', ' ', text.lower()) ??tokens = text.split() ??return tokens def create_multitokens(tokens, n=2): ??multitokens = [' '.join(tokens[i:i+n]) for i in range(len(tokens)-n+1)] ??return multitokens def build_nested_hash_table(texts, n=2): ??hash_table = NestedHashTable() ??for text in texts: ????tokens = preprocess_text(text) ????multitokens = create_multitokens(tokens, n) ????for i, token in enumerate(multitokens): ??????context = multitokens[max(0, i-1):i] + multitokens[i+1:min(len(multitokens), i+2)] ??????for ctx in context: ????????hash_table.update(token, ctx) ??return hash_table

1 次回应

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas

4 个月

This system is GREAT Vincent, you bypass many complexities of the neural network, Here's my simplified version.

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Custom Enterprise LLM/RAG with Real-Time Fine-Tuning

Vincent Granville

AI/LLM Disruptive Leader | GenAI Tech Lab

领英推荐

GenAI and Machine Learning

205,468 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Issue #217 - THE ML ENGINEER ??

Issue #191 - THE ML ENGINEER ??

Kubeflow Pipelines v2: Making ML pipelines easier, faster, and more scalable

LLM fine-tuning and model selection + other resources

Issue #173 - THE ML ENGINEER ??

Issue #164 - THE ML ENGINEER ??

TensorFlow.js Monthly #7: RoboFlow.js, Coral Edge TPU acceleration for Node.js, and OCR recognition in the browser

Why Machine Learning Projects Fail (ML4Devs Newsletter, Issue 8)

Data Phoenix Digest - ISSUE 2.2023

Exploring the LLM Infra Stack, Part 2: The Model Layer

领英推荐

GenAI and Machine Learning

205,468 位关注者

There is no such thing as a Trained LLM

2024年11月27日

New LLM & RAG Courses and Certifications

2024年11月14日

Optimizing AI Systems: Fintech Case Study

2024年11月5日

LLM, RAG, GPT & GenAI: Free Certifications and Courses from Leading Experts

2024年11月1日

Building a GenAI/LLM app on AWS with Anthropic Claude

2024年10月28日

AI/RAG Tutorial: Building Enterprise-Grade, Secure, Scalable Data APIs

2024年10月22日

AI, GenAI, LLM, Prompt Engineering, NLP: Review of the Ecosystem

2024年10月18日

New Book: Building Disruptive AI & LLM Technology from Scratch

2024年10月15日

Building an Enterprise-Grade Agentic RAG

2024年10月14日

Databases For AI, GenAI & RAG/LLMs: Vendor Comparison

2024年10月9日

社区洞察

其他会员也浏览了

Issue #217 - THE ML ENGINEER ??

Issue #191 - THE ML ENGINEER ??

Kubeflow Pipelines v2: Making ML pipelines easier, faster, and more scalable

LLM fine-tuning and model selection + other resources

Issue #173 - THE ML ENGINEER ??

Issue #164 - THE ML ENGINEER ??

TensorFlow.js Monthly #7: RoboFlow.js, Coral Edge TPU acceleration for Node.js, and OCR recognition in the browser

Why Machine Learning Projects Fail (ML4Devs Newsletter, Issue 8)

Data Phoenix Digest - ISSUE 2.2023

Exploring the LLM Infra Stack, Part 2: The Model Layer