LLM/RAG: Knowledge Graphs, Multi-Agents, Ultrafast Fine-tuning, No Latency

LLM/RAG: Knowledge Graphs, Multi-Agents, Ultrafast Fine-tuning, No Latency

In this presentation, I explain all the components of a ground-breaking architecture with applications to local, enterprise LLMs and high-quality search targeted to advanced users and busy professionals. Some of the main features:

  • Multi-LLM with 2000 specialized, small sub-LLMs covering the entire human knowledge.
  • LLM router as top layer to decide which sub-LLM to call, or let the user choose.
  • Smart crawling to recover taxonomies and knowledge graphs embedded in carefully selected, high-quality input sources. Augmented with synonyms and abbreviation maps based on glossaries, indexes, and so on.
  • Ultrafast: No neural network but instead parametric weights governed by few explainable parameters rather than optimizing billions of weights (the neural network approach).
  • Customizable relevancy scores attached to each item returned to the user (URLs, related concepts, tables, and so on). To help the user decide on what to look for.
  • Self-tuning (global or local to a sub-LLM) based on favorite hyperparameters chosen by users, and customized results. Local self-tuning is very fast, and a first step before global optimization.
  • Fast embedding search with probabilistic algorithm. Variable-length embeddings with contextual and multi-tokens. No dot product or cosine distance, but better metrics instead.
  • Using the model evaluation metric as your loss function to achieve better relevancy. Introducing the concept of adaptive loss function.
  • Augmentation and refinement based on integrating user prompt elements in back-end tables.
  • Application to content clustering and predictive analytics based on text only. Using nested hashes that leverage the sparsity in keyword association tables (no huge, sparse similarity matrix involved).
  • Model evaluation based on knowledge graph reconstruction (category assignments) and comparison with the native one.

?? Download the free PowerPoint presentation from here. With links to full source code on GitHub, datasets, documentation, related books & articles, and free training on the topic.

Awesome Information And I’m Still Digesting Them!

Are there any metrics evaluated for this system? I mean something that is used in literature for llm comparison for example accuracy on MMLU?

Gia N.

Sr Data Scientist

4 个月

Thank you Vincent for sharing this. A few questions : how would you maintain / update the 2000 sub-LLMs and what would be the environmental effect when all of them are running ?

Tariq Mohammed

Research Scientist - Research Engineer - Complexity Scientist - Inventor of Modular Formulas - AILOS 1.0

4 个月

I like the step "covering the entire human knowledge." Having a master LLM router to call sub-LLMs is a smart modular approach. Your intelligent design to recover taxonomies and knowledge graphs through selected input sources is good systems engineering. Are parametric weights governed by a few explainable parameters a new form of a modular AI system? I like the customizable and fast-tuning features. Your approach using a probabilistic algorithm, Variable-length embeddings, and better metrics is commendable for its rigorous mathematical approach. I like your adaptive loss functions as the need to more modular adaptive systems is necessary. Augmentation in back-end databases is a great approach to refinement. Using nested hashes in association tables is a great modular approach. The continual use of diverse graphs makes the system even more robust. This is a great, modern, modular, powerful, mathematically based system Vincent, I love it! ??

Brendan Humphrey

Corporate Banking | SWIFT Bank Instrument Issuance | Blockchain, Ripple & AI Trade Advisor | Luxembourg Bank Trade Desk | AI & CyberSecurity | VIP & Asset ex-Special Forces Security | Ghetto Kids NGO | Paying it Forward

4 个月

Imagine 2000 mini LLMs running around like specialized minions, each armed with their own little knowledge toolkit. They zip around the internet, not with neural networks, but with super-efficient parametric weights—kind of like choosing the right tool for the job instead of carrying the whole toolbox. And hey, they even let you pick which one to send on a quest for knowledge! It's like having a team of highly caffeinated researchers at your beck and call, ready to fetch exactly what you need faster than you can say "ultrafast search."

要查看或添加评论,请登录

社区洞察

其他会员也浏览了