LLMs have revolutionized AI. Do we still need knowledge models and taxonomies, and why?

LLMs have revolutionized AI. Do we still need knowledge models and taxonomies, and why?

Although I have of course heard this question more often in recent months than in all the years before, it is really just a reiteration of the question of all questions, which is probably the most fundamental question of all for AI:

How much human (or symbolic AI) does statistical AI need?

With every new quantum leap that machine learning makes, this question is asked again. This was already the case in the 1950s, semantic web pioneers were asked about it in the early 2000s, and it will still be the case in the year 2100, to make a prediction here (see: AI history). The answers will always be the same.

So here are the 10 main arguments why state-of-the-art AI systems should be based on a hybrid architecture consisting not only of LLMs but also of semantic knowledge models such as taxonomies and ontologies (aka "Semantic RAG").

1. Organizing and Structuring Knowledge

Knowledge models and taxonomies provide a structured way of organizing knowledge. They categorize concepts, entities, and information in a logical structure, making it easier for both humans and machines to understand and navigate complex sets of data. While LLMs are excellent at processing and generating text based on patterns in data, taxonomies help in structuring this data in a way that makes sense to users and can guide the LLM (Keyword: Prompt Engineering) in generating and retrieving information more accurately. For example, what would a faceted search look like if LLMs were to classify internal company documents independently?

Someone who knows what they are talking about has a knowledge model at the core of their language model.

2. Enhancing Search and Knowledge Discovery

Taxonomies play a crucial role in enhancing search and discovery capabilities. By organizing information into categories and subcategories, taxonomies make it easier to extract relevant information, for both humans and machines. This is particularly useful in large databases or content management systems where agents and users need to filter through vast amounts of information. LLMs can leverage taxonomies to provide more accurate search results and content recommendations (see also: RAG).

3. Improving Data Interoperability

Taxonomies and ontologies work like a semantic layer over the various systems and their implicit semantics, thus facilitating the interoperability and comparability of data across different systems and platforms. By standardizing the categorization of information, taxonomies ensure that data can be easily shared, understood, and processed by different systems (see for example: EU Taxonomy). This is essential in today’s interconnected world where data needs to flow seamlessly between different applications and services. LLMs can benefit from this standardization, as it makes it easier for them to process and integrate information from diverse sources.

4. Enhancing Machine Understanding

While LLMs are adept at understanding and generating human-like text, taxonomies enhance machine understanding by providing clear definitions and relationships between concepts. This semantic understanding is crucial for tasks that require precision and context, such as content classification, knowledge extraction, and semantic search. Taxonomies can guide LLMs in interpreting the nuances of language and context more effectively, especially in critical and regulated industries.

5. Exploring the Limits of AI Creativity

For those intrigued by the creative potential of LLMs, knowledge models represent a means to guide and shape the creative outputs of AI. Whether it's generating art, writing, or music, taxonomies can help define the parameters within which AI operates, enabling enthusiasts to explore the limits of AI-driven creativity (aka "hallucination") within a structured framework that ensures coherence and relevance.

6. Supporting Specialized Domains

In specialized domains such as medicine, law, and ESG, the accuracy and specificity of information are paramount. Taxonomies in these fields are meticulously developed to reflect the latest research and consensus among experts. They provide a framework for organizing domain-specific knowledge that LLMs can leverage to ensure the accuracy and relevance of their outputs in these specialized areas.

7. Facilitating Learning and Adaptation

Semantic knowledge models are not static; they evolve as new knowledge emerges. This dynamic aspect of taxonomies and ontologies is crucial for the continuous learning and adaptation of AI systems, including LLMs. By updating knowledge models with new concepts and relationships, LLMs can stay current with advancements in various fields and adjust their models accordingly.

8. Bridging AI and Human Knowledge

LLM enthusiasts often marvel at the potential for AI to augment human knowledge and capabilities. Taxonomies serve as a bridge between the vast, unstructured data that LLMs can process and the organized, hierarchical structures of human knowledge. This bridging enables more effective collaboration between human intelligence and artificial intelligence, opening up new possibilities for learning, discovery, and innovation. People are excited by the potential for LLMs to transform specialized fields such as healthcare, law, and education. Taxonomies in these domains provide a critical foundation for LLMs to operate with the precision, accuracy, and depth of understanding required in these contexts, making AI a powerful tool for professionals and researchers.

9. Promoting Ethical and Responsible AI Use

A growing number of users are concerned with the ethical implications of AI technology. Taxonomies related to ethics, privacy, and data security are seen as vital tools for ensuring that LLMs are developed and used in ways that are responsible, transparent, and aligned with societal values (see also: EU AI Act). Knowledge models can also serve as vital elements of an explainable AI architecture (XAI).

10. LLMs are not a knowledge database

Put simply, LLMs are not a knowledge database. They generate text that may contain incorrect information from a real-world perspective. Therefore, LLMs should not be used in use cases where a human cannot verify the output of LLMs (see also: Human-in-the-loop design principle). This makes LLMs useless as databases in which we want to find information that we do not know. Also, there are many use cases where you can't risk incorrect information, such as medicine, law and finance.

However, LLMs can complement any IT system and be used to fill gaps, for example by providing a natural language interface to expert technology. They can be used as translators for query engines to reduce the technical barrier to filling in forms or writing summaries. LLMs can save a lot of time, but they mainly function as supporting technology that acts as an assistant.


Thanks to Artem Revenko , Johannes Trippl , Tomas Knap , William Sandri , Robert David and ChatGPT for helping me write this article!



Nice piece! This is an interesting question. There are many benefits of Knowledge models and Taxonomies as you outline in your article. The most important feature of Knowledge models is the ability to provide LLM models with context. Take a few examples: "I park my car on a driveway, but I drive on a parkway." Or "Your feet smell, and your nose runs, but we smell with our nose and run on our feet." I have seen other issues with Gen AI, getting the context wrong, using "do" vs. "due." The hallucinations are a byproduct of LLM's not having enough context. So, yes, we need knowledge models (graphs) more than ever, especially If you want to reduce Gen AI from hallucinating!

Heimo H?nninen

Business Information Expert

9 个月

Thanks for the great article, really good summary of the topic. People understandably have a desire to look for quick solutions. The reliability of the answers can be a matter of life and death. A chatbot who recommends a favorite wine can hallucinate a little, but the instructions of a heart surgeon's operative assistant must be validable and based on verified information.

Jennifer Woodward-Greene

Artificial Intelligence & Precision Agriculture Enthusiast

9 个月

Well put!

Aart Houweling

Driving Digital Transformation | Expert in AI for Impact | Strategic Growth Leader and Pioneer | Business Development Expert

9 个月

Hi Andreas Blumauer, thanks again for such a clear exposé on this very relevant topic of learning to and the necessity of combining knowledge graphs and LLMs. Happy to continue our conversations and cooperation on this topic.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了