登录查看更多内容

FOIS 2024: Where AI succeeds with help from knowledge graphs and fails without

Jan Voskuil

CEO Taxonic & Ontologist

发布日期: 2024年8月7日

#fois #ontology #knowledgegraph #ai #ml #neurosymbolicai

Combining knowledge graphs and machine learning delivers results where simple AI fails. This is one of the most fascinating topics that emerged from the 2024 edition of FOIS, the conference for formal ontology in information systems, held at Twente University. The conference was a great success, and I am quite happy and proud to have been involved. It was no surprise that neurosymbolic AI, more specifically, AI that combines LLMs and knowledge graphs, turned out to be a recurring topic. It was the focus of a one-day workshop organized and moderated by John Beverley . In addition, quite a few talks, keynotes and otherwise, presented fascinating insights and results. In this blog post I share some of the most inspiring stories on this topic that I had a chance to attend, summarizing ideas and highlighting real-world results. Neurosymbolic AI delivers results where simple AI fails.

Compared with my previous blog post on the topic, reporting the news from SEMANTiCS 2023, it is striking that where in 2023 people mainly talked about possibilities and results from experiments, we now see how the ideas get more concrete and real-world results start coming in. Also, we have gained more insight in the underlying drivers that make knowledge graphs and AI such a golden combination. These are exciting times.

A good place to start is the workshop keynote on LLMs and ontologies by Barry Smith , one of the creators of the highly successful BFO ontology that is now widely embraced in US government organizations, including the military. Barry gave us a preview of the upcoming revised edition of the book he co-authored, entitled Why machines will never rule the world. It will come out February 2025 and will offer much new material, including chapters about LLMs and quantum computing — a source of inspiration and an absolute must-read for anyone with an interest in AI and ontologies.

His point is, essentially, that AIs apply statistical models in a deterministic way. They are simple systems, not complex systems — at whatever scale you implement them. Humans are complex systems and can, therefore, act non-deterministically. The list of shortcomings in AI is endless, it will never mimic or even come close to human intelligence. This line of reasoning assumes that “true” intelligence only arises from complex systems. The current state of affairs is that there is no broad consensus on this, and so the debate continues.

Empiricism, rationalism, or both

In a plenary keynote, philosopher (and biochemist) Mieke Boon compared the continuing debate between proponents of machine learning and the symbolic reasoning crowd to the age-old debate between empiricists and rationalists. Since the days of Plato, the central question here is: how do we know things? Rationalists, like Descartes, think that knowledge can be deduced from first principles, using the rules of logic and math, and starting from the slogan “I think, therefore, I am.” Empiricists, like Francis Bacon, retort that the only way to know things is by perceiving them through the senses, generalizing our perceptions as we go along.

Both sides of the empiricism versus rationalism argument start from an incomplete view of human cognition

This debate, like the AI-related one, never stops, because both sides of the argument start from an incomplete view of human cognition. This was when many people in the audience, myself included, held their breath. Immanuel Kant — the originator of modern ontology as a foundation for cognition, and on whose shoulders we all stand —, argued at the end of the 18th century that things must align themselves with human cognition before we even can perceive them as things. We start, as babies, with a limited set of ground concepts and the basic capability of deciding which concept goes with which percept. As perceptions accumulate, we create new concepts based on them, so that we learn to perceive and know new kinds of things. As the slogan goes, concepts without percepts are empty, and percepts without concepts are blind.

Ontologies and machine learning

Based on all of this, Mieke concluded that the only way forward in AI is to bring back the human perspective into the AI-mix. This was the starting point for professor Frank van Harmelen ’s keynote titled “Ontologies for Machine Learning.” This is not about “artificial ?general intelligence” and creating a human mind: we just want to create useful machines. The idea is that ontologies and knowledge graphs in combination with machine learning will help us do that. Frank’s overall message was, essentially, that machine learning runs into serious problems; that these problems can be solved using knowledge representation; and that this realization receives growing support.

Already quite a few years ago, Frank, together with colleague Annette ten Teije, published their now rather well-known “boxology” (a taxonomy expressed in a notation of boxes and arrows) of ways to combine knowledge graphs and machine learning [1]. For instance, you can use an ontology to preprocess training data. Or you can us an ontology to validate results returned by the machine, to filter out nonsense. Quite interesting are approaches where the machine learning algorithm uses the knowledge graph to learn intermediate abstractions, thus enhancing its capability for generalization.

ddd

Question: What is the queen wearing on her head? Answer given by today's AI: a shower cap. This was one of the funnier examples Frank van Harmelen presented, and makes the case that plugging in a knowledge graph really helps.

There are a number of basic combinatorial patterns, and these different patterns can themselves be combined with each other. A recent survey by a Swiss research group turned up some really cool examples of how such approaches deliver amazing results. Frank also had some of his own. Let me recount just one example. In a research project, an AI was trained to find early signs of colon cancer in general practitioners’ patient files. This failed completely. Next, they enriched the AI-architecture with an ontology so that, for instance, medicines of different brands with different names could be grouped by their active ingredient — helping the machine make generalizations. Now, the machine was able to pick up signals and make valid predictions. That is a pretty awesome result. I will come back to the use of ontology-assisted generalization to bump up signals out of noise below.

A quite inspiring and innovative set-up was presented by professor Barend Mons, one of the visionaries behind the FAIR Data movement. Here, the knowledge graph does the heavy lifting, and an LLM is used to write scientific papers about the results. This will be the topic of a separate blog post coming out soon.

领英推荐

Astrophysicist working with AI: "Building effective…

Boozt 1 年前

Technology and Science Highlights of the Week #15 June…

Omreon Technology 8 个月前

AI takes center stage at the Nobel Prize!

Emergere Technologies LLC. 2 个月前

Providing context and promoting generalization

At this point, I would like to briefly talk about a landmark paper by Juan Sequeda and others [2]. It was not presented at FOIS, but is quite recent and eminently relevant here. They took a large, moderately complex relational database and manually converted the data and the model into an RDF knowledge graph (technically, it is a virtual layer on top of the relational database). Next, they used generative AI (GPT-4) to convert natural language questions (“give me all insurance claims with these-and-these properties”) into SQL-queries for the database and SPARQL-queries for the knowledge graph. The results unequivocally show that the knowledge graph outperforms the relational database by orders of magnitude.

Juan and colleagues explain this by saying that the knowledge graph expresses contextual information that is absent in the relational database. Let me elaborate on this. Consider the statement “Castor is 39 years old.” This gives us some information, but not much: something that has existed 39 years can be anything.

Compare this to a statement like “Castor is our son,” or “Castor is our dog,” or “Castor is our boat.” Assigning something to the class Person provides a wealth of information: its form, behavior, mode of existence, lifespan, capabilities, needs, and so much more spring, as it were, to mind. This point is one of the mainstays of modern formal ontology, and was at the core of @Giancarlo Guizzardi’s 2005 dissertation in which he launched UFO and OntoUML, one of the most prominent ontological frameworks of today [3].

Knowledge is compressed information

The defining property of knowledge representation, hence, of RDF knowledge graphs, is that they make such (highly informative) statements explicit as part of the data. Conversely, a relational database typically states that Castor is 39 years old, but not that it is a Person. This latter fact can only be deduced indirectly, by noticing at the metalevel that its primary key is defined in the Person table. The same goes, more or less, for other data representations (as opposed to knowledge representations), such as CSV, XML, property graphs and so on. Making ontological statements directly available as such to the machine learning algorithm greatly enhances its performance. It boosts its capabilities for generalization.

Nele K?hler and Fabian Neuhaus presented a neat paper on what the top-level ontology is that GPT itself uses internally by asking it questions like “What is the difference between a monkey and a hammer” [4]. Underlying the behavior of GPT is a ridiculously complex statistical model, but nobody knows exactly what it is a model of. Research such as this gives us clues. With AI, we are still in the earliest stages. We try things, and sometimes it works and sometimes it doesn't. Most often, we have no idea why. Frank van Harmelen pointed out in his keynote that we suffer from a great lack of theoretical insights in these things.

Next, a great paper by Michael DeBellis and colleagues [4] describes creating a service that helps dentists to stay abreast of the literature, especially in countries where this is challenging than elsewhere, such as India and other lower-income countries. They combine a so-called RAG architecture with a knowledge graph of metadata of papers, using Franz Inc. AllegroGraph platform, that combines an RDF graph database with a native vector store. The initial tests with real users shows that this works extremely well. In contrast, using plain GPT to answer these questions does not lead to useful? responses. Again, here is an example of how combining knowledge graphs and AI leads to great results — this time in a system that is scheduled to go into production soon.

At Taxonic , we not only follow these developments with much interest, we are also actively performing experiments with RAGs and knowledge graphs. On this, more soon.

Wrapping it up

The conclusion is that there is quite a lot of research being conducted on how to combine knowledge graphs and machine learning, and that real-world results are being achieved. This combination brings the human perspective into the mix, and promises to solve some of the hardest problems in today's AI: explainability, hallucinations, shallowness and bias.

In this blog post, I summarized some of the ideas about knowledge graphs and AI presented at FOIS — in a following blog post, I’ll mention some more, focusing on explainability. Of course, this topic was only one of too many to sum up here. A big thank you is in order to the organizers, who worked tirelessly to ensure a smooth conference, with Giancarlo Guizzardi as general chair and Tiago Prince Sales as local organizer and primary point of contact. I am already looking forward to the next edition of this great conference!

References

[1] Frank van Harmelen and Annette ten Teije, “A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems,” Journal of Web Engineering, Vol. 18 1-3, 97–124. url: https://www.cs.vu.nl/~frankh/postscript/JWE2019.pdf

[2]? Juan Sequeda, Dean T. Allemang and Bryon Jacob, “A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases,”? Proceedings of the 7th Joint Workshop on Graph Data Management Experiences \& Systems (GRADES) and Network Data Analytics (NDA), 2023. url: https://www.semanticscholar.org/paper/A-Benchmark-to-Understand-the-Role-of-Knowledge-on-Sequeda-Allemang/b66c5d17424b37c46980d50bd2796c568e1e926f?

[3] Giancarlo Guizzardi, Alessander Benevides, Claudenir Fonseca, Daniele Porello,? Jo?o Almeida, Tiago Prince Sales, “UFO: Unified Foundational Ontology,” Applied Ontology 1 (2021). doi: 10.3233/AO-210256

[4] Nele K?hler and Fabian Neuhaus, “The Mercurial Top-Level Ontology of Large Language Models.” To appear. url: https://www.utwente.nl/en/eemcs/fois2024/resources/papers/neuhaus-the-mercurial-top-level-ontology-of-large-language-models.pdf

[5] ?Michael DeBellis, Nivedita Dutta, Jacob Gino, Aadarsh Balaji, Integrating Ontologies and Large Language Models to Implement Retrieval Augmented Generation (RAG), Applied Ontology 1 (2024) 1–5, IOS Press. URL: https://www.academia.edu/122164296/Integrating_Ontologies_and_Large_Language_Models_to_Implement_Retrieval_Augmented_Generation_RAG

Paul Janssen

Expert Geostandards at Geonovum

6 个月

Good read. Exactly what I was hoping (and searching) for. Ontologies (and informationmodels in general) contain condensed information. How to feed this into ML and AI. For me specifically geospatial information is of interest.

Jeroen Graave

Programma Architect Open op Orde at Rijkswaterstaat

6 个月

Bij Rijkswaterstaat heb ik met mijn collega’s ge?xperimenteerd met Graph RAG en daarvoor de software bibliotheken van Microsoft en Neo4j / Langchain gebruikt. De eerste resultaten waren zeer veelbelovend. Zie https://neo4j.com/developer-blog/enhance-rag-knowledge-graph/ en https://www.microsoft.com/en-us/research/project/graphrag/

Roy Roebuck

Holistic Management Analysis and Knowledge Representation (Ontology, Taxonomy, Knowledge Graph, Thesaurus/Translator) for Enterprise Architecture, Business Architecture, Zero Trust, Supply Chain, and ML/AI foundation.

6 个月

Very good article. I appreciate that the conference validates the knowledge representation (#KR) approach I discovered and have been using and advocating for 43 years, when I suggested in 11/2022 that LLMs needed a Knowledge Graph (#KG) to add fidelity to the GPT processing and output. And I suggested that a useful process for gaining the KG was to: 1) have a domain viewpoint, 2) modeling it as an ontology, 3) as the framework for collecting and defining all of the domain vocabulary's terms and their definitions into a taxonomy, 4) which is then used as the sole lookup table for specifying the metadata and data of the domain KG, 5) which then provides both the context and content for a domain thesaurus for use as a translation table for jargon and language translation, 6) which then provides the spriral refinement mechanism for simplifying the domain's KR. (#VOTGT) I used this approach in the early 80s to build descriptive models, diagrammed below, as the data foundation for holistic upper general (#HUG) descriptive, diagnostic, prescriptive, and predictive analytics. My purpose was to manage the transformation of a large military organization to a new mission, with corresponding changes in the organization’s composition.

Mieke Boon

Professor Philosophy of Science in Practice University of Twente

6 个月

Thank you, Jan Voskuil, for this inspiring Blog, and for your kind words on my keynote at FOIS2024!

2 次回应

Richard Dijkstra

Linked Data consultant at Taxonic, now working for Dutch national police

6 个月

Dank je Jan voor dit leerzame en amusante stuk over het nut van knowledge graphs bij AI.

1 次回应

查看更多评论

要查看或添加评论，请登录

Jan Voskuil的更多文章

The value of conceptual modeling

2023年11月20日

The value of conceptual modeling

On 9 November, I had the honor to participate in the 90-minute Industry Panel at the ER 2023 conference in Lisbon…

17 条评论
From the frontlines in Leipzig: LLMs and Knowledge Graphs

2023年10月4日

From the frontlines in Leipzig: LLMs and Knowledge Graphs

This year’s edition of SEMANTiCS, in Leipzig, was wonderful: great venue, a big turn-out and a tsunami of bright ideas…

37 条评论
History in the TOOI knowledge graph

2022年9月29日

History in the TOOI knowledge graph

In a large-scale knowledge graph used for describing government information (among other things), history plays an…

3 条评论
Modelling MARC relators using PROV or how to fix DCAT 2

2022年6月13日

Modelling MARC relators using PROV or how to fix DCAT 2

Among the finer details of DCAT 2, the W3C standard for describing catalogs of datasets, lurks a subtle mistake. It…

4 条评论
Practical aspects of the semantics of SHACL

2022年5月2日

Practical aspects of the semantics of SHACL

The past few years, more and more of our customers have started using SHACL. Currently, I am involved in applying SHACL…

6 条评论
SYNTAX, SEMANTICS, AND THE GREAT OWL HOAX

2022年1月24日

SYNTAX, SEMANTICS, AND THE GREAT OWL HOAX

Older ontologies use OWL to express constraints. SHACL is the 2017 W3C recommendation that was specifically designed…

54 条评论
Parts and Wholes in Object Type Libraries — OTL Best Practices Part 2

2021年7月26日

Parts and Wholes in Object Type Libraries — OTL Best Practices Part 2

Second in a series on emerging best practices in the design of Object Type Libraries or OTLs, this article examines the…

3 条评论
The Case for RDF (Revisited)

2021年6月14日

The Case for RDF (Revisited)

Over the past few years, growth in the uptake of RDF has picked up steadily. In some domains, such as asset management…

4 条评论
Real-World BIM: Analysing Different Modelling Options in Published Object Type Libraries

2021年5月25日

Real-World BIM: Analysing Different Modelling Options in Published Object Type Libraries

In the world of asset management, data exchange is a big thing. Modelling the underlying information is a central tenet…

17 条评论
Why Asset Data Must Be FAIR: The SIDO Case

2021年4月15日

Why Asset Data Must Be FAIR: The SIDO Case

Managing infrastructure assets — street furniture, buildings, escalators, landing guidance systems, toilet doors —…

2 条评论

See all articles

FOIS 2024: Where AI succeeds with help from knowledge graphs and fails without

Jan Voskuil

CEO Taxonic & Ontologist

Empiricism, rationalism, or both

Ontologies and machine learning

领英推荐

Providing context and promoting generalization

Knowledge is compressed information

Wrapping it up

References

Jan Voskuil的更多文章

社区洞察

其他会员也浏览了

AI Master Class: Down the Rabbit Hole of AI (at NOVA FCT)

Object Detection using YOLO

ScaiDigest Volume 6: Variational autoencoders (VAEs) in biology

AI and the Metabolism of Knowledge

Decoding History: The Role of AI in Unearthing Ancient Secrets

Your Daily AI Research tl;dr - 2022-10-24 ??

Is AI On The Verge Of Emergence or Madness? You Be The Judge...??

Will AI save the world or will we save the world of AI?

Unveiling Patterns, Uncovering Meaning: The Role of AI in Mythology Research

Top YOLO Variants Of 2021

Empiricism, rationalism, or both

Ontologies and machine learning

领英推荐

Providing context and promoting generalization

Knowledge is compressed information

Wrapping it up

References

Jan Voskuil的更多文章

The value of conceptual modeling

From the frontlines in Leipzig: LLMs and Knowledge Graphs

History in the TOOI knowledge graph

Modelling MARC relators using PROV or how to fix DCAT 2

Practical aspects of the semantics of SHACL

SYNTAX, SEMANTICS, AND THE GREAT OWL HOAX

Parts and Wholes in Object Type Libraries — OTL Best Practices Part 2

The Case for RDF (Revisited)

Real-World BIM: Analysing Different Modelling Options in Published Object Type Libraries

Why Asset Data Must Be FAIR: The SIDO Case

社区洞察

其他会员也浏览了

AI Master Class: Down the Rabbit Hole of AI (at NOVA FCT)

Object Detection using YOLO

ScaiDigest Volume 6: Variational autoencoders (VAEs) in biology

AI and the Metabolism of Knowledge

Decoding History: The Role of AI in Unearthing Ancient Secrets

Your Daily AI Research tl;dr - 2022-10-24 ??

Is AI On The Verge Of Emergence or Madness? You Be The Judge...??

Will AI save the world or will we save the world of AI?

Unveiling Patterns, Uncovering Meaning: The Role of AI in Mythology Research

Top YOLO Variants Of 2021