Taxonomy, Ontology and Knowledge Graph... do they come together.
Almost two years back, when I started my journey with Generative AI, it was obvious to me that Generative AI based solutions will be as good as the underlying representation of the knowledge that they have access to. This realization was based on my understanding of the transformer architecture, which I say is an incremental innovation over what we had in the form of RNNs. Transformer architectures helped to extract the relationships between words in a longer sentence. It is the association of words which it encoded, and not the meaning of the content in relation to the domain of the knowledge corpus. This is where, language models and human are different in the way we acquire knowledge. In our case, we do not just speak the language, we also have developed a skill to associate the language to concepts and identify the object that the concept refers to. This is also called the semiotic triangle which helps human to convey a shared meaning of the world model. This meaning is formed through an implicit and explicit internal knowledge model that we have developed. Language models inherently does not have that knowledge model. We need to equip the language models with that knowledge model. And this where we can use Taxonomy, Ontology and Knowledge Graph to develop that knowledge model.
In this blog I am going to talk about each of these three topics. But before talking on these three topics, let me introduce you to Knowledge organization systems(KOS). Taxonomies and Ontologies are both types of KOS.
A knowledge organization system can be any system that helps to classify, categorize , organize and manage knowledge. A subset of KOS is "controlled vocabularies" . A controlled vocabulary is a list of terms that has been enumerated explicitly. All terms in a controlled vocabulary must be un-ambiguous and non-redundant.
A taxonomy is a collection of this controlled vocabulary of a knowledge corpus organized in a hierarchical structure. It consists of a controlled vocabulary based on unambiguous concepts rather than just words. For example, Human is a concept which part of the broader concept of Mammal. The concepts are arranged in a structure of hierarchies, categories and facets to organize them for better search and retrieval. Taxonomies, done properly will evolve to ontologies. The main difference between taxonomy and ontology is that taxonomy does not capture the semantic relationships between concepts, while ontologies capture the semantic relationships. From a taxonomy, I can create a thesaurus that captures the associative relationships between the concepts.
An ontology, on the other hand, is not just the knowledge organization. It is about the knowledge representation and is purpose based. I will be creating an Ontology for a use case or a particular purpose for which I need to explicitly represent the knowledge so that machine can also understand that representation. The ontology can overlay and connect to multiple taxonomies to add semantics for the knowledge representation.
Knowledge Graph:
A knowledge graph persists the taxonomy and ontology with the actual data and the relationships based on the taxonomy or the ontology defined. Language models can refer to the graph(through retrieval mechanisms) to have more explicit knowledge about a domain and then respond more contextually based on the input provided to it. This also enables it to relate to associated contexts to be able to provide even more richer insights.
How can they come together to power the language agents with knowledge?
I see two patterns based on the use case. Those two patterns are as below
2. Insights delivering knowledge agents: This pattern will not only answer the fact, but also can bring in additional insights from other related concepts by leveraging the semantic relationships between multiple taxonomies. For example, if I ask what plants should I buy during summer, it should be able to also tell me what plant care I should consider for those types of plants. The solution design would then look like as below
The concept of taxonomy, ontology and knowledge graphs are easier written than done. It requires domain expertise to model the knowledge classification as well as the representation. It also requires a good understanding of what type of response we expect from the knowledge representation. But once we put the initial effort to create the knowledge representation and continuously manage it, this has the potential to exponentially increase the value that we will derive from the language agents. And when we mix these two patterns, we will come very close to Cognitive Agents.