Learning Relations between Knowledge Graph Relations
Image by vecstock

Learning Relations between Knowledge Graph Relations

One key way that engineers build LLMs like chatGPT is with a process called transductive learning.? In this approach, we create a huge collection of documents – a "corpus" – then split it. About 80% we use to train the system and build language models. The other 20% we set aside:? we don't use it for training, only for testing. So we create a model based on the original corpus with a broad coverage vocabulary and test with a different sample from that same vocabulary – to ensure that the test vocabulary is comparable to the training vocabulary.?

But these models can't generalize or guess well on different or unseen vocabulary – in a sense, the models are too focused, too dependent on the specific sample of training documents. These models work well on new instances of familiar vocabulary but falter on new terms.? We try to make up for this weakness by using truly ginormous training corpora to sidestep the problem.??

This kind of model building, then, can be seen as a weak kind of transfer learning – it works because it finds patterns among familiar items and "transfers" those patterns to model different samples of the same corpus.? In this case, the patterns that we can transfer are the vocabulary items shared across training and test samples or across training and what users give us in production.?

But what can we do to address the unknown vocabulary, the mismatches between the training and test corpora? How can we get the model to generalize or guess better when faced with new terms???

A different, more recent, method – called inductive learning – addresses this problem directly, finding and transferring other patterns, not just shared vocabulary.? In the clearest test case, there may be no shared vocabulary at all – we can (and should) double check that none of the test items have appeared in training.

With this method, there are at least three other kinds of patterns (or invariant features) that we might work with when we model the original documents:

  • Sub-tokens, i.e., re-combinable pieces or substrings of target items, like suffixes, prefixes, byte pairs, etc.
  • Context tokens, i.e., re-combinable vocabulary around or in the context of the target item
  • Item features, i.e., re-combinable formal or semantic characteristics of the target item

In all of these cases, we learn patterns of sub-tokens, context tokens, and/or features instead of (or in addition to) patterns of vocabulary.? When something new appears, like a new word, it may match any of these other patterns even if it doesn't exactly match a vocabulary item. Rather than rely on a brittle exact match to a full vocabulary string – like we would with a database, for example –, we can rely on a robust, flexible fuzzy match to a collection of features – like we would with other machine learning techniques. In other words, we build a more robust model:

We make our internal representation of the original item more general and more flexible: instead of a specific string, we represent it as a collection of patterns.

Patterns with semantic item features are particularly important in the case of knowledge graphs:? the graph neighborhood of each item in a graph is a collection of rich conceptual or semantic features – i.e., all the other items it is explicitly related to and the type of relation for each. This is a kind of conceptual unpacking – we document the key components and characteristics of each concept.

Conceptual unpacking is what gives knowledge graphs their superpowers.

Finding and leveraging patterns of this kind will allow us not only more robust and reliable models (through better inductive and transfer learning), but also allows us to approximate other forms of reasoning much more directly – because with knowledge graphs we make models for concepts, not only models of strings.?

This inductive learning approach is particularly important for matching and merging knowledge graphs.? It is common to have a model of the vocabulary (the node and relation labels) for one knowledge graph but not for another. This is because today's knowledge graphs and ontologies are often built and curated manually by domain experts, so the depth of the experts' training, along with a lack of effective tooling, forces them to create many, smaller graphs, each focused on a very specific domain. Moreover, different perspectives and different priorities lead to knowledge graphs with overlapping items but differing triples. But then how can we combine them?? Inductive learning can help to merge and cross-validate different knowledge graphs, ontologies, or taxonomies, even when their nodes and relations have different labels.

One excellent example of inductive learning over knowledge graphs that have different vocabularies is the very recent paper by Mikhail Galkin et al:? Towards Foundation Models for Knowledge Graph Reasoning. In this paper, the authors focus on knowledge graph relations (not the nodes) to illustrate the usefulness of this approach.? They model patterns in the context of target relations (which relations co-occur with them) – a graph of relations – to establish similarities between relations from different sources. Their features focus on context tokens:? same-head relations (when the head of relation1 matches the head of relation2), same-tail relations, head-matches-tail, and tail-matches-head relations. And they do this without modeling the head or tail entities at all.

To be clear, the "knowledge graph reasoning" that they focus on is simply "reasoning" about the similarity of relations. Their results document systematically that leveraging the patterns in this relation similarity graph indeed enables much better transfer and generalization across differing relation vocabularies. In this way they can find which known relation is closest to a new, unknown relation.

This work provides plenty of food for thought.

It reminds us that how you train and test is very important. Even though it's widely used, the transductive approach is really too weak for robust applications and should not be used in industry. In production data, there are too many surprises.

These authors look at patterns of unanalyzed context tokens and even this has a positive impact. So we can expect that richer, more informative semantic item features (like knowledge graph neighborhoods of these tokens) will allow us to extend this work to more complex forms of knowledge graph reasoning.

This interesting study is focused on what seems to be a very small, very technical problem. But it suggests a solution for a much broader one.? Shared standards are notoriously difficult to develop, to agree on, and to comply with -- just ask anyone who's been on a standards committee.? The many ill-fated proposals for a standard vocabulary of knowledge graph relations (see one review in Helbig, 2006) are just one more example of this well-known difficulty.? To the extent that we can relate differing knowledge graph relations reliably and automatically using an approach like the one described here, then agreeing on a set of must-match-exactly relations is no longer necessary.? Instead of exact-match, systematic standardization for knowledge graph relations, maybe we can work with fuzzy matches and avoid the difficulties of standardization altogether.??


Mark Spivey

Helping us all "Figure It Out" (Explore, Describe, Explain), many Differentiations + Integrations at any time .

1 年

“meaning is usage, structure emerges from usage” … and “usage” is individual and independent … many people mistakenly reify and assume and expect semantics as being formal and expressed and relatable etc . you can’t point to “semantics” . people are always merely forever pointing to usages of signs and symbols by specific people at specific times for specific reasons .

回复
Paul Sumares

Senior Software Engineer

1 年

Mike, if I recall correctly, you were until recently Open To Work ... did some smart company snap you up?

回复
Rémy Fannader

Author of 'Enterprise Architecture Fundamentals', Founder & Owner of Caminao

1 年

要查看或添加评论,请登录

Mike Dillinger, PhD的更多文章

社区洞察

其他会员也浏览了