Superpowers of Knowledge Graphs, part 3: Taming Unruly Language Models
Image by Freepik

Superpowers of Knowledge Graphs, part 3: Taming Unruly Language Models

Knowledge graphs have many superpowers above and beyond those of ontologies, taxonomies, and mere databases. They can't jump tall buildings in a single bound, but they do have something akin to x-ray vision and other powers. I've talked about a couple of these superpowers in previous posts:? data integration across sources, and data aggregation across instances.

Another (very timely) superpower of knowledge graphs is their ability to tame unruly language models. Large language models (LLMs) are all the rage, and the enthusiasm is entirely justified. But large language models show both boundless potential and a series of undesirable behaviors – many more than just hallucinations –, rather like our own lovable, unruly teenage offspring:

  • They make things up to tell you what they think you want to hear.
  • They provide incomplete answers even when they have more information.
  • They don't seem to understand what appear to us to be obvious questions.?
  • They're easily influenced by the media and hearsay.
  • They get most of their knowledge from the web.
  • There are a lot of things they still don't know.
  • They provide contradictory information when you ask things piece by piece.
  • They give you very different answers if you ask the same thing in different ways.
  • They give you different answers if you ask the same thing in different languages.
  • We really can't understand how they're thinking; they can't either.
  • It's very hard to change their beliefs and factual knowledge.?
  • They clam up when you ask something they don't want to answer.
  • They lecture you when you do or ask something you shouldn't.
  • They're very expensive to create and to maintain.

We hear a lot about hallucinations from LLMs that blithely make things up (and the hallucinations can actually snowball), but all of these other undesirable behaviors have also been carefully documented and cause significant problems. LLMs and teens are not always like this, but often enough to make life frustrating and difficult. At least LLMs don't curse and slam doors or get arrested when you least expect it!

Takeaways for the hurried and harried

Your AI strategy needs more than large language models (LLMs).?

  • LLMs like chatGPT, PaLM, Llama, etc. display many undesirable behaviors, not just hallucinations.
  • LLMs need knowledge graphs to behave better. Knowledge graphs provide the right "adult supervision" to improve the behavior of LLM-based AI systems at every step of their development and use.?
  • LLMs cannot replace knowledge graphs, but they can help us build them more effectively.?

You need more than LLMs

I was inspired to write this post when I got the same question from two different groups on the same day:? My LLMs are great, so I don't need knowledge graphs, right? One group had decided not to start work on knowledge graphs; the other was shutting down their long-standing (and very impactful) efforts to build knowledge graphs. I replied with a very strongly worded "Wrong!" to both. Let me explain.?

The bad news is that now, or very soon, you or your company will have to deal with unfamiliar, hard-to-understand "assistants" who display all of the behaviors listed above on a day-to-day basis. But you wouldn't hand over all of your assets and responsibilities to an unsupervised teen, would you? So you shouldn't hand over responsibility for important tasks to an autonomous LLM-based chatbot either.

The good news is that a lot of research has been done, and more is underway, to address and mitigate these undesirable behaviors and the risks they create. The key research finding that I need to emphasize here is this:?

Knowledge graphs provide the right "adult supervision" to improve the behavior of these LLM-based AI systems at every step of their development and use.?

Let's review the steps of developing LLMs to see just why this is the case. I've included links to smattering of papers for the technically-minded so they can get a better idea of the details.?

Note that all of the discussion here and in the linked papers focuses on how knowledge graphs help us build better LLMs.? There is also a huge literature about the inverse relation, as well:? how LLMs help us build better knowledge graphs (a super important topic for a different day).?

The self-reinforcing, virtuous cycle between knowledge graphs and LLMs will play a central role in how, how quickly, and how effectively we deploy and create value from intelligent systems in the coming years.

How knowledge graphs help LLMs

Here are the most common steps involved in building LLMs. Knowledge graphs help each one.

Gathering Inputs. Large language models are built from vast collections of written text:? web pages, Wikipedia, Reddit – anything that happens to be lying around. It turns out that when we add knowledge graphs to the mix (the training corpus) – either in their original simplified format or by automatically turning triples into sentences – then LLM performance and behavior improves significantly. Answers are measurably more accurate and there are fewer hallucinations. [See 1, 2, 3, 4, 5, as well as more hands-on descriptions in 1, 2, 3]

Model-building ("training").? The algorithms that digest these vast inputs are guided by self-generated feedback from what's called a loss function. After each processing cycle during training, the loss function spits out a measure of how close the model-in-progress is to the input data, i.e., how well the current version of the model can predict the sequences of strings in the input.? Training a model is essentially adjusting the model so that it gets better and better at predicting the examples from the input. The loss function guides these adjustments, specifying how large they should be and in what direction (positive or negative) they should go – so it plays a crucial role in determining how good the model is.?

It turns out to be very helpful to calculate at the same time how well the current model predicts the information in a knowledge graph as well as the sequences of the input.? This two-part loss function (loss wrt the input, loss wrt the knowledge graph) yields more systematic guidance and better adjustments to the model, as in physics-informed neural networks.? In the end, a model built this way provides not only more consistent and accurate answers but also fewer hallucinations.? [See 1, 2, 3]

There are different ways to check how well the model can predict the input. One important way is to pick a specific position – a "masked" token – (not just the next one) and have the model predict what should go there – over many, many iterations.? Researchers who use a knowledge graph to choose the most important positions to double check (based on a knowledge graph) find that this feedback leads again to models with more accurate answers. [See 1, 2]

Another method that researchers have explored is to train on the inputs and on the knowledge graph separately – a "two tower" approach – and then glue together the results. This is called knowledge fusion.? This richer, two-part model again leads to more consistent and accurate answers and also to fewer hallucinations. [See 1, 2, 3, 4, 5]

So knowledge graphs provide the "adult supervision" needed to train significantly better foundational models like GPT, Llama, Claude, or PALM.

Model analysis.? Organizations that want to use these foundational models often need to assess how and how well a given model represents or "remembers" the information that it was fed during training – otherwise the model is just a black box and we're not sure how much to trust it or what it's good for. Researchers have used knowledge graphs to probe language models to understand how and how well they capture knowledge – which in turn guides how they improve and evaluate the models. [See 1, 2, 3, 4, 5]

Fine tuning. Work with broad-coverage foundational models shows that they can be "tuned" or adjusted to work better for specific tasks like question answering, search, recommendations, etc. or for specific domains (like your company's key information) – and knowledge graphs play a central role in this process. [See 1, 2, 3]?

Prompt engineering. Once we have built or have access to a tuned LLM, we need to retrieve the information that we need from it. Researchers are doing a huge amount of work on understanding how prompts work with large language models so that we can create the best prompts for each use case. [See 1, 2, 3]?

Organizations that focus on prompt engineering instead of building better LLMs, however, make the erroneous assumption that the knowledge in the LLM is complete, accurate, and relevant enough for their needs. The list of well-documented, undesirable behaviors of LLMs above shows that this assumption is entirely false. So it's a fool's errand to repurpose a team that was building high-quality structured data to improve the LLMs and have that team focus instead on creating more and better prompts for an LLM that's not improving.?

Inference. Given a model and a prompt, an AI performs inference to generate one or more responses. When this is done with the help of additional knowledge retrieved from a knowledge graph, once again the responses are more accurate and more reliable. [See the growing body of work on retrieval-augmented knowledge fusion, and research like 1, 2, 3]

Implementing Guardrails. Several tools let developers add structure, type, and quality tests on the outputs of large language models by specifying a knowledge graph with the desired constraints in something like the Reliable AI markup Language. When a given LLM output fails such a guardrail test, the system takes corrective action like querying the LLM again with an updated prompt.

At this point in the process, we have the system's output or answer to a given question.

Evaluation.? Knowledge graphs are often used as baselines for evaluating the factual accuracy of LLM outputs, much like the probe-based analysis described above. So knowledge graphs help us check LLM behaviors before deployment -- to help manage risk.?

Recap

In sum, it's very clear that knowledge graphs play an important role in every step of the development and deployment of LLMs. Very many studies show that incorporating knowledge graphs in different ways has a significant impact on problems of accuracy, on reliability, and on other undesirable behaviors of LLMs.?

LLMs need knowledge graphs to behave better, so focusing on one instead of the other will delay how quickly and how effectively you can build, deploy, and create value from these intelligent systems.?

You need knowledge graphs to control your LLMs when prompts are not enough. You're already in trouble if you're only doing prompt engineering.

"The future of knowledge graphs is brighter than ever, thanks to a world with language models." (Denny Vrande?i?, Wikimedia, Knowledge Graph Conference Keynote, 2023)


Abe Mammen

Technologist, Entrepreneur, Investor, New Ideas

1 年

Well written article. I agree with your assertions and power of KGs, especially with regard to LLMs.

Simon Collery

Data architect working with enterprise data models, taxonomies, ontologies, metadata and governance.

1 年

Excellent articles, thank you Mike Dillinger, PhD! The third was so good I had to read it twice. One question, as a taxonomy and ontology person, do you think it is possible to extend ontologies to include instances? I've tried to do that, but only out of curiousity. But I don't think ontologies are inherently unsuited to this. Do you?

By the way - even the most ardent proponent of AI will not deny that?teenage offspring are very expensive to raise and maintain :) !

Roy Roebuck

Holistic Management Analysis and Knowledge Representation (Ontology, Taxonomy, Knowledge Graph, Thesaurus/Translator) for Enterprise Architecture, Business Architecture, Zero Trust, Supply Chain, and ML/AI foundation.

1 年

I fully agree, especially with your comparison of LLMs to teenage humans (and humans with low and fragmented educations). I had suggested my own holistic, question-based, governed-taxonomy, upper-ontology-derived, generalized KG framework, methodology, and technology-specifications be used to add "fidelity" to any LLM. I described my KG approach in my comment to Part 1 on this topic. I also had briefly worked with a startup who said, in distaining hubris, that they didn't need to use a KG because they would be using LLMs.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了