登录查看更多内容

Scoping Knowledge Graphs

Mike Dillinger, PhD

发布日期: 2023年9月22日

Building knowledge graphs is supposedly a huge and terrifying project, like fighting dragons or sending humans to Mars. I hear or see it time and time again:

Knowledge graphs are too difficult, too time consuming, and too expensive to build.?

Think for a second about the people who say this.? Mostly software engineers. They're thinking that a knowledge graph has to be enormous – like one of the handful of widely known graphs such as Google's or Wikidata's with billions of triples. And they know that the foundation of a knowledge graph is coherent semantics – scary stuff that they were not trained to do. Of course they'll push back:? this kind of project does not set engineers up for success. It's like asking librarians and philosophers to build bulletproof cybersecurity infrastructure.

The vast majority of companies will never build and never need a knowledge graph that covers everything from newts and nebulae to Nubians and nurses.?

So billions of triples is just a scary bedtime story that engineers will tell. Implicit message: Rely on Google or someone else to do it for us.?

And the vast majority of software engineers will never do the semantic analysis needed to ensure the conceptual integrity and coherence that make knowledge graphs so valuable.? So "it's too hard to do" is another scary bedtime story that engineers will tell. Implicit message:? Don't ask me to build stuff I don't know how to build.

But tech organizations are overflowing with [add your own adjective here] engineering managers who think (or expect) that engineers can do anything and everything. And there are precious few product managers who are familiar enough with knowledge graphs to frame knowledge graph building in terms that managers and engineers can grok and buy into.?

End result:? Companies don't implement crucial but unfamiliar tech, like knowledge graphs.

Here's a quick primer so you can get an initial idea of the scope and goals of a knowledge graph project.

Your knowledge graph captures your view of your business, not the universe.

A knowledge graph captures and stores a model of your specific business domain – not the entire universe! It's a domain model like those used in software architecture or more recent "digital twins". It includes the entities, features, and relations that are most important in your domain. You can add or omit them according to the way you see your business or domain – your vision is often part of your special sauce, so it's important to make it crystal clear.?

I think of your knowledge graph as a kind of encyclopedia that describes your business in terms (knowledge graphs) that algorithms can actually understand, not just store and print. For example, you might be in HR Tech and summarize your business like this:??

We match candidates to jobs using skills so we can optimize hiring.?

In this case, you can build your knowledge graph around match, candidate worker, job, skill, and hiring. This makes sense: you can't make much progress as a business in this field without a clear idea of what a good match is, which skills best describe a good candidate and an open job position, what a skill is, and how all this will help your clients to hire.? That's just 5 concepts to focus on for starters. Nailing down and aligning on actionable information about these concepts is more challenging and valuable than you might imagine.

Can you leverage open source knowledge graphs and ontologies?? Sure. But remember that your knowledge graph is there to capture how you think about your business.? An off-the-shelf resource will only rarely do that, so you have to be extra careful. On the other hand, comparing your domain model with a public knowledge graph will highlight how your thinking is different from the status quo.

领英推荐

? Study on operator bugs, 100 million images for just…

Learnk8s 1 个月前

Useful Docker Commands & Tricky Questions and Answers

Qaisar Abbas 1 个月前

Anthropic’s MCP: Set up Git MCP Agentic Tooling with…

Rick H. 3 个月前

A knowledge graph can be as big or small as you want it to be and still add value.

Of course you need to scope and prioritize your project. You want to start with the keystone concepts for your business and build a graph around those. Keystone concepts are the ones that your documents mention most frequently, the ones that have a huge impact on what your tech stack looks like, the ones that guide marketing and sales. They should show up in a one-sentence summary of your business, like the example above. I've written about knowledge graphs and skills as different keystone concepts, and others like Ideal Channel Partner, Next-Million-Customers Profile, Most-Valuable-Customer Profile, Sales this Quarter, etc. are equally important. Any organization has -- and needs to align on -- their keystone concepts across all teams – and with clients, as well.

As an example, at one point, manager (and its thousands (!) of variants) was one of the most common job titles on LinkedIn. Just structuring that one concept more systematically had a visible impact on search, analytics, and recommendations across the board.?

How much detail do you need for each concept? Just key information, not every conceivable feature. A knowledge graph describes each concept as an array of features and relations or a collection of triples or facts. You need enough explicit features both to make sure that concepts are distinct (they'll often overlap, which is OK, but no dupes allowed) and to include what's most important to know – the features that are most important to your way of thinking or to your business processes.? It's straightforward to add more detailed information (more concepts, features, or triples) incrementally as necessary.

Along the way, you'll find that the same term (like manager) can cover two or more concepts (like people manager vs asset manager, which have very different skill sets) and you'll see lots of synonyms (like supervisor or coordinator). These are all important to distinguish:? the key idea is to have only one meaning for each concept, along with a list of synonyms.

How will a knowledge graph add value?

Alignment. By aligning across the business on what each keystone concept means and documenting that understanding in a knowledge graph, you improve communication across teams and increase shared understanding of goals. You can use it to understand how your clients think, too. Think of the knowledge graph as a translator.
Data integration.? A knowledge graph helps you link existing data from different silos and double check that everyone's talking about the same things. It makes your existing data more valuable and more useful. This helps with better planning, better tracking, and better cost control. Think of the knowledge graph as a bridge between silos.
Knowledge aggregation.? Knowledge graphs accumulate institutional knowledge and expert experience as they grow. This mitigates problems of lost wisdom when people leave the organization and helps to make onboarding new people more effective. Think of the knowledge graph as the bedrock of your operations.
Enabling AI. Knowledge graphs open the door to more reliable use of other AI technologies like machine learning for search and recommendations or large language models for user interfaces. Think of the knowledge graph as an enabler.

Who does what?

Don't ask engineers to build a knowledge graph. Instead, get the right people for the job. For the content of the graph – the coherent knowledge part –, you need analytic linguists and people with experience building ontologies to create a knowledge graph worth having.??The engineers will be busy enough building the infrastructure to store, serve, and leverage the knowledge graph. And you will need guidance, too, because knowledge graphs are still very new to everyone.

Get your data scientists involved early and often. As you develop your keystone concepts, involve your data scientists and analytics people.? The best knowledge graphs are the ones that are supported by (and integrate) data, not separate from it.??

The triples that describe your most important concepts in your knowledge graph should match or map to the SELECT clauses that the data scientists use to extract and analyze your data.? Predicates in knowledge graph triples should map directly to queries or column labels in your databases, even if the wording is different. Mismatches between knowledge graph concepts and database entities should be welcomed as opportunities to align strategy with data. If a feature is important enough to appear in a query, include it in your knowledge graph.

Concepts, categories, and queries are basically the same thing:? collections of features. The best knowledge graphs have clear, explicit links between features in the graph and the granular data (in different silos) that you already have.? This is how to realize the benefits of one of the key superpowers of knowledge graphs: data integration.?

Go for it!

Building a knowledge graph is new and unfamiliar, yes. But it doesn't have to be huge or terrifying. With a bit of guidance and the right talent, you can make it happen.

Dmitry Ulanov

Senior analyst building a second brain

1 年

Thank you Mike! ?? Very interesting how it scales to an enterprise level.At the personal level, to my understanding, even perfectly linked pieces of knowledge deliver very little value.I can refer to the feedback from the Obsidian.md Personal Knowledge Management tool users that I also share, that notes graph is the most impressive and surprisingly most useless feature of the tool ?? For manual research, it makes sense only for graphs that are 1-2 levels depth from a topic. Global graph is just a wonderful pic that looks great but impossible to explore. While technologies are moving forward, it would? be great to finally get enterprise level analysis tools for personal level knowledge lakes ??

Putcha Narasimham

Founder Proprietor at Knowledge Enabler Systems

1 年

What is the definition of "knowledge" used here, please? How is it distinguished form "intelligence" natural or artificial.

Putcha Narasimham

Founder Proprietor at Knowledge Enabler Systems

1 年

The structure of Node Link Node corresponding to Subject Predicate Object each having multiple attributes is well known and sufficiently expressive. At times the Predicate which links Subject and Object may itself need a link to another Object as in A killed B "with C (a dagger)". Is there any generic / standard way of Linking Predicate (kill) with its own Object or means or device C, without confusing it with Object B? I have for a long time tackled it by defining an attribute "using" or "byMeansOf" having a "value = dagger". Strictly, "using" or "byMeansOf" is NOT an attribute or inherent property of the Predicate Kill. It is actually a Predicate of Predicate P connecting with C as Kill is a Predicate of A, connecting B. My present practice violates the principle that "an object cannot be an attribute of another object or here, predicate. It (the new object) should be linked with another Predicate to the first Predicate". I feel that using a second Predicate to link the first Predicate with its own associated Object is structurally meaningful and consistent. Let me know other means of elegant modeling this common requirement. An other similar requirement is, P fell from Q on R.

Suzanne Jozefowicz

1 年

Thank you Mike. This is where I struggle with vector databases... they don't seem to bring together the human & the machine element.

1 次回应

Suzanne Jozefowicz

1 年

Very succinctly explained - and yes, #Incorvus agrees - that's why we say "No knowledge, no #AI"!

1 次回应

查看更多评论

要查看或添加评论，请登录

Mike Dillinger, PhD的更多文章

Knowledge Graphs: Artificial Knowledge for Artificial Intelligence

2025年3月14日

Knowledge Graphs: Artificial Knowledge for Artificial Intelligence

Intelligence is simply being good at thinking: at using what you know to make sense of what you don't. That might be…

59 条评论
Herding, Culling, and Caging Predicates for Knowledge Graph Relations

2025年2月7日

Herding, Culling, and Caging Predicates for Knowledge Graph Relations

Lists and bags and sets are jumbles of items that I find aberrant and abhorrent. So when I see people blithely invent…

14 条评论
Diversity, Depth, and Density of Knowledge Graph Relations

2025年1月13日

Diversity, Depth, and Density of Knowledge Graph Relations

At the top of my list of New Year’s resolutions this year is relation resolution. In the world of structured knowledge,…

15 条评论
New Year's Resolutions for your Knowledge Graphs

2024年12月23日

New Year's Resolutions for your Knowledge Graphs

As you enjoy your holiday season, I suggest two resolutions to consider for the New Year: entity resolution and…

18 条评论
Knowledge Graphs and Monkey Business with Generative AI

2024年12月9日

Knowledge Graphs and Monkey Business with Generative AI

Throughout the year I got poked and prodded and challenged in a bunch of different ways by my friends, colleagues, and…

7 条评论
Thanks for them Knowledge Graphs

2024年11月28日

Thanks for them Knowledge Graphs

It's Thanksgiving Day here in the US. A time to count one's blessings.

10 条评论
Knowledge Graphs are Essential for Safe AI

2024年11月11日

Knowledge Graphs are Essential for Safe AI

AIs will only be safe for general use when they have and use goals and values that are identical to those of humans. In…

30 条评论
Knowledge graphs, Linguists, and the Last-mile problem of AI

2024年11月4日

Knowledge graphs, Linguists, and the Last-mile problem of AI

Now that AI can generate fluent text at scale in multiple languages and different styles, are authors, translators…

22 条评论
Audio: How to make AI safe and reliable?

2024年10月21日

Audio: How to make AI safe and reliable?

Janie and Johnny are back for Episode 2 of my Byte-sized AI series! Listen in to these engaging, bite-sized podcasts to…
Audio: What are Knowledge Graphs?

2024年10月1日

Audio: What are Knowledge Graphs?

Who knew? It seems that Max Headroom had blue-eyed twins and they're all grown up! I suspect that he sent them to…

10 条评论

See all articles

Scoping Knowledge Graphs

Mike Dillinger, PhD

领英推荐

How will a knowledge graph add value?

Who does what?

Go for it!

Mike Dillinger, PhD的更多文章

社区洞察

其他会员也浏览了

Hitchhiker's Guide To Privacy Engineering Chapter 6: How Websites Work? (Part 3)

October 19, 2024

Git's Delta Compression Algorithm: Technical Deep Dive

Navigating the Security Risks of Generative AI in Software Development

ScriptSmith: A Unified LLM Framework for Enhancing IT Operations via Automated Bash Script Generation, Assessment, and Refinement

?? End to End LLMOps Pipeline - Part 10 - Wrapping Up: Bringing It All Together using GitHub Action??

The Most Important Engineering Principle You've Never Heard Of

Rust Refactoring to enhance Modularity

SRE/Devops/Sysadmin newsletter : 2024/04

AI Code: Secure or Scary?

领英推荐

How will a knowledge graph add value?

Who does what?

Go for it!

Mike Dillinger, PhD的更多文章

Knowledge Graphs: Artificial Knowledge for Artificial Intelligence

Herding, Culling, and Caging Predicates for Knowledge Graph Relations

Diversity, Depth, and Density of Knowledge Graph Relations

New Year's Resolutions for your Knowledge Graphs

Knowledge Graphs and Monkey Business with Generative AI

Thanks for them Knowledge Graphs

Knowledge Graphs are Essential for Safe AI

Knowledge graphs, Linguists, and the Last-mile problem of AI

Audio: How to make AI safe and reliable?

Audio: What are Knowledge Graphs?

社区洞察

其他会员也浏览了

Hitchhiker's Guide To Privacy Engineering Chapter 6: How Websites Work? (Part 3)

October 19, 2024

Git's Delta Compression Algorithm: Technical Deep Dive

Navigating the Security Risks of Generative AI in Software Development

ScriptSmith: A Unified LLM Framework for Enhancing IT Operations via Automated Bash Script Generation, Assessment, and Refinement

?? End to End LLMOps Pipeline - Part 10 - Wrapping Up: Bringing It All Together using GitHub Action??

The Most Important Engineering Principle You've Never Heard Of

Rust Refactoring to enhance Modularity

SRE/Devops/Sysadmin newsletter : 2024/04

AI Code: Secure or Scary?