登录查看更多内容

点击“继续加入或登录”，即表示您同意遵守领英的《用户协议》、《隐私政策》及《Cookie 政策》。

Inferencing with SHACL using sh:values

Holger Knublauch

Lead Software Developer at TopQuadrant

发布日期: 2023年11月14日

SHACL is best known as a language for representing constraints on the shape of RDF graphs. But the W3C WG also produced a companion document called the SHACL Advanced Features that is now maintained by the SHACL Community Group. This part of the language includes features to represent inference rules that can be used to derive new statements from existing (asserted) statements.

In my experience these inferencing features are extremely useful for real-world application scenarios. We and our customers have had SHACL inference rules in production for many years and their design is stable. Let me explain how they work and why they are important.

Example: Counting Taxonomy Concepts

As a toy example, let's look at a SKOS Concept Scheme. The scheme instance is linked to 11 top concepts via its property skos:hasTopConcept. As shown below, there is an inferred property that computes this count automatically and on-the-fly so that it can be displayed to the user. Note there is no "top concept count" triple stored in the graph:

An example Concept Scheme with an inferred property counting the number of top concepts

The example Turtle code below shows how this works. It declares a SHACL property shape for the class skos:ConceptScheme which states that the values of the property ex:topConceptCount shall be computed by counting the values of the property skos:hasTopConcept:

skos:ConceptScheme
    a owl:Class, sh:NodeShape ;
    sh:property skos:ConceptScheme-topConceptCount ;
    ...
skos:ConceptScheme-topConceptCount
    a sh:PropertyShape ;
    sh:path ex:topConceptCount ;
    sh:datatype xsd:integer ;
    sh:description "The number of top concepts in this scheme." ;
    sh:maxCount 1 ;
    sh:name "top concept count" ;
    sh:values [
        sh:count [
            sh:path skos:hasTopConcept ;
        ] ;
    ] .

The property sh:values is part of the SHACL Advanced Features and instructs a capable SHACL processor on how the values of this property shall be computed. In this case a little structure of RDF blank nodes with properties such as sh:count and sh:path is used, but later we will show more complex examples.

If you struggle with the syntax, TopBraid includes a little wizard for common design patterns:

TopBraid's ontology editor has convenience features to create sh:values rules

General Syntax of sh:values Rules

The sh:values property links a Property Shape with a so-called Node Expression. SHACL defines a number of different Node Expression types. Most of them take a sequence of RDF nodes as input and produce another sequence of RDF nodes as output. This means that node expressions can be chained together. For example, the sh:count expression from the example above takes the sh:path expression as input and produces a sequence consisting of just a single integer node as output.

Check the SHACL-AF spec for a list of other Node Expression types. The most basic expression type is Constants, represented simply as URIs or RDF literals. On the more complex end of the spectrum, Node Expressions may perform arbitrary SPARQL SELECT queries to infer new values. And platforms like TopBraid even include an option to use JavaScript as an inference language.

In contrast to OWL inferencing, which was intentionally limited to a formally tractable subset of logic, SHACL rules have almost unlimited expressiveness. What is the point of a nice theoretical foundation in description logic if you cannot even concatenate strings or perform basic maths with OWL!

Example: If-Then-Else Rules

In this example below, the sh:if node expression points at an sh:exists expression that tests whether the current focus node is the capital of some country. If true, it produces "blue", otherwise it produces "red".

g:City-fillColor
    a sh:PropertyShape ;
    sh:path tbgeo:fillColor ;
    sh:datatype xsd:string ;
    sh:name "fill color" ;
    sh:values [
        sh:if [
            sh:exists [
                sh:path [
                    sh:inversePath g:capital ;
                ] ;
            ] ;
        ] ;
        sh:then "blue" ;
        sh:else "red" ;
    ] .

Such complex node expressions can be visualized as diagrams, illustrating that data "flows" from left to right, with each Node Expression taking one or more input values and producing zero or more output values.

Illustration of the data flow between SHACL node expressions

Node Expressions can be linked together like Lego bricks. For example, the input to the sh:then above may be yet another complex node expression instead of the constant "blue".

Querying Inferred Values

Implementations have some flexibility on what to do with sh:values rules. In our product, TopBraid, sh:values rules are applied as part of all GraphQL queries.

GraphQL queries compute inferred property values on-the-fly

The example GraphQL query above returns JSON with all instances of the class City, and for each City it returns a label and the fillColor, which is computed at query time using the if-then-else rule further above. The GraphQL user does not even need to know that fillColor is a completely virtual field that has no RDF triples asserted in the graph.

This works nicely and efficiently from GraphQL because the GraphQL engine knows in advance that the fillColor field is backed by an inference rule. It knows this because the surrounding query context is the City class, and the SHACL for City includes the sh:values rule.

Likewise, our Active Data Shapes (ADS) JavaScript framework computes inferred values whenever they are needed. Again, this is possible because the JS engine has enough context from the surrounding object to determine in advance if a property is inferred or not. So there is little performance overhead.

TopBraid's JavaScript framework also computes sh:values rules automatically

Processing sh:values rules is more difficult for a SPARQL engine, where typically no such context exists and all you have are BGP triples. and querying inferred values in the "inverse" direction is hard without materializing the triples first and that option is difficult if data changes often. In TopBraid we have therefore elected to not compute the inferences at query time, but have added a special "magic" property function to request them explicitly. This may, however, change in the future and other SPARQL engines may compute such values on the fly too.

Note that sh:path expressions inside of SHACL Node Expressions are designed to apply nested inferences on demand. So when a sh:values rule depends on a sh:path expression which is backed by another sh:values rule then those rules are computed when needed. This is similar to backward chaining.

Example: Complex Inferences and SPARQL

In this example, the SKOS taxonomy was enriched with an inferred property that computes the total number of narrower concepts (children) of a given Concept:

A SKOS Concept with an inferred property counting the number of children as shown in TopBraid EDG

The sh:values rule for this property computes the sh:count of a complex SHACL sh:path expression that walks into the concept hierarchy using sh:oneOrMorePath of the inverse of skos:broader:

skos:Concept-totalNarrowerCount
    a sh:PropertyShape ;
    sh:path g:totalNarrowerCount ;
    sh:datatype xsd:integer ;
    sh:name "total narrower count" ;
    sh:values [
        sh:count [
            sh:path [
                sh:oneOrMorePath [
                    sh:inversePath skos:broader ;
                ] ;
            ] ;
        ] ;
    ] .

Here is the same example using a SPARQL SELECT Expression:

   sh:values [
       sh:prefixes <https://topbraid.org/skos.shapes> ;
       sh:select """
           SELECT (COUNT(?narrower) AS ?count)
           WHERE { 
               ?narrower skos:broader+ $this .
           } """ ;
   ] .

SPARQL can be regarded as the ultimate fallback that gives a lot of expressiveness for things that cannot be covered using other Node Expressions. Almost all SHACL Node Expression types have a direct translation into SPARQL, but there is not (yet) a concept of variables that could be used for joins.

Example: Filtering by Shapes

Here we define a property that only contains the preferred label(s) that have a German language tag:

skos:Concept-germanLabel
    a sh:PropertyShape ;
    sh:path ex:germanLabel ;
    sh:name "German label" ;
    sh:values [
        sh:nodes [
            sh:path skos:prefLabel ;
        ] ;
        sh:filterShape [
            sh:languageIn ( "de" ) ;
        ] ;
    ] .

The sh:values rule above uses sh:filterShape which takes the values of the path skos:prefLabel as its input and only keeps those that conform to the given SHACL shape. Here, each preferred label is checked whether it has a language as defined by sh:languageIn. You may also use any other shape here, implementing complex filter conditions with SHACL features like sh:minCount, sh:node and sh:hasValue.

Example: Rules using ADS JavaScript

Within TopBraid, rules may be backed by arbitrary ADS JavaScript snippets. Here is an example from the Software Knowledge Graph from my previous article. This property rule performs a regular expression search over documentation markup to extract links to components that are mentioned in the markup:

A sh:values rule backed by JavaScript code

We have several other examples where we use JavaScript to infer rdf:HTML literals to produce custom renderings of values on forms. And yes, such JavaScript rules can also make web service calls, for example to query an LLM when needed...

While these capabilities go way beyond the W3C specification at this stage, they illustrate our commitment to delivering practical solutions to real-world problems that are typically found in enterprise settings.

Where to Go From Here

If you want to play with sh:values rules, you can use the open source TopBraid SHACL API. Enterprise users can find comprehensive inference support in the TopBraid EDG product line. I cannot say which other vendors are supporting them at this time.

Now that SHACL has been a well-established W3C standard since 2017, it is quite possible that official efforts towards a next generation of SHACL are relaunched. The next version of SHACL may be the result of another full-blown formal W3C Working Group process, but SHACL could also become a "Living Standard" where features are added incrementally once enough implementations exist. I would very much hope that the SHACL specifications get restructured and widened in scope to include SHACL Inferencing as a dedicated document. SHACL (Core) already defines the framework for defining shapes, targets, property definitions, so adding just one more property for inferencing is a natural and sensible extension. How do other vendors and users feel about this?

Jesse Lambert

Knowledge Graph Engineer / Solutions Engineer at TopQuadrant

1 年

Nice article Holger. I always liked using the sh:values>sh:select pattern because I could query the full SPARQL query out of the graph anytime I wanted.

1 次回应

Thomas Francart

Web de Données · Knowledge Graphs · Ontologies · sparna.fr

1 年

Thanks Holger. Let’s keep SHACL for what it is : shapes to specify the structure of a graph. While making inférence like this is undoubtedly useful, I think mixing it in the same vocabulary/standard than shapes is confusing for users.

4 次回应

Emiliano Reynares

Boosting quality of data in healthcare and life sciences.

1 年

Always glad to see solutions like this one, driven by real-world application scenarios instead of theoretical purism. Thanks for sharing it with such a level of details!

1 次回应

Robin F.

1 年

Hi Holger, nice piece! About your last paragraph - what do you mean with "adding one more property"? Would it be a matter of putting the logic of inferencing in some special propertyshapes, and then being able to run the same command to do both validation and inferencing? Curious how the problem of sequence would be solved in that case (do you inference first, and then validate your data or vice versa? )

Martin Voigt

free your time to create.

1 年

Thanks for the summary. We also make use of this nice feature ??

1 次回应

查看更多评论

要查看或添加评论，请登录

Holger Knublauch的更多文章

W3C Semantic Technology Standards

2024年11月30日

W3C Semantic Technology Standards

For newcomers into the world of semantic technology and knowledge graphs, the diagram above illustrates some of the key…

60 条评论
Ontology Modeling with SHACL: Defining Forms for Instance Data

2024年1月9日

Ontology Modeling with SHACL: Defining Forms for Instance Data

The previous articles of this series, such as Getting Started, have introduced SHACL as a language for representing…

1 条评论
Ontology Modeling with SHACL: Defining Constraint Components

2023年12月6日

Ontology Modeling with SHACL: Defining Constraint Components

In this part of our SHACL tutorial, we will show how SHACL itself can be extended. In the previous article, we have…
Ontology Modeling with SHACL: SPARQL-based Constraints

2023年12月5日

Ontology Modeling with SHACL: SPARQL-based Constraints

In this part of our SHACL tutorial we will show how to express complex conditions with the help of the RDF query…

12 条评论
Ontology Modeling with SHACL: Qualified Cardinality Constraints

2023年11月28日

Ontology Modeling with SHACL: Qualified Cardinality Constraints

This is the second part of a SHACL tutorial. The first part Getting Started has introduced the basic and most commonly…

2 条评论
Ontology Modeling with SHACL: Getting Started

2023年11月21日

Ontology Modeling with SHACL: Getting Started

In the world of Knowledge Graphs, an Ontology is a domain model defining classes and properties. Classes are the types…

16 条评论
Building a Knowledge Graph for a Software Company

2023年10月10日

Building a Knowledge Graph for a Software Company

Using TopBraid to build a Knowledge Graph about TopBraid Knowledge graphs can be used to represent information about…

8 条评论

See all articles

Example: Counting Taxonomy Concepts

General Syntax of sh:values Rules

Example: If-Then-Else Rules

Querying Inferred Values

Example: Complex Inferences and SPARQL

Example: Filtering by Shapes

Example: Rules using ADS JavaScript

Where to Go From Here

Holger Knublauch的更多文章

W3C Semantic Technology Standards

Ontology Modeling with SHACL: Defining Forms for Instance Data

Ontology Modeling with SHACL: Defining Constraint Components

Ontology Modeling with SHACL: SPARQL-based Constraints

Ontology Modeling with SHACL: Qualified Cardinality Constraints

Ontology Modeling with SHACL: Getting Started

Building a Knowledge Graph for a Software Company

社区洞察