登录查看更多内容

Ontology Modeling with SHACL: SPARQL-based Constraints

Holger Knublauch

Lead Software Developer at TopQuadrant

发布日期: 2023年12月5日

In this part of our SHACL tutorial we will show how to express complex conditions with the help of the RDF query language SPARQL. The article builds upon the first part Getting Started in which we have introduced a simple ontology to represent the state of a Chess game and its pieces.

In addition to declaring the structure of entities in a knowledge graph (e.g. chess pieces), SHACL can be used to define constraints such as "There must always be exactly two kings on a Chess board: a white king and a black king" from the previous article on Qualified Cardinality Constraints. In general, it is ideal if the purely declarative shapes of SHACL (Core) are sufficient to express such constraints. But there are scenarios where even more complex conditions need to be expressed that do not fit into shapes alone. For many of these use cases, SHACL-SPARQL is a good choice.

Use Case

The example constraint that we will develop here represents one of the implicit rules of Chess:

The two kings cannot be placed on adjacent squares

For those who are not familiar with Chess, this is because kings cannot move into a position where they are under attack, and kings can attack all 8 squares around them. So in the following position, the black king could never have been moved next to the white king without running into a check:

How to express this as a constraint?

There MAY be a way to model this with pure SHACL Core but I believe it would require a very complex ontology where all pieces would need explicit links to the pieces that they attack. In such cases it is better to use a richer constraint language than to artificially bloat the ontology. Therefore: SPARQL to the rescue!

Playing with SPARQL

As this is a surprisingly complex condition to express even with SPARQL, let's try to break the problem down. One algorithm would be to

Find the squares of the two Kings on the board
Convert the squares into (x, y) coordinates, e.g. "a2" would become (1, 2)
Check whether the two coordinates are within the (-1 .. +1) range

Now let's write some SPARQL queries to collect the information we need. Here is the query for step 1 above:

# Query to find the squares of the two kings for $this Game
SELECT ?ws ?bs
WHERE {
    $this shess:piece ?whiteKing .
    ?whiteKing a shess:King .
    ?whiteKing shess:color shess:White .
    ?whiteKing shess:square ?ws .
    $this shess:piece ?blackKing .
    ?blackKing a shess:King .
    ?blackKing shess:color shess:Black .
    ?blackKing shess:square ?bs .
}

For step 2, given a square such as "a1" here is a query to produce (x, y) coordinates that are easier to process in maths.

# Query to convert Chess square string "a1" to x,y integers
SELECT ?x ?y
WHERE {
    BIND ("a1" AS ?s) .
    BIND (xsd:integer(SUBSTR(?s, 2, 1)) AS ?y) .
    BIND (SUBSTR(?ws, 1, 1) AS ?l) .
    BIND (STRLEN(STRBEFORE("abcdefgh", ?l)) + 1 AS ?x) .
}

Note that this is a rather ugly query as, to my knowledge, SPARQL does not have an easy way to turn characters into their ASCII code, nor a built-in function to find the position of a character in a string. Thus the hack with STRLEN/STRBEFORE.

With these problems solved, we can now write a single SPARQL query that delivers all squares ?ws and ?bs where ?ws contains a white king and ?bs the black king, when they are both on adjacent squares:

Running test SPARQL queries (screenshot of TopBraid EDG)

Once we are happy with the query and have tested it with some examples and counter-examples, we can turn it into a proper SHACL constraint.

SPARQL-based Constraints

The official SHACL specification consists of the SHACL Core features (which are now implemented by basically all RDF triple stores) and the SHACL-SPARQL features. One of the SPARQL-related features is SPARQL-based Constraints.

Below is an example declaration of the constraint that the two kings must not be on adjacent squares. We have attached it to the class/shape shess:Game using the property sh:sparql:

shess:Game
    ...
    sh:sparql shess:KingsCannotAttackEachOtherConstraint .

shess:KingsCannotAttackEachOtherConstraint
    a sh:SPARQLConstraint ;
    rdfs:label "Kings cannot attack each other constraint" ;
    sh:message "King on {?ws} cannot attack King on {?bs}" ;
    sh:select """

    PREFIX shess: <https://example.org/shess#>
    SELECT $this ?ws ?bs
    WHERE {
        $this shess:piece ?whiteKing .
        ?whiteKing a shess:King .
        ?whiteKing shess:color shess:White .
        ?whiteKing shess:square ?ws .
        $this shess:piece ?blackKing .
        ?blackKing a shess:King .
        ?blackKing shess:color shess:Black .
        ?blackKing shess:square ?bs .
        BIND (xsd:integer(SUBSTR(?ws, 2, 1)) AS ?wy) .
        BIND (SUBSTR(?ws, 1, 1) AS ?wl) .
        BIND (STRLEN(STRBEFORE("abcdefgh", ?wl)) + 1 AS ?wx) .
        BIND (xsd:integer(SUBSTR(?bs, 2, 1)) AS ?by) .
        BIND (SUBSTR(?bs, 1, 1) AS ?bl) .
        BIND (STRLEN(STRBEFORE("abcdefgh", ?bl)) + 1 AS ?bx) .
        FILTER (?bx >= ?wx-1 && ?bx <= ?wx+1 && 
                ?by >= ?wy-1 && ?by <= ?wy+1) .
    }
    """ .

As you can see, the query is basically the same as from the screenshot above. We only needed to also make sure that the query returns $this in addition to the squares ?ws and ?bs. During validation, the SELECT query will be executed for each instance of shess:Game, which can be used in the query with the variable $this. Then, for every result row, one constraint violation will be reported. In other words, you need to formulate a SPARQL query that finds the negative cases where the condition is violated.

The constraint violations will contain the sh:message where the variables ?ws and ?bs can be used to produce helpful output:

领英推荐

Object Segmentation vs. Object Detection - Which One…

Ritesh Kanjee 2 年前

Checkmate with AI: Advancing Chess Strategy through…

Allan Cruz 1 年前

Discover GANs Power: Unleash with TensorFlow

Dr. Partha Majumder 4 个月前

Constraint violation as rendered by TopBraid EDG

So: Mission Accomplished... although for my taste the query looks rather complex. Let's see how this complexity can be reduced.

User-Defined SPARQL Functions

The SPARQL language has a built-in extension point where engines can define new functions in addition to the built-in functions. Most SPARQL engines have their own extension functions, but there is no widely accepted mechanism to make these functions interoperable.

The SHACL Advanced Features specification has introduced a mechanism that allows anyone to declare new SPARQL functions and to distribute them like linked data, in RDF. Note that few SHACL (or SPARQL) engines currently support them. The open-source TopBraid SHACL API does support them, and the feature is widely used by customers of the TopBraid EDG enterprise platform.

Here are some user-defined SPARQL functions that will make writing SHACL constraints much easier, because they encapsulate reusable query logic.

The function shess:getKingSquare takes a Game and a Color as arguments and returns the square of the King with the given color:

shess:getKingSquare
    a sh:SPARQLFunction ;
    rdfs:label "get King square" ;
    sh:parameter [
        sh:path shess:game ;
        sh:class shess:Game ;
        sh:order 0 ;
    ] ;
    sh:parameter [
        sh:path shess:color ;
        sh:class shess:Color ;
        sh:order 1 ;
    ] ;
    sh:returnType xsd:string ;
    sh:select """
        PREFIX shess: <https://example.org/shess#>
        SELECT ?s
        WHERE {
            $game shess:piece ?king .
            ?king a shess:King .
            ?king shess:color $color .
            ?king shess:square ?s .
        }""" .

The function shess:getSquareX converts a square string such as "a1" into its X position as an xsd:integer between 1 and 8:

shess:getSquareX
    a sh:SPARQLFunction ;
    rdfs:label "get square X" ;
    sh:parameter [
        sh:path shess:square ;
        sh:datatype xsd:string ;
    ] ;
    sh:returnType xsd:integer ;
    sh:select """
        SELECT ?x
        WHERE {
            BIND (SUBSTR(?square, 1, 1) AS ?letter) .
            BIND (STRLEN(STRBEFORE("abcdefgh", ?letter)) + 1 AS ?x) 
        }""" .

Finally, the function shess:getSquareY extracts the digit from a square such as "a1" and converts it into an xsd:integer such as 1:

shess:getSquareY
    a sh:SPARQLFunction ;
    rdfs:label "get square Y" ;
    sh:parameter [
        sh:path shess:square ;
        sh:datatype xsd:string ;
    ] ;
    sh:returnType xsd:integer ;
    sh:select """
        PREFIX xsd: <https://www.w3.org/2001/XMLSchema#>
        SELECT ?y
        WHERE {
            BIND (xsd:integer(SUBSTR(?square, 2, 1)) AS ?y) .
        }""" .

With these helper functions, we can now significantly simplify the constraint:

SELECT $this ?ws ?bs
WHERE {
    BIND (shess:getKingSquare($this, shess:White) AS ?ws) .
    BIND (shess:getKingSquare($this, shess:Black) AS ?bs) .
    BIND (shess:getSquareY(?ws) AS ?wy) .
    BIND (shess:getSquareX(?ws) AS ?wx) .
    BIND (shess:getSquareY(?bs) AS ?by) .
    BIND (shess:getSquareX(?bs) AS ?bx) .
    FILTER (?bx >= ?wx-1 && ?bx <= ?wx+1 && 
            ?by >= ?wy-1 && ?by <= ?wy+1) .
}

Not only is this particular constraint now much shorter, we can also reuse the same business logic in other scenarios. For example we could use the functions to compute all squares that are reachable by a given King. User-defined SPARQL functions are like lego bricks or stored procedures.

Summary and Outlook

SHACL-SPARQL constraints can be used to express complex conditions. Thus, any condition that can be captured as a SPARQL query can be attached to SHACL shapes for validation purposes. We have shown one particular (complex) use case but there are countless others. We have then also shown how user-defined SPARQL functions can improve modularity of your constraints and queries.

There are several other SPARQL-based features in SHACL that we didn't cover in this article. For example there are SPARQL-based targets that can be used to fine-tune which constraints apply to which nodes. Even more importantly, there are SPARQL-based Constraint Components that are a mechanism to extend SHACL itself by introducing new constraint types backed by SPARQL queries. This topic will be covered in the next article.

Even with SPARQL there are limitations. While we were able to (somehow) express what we needed for this article within the official SPARQL standard, it would arguably have been more natural to express the condition in an even richer language that offers better string processing features. In our product we have added the ability to express constraints and other ontology features using the Active Data Shapes (ADS) JavaScript framework. This offers basically unlimited expressiveness while retaining the declarative nature of RDF-based ontologies. We will likely write more about this in the future. Meanwhile, the previous articles on sh:values and the software company knowledge graph had some ADS examples.

Appendix: Prefix Declarations

One final detail on the SHACL syntax: How to declare namespace prefixes so that they do not need to be repeated in each query. We keep getting questions about this and the specification about this is poorly written (yes, I know).

Here is the thing: namespace prefixes are not part of the RDF data model but rather live in the serializations such as Turtle and SPARQL only. For a SHACL engine to understand them, the namespace prefix declarations need to be lifted into the data model, as RDF triples. Here is how to do this correctly:

<https://example.org/shess>
    a owl:Ontology ;
    rdfs:label "Chess in SHACL Example Ontology" ;
    owl:imports <https://datashapes.org/dash> ;
    sh:declare [
        a sh:PrefixDeclaration ;
        sh:namespace "https://example.org/shess#"^^xsd:anyURI ;
        sh:prefix "shess" ;
    ] .

shess:KingsCannotAttackEachOtherConstraint
    a sh:SPARQLConstraint ;
    sh:prefixes <https://example.org/shess> ;
    sh:select """

       SELECT $this ?ws ?bs
       WHERE {
           $this shess:piece ?whiteKing .
    ...

In this case, the sh:select query can use the prefix shess: without having to explicitly declare it in the query string. This is because the sh:prefixes point at an RDF resource that holds the sh:declare statement for it.

It is not correct to write sh:prefixes shess: unless the resource shess: carries the sh:declare triple. In the case above it doesn't, because shess: would be <https://example.org/shess#> instead of <https://example.org/shess>

The usual design pattern is to attach the prefix declaration to the resource that represents the graph itself (aka base URI). That resource is often an owl:Ontology that owl:imports other graphs. The SHACL prefix mechanism will walk into these other graphs and collect all prefix declarations from them. For example, when your graph owl:imports <https://datashapes.org/dash> your SHACL-SPARQL queries can use the commonly needed namespaces such as xsd: and rdfs: for free.

Joshua Cornejo

Information and Knowledge Architect

1 年

I think SHACL is a big step forward from OWL in terms of clarity, conciseness and artefacts to work with knowledge/ontologies. But ... I can't get my head around SPARQL as a step forward in the same direction - it gives an extremely decorated and complex to trace phrases (that's the general opinion when you try to find large scalable graph databases that support SHACL/SPARQL).

查看更多评论

要查看或添加评论，请登录

Holger Knublauch的更多文章

W3C Semantic Technology Standards

2024年11月30日

W3C Semantic Technology Standards

For newcomers into the world of semantic technology and knowledge graphs, the diagram above illustrates some of the key…

60 条评论
Ontology Modeling with SHACL: Defining Forms for Instance Data

2024年1月9日

Ontology Modeling with SHACL: Defining Forms for Instance Data

The previous articles of this series, such as Getting Started, have introduced SHACL as a language for representing…

1 条评论
Ontology Modeling with SHACL: Defining Constraint Components

2023年12月6日

Ontology Modeling with SHACL: Defining Constraint Components

In this part of our SHACL tutorial, we will show how SHACL itself can be extended. In the previous article, we have…
Ontology Modeling with SHACL: Qualified Cardinality Constraints

2023年11月28日

Ontology Modeling with SHACL: Qualified Cardinality Constraints

This is the second part of a SHACL tutorial. The first part Getting Started has introduced the basic and most commonly…

2 条评论
Ontology Modeling with SHACL: Getting Started

2023年11月21日

Ontology Modeling with SHACL: Getting Started

In the world of Knowledge Graphs, an Ontology is a domain model defining classes and properties. Classes are the types…

16 条评论
Inferencing with SHACL using sh:values

2023年11月14日

Inferencing with SHACL using sh:values

SHACL is best known as a language for representing constraints on the shape of RDF graphs. But the W3C WG also produced…

10 条评论
Building a Knowledge Graph for a Software Company

2023年10月10日

Building a Knowledge Graph for a Software Company

Using TopBraid to build a Knowledge Graph about TopBraid Knowledge graphs can be used to represent information about…

8 条评论

See all articles

Ontology Modeling with SHACL: SPARQL-based Constraints

Holger Knublauch

Lead Software Developer at TopQuadrant

Use Case

Playing with SPARQL

SPARQL-based Constraints

领英推荐

User-Defined SPARQL Functions

Summary and Outlook

Appendix: Prefix Declarations

Holger Knublauch的更多文章

社区洞察

其他会员也浏览了

Discover GANs Power: Unleash with TensorFlow

?? Scikit-learn Demystifies Linear Algebra Challenges for You! ??

BERT for Topic Modeling - Bidirectional Encoders Representation of Transformers - Part 5

GPT as an AI Assisted 3D Modelling Creation Platform: An Intriguing Experiment

#3. Math for ML Part 1: Linear Algebra

Evil Twins Are No Longer Just a Bad Plot Device

7 Steps of Image Pre-Processing to Improve Ocr Using Python

Model Debugging: Sensitivity Analysis, Adversarial Training, Residual Analysis

Unveiling the Tapestry of Topics: A Journey through Topic Modeling Techniques

A Beginner Introduction to Fuzzy Logic with Matlab - Part 1

Use Case

Playing with SPARQL

SPARQL-based Constraints

领英推荐

User-Defined SPARQL Functions

Summary and Outlook

Appendix: Prefix Declarations

Holger Knublauch的更多文章

W3C Semantic Technology Standards

Ontology Modeling with SHACL: Defining Forms for Instance Data

Ontology Modeling with SHACL: Defining Constraint Components

Ontology Modeling with SHACL: Qualified Cardinality Constraints

Ontology Modeling with SHACL: Getting Started

Inferencing with SHACL using sh:values

Building a Knowledge Graph for a Software Company

社区洞察

其他会员也浏览了

Discover GANs Power: Unleash with TensorFlow

?? Scikit-learn Demystifies Linear Algebra Challenges for You! ??

BERT for Topic Modeling - Bidirectional Encoders Representation of Transformers - Part 5

GPT as an AI Assisted 3D Modelling Creation Platform: An Intriguing Experiment

#3. Math for ML Part 1: Linear Algebra

Evil Twins Are No Longer Just a Bad Plot Device

7 Steps of Image Pre-Processing to Improve Ocr Using Python

Model Debugging: Sensitivity Analysis, Adversarial Training, Residual Analysis

Unveiling the Tapestry of Topics: A Journey through Topic Modeling Techniques

A Beginner Introduction to Fuzzy Logic with Matlab - Part 1