Why You Should Be Using XSLT 3.0

Why You Should Be Using XSLT 3.0

Eighteen years ago, the originators of XML specification faced a problem: how to use the new language to generate a book-publishing format. What emerged were two new languages, the first for describing the various functional parts of a publication in XML called the XML Stylesheet Language Formatting Objects (ultimately XSL-FO) and the XML Stylesheet language (XSLT) for transforming XML-formatted content into the XSL-FO language. 

XSL-FO is still in use today, though the number of formatting languages in XML has grown beyond the initial scope of FO. Additionally, CSS has been quietly overtaking FO for many simpler document transformations, to the extent that many eBooks (specifically those based upon the ePub standard) are essentially HTML + CSS. However, XSLT has taken its own remarkable trajectory, as people began to realize that the problem of transforming XML transcended just publishing books and covered transforms from any format to any other.

A problem that XSLT adoption has faced comes due to the difficulties in getting older implementations upgraded. Java ships with Xalan. Xalan has not been improved since it was first incorporated into Java back in 2000 and it still uses the very first version of XSLT, standardized in 1999. The Linux based libxslt processor is similar; while it is a good implementation for the Linux platform, it has not been upgraded since it was written in the early 2000s. Since then there have been two more major versions released of the XSLT standard, the first (XSLT 2.0) in 2006, the second (XSLT 3.0) scheduled to be released this year. These versions are backwards compatible, which means that XSLT 1.0 stylesheets written fifteen years ago should still work today in contemporary XSLT engines with little to no modification.

Moreover, swapping out XSLT versions is typically as simple as dropping a more contemporary engine, such as the Saxon processor, into a folder in your Java project and changing a line in a configuration file. Most Java developers could do it in under ten minutes, and there are both open source and commercial versions of these for free up to a fairly modest licensing fee. Upgrading similar systems on Windows (such as Altova's XSLT server or the Quixslt XSLT processor) is usually nearly as easy. There really are very few reasons why you should not upgrade.

The question, of course, is what benefits do you get for that upgrade? There are a number of them, but it's worth going through the key ones to understand why upgrading (preferably to XSLT 3.0) is so worthwhile.

JSON Transformations

In XSLT 3.0, an inbound document can be in JSON, rather than XML. The processor can take that document, use the json-to-xml() function to convert it into a specific known XML format, process that through the templates, then convert the resulting output back into JSON (or can convert it into HTML 5 among other formats).

For instance, the following inbound JSON content

{"employees":{
    "jd101":{
        "firstname":"Jane",
        "surname":"Doe",
        "department":"IT",
        "manager":"kp102"
        },
    "kp102":{
        "firstname":"Kitty",
        "surname":"Pride",
        "department":"IT",
        "manager":"jh104"
        },
    "cx103":{
        "firstname":"Charles",
        "surname":"Xavier",
        "department":"Management"
        },
    "jh104":{
        "firstname":"James",
        "surname":"Howlett",
        "department":"Security",
        "manager":"cx103"
        }        
    }}

will get transformed to an internal XML representation through the json-to-xml() function:

<j:map xmlns:j="https://www.w3.org/2013/XSL/json">
   <j:map key="employees">
      <j:map key="jd101">
         <j:string key="firstname">Jane</j:string>
         <j:string key="surname">Doe</j:string>
         <j:string key="department">IT</j:string>
         <j:string key="manager">kp102</j:string>
      </j:map>
      <j:map key="kp102">
         <j:string key="firstname">Kitty</j:string>
         <j:string key="surname">Pride</j:string>
         <j:string key="department">IT</j:string>
         <j:string key="manager">jh104</j:string>
      </j:map>
      <j:map key="cx103">
         <j:string key="firstname">Charles</j:string>
         <j:string key="surname">Xavier</j:string>
         <j:string key="department">IT</j:string>
      </j:map>
      <j:map key="jh104">
         <j:string key="firstname">James</j:string>
         <j:string key="surname">Howlett</j:string>
         <j:string key="department">IT</j:string>
         <j:string key="manager">cx103</j:string>
      </j:map>
   </j:map>
</j:map>

Now, suppose that you wanted to map this to a different data structure, such as an array of objects. The templates to do so would look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="https://www.w3.org/1999/XSL/Transform"
    xmlns:xs="https://www.w3.org/2001/XMLSchema"
    xmlns:math="https://www.w3.org/2005/xpath-functions/math"
    xmlns:xd="https://www.oxygenxml.com/ns/doc/xsl"
    xmlns:emp="https://www.semanticalllc.com/ns/employees#"
    xmlns:h="https://www.w3.org/1999/xhtml"
    xmlns:fn="https://www.w3.org/2005/xpath-functions"
    
    xmlns:j="https://www.w3.org/2013/XSL/json"
    exclude-result-prefixes="xs math xd h emp"
    version="3.0"
    expand-text="yes"
    >
    <xsl:output  method="text" indent="yes" media-type="text/json" omit-xml-declaration="yes"/>
    <xsl:variable name="employees-a" select="json-to-xml(/)"/>
    <xsl:template match="/">
       <xsl:variable name="persons-b">
           <xsl:apply-templates select="$employees-a/*"/>
       </xsl:variable>
       {xml-to-json($persons-b,map{'indent':true()})}
    </xsl:template>
    <xsl:template match="/j:map">
            <j:map>
                <j:array key="persons">
                    <xsl:apply-templates select="j:map[@key='employees']/j:map" mode="employee"/>
                </j:array>
            </j:map>
    </xsl:template>
    <xsl:template match="j:map" mode="employee">
        <j:map>
            <j:string key="id">{@key}</j:string>
            <j:string key="fullName">{j:string[@key='firstname']||' '||j:string
[@key='surname']}</j:string>
            <j:string key="reverseName">{j:string
[@key='surname']||', '||j:string
[@key='firstname']}</j:string>
            <xsl:copy-of select="*[@key=('firstname','surname','department')]"/>
            <xsl:if test="j:string [@key='manager']">
            <j:string key="reportsTo">{j:string
[@key='manager']/text()}</j:string>
            </xsl:if>
        </j:map>
    </xsl:template>
</xsl:stylesheet>

As a side note, the namespace for the XML-ified verson of the JSON (the namespace referred to by the j: prefix) has changed several times, over the course of the XSLT 3.0 recommendation, so it's probably worth experiment with the json-to-xml() function to see what namespace the processor currently uses. 

This can then be converted back to JSON with the xml-to-json() function, resulting in the following output:


  { "persons" : 
    [ 
      { "id" : "jd101",
        "fullName" : "Jane Doe",
        "reverseName" : "Doe, Jane",
        "firstname" : "Jane",
        "surname" : "Doe",
        "department" : "IT",
        "reportsTo" : "kp102" },
      
      { "id" : "kp102",
        "fullName" : "Kitty Pride",
        "reverseName" : "Pride, Kitty",
        "firstname" : "Kitty",
        "surname" : "Pride",
        "department" : "IT",
        "reportsTo" : "jh104" },
      
      { "id" : "cx103",
        "fullName" : "Charles Xavier",
        "reverseName" : "Xavier, Charles",
        "firstname" : "Charles",
        "surname" : "Xavier",
        "department" : "Management" },
      
      { "id" : "jh104",
        "fullName" : "James Howlett",
        "reverseName" : "Howlett, James",
        "firstname" : "James",
        "surname" : "Howlett",
        "department" : "IT",
        "reportsTo" : "cx103" } 
    ] 
}

Attribute and Text Value Templates

One argument that has dogged XSLT from the beginning is it's verbosity, along with similar arguments that certain expressions are difficult to manage when the @select attribute is used for evaluating text expressions (specifically with the <xsl:value-of> element). With XSLT 3, an innovation that first appeared in XQuery has made its way into the XSLT language - the use of text value templates (attribute value templates appeared in XSLT2).

The idea here is simple - any time an expression can be evaluated as a string or similar atomic value, the <xsl:value-of> statement can be replaced with braces "{}". Such expressions are called text value templates, or TVTs. For instance, the following generates the full name of an employee from the first and last name, using XSLT 1.0.

<j:string key="fullName"><xsl:value-of 
    ?select="fn:concat(j:string[@key='firstname'], ' ', ?j:string[@key='surname'])"/></j:string>

In XSLT 3.0, this can be rewritten as

<j:string key="fullName">{
    j:string[@key='firstname'] || ' ' || j:string[@key='surname']
}</j:string>

Not only does the latter require fewer keystrokes (reducing the verbosity of the language) but it is also easier to follow, especially with the concat operators ("||") replacing the concat() function. This also solves a big problem with XSLT 1 where you had a select expression which needed both single and double quotes within attributes.

Evaluating XPaths Dynamically

A related capability within XSLT 3.0 is the <xsl:evaluate> tag, which evaluates the expression within the @xpath attribute to turn it into an XPath expression, then re-uses the XPath expression itself to select the appropriate nodes.

<xsl:template match="j:map" mode="employee">
    <xsl:variable name="prop" select="'j:string'"/>
    <xsl:variable name="key" select="'firstname'"/>
    <output>?<xsl:evaluate($prop||"[@key='"||$key||"']") context="."/></output>
</xsl:template>?

In this particular case, the expression

$prop||"[@key='"||$key||"']"

gets converted into the XPath string:

j:string[@key="firstname"]

This is then evaluated to retrieve (for the first entry) the value

"Jane"

This can evaluate both individual string or similar atomic values and sequences of nodes. Note that this is less efficient than using static content because it's harder to optimize, but invaluable in those cases where you are building stylesheets that build stylesheets (a surprisingly common design pattern, by the way).

Functions and Types

An XSLT 2.0 addition carried over into XSLT 3.0 is the introduction of functions and typed variables. These provide a huge boost over named templates.

In XSLT 1.0, if you wanted to evaluate a "function" you needed to use a named template:

<xsl:template name="multiple-template">
    <xsl:param name="a"/>
    <xsl:param name="b"/>
    <xsl:value-of select="$a * $b"/>
</xsl:template>????

This could only be invoked outside of attributes, so going from:

<multiply a="10" b="20"/>

to

<multiply-result value="200"/>

would look something like:

<xsl:template match="multiply">
     <xsl:variable name="result">
     <xsl:call-template name="multiply-template">
          <xsl:with-param name="a" select="number(@a)"/>
          ?<xsl:with-param name="b" select="number(@b)"/>
          <xsl:value-of select="$a * $b"/>?
     </xsl:call-template>
     </xsl:variable>
     <multiply-result>
         ?<xsl:attribute name="value">
             <xsl:value-of select="$result"/>
         </xsl:attribute>?
     </multiply-result>?
?</xsl:template>

With XSLT 3 (really XSLT 2), you can simplify this considerably by creating functions and using datatypes:

<xsl:function name="myNS:multiply" as="xs:double">
    <xsl:param name="a" as="xs:double"/>
    <xsl:param name="b" as="xs:double"/>
    {$a * $b}??
</xsl:function>

<xsl:template match="multiply">
    <multiply-result value="{myNS:multiply(@a,@b)}"/>
</xsl:template>?

There are several key insights here. First functions can be defined in their own namespaces and can be imported as libraries. This is accomplished in the xsl:stylesheet header if not inline. Import functions overrule local functions of the same name, while with <xsl:include> local functions take precendence.

<xsl:stylesheet version="3.0"
    extension-element-prefixes="myNS"
    xmln:nyMS = "https://www.example.com/ns/myNS#"
    >
    <xsl:import href="myFunctions.xsl"/>???
    ?

This can provide portability of functions across platforms in a consistent manner, as you can move from Java to C++ to eventually Javascript without needing to change function code. Extending functions with native code is similarly supported. Functions are invoked anyplace that an XPath expression can be evaluated, and can both accept and return nodes and sequences of nodes. Sequences of nodes (and mixed types) also replace node-sets in XSLT 1.0, giving much more flexibility without the requirement of the node-set() function (which is usually retained for backward compatibility but is now essentially just a pass-through function.

Additionally, with XSLT 2.0/3.0, you can now identify datatypes, giving much more control over both input and output. These are optional rather than required, but are useful in providing functional validation. Similarly, different functional signatures are considered distinct functions, and as such you can validate for certain types but not others:

<xsl:function name="myNS:multiply" as="xs:integer">
    <xsl:param name="a" as="xs:integer"/>
    <xsl:param name="b" as="xs:integer"/>
    {$a * $b}??
</xsl:function>

XSLT 3.0 also includes the concept of function packages. For instance, a complex number package may look something like the following:

<xsl:package
  name="https://example.org/complex-arithmetic.xsl"
  package-version="1.0"
  version="3.0"
  xmlns:xsl="https://www.w3.org/1999/XSL/Transform"
  xmlns:xs="https://www.w3.org/2001/XMLSchema"
  xmlns:f="https://example.org/complex-arithmetic.xsl">
  
  <xsl:function name="f:complex-number" 
                as="map(xs:integer, xs:double)" visibility="public">
    <xsl:param name="real" as="xs:double"/>
    <xsl:param name="imaginary" as="xs:double"/>
    <xsl:sequence select="map{ 0:$real, 1:$imaginary }"/>
  </xsl:function>
  
  <xsl:function name="f:real" 
                as="xs:double" visibility="public">
    <xsl:param name="complex" as="map(xs:integer, xs:double)"/>
    <xsl:sequence select="$complex(0)"/>
  </xsl:function>
  
  <xsl:function name="f:imag" 
                as="xs:double" visibility="public">
    <xsl:param name="complex" as="map(xs:integer, xs:double)"/>
    <xsl:sequence select="$complex(1)"/>
  </xsl:function>
  
  <xsl:function name="f:add" 
                as="map(xs:integer, xs:double)" visibility="public">
    <xsl:param name="x" as="map(xs:integer, xs:double)"/>
    <xsl:param name="y" as="map(xs:integer, xs:double)"/>
    <xsl:sequence select=" f:complex-number( f:real($x) + f:real($y), f:imag($x) + f:imag($y))"/>
  </xsl:function>
  
  <xsl:function name="f:multiply" 
                as="map(xs:integer, xs:integer)" visibility="public">
    <xsl:param name="x" as="map(xs:integer, xs:double)"/>
    <xsl:param name="y" as="map(xs:integer, xs:double)"/>
    <xsl:sequence select=" f:complex-number( f:real($x)*f:real($y) - f:imag($x)*f:imag($y), f:real($x)*f:imag($y) + f:imag($x)*f:real($y))"/>
  </xsl:function>
  
  <!-- etc. -->
  
</xsl:package>

This is pulled in with the <xsl:use-package> element. The advantage that packages have over normal imports is that they provide the ability to maintain different versions and consequently establish version dependency.

One final addition in the world of functions is the incorporation of a <try><catch> block. For instance, a function can make use of such a function to catch divide by zero errors (Note: this is a deliberately simple example).

<xsl:function name="f:divide" 
                as="map(xs:integer, xs:integer)" visibility="public">
    <xsl:param name="x" as="map(xs:integer, xs:double)"/>
    <xsl:param name="y" as="map(xs:integer, xs:double)"/>
    <xsl:try>
        {$x div $y}?
        <xsl:catch>
            <xsl:if test="$y = 0">?
            <xsl:message>
                <div data-code="{$err:code}">Divide by Zero Error</div>
            </xsl:message>?
            </xsl:if>
        </xsl:catch>?
    </xsl:try>?
  </xsl:function>

The try construction contains a sequence of items to be evaluated. If any of these fail, then the failing item in the sequence generates a message, with the data about the message contained in the $err:* namespace. This extends to functions the kind of exception handling that had largely been the province of templates in 1.0 (with more capabilities).

Extended Function Set, Sequences, Arrays and Maps

The initial function set for XSLT1.0 were the same as XPath 1.0 functions, and were very limited. Minimal math support, no regular expression support, minimal string manipulation capabilities, no support for set (sequence) operations, no support for dates - it's a very bare bones function sets and one reason why many people have the impression that XSLT is underpowered: XSLT 1.0 is underpowered. XSLT 3.0 is not.

The following is a breakdown of all of the functions supported in the XPath 3.0 specification (a recommendation as of April 2014).

XSLT 3.0 adds a few functions to this list that are specific to the XSLT language, some as traditional functions, some as elements. These include additional support for sorting, grouping, numbering, higher order functions (functions as arguments to other functions), map/reduce capabilities, regular expression analysis and so forth. It also includes support for reading (and writing) XML, text and binary resources under separate threads, giving it much more control for orchestrating processes (this combined with XProc makes XSLT3 a major player in any orchestration system).

XSLT 3.0 also includes support for maps. Maps are analogous to objects in Javascript, making it possible to create entities which contain name-value pairs that can be set and updated dynamically. Such maps are (like all XSLT structures) immutable - a put() operation on a map returns a new map. This is actually in accordance with a growing sentiment in the programming community that mutable programming introduces too many potential side-effects that lead to hard to maintain code.

Similarly, the language also provides support for both sequences (from XSLT 2.0 onward) and arrays (XSLT 3.0). The distinction between the two is subtle: in a sequence, if you add a new sequence to an existing sequence, the result is just another sequence - there are no boundaries of containment. An array is similarly a list of items, but you can have an array of arrays.

<xsl:variable name="sequence" select="('a','b,'c',('d','e',('f'))"/>
{$sequence}
=> ('a','b','c','d','e','f')
{$sequence[2]}
=> 'b'   // 1-based
<xsl:variable name="array" select="[[1,2],[3,4],[5,6]]"/>
{$array}
=> [[1,2], [3,4], [5,6]]
{$array[1]} // 0-based
=> [3,4]
{$array[1][0]}
=> 3

This completes the equivalency between JSON and XML within XSLT - with XSLT you can work with all of the structures that either has. It's also worth noting that this makes it possible to use XSLT for certain RDF operations, because RDF can also be represented as either JSON or XML.

Streaming and Performance

One final benefit of the XSLT 3.0 standard - it supports streaming. The real world has moved beyond files - data comes in streams, from activity streams generated by Twitter or Facebook to location streams coming from cell phones to gigabyte sized files that can only be consumed as chunked streams. XSLT 3.0 can be configured to handle streamed content, with some limitations that come from not necessarily knowing completion points until they arrive.

Performance is a little harder to measure - both Xalan and libxslt are relatively basic and consequently haven't been optimized, much over the years. This means that in simple transformations these XSLT 1.0 processors may have a slight edge over XSLT 3.0 processors like Saxon, but for even moderate weight transformations, any real speed benefits disappear because so much post-processing needs to be done. Running XSLT 3.0 in a streaming mode can provide a huge amount of caching, and some processors (notably Saxon) also support full or partial compilation.

Wrap Up

XSLT 3.0 represents a major upgrade of the XSLT 1.0 (and even XSLT 2.0) standards to become a general purpose transformation language for the most common data storage and messaging formats. The language has become integral in publishing pipelines, is increasingly responsible for managing transformations between complex data structures and data mappings, and is accessible from almost all known languages (a version of an XSLT 3.0 compliant version of Saxon, Saxon-C, is now available for C/C++ in Linux and shortly for Windows, and as bindings for languages such as PHP and node.js).

So, even if XML is not part of your normal processing pipeline, XSLT 3.0 is still very much a worthwhile investment to learn and integrate into your own systems.

You can also read this article (and many others) on XML and related technologies at the newly updated xml.com.

Author Kurt Cagle has been writing about XSLT from its early days in 1999, and is tickled that 3.0 is on its way.

#TheCagleReport

Marco Brandizi

Senior Software Engineer, specialised in IT solutions to manage data, especially life science data.

6 年

Came to this post from google, while looking for implementations of XSLT 2 or 3, which can also do streaming processing (I have XML files with up to 10M nodes). I cannot find anything. It seems everyone has run away from XSL and the only serious implementation around is Saxon, which supports streaming, but only if you buy the enterprise edition.

Patrick Durusau

Owner at Patrick Durusau

8 年

Kurt Cagle, just a minor typo "accessible from almost known languages," err, I think you meant: "accessible from *most* known languages...." Yes? Although, "almost known languages" may be to see if readers are paying attention.

Luis Larrea

Software Engineer at JPMorgan Chase & Co.

8 年

This has been a long time coming. Is this going to finally bring XML to the front end world full force?

回复

Review your history Kurt? Not quite right. Interesting view, though (as Mike says) 3.0 is likely needed by <20% of xslt users.

回复
Jonathan Bisson

Computers, Chemistry, Biology, Humans, Cats and everything in between.

8 年

Great, thanks for presenting that, I didn't know that XSLT?was that powerful. Would that be convenient with JSON-LD? Or is that still worth it to stay with RDF?

回复

要查看或添加评论,请登录

Kurt Cagle的更多文章

  • Reality Check

    Reality Check

    Copyright 2025 Kurt Cagle / The Cagle Report What are we seeing here? Let me see if I can break it down: ?? Cloud…

    12 条评论
  • MarkLogic Gets a Serious Upgrade

    MarkLogic Gets a Serious Upgrade

    Copyright 2025 Kurt Cagle / The Cagle Report Progress Software has just dropped the first v12 Early Access release of…

    14 条评论
  • Beyond Copyright

    Beyond Copyright

    Copyright 2025 Kurt Cagle / The Cagle Report The question of copyright is now very much on people's minds. I do not…

    5 条评论
  • Beware Those Seeking Efficiency

    Beware Those Seeking Efficiency

    Copyright 2025 Kurt Cagle / The Cagle Report As I write this, the Tech Bros are currently doing a hostile takeover of…

    85 条评论
  • A Decentralized AI/KG Web

    A Decentralized AI/KG Web

    Copyright 2025 Kurt Cagle / The Cagle Report An Interesting Week This has been an interesting week. On Sunday, a…

    48 条评论
  • Thoughts on DeepSeek, OpenAI, and the Red Pill/Blue Pill Dilemma of Stargate

    Thoughts on DeepSeek, OpenAI, and the Red Pill/Blue Pill Dilemma of Stargate

    I am currently working on Deepseek (https://chat.deepseek.

    41 条评论
  • The (Fake) Testerone Crisis

    The (Fake) Testerone Crisis

    Copyright 2025 Kurt Cagle/The Cagle Report "Testosterone! What the world needs now is TESTOSTERONE!!!" - Mark…

    22 条评论
  • Why AI Agents Aren't Agents

    Why AI Agents Aren't Agents

    Copyright 2025 Kurt Cagle/The Cagle Report One of the big stories in 2024 was that "2025 Would Be The Year of Agentic…

    22 条评论
  • What to Study in 2025 If You Want A Job in 2030

    What to Study in 2025 If You Want A Job in 2030

    Copyright 2025 Kurt Cagle/The Cagle Report This post started out as a response to someone asking me what I thought…

    28 条评论
  • Ontologies and Knowledge Graphs

    Ontologies and Knowledge Graphs

    Copyright 2025 Kurt Cagle/The Cagle Report In my last post, I talked about ontologies as language toolkits, but I'm…

    52 条评论

社区洞察

其他会员也浏览了