xAPI was designed to break down silos. So, don’t put it in a silo.
Shelly Blake-Plock
Co-Founder, CEO @ Yet Analytics | xAPI and Data Tech for Learning
You’ve heard the rationale a million times.
Organizations were suffering on account of what amounted to data silos. For learning and training professionals, this meant that data was (often literally) stuck in an LMS. It also meant that if they onboarded new technologies like mobile apps or desktop simulations, it would usually require an exorbitant amount of manual labor and ETL to extract and normalize the data produced by the novel tech in order to compare it to the LMS data in any meaningful way.
So, here comes xAPI and the ability to emit commonly structured data from any source — including the LMS, the mobile app, and the sim (and whatever else came along). And it seems great — in theory.
But, the learning curve is real. And any sort of data integration — let alone semantic data integration dependent upon a bulletproof identity management protocol — is notoriously difficult. And when those integrations don’t come together the way that you think they will, the result is a lot of bad xAPI data.
And so many organizations — rather than taking on the (admittedly difficult) challenge of multi-source streaming data — simply started collecting whatever xAPI data they could in order to normalize it against data in their Excel spreadsheets and BI databases. Teams and business models were either built or reconfigured to do this kind of hand-jamming.
And the result by-and-large was that, for many organizations that ever got beyond the proof of concept stage, xAPI became another silo.
Just because it is xAPI doesn’t mean it is interoperable
For the most part, when people talk about the “interoperability” of xAPI, what they are talking about is the interoperability of applications that produce xAPI data. In reality, it’s not that xAPI “makes” things interoperable. It’s that if you have two data sources and they are both producing xAPI data, you are able to validate and store that data in a way that ensures that the data conforms to the IEEE 9274.1.1 standard for xAPI.
Just because these pools of data are structured as xAPI does not mean that they are interoperable with anything else. It doesn’t even mean that they will play nicely with one another. For example — two different applications could each produce xAPI data, yet handle the identity of the learner in different ways. Though “valid”, the discrepancies in identity management can cause frustrating difficulties in correlating the data from between the two apps. Multiply this by thousands of learners and multiple applications, and you end up with the exact opposite of interoperability.
Therefore, if the goal is interoperability between xAPI data sources, it is not enough to ensure that each data source is producing xAPI. You also have to consider how the design of your xAPI data relates to your objectives and your technical infrastructure. For example, if your competency assertion engine is expecting xAPI data according to one data profile and it receives data that is incongruous with that data profile, the result can be a bottleneck.
Therein lies a tale
xAPI — as standardized as IEEE 9274.1.1 — is part of the standards portfolio of the IEEE Learning Technology Standards Committee (LTSC). xAPI Profiles (P9274.2) is an important companion specification that provides the ability to describe and model learning experiences in machine-readable language. Understanding the xAPI standards and the components of that portfolio — especially those standards relevant to the specifications of what’s called the Total Learning Architecture — is key both to getting the most out of xAPI’s true capabilities when it comes to interoperability and to breaking xAPI out of its silo.
Besides xAPI — which covers the domain of activity data — the LTSC portfolio contains standards regarding:
These standards are core to the Total Learning Architecture (TLA) as designed and specified by Advanced Distributed Learning. As an enterprise streaming data architecture, the TLA represents the combined capability of any variety of modular configurations of the underlying standards and their expression in software applications.
For example, the well-documented STEEL-R project leverages the TLA approach to manage the flow of real-time data between activities occurring in a synthetic training environment and the processing of these activities through an automated competency assertion engine. At a high level:
In this scenario, three of the LTSC standards are implemented — the data is captured from the AI tutor as xAPI (9274.1.1), the filter-forwarder is governed by the patterns coded into an xAPI Profile (P9274.2), and the competency definitions providing machine readable representations in the competency assertion engine follow the Shared Competency Definition model (1484.20.3).
Therefore, for STEEL-R to work, it is not enough that the system produces xAPI data. Rather, the system needs to produce xAPI data and then filter it according to the requirements necessary for the competency assertion system to assert competencies described in standardized machine readable language. The xAPI data format does not “ensure” interoperability. Rather, from the point of view of data design, the xAPI model provides a way to ensure that the business system — in this case the competency engine — has exactly (and only) the data it needs in order to efficiently and accurately provide its value in real-time.
If the project were satisfied merely with the fact that the AI tutor was producing valid xAPI data, it would fail. Because the xAPI data would be in one silo and the competency engine would be in another silo and it would take a manual ETL process to hand-jam the two together.
It is not enough to be conformant
It can’t be overestimated how important data design is to all of this.
And it can’t be overestimated that when we’re talking about data design, we’re not just talking about the ability to create valid xAPI.
Most business systems do not care that you’ve produced xAPI data.
The value is not in that it is designed “as” xAPI. The value is in that it is designed to work. This isn’t magic. The specific value of the xAPI format is that it ensures the ability to align?learning activity data — in a trusted and standard format — with other expressions of business rules and data models which also are instantiations of trusted and standard formats. Data standards are not pieces of art to print out and hang on a wall. They are only relevant if they produce a clear business value within a technical infrastructure.
Another way to think about this is like this: when you are designing xAPI data, your goal should not simply be to make sure that the data is conformant to the data standard. Your goal should be that the design meets the needs of the business rules and analytics needs which will rely on the data — and in an implementation of the Total Learning Architecture, that means that the xAPI data design may reflect and be reflected in the data design of the Learning Metadata, machine-readable Competencies, and roll-up Learner Records.
Let’s imagine a hypothetical TLA implementation. In this implementation, we’ll have an LMS that is connected to a vast digital course catalog and a simulation that is disconnected from the LMS, but which produces a stream of xAPI data about the experience and behaviors of a learner engaged in a sim. The goal is to gauge increasing levels of competency under a range of difficulty levels (assessed both through quizzes on the LMS and through missions undertaken in the sim). When learners reach a certain threshold of competency, a record is recorded in what we’ll call a learner record repository. Unlike a typical transcript, the record in this repository indicates “state” — meaning that the measurement that is recorded is fungible and can go up or down over time based on ongoing or later quiz attempts in the eLearning module or assessed activities within the sim.
Remember that we’re applying the TLA approach to our hypothetical learning environment.
So, we’re going to ensure that the LMS produces xAPI data aligned with the xAPI data emitted from the sim. But, as we observed previously, it is not enough to just emit valid xAPI. Because we’re dealing with two distinct and unconnected systems, we need to make sure that however we are collecting the identity of the learner — whether as an email address or as a student number or as an anonymized hash — makes it possible to correlate that “Learner X” in an LMS session is also “Learner X” in a sim session.
Second, we’re going to want to look at the purpose this data will serve. We know that the end goal is the collection of these representations of state in the learner record repo. And that it will be through the assertion of competency gains or decreases that we’ll be forming an opinion on this state. So, we need to make sure that the xAPI data that is consumed by the competency engine is relevant to our objectives.
There’s another thing. The eLearning sessions and the sim sessions represent different modalities and different assessment formats. By applying the Learning Metadata standard, we can tag the eLearning and sim content in a variety of ways in order to add context to our experience. For example, in the eLearning course, we might present info using two different instructional strategies. In the first case, we may be assessing reading comprehension whereas in the second we may be assessing the learners ability to adapt to changing challenges. We could tag the metadata of the first case to recognize the comprehension objective, and likewise tag the metadata of the second case to recognize the adaptability objective.
The metadata then can be referenced in the xAPI data. And, thanks to the P2881 standard, all we need is the activity id in order to correlate. We’ll make templates of these contextualized data statements in an xAPI Profile. And we’ll have one profile pattern that tracks and represents comprehension while an alternate pattern tracks and represents adaptability.
The competency assertion engine will be looking for certain qualifications related to comprehension and certain qualifications related to adaptability. It will assert competencies based on these met or unmet qualifications either independently or in conjunction with the information gathered from the simulation.
Ah, yes, the simulation. Remember that the sim is also issuing xAPI data statements. And a note about sims — they tend to push out a lot more data than things like LMS-based quizzes. So, we’re going to want to make sure that we only forward the necessary data on to be joined down the road with the eLearning data. To do that, we’ll set up a filter-forwarder and we’ll govern what it filters according to a pattern in an xAPI Profile. So, in this case, say we only want data relevant to achieving a score in a specific range. Now, only that data will be forwarded to a downstream “transactional” Learning Record Store where it will be joined with the comprehension and adaptability information from our eLearning assessments.
Based on the data that it consumes from the transactional LRS, the automated competency assertion system makes its assertions and sends the relevant roll-up information to the learner record repo.
The learner record repo requires the information to be sent according to a standardized data model — because this competency info is not going to be the only info held there. Like any transcript, the records kept in the repo will also include things like past course registrations, grades, and relevant demographics. As a “state machine”, the repo will also update as new information becomes available. This could include something practical like a change in a student’s address or it could include something more germane to the process described above where a competency is updated.
It starts with a single xAPI statement
The entire set of processes and relationships described above starts with the design of a single xAPI data statement. But that design does not happen in a vacuum. Rather, the xAPI statement inherently contains the context of where it comes from and where it might go. When designing the statements, we must consider the entirety of the process, the requirements of each component of the system, and how the design decisions that we make will influence or bias the way the results are interpreted.
Most importantly, recognize that neither xAPI nor any of the data standards that describe the other aspects of our business system naturally gravitate towards living in silos. They are all designed to be interconnected and they thrive on interconnectedness.
They only end up in silos if you put them in silos.
Capture and verify learning records with direct source, real-time data || Solutions Architect @ Stride, Inc. || EdTech Breakthrough Awarded
3 天前I think it really highlights for me that solutions, specifications, etc., are only impactful if they are adopted and coupled with additional perspective efforts. Which dominos into asking… how? Do we expect the consumer to know? By commercializing LRS - it impacted and contributed to pic widespread adoption of xAPI . But that learning curve you mention is REAL. The more I look at it and learn about xAPI - the more I start to view it as a language arts. Im actively trying to program and build out an LRS (albeit about as slow as molasses (also your SQL LRS is awesome)) This field/focus for me has really been wildly insightful to what works, what doesn’t, and just a really fun / interesting challenge.
CEO and Strategic Consultant at Build Capable
3 天前What i think is interesting about this is the assumption that data translation has to happen after the data is sent to the LRS. We get ahead of it by pre-packaging distribution that allows capturing the actor consistently and has a data translator specific to content type that is set up by profiles aligned with your data strategy. This gets cleaner, consistent data into the LRS across different providers. If you control the data in, you minimize the need of forwarding and filtering through transactional LRSs. Am I missing something?