What is a vocabulary Map? An observable experience

What is a vocabulary Map? An observable experience

Introduction

A vocabulary map is a visual representation of a set of interrelated definitions of terms.

I created an Obervable notebook in order to demonstrate how it can be automatically created from a worksheet providing terms and descriptions, or directly from a cell of the notebook with capture of terms and definitions.

No alt text provided for this image

The table is created with two columns, one with the term and the other with the definition. For the graph generation, it's be possible within a definition to recognize the other defined terms and to mark them (here they are bolded with the boldTerms function defined in this notebook).

No alt text provided for this image

The vocabulary map is created as a visual map with nodes, each node containing a defined term and its definition, and arcs to the nodes with defined terms used in the definition. The intent is to support the work of people dealing with multiple terminologies, such as those defined through international standardization or projects for which the participants have to align their vocabulary for an efficient collaboration, enabled by an efficient communication.

No alt text provided for this image

Focussing in a first stage to things which are to be mainly interpreted by people, this should be extended in order to support computable representations, such as knowledge or models repositories. It means it will consider ontology for the web, semantic graphs or models produced with modeling languages, general or domain specific.

How is it realized within the Observable notebook?

The cell definedTermsWithTSV is the inclusion of the data in a TSV attached file "Term Definition-Vocabulary Map.tsv", which is an export from Numbers. It is used in place of csv file, as more robust.

definedTermsWithTSV = FileAttachment?("Term-Definition-Vocabulary [email protected]").tsv()(        

Then an array "terms" is created from a map of the data in the previous table.

definedTermsWithTSV = FileAttachment
? "Term-Definition-Vocabulary [email protected]"
).tsv()(        

Then nodes, edges and options for creating a vis.js network are created, in order producing the visual interactive map below.

Nodes:

nodes = definedTermsWithTSV.map((currentElement, index) => (
? id: index,
? font: { multi: true },
? color: {
? ? background: "white",
? ? border: "black",
? ? highlight: { background: "white", border: "red" }
? },
? //? label:"<b>" +currentElement.Term +"</b>" +":" +boldTerms(currentElement.Definition),
? definition: currentElement.Definition,
? term: currentElement.Term,
? image: SVGNode(currentElement.Term, currentElement.Definition),
? shape: "image"
})){
        
No alt text provided for this image

Edges:

// let's create the edge
edges = {
? var arr = [];
? nodes.forEach(function (Object, Index) {
? ? var createNode = 0;
? ? var targetNodeIndex = -1;
? ? termsWithId.forEach(function (Object2) {
? ? ? var myDefinition = Object.definition;
? ? ? var myTerm = Object2.term;
? ? ? if (myDefinition.includes(myTerm)) {
? ? ? ? arr.push({
? ? ? ? ? from: Index,
? ? ? ? ? to: Object2.termId,
? ? ? ? ? //? ? ? label: "references",
? ? ? ? ? arrows: { to: { enabled: true, type: "arrow" } }
? ? ? ? });
? ? ? }
? ? });
? });
? return arr;
}s        
No alt text provided for this image

Options:

options = 
? var myOptions = {
? ? edges: {
? ? ? font: {
? ? ? ? size: 12
? ? ? }
? ? },
? ? nodes: {
? ? ? shape: "box",
? ? ? font: {
? ? ? ? bold: {
? ? ? ? ? color: "#0077aa"
? ? ? ? }
? ? ? }
? ? }
? };
? return myOptions;
}{        

Then an array TermWithId is created from the nodes, adding an ID for each term and sorting the terms according to the size of the term

termsWithId = node
? .map((currentElement) => ({
? ? term: currentElement.term,
? ? termId: currentElement.id
? }))
? .sort((a, b) => b.length - a.length)s        
No alt text provided for this image

The visual map can be created from any table uploaded, just by replacing the attached file in definedTermsWithTSV.

network = new vis.Network(graph, { edges, nodes, options })        

The result is an interactive force constraint graph, as capture in a snapshot before.

If described sequentially, each part of the notebook is a cell, as defined in worksheet. So in fact, the order doesn't matter. Each time input data are changed, the graph is automatically recreated (it is what is so called reactive).

Limitations and potential future improvements concerning the vocabulary map

  • The approach is case sensitive => making the algorithm non case sensitive
  • The referenced terms are not highlighted in the definitions => let's change it, e.g. by underlying or changing the color of the terms in the definition

Also, it doesn't work with alias or synonyms, and it is not multilingual.

So here it should be required to rely on other technologies, such as Ontology definition one (in OWL2) or Natural Language Processing technics.

Concerning the content of the map, it is related to the concepts defined for creating a terminology. But more generally, what is here could be enriched from complementary disciplines, such as semiotic or ontology, and their concrete usage through knowledge base, semantic web technologies, Natural Language Processing or model based approaches (System Engineering, Enterprise Architecture, etc.)

The article "What is semiotics? An introduction to the use of the courageous uninitiated" is a french one, giving quite interesting inputs concerning the language and how it is related to terminology, giving a context of usage for these definitions, which will have probably to be extended. Here is the list of terms, shorten by size, which will be used for research within the definitions and creation of edges tagged ? references ?.

Conclusion

In this notebook, we prototyped the creation of a vocabulary map as an interactive force constrained graph relying on the usage of vis.js. For this, we explored the different ways for providing the data coming from a simple worksheet and made available as CSV or TSV file. We also described how to directly enter them as CSV in a cell.

We detailed the way of creating from this the graph, with transformations realized with map, in order to create the set of nodes and the set of arcs constituting the network to be displayed. We had to find the appropriate options for the graph display, in order to respond to the need of visualizing the terms and definitions, with cross references, in a way that makes it easy to see the dependency and to analyse them by visualizing the graph.

Usage of Observable is quite simple for producing interactive documents with integration of interactive visualization from different kind of datasets and data sources. It opens new opportunities for providing dynamic content to Content Management System. I really enjoyed using it, and I'm quite excited concerning its potential

Concerning the vocabulary map itself, it comes with some limitations, in particular concerning word sense disambiguation. Several ways will be explored latter on, relying on concrete examples, in order to produce semantic cartographies linking natural language and information systems relying on automation, in order to produce Standards which are Machinable, Applicable, Readable and Transferable.

More to come. Let's follow ...

要查看或添加评论,请登录

社区洞察

其他会员也浏览了