COMPARATIVE DATA MODELS

COMPARATIVE DATA MODELS

COMPARATIVE DATA MODELS

By W H Inmon

When the first program was written, there wasn’t much need for a data model. But as the corporation grew and many applications started to appear, it became obvious that a data model was needed. There are many good reasons for having a data model in the modern IT environment –

? With a data model you can look across vistas of data

? A data model is essential in finding ways to meaningfully connect disparate data elements

?? A data model sets the stage for future expansion

among the many reasons for a data model.

Mature IT organizations have known about the need for a data model for years. But the data model that is known to almost all of the companies is a data model that is based on structured data. And once, in the corporation, structured data was pretty much all that the corporation had.

But today there is a new and important type of data that has surfaced in the corporation that is not structured. That type of data is textual data. There is tremendous business value tied up in textual data that is just now being unlocked and explored. And as in the case of structured data, there is a need for a data model for textual data as well.

However, the data model for textual data looks nothing like the data model for structured data. Ironically, the two different types of data models serve pretty much the same purpose for their types of data. But the actual models themselves are as different as chalk and cheese.

The essential elements of the structured data model include –

?? A record

?? A key

?? An attribute

?? Some relationships

The essential elements of the textual data model include taxonomies.

The following figure illustrates these types of data models.


Most people are familiar with the classical structured data model. And most people are not familiar at all with the textual data model – the taxonomy.

So how does the textual data model differ from the structured data model?

There are many differences –

1) the textual data model demands that there be context inferred into the data model. The structured data model has context but context in the structured data model is embedded in the model itself whereas the context in the textual data model is inferred from the text itself. This is a very large, very stark, very important difference.

2) the textual taxonomy is nothing more than a list of words, all of which relate to a common classification. An interesting aspect of the taxonomy is that it is rarely if ever complete. Consider a taxonomy based on the types of trees that there are. There are not an infinite number of types of trees, but there are many types of tree that grow in only one locale and are obscure or are unknown around the world. And this indeterminate list of elements of a taxonomy is common to almost all taxonomies. The elements that should go into the taxonomy are not every possibility that can go into the taxonomy but every reasonable and likely possibility that is relevant to the data being addressed. The probability of access of the element in the taxonomy being relevant to the text being addressed determines whether or not the element should be included in the taxonomy.

3) taxonomies can be for ANY classification. There are indeed an infinite number of possibilities of classification types. The architect has to select the classification types that are relevant to the text being analyzed. For example, a designer probably would not include the types of African hippopotamus when being used to analyze South American soccer results.

4) once the taxonomies are selected, there can be interrelationships among the elements of the different taxonomies relevant to the text that is being analyzed.

5) taxonomies are like legos. They can be combined and interrelated on an as needed basis.

6) there are no records, keys or attributes in the taxonomy.

7) structured data is usually created by an occurrence of an event. Textual elements that belong to the taxonomy exist outside of the business of the corporation and a few elements may be irrelevant or tangential to the text that is being analyzed.

And there are probably many other differences between these two types of data models. But this list covers most of the ground.

In any case data models – structured and textual - are an essential part of the mature IT organizations plans to cope with the future and the chaos of today.

?

Bill Inmon lives in Denver Colorado with his wife and his two Scotty dogs – Jeb and Lena. The other night Bill accidentally stepped on Jeb’s foot. Not a hard step but one that scared Jeb (and Bill.) The next morning Jeb had forgiven Bill. Jeb got his belly rubbed. Then he wanted his cookie, which he got.

?

Thomas Otto

Experienced leader of Data Teams, Customer Success or Prof. Services. Exposure to vendor, consulting and end-user organisations.

7 个月

We did an X (back then Twitter) Analytics that grabs tweets in NZ to share msgs about sicknesses. We pulled data through the API into a classical DWH. Had a taxonomy and very simple table that helped us contextualise the conversation and classify. By far nothing super special, but enough. We combined the "cleansed" and now-structured data into a dim modell (sorry Kimball ;-) ) and analysed it with a BI Tool. Inclusive map to sow hot-spots of potential outbreaks. You could pick a disease or symptom (a list of relevant ones) and see. Great early warning system. We correlated the noisy/low-qual data with current, low latency reference (cases data) to verify the validiy and to adjust or interpret the "correct way". We could do so as we also hosted the national disease db, that was filled by GP, hospitals etc. Without that, the whole thing would be fancy geek trash ...This was back 2014.

回复
Scott Johnson

?? Transforming Data into Strategic Assets | GCP Certified | AI & Machine Learning Innovator | Master Data Management (MDM) Expert, Google Cloud Innovator, Informatica Partner

7 个月

Good to know! This great!!

回复
Karim El Helaly

Attorney at Law

7 个月

Insightful!

回复
Cindi Meyersohn

Chief Operating Officer at DataVaultAlliance Holdings and President of DataRebels

7 个月

Excited for your new book, Bill, and the progression on your new ventures!! Sending my love to you and Sylvia.

回复
John O'Gorman

Disambiguation Specialist

7 个月

Bill Inmon - "taxonomies can be for ANY classification. There are indeed an infinite number of possibilities of classification types." Yeah, there's the rub. To Robert Vane's point (I think) structured and unstructured information use exactly the same vocabularies. Elements of both also have the same classification and attribution requirements *independent of their very different structural representations.* The problem with taxonomies and domain-specific ontologies is the same one we find with tables, schema, and models: They don't play well with others since they prefer to "live in their own domains" also known as silos. If the objective is to get structured collections to talk to other structured collections - and by extension have meaningful "conversations" with their less structured cousins - then we have to start with the identification, classification, and semantic equivalences of business language, including metadata. From there you can orchestrate existing and/or build new collections independent of structural considerations.

要查看或添加评论,请登录

Bill Inmon的更多文章

  • POW WOW DENVER - MARCH 2025

    POW WOW DENVER - MARCH 2025

    THE DENVER POW WOW – March 2025 It was a lazy mid March Saturday afternoon and it was a warm day in Denver. Every year…

    1 条评论
  • STREAMLINING THE EMERGENCY ROOM - TEXTUAL ETL

    STREAMLINING THE EMERGENCY ROOM - TEXTUAL ETL

    STREAMLINING THE EMERGENCY ROOM By W H Inmon The emergency room of the hospital is where people turn to when they have…

    2 条评论
  • THE TEXT MAZE

    THE TEXT MAZE

    THE TEXT MAZE By W H Inmon A really interesting question is – why does text befuddle the computer? The fact that 80% or…

    2 条评论
  • BLAME IT ALL ON GRACE HOPPER

    BLAME IT ALL ON GRACE HOPPER

    BLAME IT ALL ON GRACE HOPPER By W H Inmon One of the more interesting aspects about the world of IT is that IT people…

    17 条评论
  • ASSOCIATIVE RECALL AND REALITY

    ASSOCIATIVE RECALL AND REALITY

    ASSOCIATIVE RECALL AND REALITY By W H Inmon A while back, on a Saturday night, my wife and I were looking for a movie…

    7 条评论
  • A FIRESIDE CHAT WITH BILL INMON

    A FIRESIDE CHAT WITH BILL INMON

    A FIRESIDE CHAT WITH BILL INMON Get Bill’s perspective on your IT organization and its initiatives. Come spend an hour…

  • MESSAGE TO ELON

    MESSAGE TO ELON

    MESSAGE TO ELON By W H Inmon Yesterday Elon Musk tweeted a message asking if anyone had some innovative ways to improve…

    73 条评论
  • GREAT EXPECTATIONS:WALT DISNEY AND THE PENTAGON

    GREAT EXPECTATIONS:WALT DISNEY AND THE PENTAGON

    GREAT EXPECTATIONS: WALT DISNEY AND THE PENTAGON By W H Inmon Think of all the delight Walt Disney has brought the…

    5 条评论
  • BUILDING THE LLM - PART VI

    BUILDING THE LLM - PART VI

    BUILDING THE LLM – Part VI By W H Inmon The language model is an interesting piece of technology. There are many facets…

    3 条评论
  • BUILDING THE LLM - PART V

    BUILDING THE LLM - PART V

    BUILDING THE LLM – Part V By W H Inmon The generic industry language model has at a minimum three important elements of…

    2 条评论

社区洞察

其他会员也浏览了