COMPARATIVE DATA MODELS
Bill Inmon
Founder, Chairman, CEO, Best-Selling Author, University of Denver & Scalefree Advisory Board Member
COMPARATIVE DATA MODELS
By W H Inmon
When the first program was written, there wasn’t much need for a data model. But as the corporation grew and many applications started to appear, it became obvious that a data model was needed. There are many good reasons for having a data model in the modern IT environment –
? With a data model you can look across vistas of data
? A data model is essential in finding ways to meaningfully connect disparate data elements
?? A data model sets the stage for future expansion
among the many reasons for a data model.
Mature IT organizations have known about the need for a data model for years. But the data model that is known to almost all of the companies is a data model that is based on structured data. And once, in the corporation, structured data was pretty much all that the corporation had.
But today there is a new and important type of data that has surfaced in the corporation that is not structured. That type of data is textual data. There is tremendous business value tied up in textual data that is just now being unlocked and explored. And as in the case of structured data, there is a need for a data model for textual data as well.
However, the data model for textual data looks nothing like the data model for structured data. Ironically, the two different types of data models serve pretty much the same purpose for their types of data. But the actual models themselves are as different as chalk and cheese.
The essential elements of the structured data model include –
?? A record
?? A key
?? An attribute
?? Some relationships
The essential elements of the textual data model include taxonomies.
The following figure illustrates these types of data models.
领英推荐
Most people are familiar with the classical structured data model. And most people are not familiar at all with the textual data model – the taxonomy.
So how does the textual data model differ from the structured data model?
There are many differences –
1) the textual data model demands that there be context inferred into the data model. The structured data model has context but context in the structured data model is embedded in the model itself whereas the context in the textual data model is inferred from the text itself. This is a very large, very stark, very important difference.
2) the textual taxonomy is nothing more than a list of words, all of which relate to a common classification. An interesting aspect of the taxonomy is that it is rarely if ever complete. Consider a taxonomy based on the types of trees that there are. There are not an infinite number of types of trees, but there are many types of tree that grow in only one locale and are obscure or are unknown around the world. And this indeterminate list of elements of a taxonomy is common to almost all taxonomies. The elements that should go into the taxonomy are not every possibility that can go into the taxonomy but every reasonable and likely possibility that is relevant to the data being addressed. The probability of access of the element in the taxonomy being relevant to the text being addressed determines whether or not the element should be included in the taxonomy.
3) taxonomies can be for ANY classification. There are indeed an infinite number of possibilities of classification types. The architect has to select the classification types that are relevant to the text being analyzed. For example, a designer probably would not include the types of African hippopotamus when being used to analyze South American soccer results.
4) once the taxonomies are selected, there can be interrelationships among the elements of the different taxonomies relevant to the text that is being analyzed.
5) taxonomies are like legos. They can be combined and interrelated on an as needed basis.
6) there are no records, keys or attributes in the taxonomy.
7) structured data is usually created by an occurrence of an event. Textual elements that belong to the taxonomy exist outside of the business of the corporation and a few elements may be irrelevant or tangential to the text that is being analyzed.
And there are probably many other differences between these two types of data models. But this list covers most of the ground.
In any case data models – structured and textual - are an essential part of the mature IT organizations plans to cope with the future and the chaos of today.
?
Bill Inmon lives in Denver Colorado with his wife and his two Scotty dogs – Jeb and Lena. The other night Bill accidentally stepped on Jeb’s foot. Not a hard step but one that scared Jeb (and Bill.) The next morning Jeb had forgiven Bill. Jeb got his belly rubbed. Then he wanted his cookie, which he got.
?
Experienced leader of Data Teams, Customer Success or Prof. Services. Exposure to vendor, consulting and end-user organisations.
7 个月We did an X (back then Twitter) Analytics that grabs tweets in NZ to share msgs about sicknesses. We pulled data through the API into a classical DWH. Had a taxonomy and very simple table that helped us contextualise the conversation and classify. By far nothing super special, but enough. We combined the "cleansed" and now-structured data into a dim modell (sorry Kimball ;-) ) and analysed it with a BI Tool. Inclusive map to sow hot-spots of potential outbreaks. You could pick a disease or symptom (a list of relevant ones) and see. Great early warning system. We correlated the noisy/low-qual data with current, low latency reference (cases data) to verify the validiy and to adjust or interpret the "correct way". We could do so as we also hosted the national disease db, that was filled by GP, hospitals etc. Without that, the whole thing would be fancy geek trash ...This was back 2014.
?? Transforming Data into Strategic Assets | GCP Certified | AI & Machine Learning Innovator | Master Data Management (MDM) Expert, Google Cloud Innovator, Informatica Partner
7 个月Good to know! This great!!
Attorney at Law
7 个月Insightful!
Chief Operating Officer at DataVaultAlliance Holdings and President of DataRebels
7 个月Excited for your new book, Bill, and the progression on your new ventures!! Sending my love to you and Sylvia.
Disambiguation Specialist
7 个月Bill Inmon - "taxonomies can be for ANY classification. There are indeed an infinite number of possibilities of classification types." Yeah, there's the rub. To Robert Vane's point (I think) structured and unstructured information use exactly the same vocabularies. Elements of both also have the same classification and attribution requirements *independent of their very different structural representations.* The problem with taxonomies and domain-specific ontologies is the same one we find with tables, schema, and models: They don't play well with others since they prefer to "live in their own domains" also known as silos. If the objective is to get structured collections to talk to other structured collections - and by extension have meaningful "conversations" with their less structured cousins - then we have to start with the identification, classification, and semantic equivalences of business language, including metadata. From there you can orchestrate existing and/or build new collections independent of structural considerations.