TELLING YOUR DATA STORY
Chris is proud owner of the most popular grocery store in the town. He has spent his whole life building it. ?There is a special cafeteria and a playground within the store, which has turned this grocery store a community centre. Elderly people meet and greet, youngsters hang around and kids play. While going back home they purchase their daily necessities from the store.?
Joe is granddaughter of Chris. She recently has passed out of college and joined the family business. Joe is observing some possibility to upscale the business and reduce the running cost at the same time. Joe proposed two innovative ideas to her grandfather to implement during this Christmas.
#1, she proposed to take away the burden of filling the attendance register from their employees. This will save their time. Additionally, it will remove of one supervisor who maintains the attendance register. Thus, Joe has installed sophisticated camera which recognises the employee when they enter the building. The system behind the camera logs the employee in attendance register.
#2, Joe proposed to open a temporary gift shop within grocery store so that their customers need not to roam around multiple places for Christmas shopping. There was one challenge though, this will require some temporary employees to work in the gift shop during Christmas holidays. But who works during Christmas? Joe got a brilliant idea; She made a poster for job vacancy asking “Who wants to help Santa this year?” They got a tremendous response. It is actually the whole town want to pitch in to work part-time in the gift shop.
Joe and Chris made a special Christmas themed uniform to recognize this special task force from other regular employees. Their employment contact and salary structure is also different than others. After all they are helpers of Santa Clause!
Everything was going as per plan until HR team reported that as per the new attendance registration system Santa’s helpers are clocking much more than their contractual hour. This will turn up as huge unplanned cost on salary in December. Joe called IT department to look into the newly installed camera system, in case that has a fault. But that is working as per design.
Sunday is Samantha’s regular day to work in the grocery store. She entered the store and smelled some problem. Samantha offered to extend her help in route cause analysis. Being Data Management Expert, she started digging into the data structure,? data rules and real time data entries.
She got two major observations,
1.?? The IT department has recognized the uniform as special attribute of an employee. When the camera detects an employee with the uniform, the system is logging that employee attendance in the attendance register.
2.?? Some super excited special work force, ‘the Santa’s helpers’ , are coming to the store during their non-working days wearing their special uniform. Some special worker didn’t care about wearing the uniform while coming to work.
These two things together causing additional attendance registrations.
By the way, do you remember Samantha? She is not Santa’s helper. She is just an regular part-time employee in the store. If you can’t recognize her, this is a good time to take a pause and detour to my previous article on Data Modelling à ‘Data Modelling Simplified’ published at LinkedIn on July 27, 2024;
What caught my attention in the story?
The most fascinating thing in the above story that caught my attention was the data design. Especially the fact that Attendance Register table has a column ‘Is Christmas themed uniform detected’.?
Is it a bad design? We can discuss that later. At this point let's pay attention to something vital,
There is a misalignment between how real world is revolving and how the IT systems are working. I have seen this happening when the data story is not crystal clear.
What is a data story and why to create it?
A small business with a limited set of objectives, customers and employees can usually operate manually. As in this business story, the store was running well with one hardcover employee register and an supervisor of the register.
A business can grow by having more business objective or by servicing more customers. In either case the number of employees may grow. These growths bring more and more administrative overheads with it. IT systems are built to support business operations to run conveniently. But how would an IT system know about the business, as a matter of fact about the real world??
Well, engineers embed the business story ( you may call it the business knowledge as well ) into the IT system in the form of a data story. While on execution, an IT system generates data. The system also needs to be told the business expectations. The engineers embed business expectations as another data story. Finally the data that gets generated out of execution of IT systems execution is another data story.
A data story is representation of business knowledge, business requirements and operations in data language. Well-crafted data stories definitely reduce, if not fully eradicate, the misalignments between Business and IT.
Analytics is a tool on which business rely on to know past performance and future trends. Though basically it’s also IT system, but there is a difference between how an Operational or Analytical IT systems works. Therefore the data story needs to be re-written in a fashion that analytical, for that matter artificial intelligent systems understand. ?
Finally, businesses that operate in the regulated environments, such as financial or healthcare sectors, needs to face the regulators. They most often asks, are you doing the right thing? Are you doing it right? Essentially all they want to hear is your data story.
How to create the Data Story?
As per definition in DAMA DMBoK, data modelling is the process of discovering, analysing and scoping data requirements, and then representing and communicating these data requirements in a precise form called Data Model.
Wouldn’t a Data Model be a structured way to representing and communicate the data story?
In the later section, I will prepare prototype of a data model which is in-line with the way business operates. The process of preparing the prototype of a data model is called Data Modelling. It can be embedded into operational IT systems that used these data. It can be used as input to the data modelling for the analytical data models. Furthermore, can be re-presented to the regulator to substantiate business is doing the right thing in a right way.
Let’s begin data modelling keeping it in mind that we are creating a prototype to get an idea of the process.
Data Modelling
Step -1 : Identify entity
In data modelling, an entity is a thing about which an organization collects information. These are sometimes referred to as nouns of the organization.
In the above business story, employee is a thing about whom business is collecting attendance information. Uniform is another thing, business would like to know whether employee come to the store earing it or not. Attendance register also a thing where business keeps tract of ?want to see employees information.
Thus there are 3 entity discovered. (1) Employee, (2) Uniform and (3) Attendance Register.
Tips & tricks : If you are searching for entity, pay attention to answers to the fundamental questions – who, what, when, where, why or how – or combination of these questions. There lies the secrets.
Step – 2 : Create definition of the entities
A precise definition describes what this entity is. It helps to understand, visualise or describe the entity. Let’s create some basic definition for the entities identified in the previous step.
Employee is a natural person who provides service for a set number of hours to the organization in exchange of payments in the forms of salary or benefits.
领英推荐
Uniform is a standardized set of clothing worn by the employee.
Attendance Register is a storage where the organization logs the information about presence of the employees.
Now you have the entities that represents one tangible (or intangible) thing in the real world is ready. At this stage it is also called Concepts or Terms. It looks like as follows.
Tips & tricks : If you are describing how to use a real world thing, that is not a definition. Stick to describe what is it.
Step – 3 : Identify the relationships between the entities
Let us assume, as per the store policy, an employee can have zero or one Uniform. A uniform can belong to zero or one employee of the store. A attendance register can have zero or multiple employees attendance records.?
Wait a second, what is attendance record?
Step – 1A and 2A, 3A : Identify and Define Entity and identify the relationship
An attendance record is a thing that describes whether an employee is present or absent in the store on a particular day. Pay attention, this is an intangible concept about which the organization is interested to know. Hence ?we consider it as an Entity.
Attendance Record is a thing that represents the presence of an employee in the store on a particular day.
An employee can have zero or multiple attendance record. An Attendance Record belongs to only one employee. An Attendance Register can have zero or multiple Attendance Records. Whereas, an Attendance Record belongs to only one employee.
At this stage we are ready to create first draft of the Conceptual Data Model.
Step – 4 : Identify the attributes and create their definitions
An attribute is a property that identifies, describes or measures an entity.
Let’s take Employee entity. In the above story we can see some attributes, such as Employee Id, First Name, Last Name, Workday and Uniform. I will use my author’s privilege to imagine and define them.
Employee Id : an unique identification of an employee.
First Name ??: the initial part of one person’s name that distinguishes an individual
Last Name?? : the part of one person’s name that is shared by other members in their family
Attendance Id : identifies one attendance status of an employee on a day
Date of attendance : identifies the date when the employee was present in the store
Attendance Status : identifies whether the employee is present or absent
With these attributes, we now are able to create first draft of the logical data model.
Step – 5 : interpret the data model the way your system understands
Until logical data models it was technology agnostics. Now come to technical parts. IT Systems gets underlying data storage. The logical data model needs to be further translated into the physical data model keeping all technical considerations in mind. This is called Physical Data Modelling.
For simplicity, let’s consider the IT System of the Grocery store has one relational database to store data. A relational database stores tables and columns. If we follow the most simple rule, such as, Entity = Table and Attribute = Column, the first draft of the physical data model can be prepared.
Step -6 : Repeat 1 to 5
The key to getting a comprehensive data model is in iterations. When you re-start from the step-1, what would you pick next? There are two options.
Option-1 : Take a new piece of data, understand the story behind it and write the story as data model. For example, in the business story something is told about part-time. Employees works on different days. The Employee table also had a column named, Work day. It’s worth exploring further because this may have direct connection to the core business problem, attendance.
Option-2 : Take the same piece of data, dig deeper into the business story. This iteration will allow you to fine-tune the data model to make this piece more precise. For example, in the prototype we briefly touched upon the entity, Uniform. It’s worth finding out the complete business story behind it because this also has some co-relation with the attendance checking.
My personal choice is to establish a clear big picture view at first. Then dig deeper and deeper into each pieces and parts, because this approach creates alignment along with clarity. I would definitely prefer this approach for the cross-functional data elements. For the data elements which are relevant for one domain (one part of the business), we can be flexible.
Conclusion
In this article, I have used multiple terms interchangeably, such as business story, data story, data model. Don’t get confused by the jargons, in the end they are different representation of the same thing. The language plays a crucial role in storytelling. We say same story in different languages to the people in different demography. Right? Similarly, for different stakeholders different representation of the same thing is needed. After all, perspective matters.
I would like to bring your attention to a vital point. Data modelling is an iterative and continuous process. Getting it first time right concept doesn’t apply here. During prototype I intentionally took a detour. In practical situations data modellers iterate hundreds of times to reach to one decent prototype.
Time, Patience and a little Pressure turns coal into a diamond. I recommend same recipe to turn the business stories into well rounded data stories.
Writing data story in the form of data models is one thing, but making the data model precise is the next level.
Unlike diamonds, the data models also needs cut to sparkle.
Are you wondering, how to add the cuts to a data model? Well, I will save that story for another time. For now I wish you a (belated) Merry Christmas and a Splendid New Year.