How to distinguish facts, attributes and relations from bullshit ?
Sylvain Arbaudie
// performance and problem-solving minded data architect // jack of all MariaDB trades (and master of a few) // my posts reflect my personal views // praise the Omnissiah //
We've seen in my previous article ( who said rant ? ) how to easily normalize database items, but how do we know what's what ? It is crucial to clearly define and differentiate between facts, attributes, and relationships to create an efficient and evolutive design . Here’s how you can identify each component and filter out unnecessary information.
Know the domain
It should make perfect sense that nobody ever created a decent data design on a domain it didn't know beforehand. To gather that knowledge or learn the missing bits : collect detailed requirements from stakeholders to understand what data needs to be stored and how it will be used. Without this intimate knowledge, it is impossible to filter out unnecessary information ( aka bullshit ).
Now let's begin by defining the topics :
Ain't it easy(peasy) ? Not that much indeed. Let's now have a look to a method that helps identifying the aforementionned 4 categories ( facts, attributes, relations and bullshit ).
Use natural language
Natural language descriptions are invaluable in distinguishing attributes from facts and relationships : a fact is usually a noun that refers to a core concept of the domain we are modeling. Attributes do provide specific details about the previous item (the fact), and natural language helps identify these details through descriptive terms and context. Relations are very often expressed by verbs linking facts between themselves.
Example "Our e-commerce platform needs to manage customers, their orders, and the products they purchase. Customers have personal contact details such as name, first name and brithdate, and unique public ID. Each order includes an order date and can contain multiple products, each with a name and price."
领英推荐
Caveats
There are three main mistakes that can lead to wrong design. While it usually happens to attributes, some of those could alos lead to wrongly consider an item as a fact.
The two most infamous caveats are phone number and address. The true wording about phone is : "My phone operator granted me a line publicly identified by this phone number". The number has an existence outisde the person ( and also the device used to answer the calls ) : those are different facts linked toghether by relations. Simiarly the true wording about address is : " i live in/possess a home/building/plot of land that sits at this address." Address is a geographical beacon that exists wether there is something or not, whomever possesses the place sitting there. Address is a fact that is linked to a person by a relation. In both cases linking directly the address fact to the person is a simplification but a necessary one most of the time.
False friends
The most infamous false friend is age. Despite being seemingly intrinsic age is not immutable. It is also not that intrinsic since all the people born the same day ahev the very sdame age. What should trigger your understanding is that age is a calculation that need knowing another attribute : birthdate and the day of calculation date. This is a dynamic attribute so it should not be stored nor even considered at this stage of data design.
Conclusion
Using natural language along with this structured approach ensures that you correctly identify the core entities (facts) in your database design, providing a solid foundation for further detailing (attributes &relations) and implementation. Beware to truly addressing the caveats and being aware of false friends. By strictly following those guidelines professionals in charge of data design can effectively identify and define attributes that accurately represent the data requirements of the system, fostering a robust and reliable database schema.
// performance and problem-solving minded data architect // jack of all MariaDB trades (and master of a few) // my posts reflect my personal views // praise the Omnissiah //
4 个月any comments /.thoughts ? Federico Razzoli Patrizio Tamorri Richard Bensley Richard M. Sebastien GIRAUD Anders Karlsson Kathryn Sizemore Kaj Arn? Michael Amadi MariaDB Foundation MariaDB Michael "Monty" Widenius Alejandro Duarte Daniel Black Oli Sennhauser ?