What is data integrity?
Consider that one of your colleagues accidentally deletes the record of your customer, Mark Doe. You not only lose information about a valuable customer, but you also risk losing your relationship with Jodie Doe and Sally Doe, his family members. In another case, after migrating data to the cloud, you find a ton of duplicated data, jeopardizing your operations for two whole days while data engineers try to resolve the issues.
Broken data attributes and relationships, data theft in a virus attack, or a server crash resulting in loss of data are the nightmares of data-driven organizations using data for real-time analytics and decision-making. They need undamaged data that represents real-world entities correctly and consistently. Compromised or poor-quality data can never build trust in their decisions. They need an assurance of data integrity to accept that the data powering their decisions is trustworthy.
What does data integrity mean?
Data integrity refers to the completeness, accuracy, consistency, and security of data throughout its entire life. It is indicated by no difference between any two instances of data, signifying that the data is intact. Data integrity uses a collection of processes, rules, and standards designed to keep data undamaged and accurate over its life, wherever it may move.
You will find that integrity is not just about data but also about its relationships. For example, if a customer address is changed, all relationships of that attribute must get updated. As data moves and gets transformed across enterprise systems, integrity ensures that data remains intact and correctly connected.
Data integrity is traditionally considered a?dimension of data quality . But operationally, you will find it aligned more to?data governance . It implements rules and processes to assure?data quality ?while data is entered, stored, moved, and used across systems.
In other words, data integrity uses rules and processes to protect your data from damage during enterprise operations. And it leverages data security to defend your data from any outside damage.
What are the different types of data integrity??
You can maintain data integrity at two different levels, physical and logical.
Logical data integrity is essential for assuring trusted decisions and regulatory compliance. Largely a feature of the database, logical data integrity is of four different types.?
Entity integrity
Entity integrity ensures that no data is redundant, no fields are null, and no data is duplicated. It uses the concept of primary keys, the unique values that identify pieces of data. It is a feature of relational systems that store data in tables, which can be linked and used in a variety of ways.
Referential integrity
Referential integrity ensures that data is stored and used uniformly. It uses the concept of foreign keys, which either refer to a primary key value of another table or are null. The null value of a foreign key indicates either no or an unknown relationship. Rules are embedded into the database structure about foreign keys. These rules can define constraints to eliminate duplicate data and guarantee data accuracy.
Domain integrity
Domain integrity ensures the accuracy of each piece of data in a domain. A domain is a set of acceptable values a column can contain. The defined constraints limit the format, type, and amount of data entered. For example, the rules prevent users from entering wrong information in the birth date field.
领英推荐
User-defined integrity
User-defined integrity provides additional rules and constraints to align with the specific user requirements. It is typically used when entity, referential, and domain integrity are not sufficient to safeguard data. Business rules are sometimes part of the user-defined integrity constraints.
How can you achieve high data integrity?
As data volume, variety, and speed of arrival increase, managing data becomes challenging. It also exposes data to more risks affecting its integrity.?
You can minimize the impact of these factors by limiting data access, using error detection software, and validating data. Regular data backups and data audits also help to improve data integrity.
What are the best practices for data integrity?
Best practices for achieving high data integrity are focused on the processes of data handling. They also take into account the related practices of data quality, governance, and security.
Getting your business data integrity ready
Data integrity relies on a set of rules and procedures to improve system stability and performance. High data integrity requires that data must be unaltered and used correctly. You can use a few questions to assess how your organization scores.?
Once you assess what your data processes look like, you can use a simple 5-step process to get your business data integrity-ready.
Improve the overall accuracy, completeness, and reliability of your data sets with data integrity. With?Collibra , you are always assured of data integrity to meet the stringent requirements of regulatory compliance. You can also take the pain out of your data movements and migrations. The comprehensive platform scales as your business grows, assuring that your data integrity is always safeguarded.?
In summary
Good business decisions are founded on the integrity of data. A thorough understanding of data integrity at different levels prepares you to achieve it in your organization. The best practices for data integrity go hand in hand with data quality, governance, and security.
Choosing the right comprehensive platform helps you with trusted data that can power all your business initiatives.