Vilfredo Pareto - On Data

Vilfredo Pareto - On Data

I entered the world of data, starting by data quality, making quality one of the foundational themes for all the work I’ve been producing since.

“If I have seen further, it is by standing on the shoulders of Giants” is one of my favorite quotes, belonging to Isaac Newton, working as a reminder the everything we know and do is a compound of work done before us.

One of these giants is Joseph M. Juran, whose work in the field of quality management is still a reference.

So why am I bringing Juran here? Mainly because he introduced the Pareto principle to quality issues, verifying that a small percentage of root causes contributes to a high percentage of defects.

The Pareto principle or 80/20 rule, follows the observations of economist Vilfredo Pareto, whose studies showed that 80% of the land in Italy was owned by 20% of the population.

Although I’ve frequently used this principle while dealing with data quality issues, this is a principle - even though there is little scientific analysis that either proves or refutes its validity – that is frequently used in many different fields.

This is also true when reflecting on some of the issues faced by who has responsibilities in the data management area, and if correctly applied can bring a better understanding of the issues and possibly, additional benefits, cutting costs, and increasing some efficiencies – Or at least to be used as a tool to identify priorities.

Putting it in a different way – considering data as a corporate asset – the rule allows an organization to identify its best assets and use them efficiently to create maximum value.

Keep in mind that 80-20 is only a guideline, it’s in fact almost a branding name. What those two numbers measure are outputs and inputs, not even necessarily using the same units. S, it can easily be 70-30, 50-10 or whatever combination.

What I’m proposing here is a questioning exercise, that will allow in certain situations to do a more efficient allocation of resources, or even help to define future investments.

Asking questions like:

  • Which 20% of data produces most business valuable insights?
  • Which 20% of data is more critical for business continuity?
  • Which 20% of data is more liable to security risks?
  • Which 20% of data is more frequently accessed?
  • Which 20% of data is less frequently updated?
  • Which 20% of data is more critical for regulatory purposes?
  • Which 20% of data is taking more processing time in loading and transformation processes?
  • Which 20% of data is the cause for most of the data quality problems?

These are just a few examples of questions that can be put and that can in some situations lead to a change in perspective followed by some specific actions, especially when we start crossing the answers from different questions.

As an example, trying to identify the 20% of data that is most valuable to the organization, will allow to better prioritize and define any future data initiative, to review the current ones or even adapt ongoing initiatives to maximize the efficiency of the data architecture overall.

Joseph Ezenwa

Managing Consultant

2 年

Yes, Jose Almeida, I am very familiar with Pareto Principle developed by Joseph M. Juran ie 80/20% rule. This rule is applicable in different areas. I have done some works on Total Quality Management(TQM) which is one of the quality management systems.It was Juran that talked about the cost of quality.Quality has a cost really.It is not always easy for organizations to fully implement TQM programme because of high cost of its implementation.In organization for example it is 20% of the employees that makes 80% of the most valuable contributions.Yes, the same is applicable to data.

Jude Juma

Sr. Software Engineering Manager @Safaricom |DevSecOps | SDET | Product Development | Enterprise Architecture | AI & ML | IT Strategy | ISTQB 4x Certified | AWS 2x certified | ITIL 5 certified | Agile Certified

2 年

Quite insightful. In Quality Engineering we always make reference to the “defect cluttering “ principle which shows how majority of defects tend to clutter in a small number of software/application features, causing the most quality issues. This is 80-20; Pareto in action.

要查看或添加评论,请登录

Jose Almeida的更多文章

社区洞察

其他会员也浏览了