Business Intelligence & Data analytics | My notes 8'23
Since Linkedin opened doors for my collaboration in the Community Specialist's Articles I've realized they many times get lost in the endless flow of information that goes through this network day by day. So I decided to collect and summarize some of my monthly additions to this public knowledge database for its documentation and easy-of-search for the BI and Data Analytics Enthusiasts out there =).
What is data visualization about?
It is about adding value.
That's right. The most important thing to keep in mind as a data analyst is that regardless of its name, the essence of data visualization is not really about the visual representation of data. It is about creating an access door to valuable insights by revealing relevant facts and relations otherwise concealed from plain sight. It is about telling a story that the viewers of your report may use as a bridge to connect with these insights and about providing aid in the understanding of matters generating a meaningful impact.
Also, acknowledging that more often than not, less is more (especially for a wider audience), any data study presentation should prioritize engagement over the displaying of raw data. Simplicity is often key to an effective communication, and an emotional connection with the audience beyond a logical can become the needed catalyzer for driving positive change within an organization.
Data Quality
To trust data, we must first be able to trust processes.
Data Quality directly impacts the credibility and success of any and all BI projects. It measures data's fitness, accuracy, completeness, consistency, and reliability for its intended use.
One of the biggest mistakes one can make when working with data is considering that each data point is a fact, because it is not. Every bit of data is just that, a bit of seemingly meaningless data awaiting your aid to become valuable and gain purpose. The circumstances and methods by which it is produced or collected are the first steps in attributing the quality value to such a bit of data. A simple row or column shift in an Excel file, a name change in a CSV file, or perhaps a seemingly inoffensive disregard of the proper data governance policies may turn any analytic effort down the road. As a rule of thumb, the quality of the data can only be as good as the quality of the processes it goes through from its production to its consumption.
If you want to learn more on this subject consider exploring topics such as data profiling, cleansing, validation, governance and lineage. In such note, you may also want to check out tools such as Talend, Informatica and Alteryx.
Data Integration - Datalakes
Datalakes are the silver bullet of this era of data.
This is thanks to their inherent simplicity and flexibility. Nonetheless, each organization must vigilantly monitor its practices and protocols to forestall the transformation into a Data Swamp. This is what a Data Lake becomes when an organization's data environment becomes inundated with massive amounts of unstructured and/or inadequately managed data. This data is frequently of subpar quality, lacking proper organization, metadata, or governance, and consequently becomes challenging to navigate, analyze, or employ effectively, unnecessarily amplifying the efforts needed to extract valuable information.
Data Integration - DataMarts
Data Marts to save the day.
These are subsets of a data warehouse or data repository that are designed to serve the specific data needs of particular business units, departments, or functional areas within an organization and serve as a valuable complement to data lakes. In certain situations, they can help address some of the challenges and complexities just mentioned about data lakes.
Data marts are valuable for organizations because they make data more accessible and relevant to the people who need it most. They improve data usability, performance, and alignment with business goals. However, it's essential to manage data mart implementations carefully to avoid data silos and ensure data consistency and integrity across the organization.
领英推荐
Data Mining
You do not need to reinvent the wheel.
Much like data visualization crafts stories with data, data mining decodes the language of information, unveiling insights hidden within the intricate layers of data complexity. Yet, despite how unique the complexity of your project may seem and while it is true that not two problems are identical, a fair set of proven methods are your best ally in most if not all data-mining-needs scenarios.
Exploring literature on the subject of how others tackled and resolved complex challenges may enhance your current approaches or yield better results for the challenges you may one day encounter yourself. A great place to look for such knowledge is within the pages of scientific papers. Give it a try and you will soon see that becoming acquainted with scientific publications and forming a habit of doing so can quickly become immensely valuable.
Data Modeling
Data modeling encompasses the task of developing and structuring data to align with your business intelligence (BI) objectives and needs. This entails crafting both abstract and tangible representations of your data, including its structures, interconnections, limitations, and operational regulations.
A well-crafted data model streamlines engineering tasks and enhances their efficiency, but... There is no universally applicable model and not a single definitive perfect data model for every use case, nor should be. A comprehensive and overarching understanding of the business activities and processes is crucial in its development and maintenance.
But what does it mean?
It means that data models are a complex and that they often evolve over time within organizations just as organizations also evolve. Data models are not static entities; they are living structures that require adaptation to challenges such as changing business needs, technological advancements, data volume or variety, regulations, etc.
The five main types of Data analytics.
Descriptive, diagnostic, predictive, prescriptive and cognitive.
These cover a field of bidimensional progression. On one hand a temporal progression going from understanding historical data to predicting future outcomes and optimizing the results of the chosen actions. And on the other hand going from highly structured data to the realms of unstructured complex data by the hand of ML & AI. It is because of this, that they are often depicted as concentric fields with Descriptive analytics at its core rooted near the origin of an XY cartesian axis where potential value for the company and data complexity are each of the axis labels that grow outwards starting from Descriptive analytics towards the realm of prescriptive & cognitive analytics.
For a detailed footnote on this subject:
Hope you enjoyed the reading as much as I enjoyed the writing! See you on the next article!
Senior Technical Lead @ Dimpera
1 年Excelente Alejandro Loredo!!! Se viene una saga nueva de artículos de BI y DA??! ??