Doing it with Data
Data has become part of our everyday vocabulary. Much more than research data, it has become the access to our online life and also the information we leave behind – our online footprint.
Bill and Melinda Gates’ annual letter was released this week and their reference to data really caught my attention. One of the surprising things about 2018, according to the powerful couple, was that data was sexist. Can it be? Does data have such power?
Why is data so important? For the individual, having personal information available online, holds security concerns, but from a business perspective, this information is invaluable. The individual’s online footprint gives companies insight into human behaviour. It helps business to predict preference and trends and it helps with product development, placement and promotion. Data analysis gives quicker and in-depth results compared to outdated research methods and is very important in guiding decision-making. “Data-driven decision management (DDDM) is an approach to business governance that values decisions that can be backed up with verifiable data. …Data-driven decision management is usually undertaken as a way to gain a competitive advantage.” (Read more)
There are different types of data and I find this article by import.io very handy. Here is how they distinguish between the various types of data:
Personal data – your demographics, location, contact details, etc. The online security of personal data is a contentious issue. On the one hand, sharing your personal detail enables personalisation, but on the other hand it can expose you. Targeted social media and online advertising depend on personal data.
Transactional data – when you click on an ad, purchase something online, etc. your transactional data is collected. This data helps businesses determine patterns and trends that they can use as an competitive advantage.
Web data – refers to all the public information you might collect from the internet. It is one of the major ways in which businesses collect information. Other than just general searches, web scraping can provide data in a structured format.
Sensor data – includes everything from smartwatches to weather forecasts and refers to measurement and the opportunity for either human or machine to adapt behaviour.
Big Data – all the different types of data contribute to Big Data. “Things like social media, online books, music, videos and the increased amount of sensors have all added to the astounding increase in the amount of data that has become available for analysis.” In contrast to research sampling, Big Data analyse data in its entirety and ensure a more complete picture. For the wine industry, Enolytics is collecting data about consumer insights and producer perceptions in order to find a better understanding of the wine consumer. (Read more)
But data, similar to all types of research, has to be interpreted. Enter the data scientist, a title that has been referred to as the sexiest job of the 21st century… The data scientist has to understand computer science, statistics, analytics and mathematics. Interpretation of results and proper communication of these interpretations are, however, essential. It is therefore not all about the data.
Why would Bill and Melinda Gates call data sexist? Two of the remarks intrigued me: “What we choose to measure is a reflection of what society values.” and “Answers depend on the questions we ask.” It is easy to trust the data, to see it as impartial statistics when it comes via online reporting. The problem is that the interpretation is to a large degree determined by our own preconceptions. And the data scientist is not to blame, it is our way of thinking that determines what we see and how we react to it.
Data is one of the most important tools of the modern era and what I find most valuable is how it predicts probability. Having said all that, data as we know it at the moment, cannot be our sole guiding factor. The human factor stays relevant. To ensure we get the most out of data, we need to ascertain the credibility and accuracy of the data we work with, we have to be reasonable and we have to keep in mind that our own references can influence decision-making. We might expect algorithms to have human insight, but in reality these insights are still left to us.