Data Science is NOT Statistics
Paulo Cysne
Senior Data Science & AI Leader | 30,600+ followers | AI Strategy | NLP & LLM Expert | Explainable AI | Graph Databases | Causal AI | Enabling organizations to elevate data-driven decisions and harness AI
Data Science is actionable intelligence
by Paulo Cysne Rios Jr
While Statistics is a great science with many achievements and Data Science does make use of many statistical analyses, the two aren’t the same at all. Statistics is mostly about describing data, even when inference is made. Data Science has another goal: predictive modeling towards actionable intelligence. It isn’t about the future, but about what to do today to make a desirable near future.
Data Science doesn’t really ask questions to data, but rather performs what I call deep analysis. It is explorative, trying to discover patterns, trends and relationships in the data that can be used for actionable intelligence. These patterns and relationships must be predictive. What is learned from current data must hold true for unseen data now and in the future. So it seeks to answers questions such as:
Healthcare: Who will get sick in the near future?
Marketing: Who will respond to what offer?
Production: Which machines in the shop floor will fail next?
Logistics: Which routes/resources will be most profitable/efficient?
The key requirement for new knowledge in Data Science is its ability to predict and not just explain.
Some people think that prediction is something more akin to fantasy and science fiction. But this is a fundamental misunderstanding about science and technology. Nature is deeply predictive because there is a beautiful underlying order. In contrast, human affairs seem hard to predict, but all sorts of data have patterns and relationships that are invisible to us simply because we are limited in our capacity to analyze many variables at the same time. But computers don’t have this limitation and can search for statistical patterns in more than 500 or 5,000 variables at a time.
Data Science also requires skills and experience in software engineering and computer data modeling and implementation that a statistician isn’t expected to have. This is so because Data Science is also about implementing computer algorithms and models that can work on data available in digital form, be it in structured or unstructured way. The immense and ever increasing availability of data nowadays has fueled Data Science.
This article originally appeared on cyzne.com