Data Science: A Tiny Introduction
What Is Data Science?
Data Science is a multi-disciplinary field of science that combines approaches and mathematical techniques in statistics, business, and computing to produce insights and information that are useful for producing/finding solutions to a (business) problem.
There are several experts and sources that describe the process/stage of Data Science, the following is one of them. Adapted from “The Data Science Life Cycle” Berkeley School of Information.
- Capture: data acquisition, data entry, signal reception, data extraction.
- Maintain: data warehousing, data cleansing, data staging, data processing, data architecture.
- Process: data mining, clustering/classification, data modeling, data summarization.
- Analyze: exploratory/confirmatory, predictive analysis, regression, text mining, qualitative analysis.
- Communicate: data reporting, data visualization, business intelligence, decision making.
An effective data scientist can identify relevant problems precisely, gather data from various sources, organize information, translate findings into solutions, and communicate these findings positively for business decision making.
Implementation of Data Science
Currently Data Science has been applied in various fields. Starting from the marketing process, telecommunications, transportation, to government and Smart City. Some of these implementations can be described as follows:
Data Science in Marketing
In the field of marketing, Data Science can be used to gather valuable information for marketers, such as accurate customer data, to further determine the basis for effective marketing strategies. Some examples of implementation include:
- Efficient marketing budget. By analyzing marketer expenditure and acquisition data, a data scientist can build a budgeting model that can better regulate budgets that evenly distribute budgets in various locations, channels, and promotion strategies.
- Targeting the right audience. Data science can help marketing in determining the right target customers from data collected from various sources, such as economic profile analysis, interests, purchase history, and other demographic data.
- Determine effective channels. Data science can function to analyze the most appropriate promotion/marketing channels, so that marketers can focus strategies on the right channels and allocate resources more efficiently.
Data Science in Entertainment
One example of the application of Data Science in the Entertainment industry is on Netflix. Netflix is an internet-based provider of video entertainment content streaming. To provide the right shows for its customers, Netflix provides customized recommendations according to specific customer data, such as watch history, location, age, and movie preference options.
At present Netflix has 150 million subscribers, which surely with such a large amount will produce large data to be analyzed, as well as provide adequate insights to provide the best service.
Some Data Science innovations on Netflix include (Shah, 2019):
- Netflix’s Data Scientist & Engineers build models to predict “perfect situation” in which, customers continuously receiving the programs they enjoy. To do so, it assigns users to 3–5 different clusters among more than 1300 clusters, based on their viewing preferences.
- Using Data Science techniques, Netflix Service created 76,897 unique ways to describe types of movies. These are called “alt-genres” which is what leads to Netflix’s Scarily specific movie/show suggestions.
- Cover image personalization. Netflix models the shows’ cover image on the colors and styles for successful similarly tagged programs. Also, they try with different versions of cover images to find out which one is more effective for the user.
Data Science in Transportation
An example of the next application of data science is in the transportation industry. One of the companies that is superior and applies data science technology in the field of transportation is Uber. Uber has a massive database of drivers and customers. All this data is stored and used to predict supply and demand, and also set rates. They are also able see how to transport in various cities efficiently and try to set a route with traffic and hazzles.
With data science, Uber can also monitor the speed and acceleration for its drivers, making it easier for drivers to work for competing companies (like Lyft). All data is collected, grouped, analyzed, and used to predict everything from customer waiting times, to how drivers must carry themselves through a strategic map to take advantage of the best prices and produce many passengers. All data processing is implemented in real-time, both for drivers and passengers.