Demystifying Data Science
Data Science is the new ‘In Thing’. Everyone seems so excited and wants to know about it and work with it. But was not data important before? Have our decisions not always been data driven? Then why so much fuss about Data Science now? To answer that we need to first understand what is data science.
Data science enables the stakeholders to take an informed decision based on the knowledge gained by extracting useful information from the available data. So data becomes the most important aspect for decision making and the structure of data (or the lack of it) has seen a sea change over the years. Data no longer comes only in rows and columns, its as varied and diverse as it can be. With the explosion of multimedia, social media etc. everything from Facebook and twitter comments, likes to images, videos, audios, everything is data from which useful insights can be extracted. So traditional systems, software cannot handle the volume, variety and velocity of the incoming data to extract useful information.
Therefore, to address this we needed our computers to think out of the box! Rather we needed computers to be intelligent. Intelligent enough to predict the future trends based on the current data. So to ingrain the intelligence in modern computers, these systems are designed to model the human brain. The systems thus designed, emulate the human thinking process and decision making skills.
The discipline of Data Science comprises of 3 main components : MIS ( Management Information System), Business Intelligence and Predictive Analytics. My understanding of these 3 broad terms is we predict the future trends looking at the present data. The information thus extracted can be used to select a subset of data that is most relevant which can be classified as Business Intelligence. Based on the knowledge gained we generate reports for the management via the MIS. So these reports are the valid reasons for the decision that the management takes.
Let's explore Predictive Analytics a bit. Right Prediction helps organizations to be proactive in dealing with issues. But there has to be a solid backing for the prediction to prove right. That's where past experience and data comes handy. Predictive Analytics is mainly comprised of 2 branches: Machine Learning and Artificial Intelligence. Both of these do the job of predicting but their approach differs. While machine learning relies on statistical methods , artificial intelligence emulates human intelligence for prediction. Humans exhibit intelligence while seeing, talking, listening and predicting. For example, when we are asked a question or in general when we are conversing we answer or reply back based on what we can recollect or tell our opinion based on what we feel after we understand what we are asked. In similar way we have the first Robot to have got a Citizenship granted by a country, Sophia.
If you see the video you would understand how wide the research area is for bringing this robot more closer to the intelligence level of human speech comprehension.
So bottom line being AI works on the principles of human brain. The most important element in the human brain is the neuron and the network of neurons that pass on the data, comprehend it and pass the reaction back at lighting speed while incorporating human emotions and traits.
On the other hand when we talk about machine learning, it makes a prediction based on statistical approach. It takes in the past data( called training Data) , does some number crunching based on mathematically proven formulae which reveal some patterns or trends present in the data. But before we can predict we need to validate how much correct our patterns are and how much deviation w.r.t actual result is present. Therefore we feed the test data and compare the results of our algorithm with the actual results and then do the prediction while also specifying the error margin.
So to answer the question that was raised in the beginning, what's the fuss all about. Well it's the system processing all that varied humongous data coming in every second to tell you what is right for your business so that you take informed decisions. The data that we could not possibly go through in such a short span of time, therefore Data Science is here to stay.