Data Analytics # CAROHITUPADHYAYA
CA ROHIT UPADHYAYA
Group CFO @ Kerchanshe Coffees | 25 + YEARS I FMCGI MANUFACTURING I LOGISTICS I OIL & GAS I AGRICULTURE I HOSPITALITY I RETAIL I MINING I TECHNO COMMERCIAL I AFRICA I Driving financial growth and consolidation
Introduction to Data Analytics
Data analytics is the science of raw data analysis to draw conclusions about it. Data Analytics refers to the techniques for analyzing data for improving productivity and the profit of the business. Data is extracted and cleaned from different sources to analyze various patterns. Many data analytics techniques and processes are automated into mechanical processes and algorithms which handle raw data for human consumption.
Types of Data Analytics
The Data Analytics Process is subjectively categorized into three types based on the purpose of analyzing data as:
1. Descriptive Analytics
Descriptive Analytics focuses on summarizing past data to derive inferences.
The most commonly used measures to characterize historical data distribution quantitatively includes:
In recent times, the difficulties and limitations involved to collect, store and comprehend massive data heaps are overcome with the statistical inference process. Generalized inferences about population dataset statistics are deduced by using sampling methods along with the application of central limiting theory. A leading news broadcaster gathers casted vote details of randomly chosen voters at the exit of a poll station on the election day to derive statistical inferences about the preferences of the entire population.
2. Predictive Analytics
Predictive Analytics exploits patterns in historical or past data to estimate future outcomes, identify trends, uncover potential risks and opportunities, or forecast process behavior. As Prediction use-cases are plausible in nature, these approaches employ probabilistic models to measure the likelihood of all possible outcomes. The chatBot in Customer Service Portal of financial firm pro-actively learns the customers’ intent or need to be based on his/her past activities in its web domain. With the predicted context, chatBot interactively converses with the customer to deliver apt services quickly and achieve better customer satisfaction.
3. Prescriptive Analytics
Prescriptive Analytics uses knowledge discovered as a part of both descriptive and predictive analysis to recommend a context-aware course of actions. Advanced statistical techniques and computational-intensive optimization methods are implemented to understand the distribution of estimated predictions.
In precise terms, the impact and benefit of each outcome that is estimated during predictive analytics are evaluated to make heuristic and time-sensitive decisions for a given set of conditions. A Stock market consultancy firm performs SWOT (Strength, Weakness, Opportunities, and Threat) analysis on predicted prices for stocks in investors’ portfolio and recommends the best Buy-Sell options to its clients.
Process Flow in Data Analytics
The process of data analytics have various stages of data processing as given below:
领英推荐
1. Data Extraction
Data ingestion from multiple data sources of various types, including web pages, databases, legacy applications, results in input datasets of different formats.
The data formats inputted to the data analytics flow can be broadly classified as:
Implementation of data parsing for structured and semi-structured data is incorporated in various ETL tools like Ab Initio, Informatica, Datastage, and open source alternatives like Talend.
2. Data Cleaning and Transformation
Cleaning of parsed data is done to ensure data consistency and availability of relevant data for the later stages in a process flow.
The major cleansing operations in Data analytics are:
Cleansed data is transformed into a suitable format to analyze data.
Data transformations includes:
3. KPI/Insight Derivation
Data Mining, Deep learning methods are used to evaluate Key Performance Indicators(KPI) or derive valuable insights from the cleaned and transformed data. Based on the objective of analytics, data analysis is performed using various pattern recognition techniques like k-means clustering, SVM classification, Bayesian classifiers, etc. and machine learning models like Markov models, Gaussian Mixture Models(GMM), etc.
Probabilistic models in the training phase learn optimal model parameters, and in the validation phase, the model is tested using k-fold cross-validation testing to avoid over-fitting and under-fitting errors. The most commonly used programming language for data analysis is R and Python. Both have a rich set of libraries (SciPy, NumPy, Pandas) that are open-sourced to perform complex data analysis.
4. Data Visualization
Data visualization is the process of clear and effective presentation of uncovered patterns, derived conclusions from the data using graphs, plots, dashboards, and graphics.
Group CFO @ Kerchanshe Coffees | 25 + YEARS I FMCGI MANUFACTURING I LOGISTICS I OIL & GAS I AGRICULTURE I HOSPITALITY I RETAIL I MINING I TECHNO COMMERCIAL I AFRICA I Driving financial growth and consolidation
2 年Brigette Hyacinth Satya Nadella Lalji Bhai Patel