AN APPROACH TO ANALYZING DATA
geeksforgeeks.org

AN APPROACH TO ANALYZING DATA

For someone who loves analyzing data, there are opportunities seen in any data that is around. Oftentimes, the brain can go whirly in trying to fathom the approach to analysis and this can result in several iterations before we get to the goal of realizing value from the analysis. Even if the problem statement is defined by the customer, there is still merit in exploring what else the data might have to offer by way of insights. This led me to think of a generic approach that I could employ when facing a set of data. To make this article meaningful, I have taken the following data set as an example.

https://www.transtats.bts.gov/ONTIME/Departures.aspx

1.??????Start by defining the customers – A look at the data can tell us who can be the prospective customers. In the case of the airlines data, the customers could be the Airlines, the OEMs of the planes, the airport authorities and the passenger.

2.??????List out the potential pain points of each customer and/or their areas of interest – For example:-

a.??????Airlines - Delays by airport, time of day, month of year, day of week, causes (weather, carrier, security etc)

b.??????OEMs of planes – Delay by plane type

c.??????Airport authorities – Delays due to security across months and days of week

d.??????Passenger – Delay prediction when booking a ticket.

3.??????Examine the data thoroughly – Going one column at a time, understand:-

a.??????The meaning of each column heading

b.??????The type of data in each

c.??????The source of data (which system does it come from, accuracy of data etc)

d.??????How many rows of data are missing and what to do about them?

For example, where does the delay due to weather come from? Is it manually input and by whom? How much credence to give to it? What is the meaning of delay due to National aviation system? This is a very important step for data scientists as it also helps build an understanding of the domain. I believe a good data scientist is one who has sufficient of breadth of knowledge to get a perspective as well as one who is smart enough to gain the necessary depth of the domain knowledge in a short time.

4.??????Think what type of graphs need to be plotted to visualize each of the customer persona’s pain points/areas of interest (Descriptive analysis). For the airlines, bar graphs of delays categorized by Airport and causes and trend lines of delays by time of day, day of the month and month of the year might be enough to begin with. For the passenger, we need to be able to show a predicted delay based on the ticket being booked. Working backwards, we would need to have a prediction model built on training data and tested on a separate set to determine model parameters.

5.??????Data will point to areas of interest; the next step is to deep dive into possible causes and remedial actions. For example, if it is seen that the delays due to security peak in certain months which also correlate to a larger number of planes flying in those periods, it would be good to get additional data on number of passengers. If the delays due to security are observed to correlate with number of passengers, perhaps augmenting security staff in times of peak forecasted demand could help mitigate the issue.

A logical approach as above will make one feel less overwhelmed when faced with a large data set and not knowing exactly what to do with it. Do you have any other ways that have worked for you? Feel free to comment.


要查看或添加评论,请登录

R Ravi Shankar的更多文章

  • Navigating Agile Methodologies: Sprints vs Kanban

    Navigating Agile Methodologies: Sprints vs Kanban

  • Is NPVI enough?

    Is NPVI enough?

    The New Product Vitality Index (NPVI) was introduced by 3M in 1988 as a way to measure Innovation. The metric is…

    6 条评论
  • HOW MANY PROJECTS IS ONE TOO MANY FOR YOUR ORGANIZATION?

    HOW MANY PROJECTS IS ONE TOO MANY FOR YOUR ORGANIZATION?

    Enough research exists to show that Multitasking, doing more than one thing at the same time, is a killer of both…

    7 条评论
  • GO FIX THAT BROKEN WINDOW, EVEN IF WITH DUCT TAPE!

    GO FIX THAT BROKEN WINDOW, EVEN IF WITH DUCT TAPE!

    Over the last few weeks a few incidents occurred that, to me, seem connected. Maybe there is confirmation bias at work…

    6 条评论
  • Self Improvement - The hardest thing!

    Self Improvement - The hardest thing!

    I am just back from my weekend 'Kanban', sustainably paced, 21k run. I do 'Scrum' runs on weekdays: shorter distances…

    6 条评论
  • Is Agile against human nature?

    Is Agile against human nature?

    The last few months have been interesting. My colleague Arvind and I have been coaching teams who are foraying into the…

    1 条评论
  • New year anti resolutions

    New year anti resolutions

    It's getting to be that time of the year when most might be beginning to think of new year resolutions and the process…

  • DECODING THE EFFICIENCY AND EFFECTIVENESS OF AGILE PRODUCT DEVELOPMENT

    DECODING THE EFFICIENCY AND EFFECTIVENESS OF AGILE PRODUCT DEVELOPMENT

    When teams are trying to change from a waterfall product development to Agile, it is good to adopt a practice like…

    3 条评论
  • Being Agile or Doing Agile?

    Being Agile or Doing Agile?

    During agile implementation in various teams, one is often asked a whole bunch of questions on agile processes. While I…

    2 条评论
  • Kanban or milestones?

    Kanban or milestones?

    In continuation of the thread in my articles https://www.linkedin.

社区洞察

其他会员也浏览了