Solving Business Problems Using Data Science approach

Solving Business Problems Using Data Science approach

I spent several hours in last weeks researching on this topic. In this post, I will share what I learnt about the preparation phase of successful data science projects. Some of the content is what I derived from the best practices of Google in data science projects. Please do not forget there are no strictly defined rules in any data related discipline. That’s why it is best to assume this as a suggestion rather than a guideline.

The term Data Science has two parts: Data + Science.

The science part is mostly related to mathematics and computing. In the scale of an individual, there are many ways to learn mathematical foundations and develop computing skills. I will also discuss these in later posts. If you are running a business, given enough financial resources and time, you can retrain employees, hire new talents or invest in new technology.

Data is similar to water. It may be pouring rain or an oasis in Sahara. Sometimes it is Lake Superior, sometimes it is bottled water. Sometimes you need a refiner to drink, sometimes you immediately start irrigation. In the scale of an individual, you can have open access tons of data to start a project. If you are running a business, the burden is on your own shoulders most of the time.

As the title implies data science may be one of the weapons when businesses face problems. Science part brings the solution. However, data is the potion which is turned into magical by the application of science. If you have the wrong potion, you get an ugly frog. If you have the right potion, you get a swan (sometimes even a black one).

That’s why the outcome is dependent on the strength of the ties between the problem defined and available data. Now, let us have a closer look at the steps to enhance these ties.

1. State Your Business Problem

It all starts with a clearly defined business problem. You should propose a predictive problem to generate/improve business value for your organization. Moreover, you should also know the impact of the predictive result.

Your business problem may be defined in the following design: How might we (goals) so that (impact/outcome)?

Example:

How might we set promotions/discounts for potential customers based on our existing portfolio so that we can increase our share in their expenses?

2. Make Unit Analysis

The more detailed the problem is, the clearer it becomes. The unit analysis is most of the time derived from the business problem. You should define units from your available data and they should be consistent throughout the dataset. You must be sure about what each row means in your data.

Example:

Business Problem: How much demand is expected on several periods ahead so that we can manage our inventory efficiently?

Unit Analysis: Demand (# of orders, SKU, etc.), Time (day/week/month), Inventory (SKU, # of barrels, parcels etc.)

3. Define Your Variables

Variables are all that matters to make predictions. Think variables as columns in your data. Your variables should be associated with unit analysis.

All variables should be consistent in measurement such as if price is a variable the currency should be defined beforehand as USD etc.

All variables should also be consistent in specifications. For example, if gender is a variable, having both female/woman and both male/man violates consistency.

Variables might not have a structure most of the times. For example, your variable set might be pixels in images which also might be of different sizes. In such cases, you should standardize the images to ensure consistency.

Example:

Business Problem: What kind of shoes might an online customer be interested in buying?

Variables: Brand, Color, Size, Price, Discount Rate, InStock_YN etc.

4. Define Your Target

Target is what we are after to predict. It also should be associated with unit analysis. It may be a categorical, such as Hot/Warm/Cold or continuous such as Celsius/Fahrenheit degree.

Example:

Business Problem: How should we predict the time a pet spends at shelter before being adopted?

Variables: Type, Breed, Gender, Color, Name, Health Condition, Image of Pet, Profile Description etc.

Target: # of days spent by a pet at shelter until adoption

5. Set the Actions

Plan how you are going to benefit from predictions. Action is what users should do after getting predictions. Your actions should be derived from business problem and be associated with target. Depending on the results, multiple alternative actions might be necessary.

Example:

Business Problem: How should Uber set price for a specific journey?

Variables: Distance, Congestion Level, Estimated Travel Time, Customer History, Supply/Demand around etc.

Target: Total cost of the journey shown a potential customer

Action: Feed Uber algorithm with the response of the customer (Accept/Decline) to optimize pricing.

6. Define Success Criteria

Your success criteria must be defined to measure the impact of solution to the problem. It might be either to minimize or maximize something. It might me derived from a statistical criteria. You had better represent it by money, time or manpower. Success criteria should be associated with target.

Example:

Business Problem: What is the right amount of inventory to start production?

Target: Inventory limit at each distribution center for each product

Success Criteria: Minimize average inventory (monetize by holding cost), Maximize % decrease in # of shortages (monetize shortages)

I like the simplified, unrealistic Uber example I shown above. Here is a full example for that case:

Business Problem: How should Uber set price for a specific journey so that they can maximize their daily revenue?

Variables: Distance, Congestion Level, Customer History, Estimated Travel Time, Supply/Demand around etc.

Target: Total cost of the journey shown a potential customer

Action: Feed Uber algorithm with the response of the customer to optimize pricing.

Success Criteria: Increase daily revenues by 15% compared to static pricing model by maximizing customer acceptance (or target price).

Unit Analysis:

  • Price: (USD)
  • Distance: (miles between origin & target location)
  • Congestion Level: (Low, Medium, High, Extraordinary. Generated by another model)
  • Customer History: (A for Accept, D for Decline. A-D-A-A)
  • Estimated Travel Time: (mins)
  • Supply/Demand Around: (3 free drivers, 7 customer requests. 3/7)

要查看或添加评论,请登录

Anto Franklin Christuraj的更多文章

社区洞察

其他会员也浏览了