How Do You Win the Data Science Wars?  You Cheat By Doing The Necessary Pre-work!

How Do You Win the Data Science Wars? You Cheat By Doing The Necessary Pre-work!

If you can keep your head when all about you

     Are losing theirs and blaming it on you,   

If you can trust yourself when all men doubt you,

    But make allowance for their doubting too;   

If you can wait and not be tired by waiting,

    Or being lied about, don’t deal in lies,

Or being hated, don’t give way to hating,

    And yet don’t look too good, nor talk too wise:

“If” by Rudyard Kipling


I’m sure that most data scientists have experienced that moment when they realize that the folks around them have no idea what they do.  That moment when someone walks up to them and says “I’ve got some data.  Can you do some data science on it?”

Many organizations started their data science journey by hiring a “data scientist” and asking him or her to perform magic on the data.  And while there are countless problems with that approach, companies quickly learned that 1) not everyone who calls themselves a data scientist is a data scientist (I can call myself young and dashing, but that don’t make it so) and 2) there is no magic when it comes to data science. Sorry, but as Chris Rock famously said:

There is no sex in the champagne room

Data Science is very hard work, requiring experience and expertise in gathering (scraping in some cases) data from a wide variety of poorly documented and hard-to-access data sources; dealing with the incompleteness, inaccuracies, vagueness and poor documentation about the data; massaging, twisting and torturing that data into some useful form; and trying a seemingly endless number of analytic transformations, enrichments and algorithms in an attempt to find those combinations of variables and metrics that might yield a better predictor of performance (see Figure 1).

No alt text provided for this image

Figure 1:  Data Science Analytics Development Process

The key to the successful execution of the data science analytics development process – where success is defined as delivering predictive and prescriptive results that materially impact the operations and success of the business – is the pre-work. This means not an hour or so of “showing up and throwing up’, but an orchestrated business stakeholder and subject matter expert engagement process that ensures that the data science team thoroughly and intimately understands the business opportunity under consideration, understands the metrics and KPI’s against which progress and success will be measured, has identified, validated, valued and prioritized the decisions that need to be optimized in support of the business opportunity and understands the costs of the analytics being wrong (the cost of False Positives and False Negatives).  So how do we ensure data science success?  We cheat.

We cheat, we do tons of pre-work before we ever “put science to the data

Cheating: The Pre-work before “Putting Science to the Data”

As I discussed in the blog “Why Is Data Science Different than Software Development?”, the methodologies and processes that support successful software development do not work for data science projects according to one simple observation: software development knows, with 100% assurance, the expected outcomes, while data science – through data exploration and hypothesis testing, failing and learning – discovers those outcomes (see Figure 2).

No alt text provided for this image

Figure 2:  Key Differences Between Software Development and Data Science

Software development defines the criteria for success; Data Science discovers them

Consequently, the development folks and management sometimes do not understand and appreciate the significant amount of work that needs to be done “before ever putting science to the data” to give the data science development process the highest probability of success.  And for data scientists, that data science development process is about getting all the key business stakeholders, business executives and subject matter experts to “think like a data scientist”.

No alt text provided for this image

Figure 3:  Thinking Like a Data Scientist

It is critical to the success of the data science initiative not only to have subject matter expertise involved at the beginning of the engagement (because they have valuable insights into variables and metrics that might be better predictors of performance gathered over years of hands-on experience), but it is critical to understand their work environment and decision-making processes to help drive the subsequent adoption of the analytics.

Check out the following sources for more details on “Thinking Like A Data Scientist” process:

Data Science Development Pre-requisites to Success

There are several data science pre-engagement prerequisites that we require before we ever “put science to the data” (see Figure 4). They include:

  • Creating a persona for each stakeholder or constituent that captures roles, responsibilities, pain points and key operational decisions. 
  • Documenting the use cases that comprise the targeted business initiative or opportunity; document financial, operational and customer benefits and potential implementation risks for each use case.
  • Identifying, brainstorming and ranking internal and external data sources against the top priority use cases; assess data implementation risks associated with accessibility, completeness, granularity, accuracy, latency, documentation, etc.
  • Leveraging the Prioritization Matrix to conduct an envisioning exercise with key stakeholders and constituents to prioritize use cases (business value vs. implementation feasibility) and create a Use Case Roadmap that identifies use case interdependencies and prerequisites.
  • Developing a Hypothesis Development Canvasto ensure cross-organizational alignment by fleshing out priority use case business and data science requirements including KPI’s against which to measure success and progress; financial, customer and operational benefits and costs associated with False Positives and False Negatives.
No alt text provided for this image

Figure 4:  Guide for Ensuring Data Science Success

So there, I’ve given away all of our secrets.  No reason why anyone should ever just grab some data and expect the data science team to do magic.  Because remember, there is no sex in the champagne room.


Suresh Kumar

AVP - GLOBAL BUSINESS DEVELOPMENT

1 年

On behalf of the entire team at SUREMINDS SOLUTIONS, I would like to extend our heartfelt greetings and wish you every success in your Recruitments. ? Speaking of recruitment partnerships, Sureminds Solutions specializes in providing comprehensive recruitment services across various industries, including global recruitments. ? We're a consulting firm linking skilled professionals with top-notch job opportunities around the globe. ? With 12+ years of experience, we've successfully recruited and deployed talent to US, UK, INDIA, PORTUGAL, MALASIA & MIDDLE EAST AND EXPANDING. ? We are registered and licensed by the Ministry of External Affairs with a license number of 1000+. ? If your Team are ever in need of recruitment assistance, please feel free to contact us. ? You can reach me via email at [email protected] or through mobile/Wapp at +91 9110556354. ? To learn more about our services, you can visit our website at https://www.sureminds.co.in/.

  • 该图片无替代文字
回复

要查看或添加评论,请登录

Bill Schmarzo的更多文章

社区洞察

其他会员也浏览了