Data Science Keeps Failing

Data Science Keeps Failing

When summer rolls around, there always seems to be a new study that comes out on the impact data science teams have on the businesses they support. This summer is no different. This year, thanks to a panel discussion, VentureBeat has posted that 87% data science projects fail to create any impact on the businesses they support. That’s been in the arrange of 80-90% failure that I have seen over the past three years. Why is data science such a big failure? There are actually a number of issues in play that I have seen, dealing with many clients, and my past experiences building these teams in previous roles.

 

Lack of Collaboration

Many data science teams don’t do well at collaboration. They might talk about engaging a subject matter expect but that tends to be a meeting, not real collaboration where they work together, just a check the box meeting. I have even seen companies that have a separate team communicate to the rest of the org for the data science team because the DS team didn’t want to actually talk to other people.

 

Unfortunately, data science does attract a lot of people who enjoy just sitting at a computer plugging away and not dealing with other people. The nature of a lot of the business cases created for these teams to solve, demand a high level of interaction. In some ways, those companies that have another team doing the interacting for data science, are right. Most data scientists are often the last person you should send in to do the actually communication piece. That’s a skill set not taught to data scientist. Which leads to point #2.

 

 

The Wrong Talent

There has been a strong focus on data science the last few years. Everyone is rebranding themselves to be a data scientist, except me, I don’t call myself one. For good reason. A data scientist should focus on the algorithms. What many companies do is hire data scientists, because DS is cool, then load them up with tasks they really shouldn’t be doing. Such has hiring developers for non-DS projects (yes, I saw this).  You have an HR team for that, use them! Having data scientist do data engineering. I know a ton of data scientists who claim to know data engineering, they often don’t. It is a rare breed that does have that level of proficiency to do both engineering and data science as well as a specialist. They will claim it, have them prove it. You might often find that their level of experience is just not good enough for the job at hand. Data engineering has a lot of non-DS aspects to it that are very important to ensure the business has good data pipelines for everyone, not just some data science project.  

 

 

Lack of Clear Understanding

One of the things I find very interesting is just how often what people say and what they want, do not match when it comes to the analytics space. Case in point, I was dealing with a very large company as a client and they kept saying predictive analytics is what they want. I asked them to provide me a few examples of what they wished to accomplished. From this discussion, I realized, they wanted prescriptive analytics. This is endemic in that many people say they need data science, when in fact they really need data engineer or data strategy or something completely different.

 

The space has evolved over the last decade. Even back in 2012, an engineer and data scientists were very much their own specialized roles. Yet I constantly run into data scientist who think they are engineers. Not so many engineers who think they are data scientists, which is interesting. But also many data scientists will proclaim to be “business experts,” when in fact that is far from the truth. I do data strategy. I manage the build of algorithms and sometimes build them, but I spend a lot more time on the legal, regulatory, design, customer experience and P&L management of the process. Sometimes I get data scientists telling me that they can do my job, until they actually have to do it. To this day, I have yet to me one that actually could do my job. 

 

It’s really is a lack of understanding that true data strategy is just as complex and labor intensive as the development of other aspects of the data stack. In fact, I would rather spend my time on engineering a tech stack vs doing a patent dive because designing a tech stack is far easier and enjoyable. If you really want a successful data practice, recognize that the skills of a data scientist, a data engineer and a data strategist, are very different and you can’t cheap out on one.

 

 

Things are Getting Static

Data science is often seen as cutting edge. So, it is rather funny that a lot of data scientist just hate using tools that make life easier. Just yesterday I saw a post ripping on Snowflake Computing for making data science “look easy.” That’s is kind of the point, to make it easy. Most of the commenters were saying that you should hand code everything! In some situations, sure, but we have to keep in mind that we often have deadlines and market forces, not data scientist, decide when something goes to market. So, if you need it fast, take the faster router. If you have time, sure, do your hand coding.

 

But this is the issue of data science, for all it’s marketing hype around cutting edge, a lot of data scientist are not. They are more than happy to disrupt the life of the data engineer with all kinds of requests for new tools and data, yet when you want to do the same to the data scientist, you will find they are often the most stubborn group to deal with. But life goes on, progress happens and data science is going to go main stream. We have seen this over and over again. Remember 20 years ago you need someone with specialized skills to code a website and that skill often was costly. Not today. The same will happen to data science. Most of us don’t need super computer level analytics, we need a basic recommender or pattern recognizer. Something that will because as easy as using Microsoft Word in 10 years. This lack of progress on their part is going to be their own downfall.

 

 

Bad Leadership

Finally, the biggest issues is bad leadership. I see data science often thrown in with software development. Sure, there are some similarities but I can say that about anything. I mean marketing and finance both use computers, so why not have both report to the CMO? Most people would laugh at that, yet when it comes to data science, we don’t? Data science belongs under the Chief Data Officer or the business leader and that leader needs to know how to put data science to use.

 

Far to often I see business leaders shy away from actually leading data science. They think their team will magically come up with answers. They will have answers, that take a long time and cost a lot of money. But don’t blame the team, blame the leader. This is a field were very few really know how to lead and if you find someone, odds are they are a data strategist because the nature of the job is more inline with leadership than the engineer or data scientist. Yet, that is often the most overlooked aspect.

 

Has anything changes from previous years? Not really, the same issues continue and thus we see the same high rate of failure. Until companies really change, I don’t expect this failure rate to come done. But, that doesn’t mean companies are all bad. I have worked on several teams where we had an 80% success rate. Because we did things different. Which would you rather have, 80% failure or success?

Scott Smith, PhD

Director of Data Science and Analytics | PhD | AI/ML

5 年

Great insight! Reliable, accurate measures over time are also critical to success. There is minimal opportunity when a data lake hasn’t been cleaned and properly cared for.

Rémy Fannader

Author of 'Enterprise Architecture Fundamentals', Founder & Owner of Caminao

5 年

The terminology is doubly confusing, first with regard to science, then with regard to data. Properly speaking "data science" is not a science but a technology, and as such a very effective one. Then, there is a confusion between data (what can be known or anticipated from environments), information (categories or types of data managed by systems), and knowledge (data and information put to use). https://caminao.blog/knowledge-architecture/ontologies-business-intelligence/

Roy Roebuck

Holistic Management Analysis and Knowledge Representation (Ontology, Taxonomy, Knowledge Graph, Thesaurus/Translator) for Enterprise Architecture, Business Architecture, Zero Trust, Supply Chain, and ML/AI foundation.

5 年

If you think of analytics as a sequence, first you begin descriptive analytics by building a domain class or type descriptive model, also called a metamodel, ontology, knowledge model, or type knowledge graph, to provide an abstract base for human and machine deep learning. Then you build the domain instance descriptive model, also called an architecture, knowledge base, and instance knowledge graph. Then you do diagnostic analytics using the descriptive analytics products, to find the deficiencies, overlaps, gaps, and shortcomings in the domain's description. Then you do prescriptive analytics to build and link a change portfolio of strategies, programs, projects, deliverables, and dependencies at the mission, function, and operation level. Then you do predictive analytics to guide decision-making on destinations, path, and pace of change. It seems data science operates as predictive analytics without integrated, dynamic, and adaptive prescriptive, diagnostic, and descriptive analytics and an underlying core descriptive model. So they're focusing on disjointed twigs and leaves of a knowledge tree, while discounting the interconnected branches, trunk, roots, soil, and surrounding vegetation. It is broken from the beginning.

Gyuri Lajos

Bootstrapping the Open (Mutual) Learning Commons -" Symmathecist in the medium of software” Augmenting Human InterIntellect on the IndyWeb

5 年

We are heaping layers of layers of accidental complexity on false promises. (instead of slaying them) https://www.anchormodeling.com/?p=1214 Data Lakes not delivering, then add a layer of Machine Learning, still no joy, add Knowledge Graphs. The reason these ideas can be sold to the enterprise is that there is? a nugget of truth in them. Notice that, Knowledge Graphs get traction just when the big boys are ready to capitalize on it Neptune etc. Then there are big technology bets to consider, which way should we jump: https://www.dhirubhai.net/feed/update/urn:li:activity:6572076839991414784 Always ask: Is it alchemy or science https://www.youtube.com/watch?v=gG5NCkMerHU

Enrique Benito Casado

Data Engineer/ Solution Architect @ LHIND

5 年

"Everyone is rebranding themselves to be a data scientist, except me, I don’t call myself one." Edward Chenard "Data Strategy, Data Science,Big Data.." very trustly

要查看或添加评论,请登录

Edward Chenard的更多文章

  • The New Skills of Data and Analytics Leaders

    The New Skills of Data and Analytics Leaders

    The last 12 years has been a fun ride for data leaders. 12 years ago, the chief data officer role was pretty much non…

    2 条评论
  • What Happened to Innovation

    What Happened to Innovation

    This started out as a simple post but quickly grew into something bigger and Linkedin said it was too big for a simple…

    5 条评论
  • Data Philosophy – The missing piece of your data practice

    Data Philosophy – The missing piece of your data practice

    For years now, we have seen that data science has had an extremely high failure rate. Typically reports say that data…

    4 条评论
  • The Modern State of Personalization

    The Modern State of Personalization

    Personalization is on the rise again. Not a shock, every few years it gets rediscovered or a new tool comes along that…

    2 条评论
  • I am Looking for a New Passion to Grow

    I am Looking for a New Passion to Grow

    Each new business cycle brings a unique set of challenges and opportunities. Chaos and ambiguity often come with each…

  • The Year of Personalization?

    The Year of Personalization?

    It is that time of the year, prediction articles fill my inbox about what to expect in 2019. One such prediction that I…

  • How the French Army of World War 1 Can Teach You to Run a Better Analytics Team

    How the French Army of World War 1 Can Teach You to Run a Better Analytics Team

    100 years ago, this month, WW1 ended. A bloody four-year long war that changed the way we live.

    8 条评论
  • The Algorithm Slinger

    The Algorithm Slinger

    There is a type of person who is populating data science and analytic teams. These people tend to be in high demand…

    5 条评论
  • What do you Mean by Hands On Experience?

    What do you Mean by Hands On Experience?

    “Hands on” has to be the buzz phrase of the year when it comes to jobs for data. Just about every week I hear this…

    2 条评论
  • Is it a Data Science Shortage or a Leadership Shortage?

    Is it a Data Science Shortage or a Leadership Shortage?

    I sometimes feel like I am in the movie Ground Hogs Day. Except it is with how companies run their data science…

    16 条评论

社区洞察