A Data Science Discussion
Mark Cornwel-Smith
Enabling projects to scale across Data Engineering | Data Science | Business intelligence | Data Governance | Data Architecture | Data Management | Data & Analytics
I recently met with Kale Temple, Co-founder & Practice Director of Intellify to get his perspective on the Data Analytics Market and understand some of the challenges he has encountered in different organisations through his career.
What are the major trends you’re seeing in the Data Analytics space?
‘The Analytics and Business Intelligence market maturity has peaked. As a result, a lot of businesses are looking at how they transition from Analytics and Business intelligence to Data Science and Machine Learning.
The way you solve a data analytics problem is very different to the way you solve a Data Science problem. Otherwise Data Science would just be known as Data engineering which follows a very standard rigorous process while there’s actually a very scientific component to that Data Science side.
The market in terms of people is very imbalanced. There are either very good Data Scientists asking for massive salaries and those that are calling themselves Data Scientists who are reskilling. This is a good and a bad thing. Good in that it’s creating more supply in the market but bad in that organisations that are hiring this talent don’t know how to solve enterprise Data Science problems. A lot of the common pitfalls aren’t being avoided because of that it escalates to over promising and under delivering of Data Science functionality.
This has meant when hiring Data Science and Machine Learning people there is additional overhead or burden about actually trawling through candidates in an effort to actually find the right people.’
What are the top traits that you see from a good Data Scientist? What skills are required?
‘The first is actually good software engineering skills. Just because you can write some code does not necessarily mean that you could develop an enterprise grade Data Science solution. A lot of people can build a model on a laptop but can’t put it into a robust, secure environment to deploy these machine learning models at scale. The Top talent in the market actually have this capability, they are very rare, they cost a lot of money, but they do have this skill set.
The second one is about understanding the business. A lot of data scientists are too focused or too caught up on the technical details, rather than trying to understand what they are trying to solve. They get caught up in these algorithms, but it is never about the algorithm, it is about trying to achieve some form of business outcome.
The last one is people having more than surface level skills. Some people will claim that they have statistics, machine learning and maybe some mathematical optimisation knowledge but the thing is most will only have surface knowledge. You have a lot of noise in the market with people claiming that they have these skills when in reality they don’t actually have it. The problem here is that we’ve had some people in the top end of town where they have a surface level knowledge of machine learning, so they can hold a conversation, but when you start speaking with them about solving complex problems they don’t actually have the depth required in the theoretical side to properly solve them. ‘
What is a good data strategy?
’It depends on the size of the organisation and how mature they are in terms of data analytics capability.
First is identifying what business problems they are trying to solve with data because there is no point in capturing data for Data sake. Data should be the enabler of business in whatever capacity whether it is Business Intelligence, Data science or Machine learning. Most importantly, you need to understand what the objectives of the organisation are and work your way backwards. Once you have a single source of truth it’s about putting this data in the hands of users through a self-service capability. It then moves onto; how we give them an enterprise reporting platform where a non-technical person can go in and ask basic questions of the data and provide some answers to them.
Once this is done, it starts to lead into predictive and prescriptive analytics. Which is about using data to tell us what to do. Predictive in that it is telling us what will happen, and prescriptive where a Machine Learning algorithm might tell you what to do to achieve that ultimate goal; maximise revenues, save costs or whatever it is.’
What are some of the most common challenges you see in Data Science?
‘The first one is expectation. A lot of people have different views on what Machine Learning and AI will do for organisations. Setting the right expectation about what is possible in this space is imperative otherwise, you are destined to fail.
The other challenge is around the people component and this is people not knowing how to hire in this particular space as they are not data scientists themselves. This could manifest itself through not knowing what questions to ask, so they may not know how to validate the talent.
The second one is cultural fit and is important to build a commercial Data Science capability. How do you get a Data Scientist, Data Engineer and a Machine Learning strategist to think commercially because most Data Scientists aren’t commercially savvy enough to able to identify the opportunities inside the organisation.
The third one is that most people don’t have trust in data. And if you don’t have trust in your organisations data how can you build a good model on top of it. If you don’t have those right data foundations in place you can’t really deliver good Data Science and Machine Learning.
Lastly, you need someone inside the business that can identify those use cases and set the path forward. This will help determine where we are going to be able to use Data Science and Machine Learning inside the organisation to deliver business value. What we have found if you don’t have someone good at identifying the use cases, you’re not going to get executive buy-in and because of that all your initiatives are going to fail.’
Do you have a view on if Data Engineering and Data Science should go together or be separate?
‘It depends on your definition of a Data Engineer. For us a Data Engineer does data ingestion and cloud engineering work. Data Engineers are the facilitators of Data Scientists and if you don’t have Data Engineers to set up the foundations and infrastructure for them to be able to run their models, they can’t be productive. When a Data Scientist comes up with an algorithm that’s supposed to solve a specific business problem the Data Engineer has to be the facilitator of that algorithm. So, if you have a Data Science team without Data Engineering you’re going to fail and vice versa if you have a Data Engineering team without a Data Scientist you’re going to fail.
Data Engineers are imperative for facilitating the Data scientist’s work because they create features, productionise Machine Learning models through creating data pipelines, monitor for model drift and model degradation. ‘
Thank you for participating in this Data Science discussion Kale.