The Data Science Hierarchy of Needs

The Data Science Hierarchy of Needs

The Data Science Hierarchy of Needs, developed after Maslow’s Hierarchy of Needs, is a model that shows the needs of data science practitioners and helps them understand and prioritise their projects. It outlines the different phases a data science project should pass through to benefit from AI implementation. It’s a model that more data scientists should be familiar with, as it gives them a better understanding of their projects and helps decide on the order of activities.

?

Data collection

Data collection is crucial and, therefore, at the base of the hierarchy. There will be no insightful analysis without having access to data in a suitable format.

You will have to know how the data is collected, the data flow and the various data analysis done to derive valuable and profitable insight, and how to use this insight to influence your decisions on making profits.

It is beneficial for data scientists to participate in the data collection process to understand its history and make the best decision regarding data format. Choosing a suitable data type (CSV or parquet) might improve the processing speed for large files.

Activities at this level include:

Recording transaction

Logging errors

Digitising analogue data

Data Management Plan

Data generation

Data platform development and database management

Data Acquisition

?

Move, store and organise data.

Once data scientists have collected their data, they have to move it somewhere safe and accessible. Organising data make it easy to find the information researchers want to see.

One thing to note about data sets is that they’re often messy. The data scientist needs a way to ensure the data they’re collecting is appropriately structured and is ready to be analysed. They do this by coding their scripts or using the software.

Activities at this level include:

Data migration

ETL / ELT

No alt text provided for this image

?

Explore, transform and Analyse data

Next, we need to explore the data, transform it into a suitable format, and analyse it. Data can help understand what’s happening in the organisation and why. It generally starts with essential data analysis tools, like reports, dashboards, and KPIs.

As the company matures, more robust solutions like ETL pipeline, warehouse, or data lake will be in place.

Activities at this level include:

Building ETL pipeline

Data cleaning

Descriptive analytics

Reporting/dashboards

Exploratory data analysis

?

Generate insights from data - predictive, prescriptive, diagnostic, descriptive analytics

Once data are collected, stored, transformed, and analysed, we need to use them to generate insights that drive business decisions. Four types of data analytics can help create insights; descriptive, diagnostic, predictive and prescriptive.

The organisation may incorporate predictive analytics, prescriptive analytics, and machine learning in their data-science pipeline.

Activities at this level include:

Statistical analysis

Descriptive analytics – Reporting and Dashboards

Diagnostic analytics – anomaly detection

Predictive analytics – Supervised and unsupervised ML

Prescriptive analytics - AI & Machine learning

?

Automate the process

Automation is a critical aspect of data science since data scientists should be spending most of their time-solving business problems. We need to automate our data-science processes.

When appropriately applied, data-driven AI can minimise our costs and maximise our revenue. This type of AI sets the industry leaders apart from everyone else.

Activities at this level include:

ML Pipeline development

Auto ML

A/B testing and experimentation

No alt text provided for this image

?

Conclusion

The Data Science Hierarchy of Needs model shows the needs of data science practitioners and helps them understand and prioritise their projects. It outlines the different phases a data science project goes through, with a focus on the needs of the team. It’s a model that more data scientists should be familiar with, as it gives them a better understanding of their projects and helps decide the order of activities.

?Stroll down and click on the like button if you enjoy this blog.

Follow me on?Medium.

Click?here?to?Subscribe?to my?weekly newsletter?for more blog posts.

See you next week. Thank you!



Seth Levine

Lead Machine Learning Scientist at Loris.ai | Host of Learning from Machine Learning

1 年
Obidinma Nnebe

Business Data Analyst | Operations | Strategy | Business Development | Project Management

1 年

Thanks for sharing

Thales Ferraz

Analista de Dados | Engenharia de Dados | Arquitetura de Dados | Python | R | SQL | Power BI | Power Automate | Power Apps | Databricks | AWS | Azure

1 年

Thanks for sharing! :)

Wheels of Hope Rising Foundation

#Education 4 All #Ending Maternal Mortality #Universal Health Coverage #Empowerment 4 Women/PWD #Solar4Life #Water4Life

2 年

Great and insightful Emmanuel Ogungbemi, PhD

要查看或添加评论,请登录

Dr Emmanuel Ogungbemi的更多文章

社区洞察

其他会员也浏览了