Roles in Data - and whom to hire first
Photo by Hassan Pasha on Unsplash

Roles in Data - and whom to hire first

There are a few different roles in data that are dealing with different parts of the data stack. It's important to understand the differences even for non-data people as searching for or even hiring the wrong people may slow down your project.

In this article I'll go through the different roles to discuss their background and typical work area as well as technologies and tools used. I might be a bit biased to modern cloud based data warehouse architectures which will be reflected in the tools listed. Be aware that there are a lot more out there and I can only give a few examples.

After describing each role individually, I'll also explain what roles might be the first roles to hire when you are entering the data domain at the end of the article.

Tracking Specialist/Web Analyst

Possibly the most diverse position in the field. They need to know how web or mobile apps work (HTTP, REST, SPA, Cookies), need to know programming JavaScript and HTML to extract data from websites and apps, also quite a bit about marketing to steer UTM parameters and send data to marketing partners and last but not least they need to be analysts to understand what they measure and recommend actions. A/B Testing may also fall into their tasks.

Due to this diverse profile, only a small group of people really master the full spectrum. Often web analysts are more concentrating on the business/operational side and leave tracking implementation to the IT. This however often creates friction as IT may deprioritize tracking or may not fully understand the business/data needs.

Technologies and Tools: Google Analytics, Adobe Analytics, Amplitude, Matomo, Google Tag Manager, Tealium, Adobe Launch, Segment, Optimizely, Marketing Pixel (Facebook, GA, Google Ads), Javascript, HTML, and many more

Data Analyst

Probably the most versatile position in the field. They must understand the business to deliver insights that lead to data driven decisions. Therefore hypotheses about product/business model improvements are generated that the data analyst either confirms with the existing data or tests by generating new data. The analyst also models future developments and thereby supports business decisions further.

Data Analysts need to be able to analyze data either in an ad-hoc study or by providing dashboards for questions that need to be answered repeatedly. This requires technical and mathematical skills along with a strong analytics thinking.

The business analyst is a variant that focusses more on the business analytical side than on the technical/data side. Less data crunching more reasoning and more and potentially better communication towards stakeholders as business analysts are living more in the business world. The financial analyst would be someone that is specifically looking into financial data and forecasting.

The analytics engineer is a rather recent development. In contrast to the business analyst they are more technical going deeper into the data, including data modeling, data quality, big data and even some level of data engineering.

Technologies and Tools: SQL, Tableau, Data Studio, Looker, Python, R, pandas, Excel, Powerpoint, dbt, SaaS ETL

Data Engineer

Data engineers enable the collection and usage of (big) data by providing and maintaining storage and compute resources. This often is done through (cloud) data warehouses or generally databases (data base administrator). This classically includes data modeling, so bringing source data into a clean and joinable format to prepare it for analysis. Nowadays I would see this moving in the analyst's domain (in particular as task for an analytics engineer) given the importance of reflecting business needs in the data models.

Also data cleaning and data quality checks are part of their work, in particular with respect to technical terms like consistency, validity, uniqueness and timeliness, while I see the aspects of completeness and accuracy (does the dashboard show the same revenue as the bank account?) rather in the analytics domain.

The data engineering roles also have extended with recents shifts to the cloud. The dataops topic (devops for data) has developed and technologies like docker/kubernetes become standard, similarly to running data pipelines on MPP technologies like Spark. Furthermore they may also support the data analysts in performance optimizing their analytics/SQL workloads.

Technologies and Tools: SQL, BigQuery, Snowflake, Redshift, Databricks, Airflow, Talend, dbt, Docker, Kubernetes, Spark, Hadoop, python, Java, Scala

Data Scientist

There is a larger overlap between data scientists and data analysts. As a tendency i would see the analysts domain in doing descriptive and diagnostic analytics, i.e. using existing data and calculating deterministic KPI out of it, while the data scientist is developing models which based on a) existing data and b) assumptions (i.e. the model) to create predicted KPI or classifications which have an inherent uncertainty. This generally uses machine learning techniques. These techniques usually need statistical evaluation to assess the significance of the prediction.

Data scientist are also responsible for data cleaning and feature engineering. It should be noted that the cleaning needed for analytics and prediction are different and often not the same process can be used. While in analytics it's important to not loose any business objects (you don't want to underreport your revenue) it is advisable to filter out incomplete data and work with a subset when training models.

As a special case, the ML engineer will deploy developed prediction models such that it is fed by streaming real time production data to do the predictions (inference) and reinsert the results into the production system to automate or personalize the experience. They will use similar techniques as the data engineers.

Technologies and Tools: python, R, machine learning, pandas, scikit-learn, keras, pyspark, SQL, docker/kubernetes

Data leader

Data leaders will manage a team of data professionals and steer strategic decision regarding the data infrastructure of an organization. Depending on what team they manage they need the same skills and methods as data engineers, tracking specialists, data analysts or scientists or even all of them together. To be able to do the right strategic decision they need a lot of experience in their field(s). While the individual data professionals may be more skilled in how a certain technology is used it is most important for data leaders to know what technology should be used to reach a certain strategic goal and why.

Whom to hire first

If you are running a web business whose business model is mainly built on a website or web shop you would hire a tracking specialist / web analyst to setup a decent tracking and analytics solution.

If you want to go beyond that and would like to integrate other data sources or you even don't have any web tracking data you should go for an analytics engineer. As this is a rather new role you could also search for data analysts, however, make sure they have experience with some level of data integration and data modeling next to building dashboards.

It may be overkill to hire a data leader at start. On the one hand they may be less "hands on" than needed at the beginning, on the other hand they may miss the strategic big tech challenge so in the end the liaison may be short. Only if you start big and will hire a larger team right from the start, hiring a leader is certainly a good idea.

Data engineers and data scientist are (with potentially a few exceptions) the wrong people to start with. While they can build amazing technology this will most certainly not connect at the input or the output side to anything that you have at the moment and that would bring your business forward.

It is essential that the first hire is not a Junior as they will be the first person with a data background in your organization. So all strategic decisions will rely on them. It may also be a good alternative to contract external people to build your initial data stack and then let them onboard people you hire permanently.

Takeaway

There are different roles in data and it is important to distinguish between them to hire the right people. If you are starting with data you should go for a web analyst or an analytics engineer as they bring exactly the skills needed when no data infrastructure is there yet.

If you have any questions or comments let me know.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了