We Don't Need Data Scientists, We Need Data Engineers
Data is all around us, and its volume is only increasing. In the past decade, data science has become a magnet for newcomers eager to delve into this promising field.
Current Trends in Data Science?Hiring
As someone developing an educational platform for data professionals, I often consider how the job market for data-centric roles, like machine learning and data science, is shifting.
Many aspiring data professionals, including students from prestigious global institutions, express confusion about which skills are crucial to differentiate themselves and prepare for their future careers.
A data scientist’s job can include a variety of tasks: from machine learning modeling and data visualization to data cleaning, processing, engineering, and even deploying models into production.
Starting Point for Newcomers
To provide concrete guidance, I analyzed job postings from companies that have emerged from Y-Combinator since 2012. My research was driven by these questions:
Research Methodology
I focused my analysis on YC portfolio companies that integrate data work into their core services. YC is a good data source because its directory of companies is extensive and easy to search or scrape. This incubator has supported a diverse range of companies for over a decade, making it a representative sample for this analysis. However, I did not include large tech companies, which might skew the general trends.
I started by scraping the websites of all YC companies since 2012, the year the machine learning wave took off with the success of AlexNet. This resulted in an initial list of around 1400 companies. I narrowed this down by filtering for keywords related to data and machine learning and excluded companies with non-functional websites.
Though this approach generated many potential leads, I opted for a high recall rate and performed a more detailed manual review later. Ultimately, I identified about 70 companies actively hiring for data-related roles.
领英推荐
Understanding Data?Roles
Before sharing the results, let’s clarify what each data role typically involves:
Analysis of Data?Roles
When I charted the frequency of data roles in hiring, data engineers were in significantly higher demand compared to data scientists, with roughly 55% more openings. Machine learning engineers had similar demand to data scientists, while ML scientist roles were fewer.
By consolidating similar job titles, the demand for data engineers appeared even greater, about 70% higher than for data scientists. Machine learning engineers also showed about 40% more openings than data scientists.
Key Takeaways
The increasing demand for data engineers reflects an evolution in the field. While machine learning hype increased the demand for data scientists who could build classifiers, the real bottleneck now is in processing and managing data efficiently.
This shift highlights the need for strong engineering skills focused on data management, which might not seem as glamorous but is crucial. Many educational programs for data professionals still don’t emphasize these engineering skills enough.
Despite the competition and evolving demands, there’s still a need for skilled individuals who can analyze and extract actionable insights from data. However, merely downloading pre-trained models is no longer sufficient to secure a data science job. The market now favors those who can not only use advanced tools like Tensorflow but also understand and contribute to their development.
While machine learning research gets a lot of attention, many companies need practical, scalable solutions more than the latest breakthrough. Most cutting-edge research roles are found in well-funded industrial labs rather than startups.
Conclusion
It’s essential for new entrants to the data field to have realistic expectations. Understanding the current state of the field helps us see the path forward. I hope this post has provided clarity on where things stand today in the data job market.