登录查看更多内容

Switching gears from Dev to Data

Leapfrog Technology, Inc.

Building digital experiences of tomorrow by innovating better, faster.

发布日期: 2024年12月10日

Written by: Anish Silwal, Lead Engineer

Are you looking to start your career in data engineering? Or maybe you're a seasoned full-stack engineer thinking of transitioning into the exciting world of data engineering? Or perhaps, you’re just curious about what data engineering is? Then, you're in luck.

In this blog, I’ll share my experience on transitioning from a full stack engineer to a data engineer.

The switch

I began my career in IT as a full-stack engineer. When I think of full-stack engineers, I picture software engineers who build all parts of a system and help deploy them—they work on the front-end, back-end, and manage the deployment infrastructure.

In my opinion, full-stack is a mindset—one that drives you to want to learn everything involved in building a software.

This curious mindset led me on a journey from building REST APIs to building ETL pipelines. Eventually, when my team lead approached me about joining a data-centric group internally, I decided to explore data engineering.

Overlapping skill sets

When I started in data engineering, I didn’t really feel the difference, as I was already familiar with SQL queries working with back-end systems. Additionally, since I was also familiar with the CI/CD pipelines, the learning curve was smooth.

1. Building the front end: From UI to dashboards

As a full-stack engineer, I was accustomed to crafting intuitive user interfaces. In data engineering, this skill translates to building interactive dashboards. Tools like Tableau, Power BI, or even custom Python-based dashboards allowed me to visualize complex data sets in a user-friendly manner.

For smaller use-cases, a data engineer might even need to build their dashboards with web technologies (HTML, React).

2. Crafting the back end: From databases to data pipelines

My back-end development experience was invaluable in data engineering. While I had worked with relational databases like MySQL or PostgreSQL, as a data engineer, I often delved into big data technologies like Hadoop, Spark, or cloud-based data warehouses like Snowflake or BigQuery. The core concept remained the same: designing, building, and maintaining systems to store and retrieve data.

The other part of working with data was building pipelines to interact with a lot of systems, including reading and writing data to and from a system using REST APIs, or with SOAP.

3. Mastering DevOps: From infrastructure to Data Ops

DevOps practices are essential for both full-stack and data engineers. In my previous role, I automated deployments, monitored system health, and implemented CI/CD pipelines. As a data engineer, I extended these skills to data pipelines, automating data ingestion, transformation, and loading processes, ensuring data reliability and efficiency.

Effective Data Ops requires a comprehensive understanding of CI/CD, cloud-native technologies, and load balancing techniques. These skills are crucial for automating data pipelines, ensuring seamless deployments, and optimizing the performance of distributed data systems. Data Ops also include data governance, making it easily accessible with proper access control.

领英推荐

The Rise of the Data Platform Engineer

Dagster Labs 5 个月前

Databricks: A Contemporary Solution for Today’s Data…

Analytics8 | Data & Analytics Consultancy 2 年前

Your Comprehensive Guide to Becoming a Data Engineer…

Brij kishore Pandey 10 个月前

Unique challenges and opportunities

While the core principles align, data engineering presents unique challenges and opportunities:

Data quality and integrity: Data engineers must prioritize data quality and ensure data accuracy throughout the pipeline.
Scalability and performance: As data volumes grow, data engineers need to optimize pipelines and storage systems to handle increasing loads.
Data security and privacy: Protecting sensitive data is paramount, requiring strong security measures and compliance with regulations like GDPR and HIPAA.
Staying relevant: While it’s easy to get started in data engineering, it takes a huge effort to stay relevant in this landscape. The thing is, there are tons of tools to choose from to do the same task, and choosing the right tools might feel daunting.

My journey

As I transitioned into data engineering, I was fortunate to work on a healthcare project involving a Microsoft SQL Server data warehouse. My prior experience with SQL-based backend systems made it relatively straightforward to grasp the business logic and quickly become productive. A strong foundation in SQL can be a significant advantage regardless of the underlying data warehouse technology.

One initial challenge I faced was testing changes in a complex environment with interconnected systems like Airflow and AWS. Unlike traditional software development where local testing is common, data engineering often requires testing in a production-like environment due to data dependencies and system interactions.

Another notable difference is the reduced emphasis on unit testing for data pipelines. Data engineers typically rely on integration testing and data quality checks to validate changes. Tools like Soda Core and dbt can be helpful in automating these checks.

One of the most rewarding aspects of my early data engineering experience was the direct interaction with customer success teams to resolve user issues. Unlike the more layered support structures in full-stack development, data engineers often work closely with delivery teams to address critical data-related problems. This requires a strong problem-solving mindset and the ability to quickly identify and resolve issues.

The treasure map

Oversimplified data engineering landscape

If you are thinking about starting your career in data engineering, try following this roadmap:

Learn the basics: SQL, ETL, Data Modelling, Data Warehousing, Programming Language like Java, Python
Learn orchestration: Apache Airflow, dbt
Delve into big data techs: Apache Spark, Apache Flink, Open Table Formats (Iceberg, Hudi, Deltatable), Open File Formats (Parquet, Avro)
Explore data platforms: Databricks, Dremio

While the data engineering field offers a plethora of tools, it's essential to focus on fundamental concepts. These underlying principles remain consistent across different technologies.

It's important to note that data engineering doesn't always involve complex mathematical models. While machine learning models might occasionally be part of the equation, data engineers typically focus on deploying and maintaining these models, rather than developing them from scratch. Data scientists and machine learning engineers are usually responsible for the complex mathematical aspects. However, data engineering can serve as a stepping stone into machine learning and data science.

Platform engineering is an emerging role that blends elements of full-stack and data engineering. If you're a full-stack engineer interested in data engineering, platform engineering could be a compelling career path.

Conclusion

By leveraging your full-stack expertise and embracing the challenges of data engineering, you can unlock exciting career opportunities. As data continues to drive innovation across industries, skilled data engineers are in high demand. So, if you're ready to take the plunge, start by exploring data engineering concepts, learning new tools, and building hands-on projects.

Stay curious. Keep learning.

What are your thoughts on this piece? Let us know in the comments, or reach out to Anish Silwal via LinkedIn or GitHub.

Look at the Byte Side

25,161 位关注者

Sudip Raj Koirala

Analytics Manager at Laudio

2 个月

A true inspiration on switching gears from Dev to Data! Always loved to work with Anish Silwal Khatri for data analysis works in Laudio

Tim Darling

Co-Founder, President, Laudio Insights (Laudio) + Independent Board Director (DosedDaily, UNA)

2 个月

Congrats Anish Silwal Khatri!

Arun Jaiswal

Project Manager

2 个月

Love this till the end

Aayusha B.

Engineer | Learner | AI enthusiast CBDC Researcher and Developer

2 个月

An inspiring journey of learning, perseverance, and success!

1 次回应

Prakriti Paudel

Software Engineer - Data

2 个月

Insightful!

1 次回应

查看更多评论

要查看或添加评论，请登录

Leapfrog Technology, Inc.的更多文章

See all articles

Switching gears from Dev to Data

Leapfrog Technology, Inc.

Building digital experiences of tomorrow by innovating better, faster.

The switch

Overlapping skill sets

领英推荐

Unique challenges and opportunities

My journey

The treasure map

Conclusion

Look at the Byte Side

25,161 位关注者

Leapfrog Technology, Inc.的更多文章

社区洞察

其他会员也浏览了

Start your journey as a Data Engineer and Data Scientist

The importance of building your pipeline toolbox from small independent segments of platform agnostic code

DATA ENGINEERING: SKILLS IN DEMAND

GroupBy #13: Explaining Kubernetes To My Uber Driver, Data Modelling For Data Engineers

The Roadmap to Becoming a Data Engineering Jedi

Robust Architecture to populate Data from MongoDB in Real-Time Using Mongo Streams, Event Bridge, SQS Queue and Lambdas (Processing 20k Events Per Day

ETL encapsulation in aws-Lambda Function with Serverless, CloudFormation, APIGateway, Docker, FastAPI to PowerBI API

From a full-stack developer to full-stack data scientists

Is this a new era for dbt Labs?

The switch

Overlapping skill sets

领英推荐

Unique challenges and opportunities

My journey

The treasure map

Conclusion

Look at the Byte Side

25,161 位关注者

Leapfrog Technology, Inc.的更多文章

A skeptic's path to embracing AI

Cracking the code of teamwork

How QA is beyond just bug hunting

Get your sheet together!

8 career lessons that made all the difference

re:Capping AWS re:Invent 2024

How Clean Architecture saved my sanity

Lead as an action, not a position

Jsontology: My tinkering project to simplify JSON

Building for billing: Venturing into the US healthcare system

社区洞察

其他会员也浏览了

Start your journey as a Data Engineer and Data Scientist

The importance of building your pipeline toolbox from small independent segments of platform agnostic code

DATA ENGINEERING: SKILLS IN DEMAND

GroupBy #13: Explaining Kubernetes To My Uber Driver, Data Modelling For Data Engineers

The Roadmap to Becoming a Data Engineering Jedi

Robust Architecture to populate Data from MongoDB in Real-Time Using Mongo Streams, Event Bridge, SQS Queue and Lambdas (Processing 20k Events Per Day

ETL encapsulation in aws-Lambda Function with Serverless, CloudFormation, APIGateway, Docker, FastAPI to PowerBI API

From a full-stack developer to full-stack data scientists

Is this a new era for dbt Labs?