4 Things To Keep In Mind As I Begin The Data Engineering Journey Again
To break into the field quickly and grow more efficiently.
My ultimate goal is to help you break into the data engineering field and become a more impactful data engineer. To take this a step further and dedicate even more time to creating in-depth, practical content, I’m excited to introduce a paid membership option.
This will allow me to produce even higher-quality articles, diving deeper into the topics that matter most for your growth and making this whole endeavor more sustainable.
I’m offering a limited-time 50% discount on the annual plan to celebrate this new milestone.
Intro
I started sharing what I’ve learned about data engineering two years ago. Fortunately, many of you have supported me. I have exposed my ideas to thousands of readers, learned from talented people, and, luckily, some folks have reached out to me for help.
“How would I enter the field? “
“How would I learn to become a data engineer? “
During the conversations, I realized I was in the same situation as they were. I started the journey at zero with a non-CS background. I’ve gone through the journey with many mistakes and lessons.
Sometimes, I wish I could know something sooner.
I thought it would be a good idea to share these things with all of you.
#1 Aware of the data engineer’s responsibilities
It was the first day of my first data job in 2019.
I met the team’s members, arranged a place to sit, and got a laptop to code on. Then, I was assigned to hand over a Docker deployment of a POC. Back then, I didn’t even know what Docker and POC meant.
Turns out it was a POC of a simple “data” project with a few Docker containers that run HDFS, Spark, and Elasticsearch. I spent days to learn the Docker concept (why don’t just use the VM) and how to run those containers.
Like a robot, I didn’t care why I needed to do this; if those containers are up and running, I’m happy.
In my second job, the situation was quite the same; the difference is that I had more chances to work with real-life data.
Write a lambda function to load data from S3 to Redshift.
Write a SQL transformation on Redshift.
Oh, S3 is suck. Let’s move to the GCP (the bosses tell me to do it; don’t ask me why because I don’t know).
I did a thing just because I had a task and wanted to complete it.
It was dangerous, and it harmed my reasoning ability. I waited for someone to give me a task. I finished it, and I felt awesome.
But over time, the task became less interesting than it used to be, and I felt less satisfied. I asked for more tasks. I stayed up late to complete them. The feeling of satisfaction increased a little bit, but it soon faded away. I asked myself, “Why am I doing this? “
This happened to me because I didn’t know exactly how to create value. I self-created the equation: task completed = value created.
I was stuck for a while before realizing the solution was easy: being aware of the data engineer's responsibilities.
Although it was straightforward, it was hard to find them online back then; even the definition of data engineering could have 10 different results on the internet.
It was not until 2022, when Joe Reis and Matt Housley released the book Fundamentals of Data Engineering, that I was enlightened.
From the book, the data engineering is:
Data engineering is the development, implementation, and maintenance of systems and processes that take in raw data and produce high-quality, consistent information that supports downstream use cases, such as analysis and machine learning. -Source-
…and the data engineer is:
A data engineer manages the data engineering lifecycle, beginning with getting data from source systems and ending with serving data for use cases, such as analysis or machine learning. -Source-
For me, this is the most important thing you should equip yourself with. It will help you identify how your work could create value, which things matter, how to identify a problem and solve it proactively, what you should learn, why you should learn a thing, and more.
Here is my recommendation: read the first two chapters of the Fundamentals of Data Engineering and do anything else to ensure you understand a data engineer’s responsibilities.
If you’re prepared to enter the field, this will help you be aware of what you should learn and why you should learn it. You will be more motivated than if you were unquestioningly learning a tool or a concept.
If you feel stuck in your data engineer career, this will help you reflect on your contribution to your organization and guide you to seek more opportunities to create more impact as a data engineer.