Articles by Tag #dezoomcamp

Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!

Data Engineering Zoomcamp 2025 Cohort: Introduction - Self-Study Notes

Overview Course Duration and Structure Duration: 6 modules plus 2...

Learn More 6 0Jan 25

Study Notes: DE Zoomcamp 1.2.1 - Introduction to Docker

Overview Topic: Introduction to Docker and its importance for data engineers. Purpose:...

Learn More 6 0Jan 26

Notes on Data Engineering Zoomcamp 2025 - Launch Stream

Overview: Course Edition: Fourth edition of the Data Engineering Zoomcamp. Purpose of...

Learn More 4 0Jan 24

Study Notes 5.1.1-2 Introduction to Batch Processing & spark

1. Introduction to Batch Processing What is Batch Processing? Batch processing...

Learn More 1 0Mar 4

InsightFlow Part 1: Building an Integrated Retail & Economic Data Pipeline - Project Introduction

Introduction I'm thrilled to begin documenting my journey building "InsightFlow" - an...

Learn More 1 0Apr 9

InsightFlow Part 2: Setting Up the Cloud Infrastructure with Terraform

In this post, I’ll walk you through how I set up the cloud infrastructure for my project,...

Learn More 0 0Apr 28

Study Notes dlt Workshop: API, Warehouses, Data Lakes

────────────────────────────── 1. Workshop Overview & Introduction Workshop Focus: – How to...

Learn More 0 1Feb 17

Study Notes 1.2.3: Connecting pgAdmin and PostgreSQL

Overview from lecture 1.2.3. Purpose: To connect pgAdmin, a web-based GUI tool, to a...

Learn More 1 0Jan 28

Study Note 2.2.5: Orchestrate dbt Models with Postgres in Kestra

Overview This introduces how to use dbt (data build tool) with Kestra to perform data...

Learn More 2 0Feb 4

InsightFlow Part 3: Building the Data Ingestion Layer with AWS Batch

InsightFlow GitHub Repo In this post, we’ll explore how the data ingestion layer for the InsightFlow...

Learn More 1 0Apr 29

Study Notes dlt Fundamentals Course: Lesson 5 & 6 - Write Disposition, Incremental Loading, How dlt works

Lesson 5 Write Disposition and Incremental Loading Write Disposition A write...

Learn More 0 0Feb 17

Study Note DE Zoomcamp 3.1.1 - Data Warehouse and BigQuery

Introduction This lecture covers data warehouses, with a focus on BigQuery. Topics...

Learn More 0 0Feb 11

InsightFlow Part 7: Data Quality Implementation & Best Practices for InsightFlow

InsightFlow GitHub Repo In this post, we’ll explore how data quality was implemented in the...

Learn More 0 0Apr 29

Study Notes 6.7-10: Kafka Stream Basics, JOIN, Testing & Windowing

1. Overview Kafka Streams Basics Objective: Learn the fundamental building blocks of...

Learn More 0 0Mar 18

Study Notes 4.1.2: What is dbt?

1. Introduction to dbt dbt (Data Build Tool) is a transformation workflow used for data...

Learn More 0 0Feb 25

Study Note 3.3.2: BigQuery Machine Learning Model Deployment using Docker

This lecture outlines the steps to export a BigQuery Machine Learning model, deploy it in a Docker...

Learn More 0 0Feb 11

Study Notes 4.2.1 | 4.2.2: DBT Project Setup

Introduction to DBT Projects DBT (Data Build Tool) is a framework that helps transform...

Learn More 0 0Feb 25

Study Note DE Zoomcamp 1.2.4 - Dockerizing the Ingestion Script

Introduction This session of the Data Engineering Zoomcamp focuses on Dockerizing a data...

Learn More 0 0Feb 4

Study Notes 2.2.4: Managing Scheduling and Backfills with Postgres in Kestra

Introduction The session covers automation of data pipelines using schedules and...

Learn More 0 0Feb 4

Peer Review 1: Poland's Real Estate Market Dashboards and Insights with Streamlit (Part 2)

Introduction Welcome to the second part of Peer Review 1, where we continue exploring the...

Learn More 0 0Apr 30

Study Notes 1.3.2: Terraform Basics with GCP

1. Authentication Setup for Terraform Service Account Creation: A service account is...

Learn More 0 0Feb 4

InsightFlow Part 6: Implementing ETL Processes with AWS Glue for InsightFlow

InsightFlow GitHub Repo In this post, we’ll explore how AWS Glue was used to implement the ETL...

Learn More 0 0Apr 29

Real-Time Data Processing with PyFlink and Redpanda

Just leveled up my #DataEngineering skills by building real-time data pipelines with PyFlink and...

Learn More 0 0Mar 17

InsightFlow Part 4: Data Exploration & Understanding the Datasets

InsightFlow GitHub Repo Before diving into building any data pipeline, a crucial first step is Data...

Learn More 0 0Apr 29

Study Notes 2.2.7: Managing Schedules and Backfills with BigQuery in Kestra

Overview This study note covers the details from the video "DE Zoomcamp 2.2.7 - Manage...

Learn More 0 0Feb 4

Study Notes 3.1.2: Partitioning and Clustering in BigQuery

1. Introduction Partitioning and clustering are key optimization techniques in Google...

Learn More 0 0Feb 11

Study Notes 2.2.2: Learning Kestra

Introduction to Kestra Key Resources Getting Started with Kestra...

Learn More 0 0Feb 4

Study Notes 2.2.1: Workflow Orchestration with Kestra

Introduction to Workflow Orchestration Key Concepts Orchestration...

Learn More 0 0Feb 4

Study Notes 1.3.1: Terraform Primer

Introduction to Terraform Definition: Terraform is an Infrastructure-as-Code (IaC) tool...

Learn More 0 0Feb 4

Peer Review 2: TfL Station Footfall Data Analysis Pipeline (Part 1)

Peer reviews are a cornerstone of building high-quality data engineering projects. They don’t just...

Learn More 0 0May 1