Articles by Tag #dezoomcamp

Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!

Study Notes: DE Zoomcamp 1.2.1 - Introduction to Docker

Overview Topic: Introduction to Docker and its importance for data engineers. Purpose:...

Learn More 6 0Jan 26

Data Engineering Zoomcamp 2025 Cohort: Introduction - Self-Study Notes

Overview Course Duration and Structure Duration: 6 modules plus 2...

Learn More 6 0Jan 25

Notes on Data Engineering Zoomcamp 2025 - Launch Stream

Overview: Course Edition: Fourth edition of the Data Engineering Zoomcamp. Purpose of...

Learn More 4 0Jan 24

InsightFlow Part 1: Building an Integrated Retail & Economic Data Pipeline - Project Introduction

Introduction I'm thrilled to begin documenting my journey building "InsightFlow" - an...

Learn More 1 0Apr 9

Study Notes 5.1.1-2 Introduction to Batch Processing & spark

1. Introduction to Batch Processing What is Batch Processing? Batch processing...

Learn More 1 0Mar 4

InsightFlow Part 2: Setting Up the Cloud Infrastructure with Terraform

In this post, I’ll walk you through how I set up the cloud infrastructure for my project,...

Learn More 0 0Apr 28

Study Notes dlt Workshop: API, Warehouses, Data Lakes

────────────────────────────── 1. Workshop Overview & Introduction Workshop Focus: – How to...

Learn More 0 1Feb 17

Study Notes 1.2.3: Connecting pgAdmin and PostgreSQL

Overview from lecture 1.2.3. Purpose: To connect pgAdmin, a web-based GUI tool, to a...

Learn More 1 0Jan 28

Study Notes 4.1.2: What is dbt?

1. Introduction to dbt dbt (Data Build Tool) is a transformation workflow used for data...

Learn More 0 0Feb 25

InsightFlow Part 3: Building the Data Ingestion Layer with AWS Batch

InsightFlow GitHub Repo In this post, we’ll explore how the data ingestion layer for the InsightFlow...

Learn More 1 0Apr 29

Study Notes dlt Fundamentals Course: Lesson 5 & 6 - Write Disposition, Incremental Loading, How dlt works

Lesson 5 Write Disposition and Incremental Loading Write Disposition A write...

Learn More 0 0Feb 17

Study Notes: Data Query on S3 Bucket Using Athena

Introduction to Athena Athena is a serverless interactive query service that allows users...

Learn More 0 0Feb 11

Study Notes 3.1.2: Partitioning and Clustering in BigQuery

1. Introduction Partitioning and clustering are key optimization techniques in Google...

Learn More 0 0Feb 11

Study Notes 4.1.1 - Analytics Engineering Basics

1. Introduction to Analytics Engineering Analytics Engineering is a relatively new role...

Learn More 0 0Feb 25

Study Notes 5.6.1-2 Spark on cloud & local

Study Notes 5.6.1 - Connecting Spark to GCS 1. Overview This guide explains how...

Learn More 1 0Mar 4

Peer Review 1: Analyzing Poland's Real Estate Market (Part 1)

Introduction Welcome to the first part of Peer Review 1 for DTC DEZOOMCAMP. This two-part...

Learn More 0 0Apr 30

InsightFlow Part 6: Implementing ETL Processes with AWS Glue for InsightFlow

InsightFlow GitHub Repo In this post, we’ll explore how AWS Glue was used to implement the ETL...

Learn More 0 0Apr 29

Study Notes 6.3-4: What is Kafka & Confluent Cloud

1. Introduction to Kafka in Stream Processing Context of Stream Processing: Stream...

Learn More 0 0Mar 18

Peer Review 3: France Data Engineering Job Market Transformations, Visualization, and Feedback (Part 2)

Introduction Welcome back to the last part peer review of the France Data Engineering Job...

Learn More 0 0May 2

Study Notes 5.4.1-3 Anatomy of a Spark Cluster GroupBy & Joins in Spark

Study Notes 5.4.1 - Anatomy of a Spark Cluster 1. Introduction In this lesson,...

Learn More -1 0Mar 4

Study Notes 4.2.1 | 4.2.2: DBT Project Setup

Introduction to DBT Projects DBT (Data Build Tool) is a framework that helps transform...

Learn More 0 0Feb 25

InsightFlow Part 5: Designing the Data Model & Schema with dbt for InsightFlow

InsightFlow GitHub Repo In this post, we’ll dive into how the data model and schema for the...

Learn More 0 0Apr 29

Study Notes 5.5.1-2 Operations on Spark RDDs & Spark RDD mapPartition

Study Notes 5.5.1 - Spark RDDs 1. Introduction to RDDs on Spark Resilient...

Learn More 0 0Mar 4

Study Notes 4.5.2: Visualizing Data with Metabase (Alternative B)

1. Introduction to Metabase Purpose: Metabase is a tool for visualizing and exploring...

Learn More 1 0Feb 25

Study Notes 2.2.7: Managing Schedules and Backfills with BigQuery in Kestra

Overview This study note covers the details from the video "DE Zoomcamp 2.2.7 - Manage...

Learn More 0 0Feb 4

Study Notes 6.7-10: Kafka Stream Basics, JOIN, Testing & Windowing

1. Overview Kafka Streams Basics Objective: Learn the fundamental building blocks of...

Learn More 0 0Mar 18

Study Note DE Zoomcamp 3.1.1 - Data Warehouse and BigQuery

Introduction This lecture covers data warehouses, with a focus on BigQuery. Topics...

Learn More 0 0Feb 11

Study Notes 4.3.2 - Testing and Documenting the Project

1. Introduction to Data Testing Objective: Ensure data delivered to end-users is...

Learn More 0 0Feb 25

Study Notes 5.6.3-4 Setting up a Dataproc Cluster & Connecting Spark to Big Query

Study Notes 5.6.3 - Setting Up a Dataproc Cluster in GCP 1. Introduction GCloud...

Learn More 0 0Mar 4

Real-Time Data Processing with PyFlink and Redpanda

Just leveled up my #DataEngineering skills by building real-time data pipelines with PyFlink and...

Learn More 0 0Mar 17