Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!
Overview Topic: Introduction to Docker and its importance for data engineers. Purpose:...
Overview Course Duration and Structure Duration: 6 modules plus 2...
Overview: Course Edition: Fourth edition of the Data Engineering Zoomcamp. Purpose of...
Introduction I'm thrilled to begin documenting my journey building "InsightFlow" - an...
1. Introduction to Batch Processing What is Batch Processing? Batch processing...
In this post, I’ll walk you through how I set up the cloud infrastructure for my project,...
────────────────────────────── 1. Workshop Overview & Introduction Workshop Focus: – How to...
Overview from lecture 1.2.3. Purpose: To connect pgAdmin, a web-based GUI tool, to a...
1. Introduction to dbt dbt (Data Build Tool) is a transformation workflow used for data...
InsightFlow GitHub Repo In this post, we’ll explore how the data ingestion layer for the InsightFlow...
Lesson 5 Write Disposition and Incremental Loading Write Disposition A write...
Introduction to Athena Athena is a serverless interactive query service that allows users...
1. Introduction Partitioning and clustering are key optimization techniques in Google...
1. Introduction to Analytics Engineering Analytics Engineering is a relatively new role...
Study Notes 5.6.1 - Connecting Spark to GCS 1. Overview This guide explains how...
Introduction Welcome to the first part of Peer Review 1 for DTC DEZOOMCAMP. This two-part...
InsightFlow GitHub Repo In this post, we’ll explore how AWS Glue was used to implement the ETL...
1. Introduction to Kafka in Stream Processing Context of Stream Processing: Stream...
Introduction Welcome back to the last part peer review of the France Data Engineering Job...
Study Notes 5.4.1 - Anatomy of a Spark Cluster 1. Introduction In this lesson,...
Introduction to DBT Projects DBT (Data Build Tool) is a framework that helps transform...
InsightFlow GitHub Repo In this post, we’ll dive into how the data model and schema for the...
Study Notes 5.5.1 - Spark RDDs 1. Introduction to RDDs on Spark Resilient...
1. Introduction to Metabase Purpose: Metabase is a tool for visualizing and exploring...
Overview This study note covers the details from the video "DE Zoomcamp 2.2.7 - Manage...
1. Overview Kafka Streams Basics Objective: Learn the fundamental building blocks of...
Introduction This lecture covers data warehouses, with a focus on BigQuery. Topics...
1. Introduction to Data Testing Objective: Ensure data delivered to end-users is...
Study Notes 5.6.3 - Setting Up a Dataproc Cluster in GCP 1. Introduction GCloud...
Just leveled up my #DataEngineering skills by building real-time data pipelines with PyFlink and...