Articles by Tag #parquet

Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!

Why Parquet Is Everywhere - And What Makes It Actually Fast?

Hey folks 👋, As I kept building more data pipelines, I noticed one file format showing up...

Learn More 2 0Nov 15

Why I’m Switching to Parquet for Data Storage

The first time I came across Parquet files was during my fourth-year project. I kept seeing Hugging...

Learn More 0 0Sep 15

The two versions of Parquet

A few days ago, the creators of DuckDB wrote the article: Query Engines: Gatekeepers of the Parquet...

Learn More 2 0Feb 20

Compression algorithms in Parquet Java

Apache Parquet is a columnar storage format optimized for analytical workloads, though it can also be...

Learn More 3 2Jan 20

From Python to ClickHouse: Parquet ETL with Go

Hey Devs 👋, If you're exploring modern data engineering stacks or want to try out ClickHouse with Go...

Learn More 3 0Aug 7

Crawling web sites using “Data Prep Kit”

A hands-on exercise using “data-prep-kit” and storing the result as parquet files. ...

Learn More 0 0Apr 4

Turning Parquet File into a Queryable RESTful with DuckDB, Quarkus & Kotlin

Parquet files are a powerhouse for storing large, columnar datasets in big data workflows....

Learn More 0 0Feb 23

The Carpet feature that nobody will use

This week I released a new version of Carpet, the Java library for working with Parquet files. In...

Learn More 0 1May 15

Working with Parquet files

Parquet files offer significant advantages over traditional formats like CSV or JSON. This is more...

Learn More 0 0Apr 5