Browse our collection of articles on various topics related to IT technologies. Dive in and explore something new!
History of Parquet File: A Big Data Storage Revolution The Parquet file format has emerged...
Earlier I briefly introduced Apache Iceberg and built an out-of-the-box experiment...
Amazon Kinesis is a streaming service specifically designed to address the complexities of data...
Introduction Large language models (LLMs) heavily changed the way I and other developers...
If you've been following Apache Iceberg™ at all, you've no doubt heard whispers about "the small file...
Iniciando com Hadoop e Apache Hive: Arquitetura, Configuração e Otimização Neste artigo...
Lessons learned through a PoC for a challenging use-case Introduction Apache Iceberg...
Why use data vis When you need to work with a new data source, with a huge amount of data,...
Indexing is commonly used among programmers. Without fully grasping the idea behind the technique, a...
"Without data, you're just another person with an opinion." — W. Edwards Deming In today’s...
Data refers to raw, unprocessed facts, statistics, or information collected for reference, analysis,...
In a world where every click, purchase and interaction generates data, companies that fail to...
Quartz is an open-source Java job scheduling framework that provides powerful capabilities for...
Introduction As a data engineer, I'm always seeking opportunities to experiment with...
Apache Ambari 3.0.0 brings major improvements to cluster management capabilities, featuring Apache Bigtop integration, Java 17 support, and much more!
Using distributed cluster to process big data is the mainstream at present, and splitting a big task...
Hadoop is an open-source software framework designed to handle and process large volumes of data...
When I Implemented batch generation of DolphinScheduler tasks and imported them, it was found that...
From all-in-one machine, hyper-convergence, cloud computing to HTAP, we constantly try to combine...
The journey to becoming a data scientist isn’t for the faint of heart. It’s a demanding but rewarding...
Introduction As a data enthusiast, I’ve always been fascinated by the power of cloud...
Abstract Big data refers to large and complex datasets that require advanced techniques to store,...
The History of Hadoop There are mainly two problems with the big data. Storage for a...
Introduction In the digital age, data has become one of the most valuable assets for...
The stock market can often feel like an exhilarating yet unpredictable rollercoaster ride. The second...
What made Men In Black so incredibly awesome? Was it their coordinated suits? Or was it the fact that...
1. Introduction Driven by the wave of digitalization, the growth rate of data has reached...
Are you diving into the world of data storage and processing? Look no further! My latest blog...
It is a fast and general-purpose distributed computing system for big data processing. It provides an...
Reflection 8 Data lakes have emerged as a pivotal component in the realm of big data management,...