Split a large csv file into smaller files #eg45
Judy

Judy @esproc_spl

About: Like open source, follow open source

Joined:
Oct 23, 2023

Split a large csv file into smaller files #eg45

Publish Date: Sep 20 '24
7 0

A csv file has a size far greater than 5M. Below is part of its data:

Image description
Use Java to do this: Split the file into smaller files, each having a size of about 5M; file names contain ordinal numbers, such as Orders1.csv and Orders2.csv. One record should only be put into one file.

Write the following SPL code:

Image description
A2: Compute the number of smaller files (N) the csv file will be divided into. Symbol \ performs the division and gets only the integer part; +1 makes the size of each smaller file is a bit less than 5M.

A3: Loop from 1 to N: approximately, divide the large file into N parts according to the size; retrieve the ith part each time to write to a new file while automatically ensuring that records are complete.

Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.

This is one of the problems on StackOverflow. You can click on it to see that the conventional solution is quite complicated, but the SPL approach is really simple and efficient.

SPL open source address

Download

Comments 0 total

    Add comment