Skip to content

Commit

Permalink
Merge pull request #430 from openaddresses/set-parquet-row-group-size
Browse files Browse the repository at this point in the history
Set row group size so that the buffer doesn't grow indefinitely
  • Loading branch information
iandees authored Feb 19, 2025
2 parents 861dfa9 + f63341d commit 6e4ef24
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions task/collect.js
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,7 @@ async function parquet_datas(tmp, datas, name) {
notes: { type: 'UTF8', optional: true }
});
const writer = await parquet.ParquetWriter.openFile(schema, path.resolve(tmp, `${name}.parquet`));
writer.setRowGroupSize(16384);

for (const data of datas) {
const resolved_data_filename = path.resolve(tmp, 'sources', data);
Expand Down

0 comments on commit 6e4ef24

Please sign in to comment.