Skip to content

Commit f597395

Browse files
committed
Set the chunk number cap to 200
To avoid OSError when writing large files, the chunk number is capped at 200. This number comes from the maximum number of open files allowed by macOS.
1 parent b560475 commit f597395

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

pytd/writer.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -450,11 +450,13 @@ def write_dataframe(
450450
fps.append(fp)
451451
elif fmt == "msgpack":
452452
_replace_pd_na(dataframe)
453-
453+
num_rows = len(dataframe)
454+
# chunk number of records should not exceed 200 to avoid OSError
455+
_chunk_record_size = max(chunk_record_size, num_rows//200)
454456
try:
455-
for start in range(0, len(dataframe), chunk_record_size):
457+
for start in range(0, num_rows, _chunk_record_size):
456458
records = dataframe.iloc[
457-
start : start + chunk_record_size
459+
start : start + _chunk_record_size
458460
].to_dict(orient="records")
459461
fp = tempfile.NamedTemporaryFile(
460462
suffix=".msgpack.gz", delete=False

0 commit comments

Comments
 (0)