Skip to content

Too many open files when benchmarking #178

@unkcpz

Description

@unkcpz

I try to benchmark add to pack operations for different size of file with following benchmark tests. However, it fails with "too many open files" OSError:

The test is:

from disk_objectstore import CompressMode, Container

@pytest.mark.parametrize(
    "compress_mode,nrepeat",
    [
        # 5 MiB, 5 KiB, 5 bytes
        (True, 5 * 1024 * 1024),
        (False, 5 * 1024 * 1024),
        (True, 5 * 1024),
        (False, 5 * 1024),
        (True, 5),
        (False, 5),
    ],
)
@pytest.mark.benchmark(group="write_10_packs", min_rounds=2)
def test_packs_write_py(benchmark, tmp_path, compress_mode, nrepeat):
    """Add 10 objects to the container in packed form, and benchmark write and read speed."""
    with Container(tmp_path) as cnt:
        cnt.init_container()
        num_files = 10
        data_content = [("8bytes0" * nrepeat).encode("ascii") for _ in range(num_files)]
        expected_hashkeys = [
            hashlib.sha256(content).hexdigest() for content in data_content
        ]

        hashkeys = benchmark(
            cnt.add_objects_to_pack, data_content, compress=compress_mode
        )

        assert len(hashkeys) == len(data_content)
        assert expected_hashkeys == hashkeys

Here is the failed traceback, it seems disk-objectstore will leave some symlinks that pytest tries to clean up:

  File "/home/jyu/WP-MY-IT/dos/python/.venv/lib/python3.12/site-packages/_pytest/tmpdir.py", line 303, in pytest_sessionfinish
    cleanup_dead_symlinks(basetemp)
  File "/home/jyu/WP-MY-IT/dos/python/.venv/lib/python3.12/site-packages/_pytest/pathlib.py", line 357, in cleanup_dead_symlinks
    for left_dir in root.iterdir():
                    ^^^^^^^^^^^^^^
  File "/home/jyu/.local/share/uv/python/cpython-3.12.8-linux-x86_64-gnu/lib/python3.12/pathlib.py", line 1056, in iterdir
    for name in os.listdir(self):
                ^^^^^^^^^^^^^^^^
OSError: [Errno 24] Too many open files: '/tmp/pytest-of-jyu/pytest-4'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions