Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close connection of database after initialization correctly #179

Merged
merged 2 commits into from
Feb 10, 2025

Conversation

agoscinski
Copy link
Contributor

No description provided.

Copy link

codecov bot commented Feb 4, 2025

Codecov Report

Attention: Patch coverage is 86.53846% with 7 lines in your changes missing coverage. Please review.

Project coverage is 99.57%. Comparing base (17e9243) to head (5ea4ab2).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
disk_objectstore/container.py 86.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #179      +/-   ##
==========================================
- Coverage   99.90%   99.57%   -0.33%     
==========================================
  Files          10       10              
  Lines        2099     2118      +19     
==========================================
+ Hits         2097     2109      +12     
- Misses          2        9       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@agoscinski agoscinski changed the title Alway "cache" session to close it Close connection of database after initialization correctly Feb 4, 2025
@agoscinski agoscinski changed the base branch from main to uv-lock February 4, 2025 14:54
@agoscinski agoscinski changed the base branch from uv-lock to main February 4, 2025 18:08
@agoscinski agoscinski marked this pull request as ready for review February 4, 2025 18:08
Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments:

  1. Using _operation_session and _container_session to distinguish seems work but increase the complicity. Why not a function init_container that will create the container and handle the DB within the function scope?
  2. Please also update the doc for when using the container, it requires context manager.

Again, I think there is some design flaw on the API and resources management. In my rust implementation rsdos, these issues are just not there in the first place. I'd strongly argue to consider to give it a try a some point. @giovannipizzi

@agoscinski
Copy link
Contributor Author

Using _operation_session and _container_session to distinguish seems work but increase the complicity. Why not a function init_container that will create the container and handle the DB within the function scope?

That would require a more significant refactor as the Container is not enforced to be used within a context. That means almost every test in test_container would need to be adapted and its usage in aiida-core. I am not sure even sure if it is not intended design to separate the Container instantiation (by __init__) and initialization (by init_container), since in aiida-core we often create an instance without initialization. This PR is intended to just make it work and not change the whole usage of container Container.

Please also update the doc for when using the container, it requires context manager

The usage of the Container has not changed. So using the context manager is optional as it was before. Whatever is in the doc is still valid.

@agoscinski
Copy link
Contributor Author

agoscinski commented Feb 5, 2025

Again, I think there is some design flaw on the API and resources management. In my rust implementation rsdos, these issues are just not there in the first place. I'd strongly argue to consider to give it a try a some point. @giovannipizzi

I agree, this PR is more a temporary fix than a real fix of the problem that the container not well designed to manage resources correctly (acquire and free them). I think any usage that opens db connection or leaves open file handlers should be enforced to be within a context. Any function that leaves any resource open should not be a public function.

@agoscinski
Copy link
Contributor Author

Please also update the doc for when using the container, it requires context manager

The usage of the Container has not changed. So using the context manager is optional as it was before. Whatever is in the doc is still valid.

Checked the docs, it is never mentions to close the container... I will update ...

@unkcpz
Copy link
Member

unkcpz commented Feb 5, 2025

Checked the docs, it is never mentions to close the container... I will update ...

Yes, I think it is mandatory to close the resource after using it, otherwise it is for sure the leaking of resources. I encounter quite a lot of "too many files open" in this python implementation when benchmark.

@unkcpz
Copy link
Member

unkcpz commented Feb 5, 2025

That would require a more significant refactor as the Container is not enforced to be used within a context. That means almost every test in test_container would need to be adapted and its usage in aiida-core

make sense. Let's just do another workaround :(

@agoscinski
Copy link
Contributor Author

@giovannipizzi the benchmarks, this PR seems to make it slightly slower, but nothing very significant
main current commit 73f14e3

------------------------------------------- benchmark 'check': 1 tests -------------------------------------------
Name (time in ms)          Min       Max      Mean  StdDev    Median     IQR  Outliers     OPS  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------
test_has_objects      156.4821  161.5050  159.0021  1.6613  159.0859  2.1256       2;0  6.2892       7           1
------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------- benchmark 'read': 4 tests ----------------------------------------------------------------------------------------------------------
Name (time in ns)                    Min                        Max                       Mean                    StdDev                     Median                       IQR            Outliers             OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_list_all_loose             208.3146 (1.0)           2,416.6868 (1.05)            216.6023 (1.0)             26.4382 (1.27)            215.9081 (1.0)              1.9103 (1.0)     408;22297  4,616,756.0436 (1.0)      198373          22
test_list_all_packed            211.9569 (1.02)          2,293.4780 (1.0)             221.3234 (1.02)            20.8437 (1.0)             220.9968 (1.02)             3.6089 (1.89)     630;4718  4,518,274.9948 (0.98)     198337          23
test_loose_read          32,417,292.0082 (>1000.0)  33,776,791.9991 (>1000.0)  32,902,562.6594 (>1000.0)    355,679.3920 (>1000.0)  32,815,937.4767 (>1000.0)    340,583.0357 (>1000.0)       8;3         30.3928 (0.00)         26           1
test_pack_read           69,918,000.0229 (>1000.0)  91,280,124.9651 (>1000.0)  73,246,360.1328 (>1000.0)  6,744,398.8266 (>1000.0)  70,587,812.5154 (>1000.0)  1,082,542.0031 (>1000.0)       2;2         13.6526 (0.00)         14           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------ benchmark 'write': 2 tests ------------------------------------------------------------------------------
Name (time in ms)          Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_pack_write       192.9492 (1.0)      208.5755 (1.0)      197.6553 (1.0)      6.2606 (1.78)     195.9689 (1.0)      5.5105 (1.24)          1;1  5.0593 (1.0)           5           1
test_loose_write      220.8819 (1.14)     228.5237 (1.10)     223.3578 (1.13)     3.5236 (1.0)      222.0129 (1.13)     4.4589 (1.0)           1;0  4.4771 (0.88)          4           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This PR

------------------------------------------- benchmark 'check': 1 tests -------------------------------------------
Name (time in ms)          Min       Max      Mean  StdDev    Median     IQR  Outliers     OPS  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------
test_has_objects      157.8054  162.8718  159.5420  1.8240  159.1264  1.9441       1;0  6.2679       6           1
------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------- benchmark 'read': 4 tests ---------------------------------------------------------------------------------------------------------
Name (time in ns)                    Min                        Max                       Mean                    StdDev                     Median                     IQR            Outliers             OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_list_all_loose             208.3429 (1.0)           2,655.8232 (1.0)             216.1888 (1.0)             21.2598 (1.0)             215.5657 (1.0)            3.6038 (1.0)      468;4303  4,625,587.0375 (1.0)      195120          23
test_list_all_packed            212.1352 (1.02)          5,443.1836 (2.05)            221.6600 (1.03)            31.7571 (1.49)            219.7233 (1.02)           3.7729 (1.05)     762;5800  4,511,414.3196 (0.98)     198373          22
test_loose_read          32,556,083.9847 (>1000.0)  37,034,540.9028 (>1000.0)  33,225,628.1444 (>1000.0)    855,080.8650 (>1000.0)  33,007,062.5059 (>1000.0)  372,165.9305 (>1000.0)       2;2         30.0972 (0.00)         26           1
test_pack_read           69,430,958.9453 (>1000.0)  85,344,749.9685 (>1000.0)  71,959,041.6023 (>1000.0)  5,418,212.5444 (>1000.0)  69,934,458.0062 (>1000.0)  260,979.5483 (>1000.0)       2;4         13.8968 (0.00)         15           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------- benchmark 'write': 2 tests ------------------------------------------------------------------------------
Name (time in ms)          Min                 Max                Mean            StdDev              Median                IQR            Outliers     OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_pack_write       196.4610 (1.0)      215.4258 (1.0)      203.2547 (1.0)      8.7610 (2.77)     197.7985 (1.0)      14.4017 (2.70)          1;0  4.9199 (1.0)           5           1
test_loose_write      227.7670 (1.16)     234.2380 (1.09)     231.2063 (1.14)     3.1606 (1.0)      231.4101 (1.17)      5.3408 (1.0)           1;0  4.3251 (0.88)          4           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

@unkcpz
Copy link
Member

unkcpz commented Feb 6, 2025

Maybe this fix also solves the issue #178.

@unkcpz
Copy link
Member

unkcpz commented Feb 6, 2025

The benchmark test of pack write also shows another problem. From what @agoscinski shows in the bench result, the min/max time of this bench is similar but it should not, since the test function in benchmark runs multiple times. The reason is that when adding same content to the pack, it will not check the duplication but write into it anyway. (maybe there is some consideration to go this way, but from performance side it is not good if you want to write things to the pack directly, or maybe I am wrong there is check on the hash somewhere before write to pack??)

The problem is out the scope of this PR, just mentioned it since I did encountered it in my implementation.

@pytest.mark.benchmark(group="write", min_rounds=3)
def test_pack_write(temp_container, benchmark):
    """Add 10'000 objects to the container in packed form, and benchmark write and read speed."""
    num_files = 10000
    data_content = [str(i).encode("ascii") for i in range(num_files)]
    expected_hashkeys = [
        hashlib.sha256(content).hexdigest() for content in data_content
    ]

    hashkeys = benchmark(
        temp_container.add_objects_to_pack, data_content, compress=False
    )

    assert len(hashkeys) == len(data_content)
    assert expected_hashkeys == hashkeys

@giovannipizzi
Copy link
Member

My design consideration here was that if writing directly to packs, and you want a truly streaming approach where if you read the stream once to compute the hash, it's not obvious you can start back and read it again to actually store it, then I don't think there is an efficient way or do it. Maybe the way would be to stream anyways to sandbox first, and then copy to the pack. But then you are going to write most data twice... I don't remember how much I benchmarked this, and of course it depends on how much duplication you expect, but my intuition was/is that it's better to store all, and at the end do once a full repack, and this is the fastest option.
Also, let's remember that writing directly is mostly for applications such as importing from an aiida archive file, where files to be imported are already deduplicated
So you pay only a factor of 2 if you are importing the very same data you already have

@unkcpz
Copy link
Member

unkcpz commented Feb 6, 2025

Maybe the way would be to stream anyways to sandbox first, and then copy to the pack. But then you are going to write most data twice.

For the loose to pack, it is not a problem since hash is there already. For directly write to pack, I think it will be a good feature to have a option key to support write to sandbox first. So users can choose to have a more disk space optimized solution or a more performance solution.
Maybe there is also a way that the file handler move the pointer in file after it finds the content already exist, not sure but would be interesting to check. But anyway it is another issue out the scope of this PR. I'll open an issue here and not contaminate with the discussion.

The session created during initialisation of the container was never
properly closed. This unclosed session was until py3.12 garbage
collected since it was unreferenced. With py3.13 the sessions however
are not anymore garbage collected and thus remain open. Resulting in
an open file descriptors of the `pack.idx` for each initialisation of
the container.

This commit fixes it by keeping track of the session that initialises
the container `_container_session`. We adapt the name `_session` to
`_operation_session` for a clearer distinguishment between the two
session types.
@agoscinski agoscinski requested a review from unkcpz February 10, 2025 09:57
@agoscinski
Copy link
Contributor Author

@nmounet @giovannipizzi agreed on merging I bypass approve

@agoscinski agoscinski merged commit 6686ad0 into main Feb 10, 2025
29 of 30 checks passed
@agoscinski agoscinski deleted the fix-open-session branch February 10, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants