-
-
Notifications
You must be signed in to change notification settings - Fork 328
Expand (and pass) nested FSStore tests #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand (and pass) nested FSStore tests #709
Conversation
…asses after changes to FSStore. The second test fails.
Codecov Report
@@ Coverage Diff @@
## master #709 +/- ##
==========================================
+ Coverage 99.92% 99.94% +0.01%
==========================================
Files 28 28
Lines 10408 10537 +129
==========================================
+ Hits 10400 10531 +131
+ Misses 8 6 -2
|
… into fsstore_nested_tests
Ping me when I should have a look |
…python into fsstore_nested_tests
… not test_storage.
Don't look too closely at the git history, it reflects my confusion as I try to understand what is getting tested where :) I would like some input on how best to parameterize |
I thought the fixture-based style in #417 was pretty intuitive. |
I think this is ready to look at. Instead of making any big changes to the testing architecture I just went with the flow and added two more subclasses of Additionally, there are a bunch of failing tests for |
Is pydata/xarray#5028 related, perhaps? |
I think that issue is only mildly related, it's due to this line in https://github.com/zarr-developers/zarr-python/blob/master/zarr/storage.py#L1051 in conjunction with |
I'm not sure what the argument for normalize to be True versus False by default. Are you saying that it ought to be an obvious way to pass the argument down? |
I don't really have an opinion on that... I think the upper case vs lower case thing is orthogonal to this PR -- the issues I tried to fix in this PR turn on "0.0.0" vs "0/0/0" for chunk keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK with the changes to FSStore.
I note many "added line not covered" notes in the diff.
I'm not sure what to make of the test coverage warnings, since I'm not changing the API at all, and some of the lines of code being flagged are inside the tests themselves. I do note that |
If this is the case, then yes; the intent was definitely to test it, and previously we required ==100% coverage. |
Final commit doesn't appear to have triggered. Trying a re-open. |
c68ed82
to
749747c
Compare
I've rebased and added the tests from #718. Looks like there are test failures from partial reads (#667) cc: @andrewfulton9 |
The one test failure in the non-nested FSStore The tested hex would match for the .zattr below if the line containing
|
Note that if the |
@grlee77 : yes, due to just the hexdigest issues you're mentioning. The initial commits tried deprecating key_separator and always storing dimension_separator. Always storing dimension_separator, however, led to more changes in the |
The remaining nested FSStore test failures not related to hex digests are resolved if the For example, switching the current: self.key_path = self.map._key_to_str(store_key) to _key_path = self.map._key_to_str(store_key)
_key_path = _key_path.split('/')
_key_path = '/'.join(_key_path[:-1] + [self.chunk_store._normalize_key(_key_path[-1])])
self.key_path = _key_path works, but that is just the quick hack I came up with during debugging and I haven't looked too closely to see where this fix should actually live. recommendations? |
@grlee77, it doesn't look insane 🙂 and if all current tests (incl. yours) are happy, I'd say it's an improvement. Based on @d-v-b's earlier evaluation, it seems like we want a clearer delineation between keys as used in storage.py and core.py, where the former can use "/" and the latter never users "/" (?). |
Disclaimer: I don't know anything at all about the partial read functionality. That being said, it looks like the function in question ( |
Ok. Pushed @grlee77's fix as well as hexdigest fixes. Let's see if we can get it all green! |
5185cee
to
d25b2b2
Compare
Whew! |
I think what's happening with the The code below would also work for fixing the issue, but I haven't tested it, and the above fix works fine too I think. self.key_path = self.map._key_to_str(self.chunk_store._normalize_key(store_key)) |
Can confirm that the tests pass with @andrewfulton9's version as well. |
return children | ||
else: | ||
if array_meta_key in children: | ||
# special handling of directories containing an array to map nested chunk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martindurant: can you comment if this matches your expectations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martindurant : any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, now I have reminded myself what it's doing, and I suppose this is reasonable. As to it's accuracy in all cases, I can only guess it looks right.
It makes you wonder how useful listdir
actually if for the case of fsspec
. I suppose it remains true that a user might want to start exploring a dataset at some high-level group and descend to a specific array, yet make sure they never list the entire set of files (which could be expensive).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @martindurant. I'm going to take that as a 👍 and get this into a 2.8.1 release. It seems that this may be something that needs a re-evaluation down the line.
@martindurant: this is perhaps a bit out of scope for this PR since it's specifically focused on
since cc: @will-moore |
I've migrated the |
This PR adds tests to
FSStore(..., key_separator="/")
. The first test checks that writing to a zarr array backed by a nested FSStore succeeds. This test fails without changes toFSStore.getitems
-- the originalgetitems
implementation normalizes keys, callsself.map.getitems(normalized_keys)
, and returns a dict with those normalized keys. It seems that the code callingFSStore.getitems()
may not recognize these keys and thus discard all the values, producing an array with only fill value. I modifiedFSStore.getitems
to return a dict with the input keys.This PR also expands on a second test (reading the same array with a separate store instance) which currently fails. I will try to get this second test passing, but any pointers would be appreciated. cc @martindurant
TODO: