-
-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zarr.core.group.Group
does not allow to access nested groups using file path-like syntax (?)
#2765
Comments
Thanks @aladinor for looking into this. I opened pydata/xarray#9984 to track the datatree+zarr3 integration. I thought we had fixed the leading slash issue here in zarr so I'm wondering if that is still the cause here or if there is something else going on. |
Thanks for this report, I think there are few things to unpack here:
Definitely. As @jhamman notes some of the weirdness you observed might come from
Can you explain why this is needed? In your example, |
Thanks, @jhamman and @d-v-b for your prompt reply.
We have a datatree that looks like this: <xarray.DataTree>
Group: /
│ Dimensions: (x: 128, y: 256)
│ Dimensions without coordinates: x, y
│ Data variables:
│ z (x, y) float64 262kB 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
│ w (x) float64 1kB 0.01734 0.8962 0.6293 ... 0.2805 0.2753 0.2004
├── Group: /a
│ Dimensions: (x: 128, y: 256)
│ Dimensions without coordinates: x, y
│ Data variables:
│ A (x, y) float64 262kB 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0
├── Group: /b
│ Dimensions: (y: 256, x: 128)
│ Dimensions without coordinates: y, x
│ Data variables:
│ B (y, x) float64 262kB 2.0 2.0 2.0 2.0 2.0 ... 2.0 2.0 2.0 2.0 2.0
└── Group: /c
└── Group: /c/d
Dimensions: (x: 128, y: 256)
Dimensions without coordinates: x, y
Data variables:
G (x, y) float64 262kB 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 I guess it is because that was the way we handled it in zarr v2. We had a hierarchical structure and then we query all groups including the root group. However, I might need to recheck how to get the root-level dataset ( Group: /
│ Dimensions: (x: 128, y: 256)
│ Dimensions without coordinates: x, y
│ Data variables:
│ z (x, y) float64 262kB 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0
│ w (x) float64 1kB 0.01734 0.8962 0.6293 ... 0.2805 0.2753 0.2004 I am still not sure how to get it from the store. Maybe, because the store is itself the root group (?). |
|
Thanks @d-v-b for your explanation. However, I think I found out a more technical explanation for this behaviour. When opening a datatree using xarray we need to iterate over all nested groups within the hierarchical structure, including the root group as shown here: Then, each nested group can be accessed from the Thus, my question is, how can we get the root group using the |
I don't think we necessarily want root= {'/': zarr_group}
members = dict(zarr_group.members())
tree = root | members |
@d-v-b thanks for your suggestions. We need to do some refactoring, but it makes sense to me. We need to wait for absolute paths before implementing this. |
Zarr version
v3.0
Numcodecs version
v0.15
Python Version
3.11
Operating System
Linux
Installation
conda
Description
Hi everyone,
Working on #9960 issue on Xarray, I discovered that the new Zarr python version does not allow access to group members using file path-like syntax.
Steps to reproduce
This is an MCVE:
The group paths in this datatree are
['/', '/c', '/c/d', '/b', '/a']
. However, when opening the zarr store back it returnsNone
when trying to get any of these pathsDigging a little bit more, I found out that we can get the path for each group in zarr-python v3 using the
store.members()
method as follows and this will allow us to get the groups within the zarr store.Now, we can access the nested groups using these results
Shall zarr-pyhton v3 groups support file path-like syntax to access groups?
Another thing that I noticed is that datasets stored at the root level (
ds_rt
that containsz
andw
dataArrays) are not represented as a group (root group "/") but instead represented as zarr Arrays.How could we access the root group (
store.get("/")
) instead of directly the arrays (store.get("z")
)?Additional output
The Zarr python v2 used return a
<class 'zarr.hierarchy.Group'>
which allowed us to access nested groups using file path-like syntax.The text was updated successfully, but these errors were encountered: