You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When searching a datastore by standard_name, it returns one catalog dataset:
e.g. esm_ds.search(variable_standard_name="sea_surface_height_above_geoid")
However when .to_dask() is run, the resulting xarray dataset has all variables in the source files included:
Describe the feature you'd like
I would like the resulting dataset to only include the variable which has the requested variable_standard_name
Describe alternatives you've considered
Ignore it, usign the cf standard name on the resulting dataset works fine, e.g.:
ds.cf["sea_surface_height_above_geoid"]
Additional context
This is mostly an issue of neatness.
As the dataset is still a dask object, and the source files need opening anyway, the performance benefit of only including the requested variables in the returned dataset is probably small.
Yeah, we've talked about this before @anton-seaice. This is a "feature" of Intake-esm datastores, which keep track of a single column for the dataset variables, defined by the variable_column_name. Only searches on this column (which is "variable" for most of our datastores) will refine the dataset returned by to_dask().
This possibly wouldn't be too hard to change in Intake-ESM
I'm also wondering if we might be able to use a DerivedVariableRegistry to do this on the fly - I'm not sure whether we might run into difficulties knowing all the different possible variants of variable_standard_name to open the dataset using.
But yeah, it could be a nice feature to add to intake-esm for sure.
Is your feature request related to a problem? Please describe.
When searching a datastore by standard_name, it returns one catalog dataset:
e.g.
esm_ds.search(variable_standard_name="sea_surface_height_above_geoid")
However when
.to_dask()
is run, the resulting xarray dataset has all variables in the source files included:Describe the feature you'd like
I would like the resulting dataset to only include the variable which has the requested variable_standard_name
Describe alternatives you've considered
Ignore it, usign the cf standard name on the resulting dataset works fine, e.g.:
ds.cf["sea_surface_height_above_geoid"]
Additional context
This is mostly an issue of neatness.
As the dataset is still a dask object, and the source files need opening anyway, the performance benefit of only including the requested variables in the returned dataset is probably small.
I think this is an issue with intake-esm upstream ? ping @dougiesquire and @charles-turner-1
The text was updated successfully, but these errors were encountered: