Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataTree tutorial with GPM_3IMERGHH_07 #307

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

eni-awowale
Copy link

I am still waiting on this to get merged pydata/xarray-data#31 so I can fetch the data with pooch. Some other ideas, I want to add an example of DataTree.from_dict to make a time series tree of the hurricane going inland. This would mean I would need to add a few more granules to the data repository. I was also thinking it could be good to add an example of open_groups.

Let me know what you all think
@TomNicholas, @shoyer, @flamingbear, @keewis, @owenlittlejohns and @aladinor

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

github-actions bot commented Feb 18, 2025

🎊 PR Preview 16cb113 has been successfully built and deployed to https://xarray-contrib-xarray-tutorial-preview-pr-307.surge.sh

🕐 Build time: 0.01s

🤖 By surge-preview

"outputs": [],
"source": [
"gpm_imerghh_7 = open_datatree(\n",
" '~/Downloads/3B-HHR.MS.MRG.3IMERG.20210829-S073000-E075959.0450.V07B.HDF5', engine='h5netcdf'\n",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is going to change once this is merged pydata/xarray-data#31

@scottyhq
Copy link
Contributor

Thanks for this @eni-awowale, looks great! It'll be great to have DataTree included on this website. For your notebook to be rendered, please include a link to it in the table of contents. Organizationally I think it makes sense to add the notebook under 'fundamentals' here alongside the existing data structures notebook

xarray-tutorial/_toc.yml

Lines 16 to 18 in 869be4a

- file: fundamentals/01_data_structures.md
sections:
- file: fundamentals/01_datastructures

We'll have to add metpy to the environment as well.

@dcherian
Copy link
Contributor

For a fundamental data structure notebook, I would skip the complex colormap setup. IME it's best to skip extraneous details, that kind of thing really confuses new users.

@eni-awowale
Copy link
Author

Thanks @scottyhq! Is there a specific naming convention I should follow then? Maybe something like 01_datastructures_datatree

@scottyhq
Copy link
Contributor

Thanks @scottyhq! Is there a specific naming convention I should follow then? Maybe something like 01_datastructures_datatree

No convention currently, whatever seems reasonable to you! Also, I recommend merging with the main branch b/c I updated the contributing guide last week with more details on the environment. Any additions you think would be useful there would be welcome:
https://github.com/xarray-contrib/xarray-tutorial/blob/main/CONTRIBUTING.md

@shoyer
Copy link

shoyer commented Feb 27, 2025 via email

@TomNicholas
Copy link
Member

TomNicholas commented Mar 5, 2025

Sorry for the slow input here. After @eni-awowale showed me the printed representation of this IMERG data yesterday, we discussed how ideally we would have our tutorial dataset more obviously show something that can only be done with DataTree.

My understanding of this data is that it contains multiple groups, but only as grandparent-parent-child, without multiple siblings at any level. The requirement for parent-child alignment means that by definition these 3 groups could be collapsed into one, meaning this data doesn't really need a DataTree, as opposed to just using a Dataset.

Instead, I suggested we want our example to be something that cannot even be represented without DataTree. For example by having a level with >1 sibling, containing differently sized dimensions. We could probably massage the IMERG data we have here (e.g. by up and/or downsampling) to get some data like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants