Skip to content

Conversation

@gmertes
Copy link
Member

@gmertes gmertes commented Nov 17, 2025

Description

Improve the grib template manager in a backwards compatible way.

  • Improve documentation on how users can provide their own templates
  • For the samples provider, add the option for the sample index to be specified directly in the inference config
  • For the file provider, add an option for the file to contain many parameters, and do a lookup based on the param name.
  • In the case of grib input, the template manager automatically injects templates found in the input grib at the highest priority. Refactor this so that this becomes its own standalone template provider called input, this gives more flexibility to enable/disable combinations of template providers and set the priority..
  • input provider fallback mode: force an output variable to use one of the input variables
  • Improve logging of the template manager. Right now it only logs in the case of errors. Also log in the case of success so it's easier to see which templates are being used from which provider.
  • tests for everything, including backwards compatibility

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.


📚 Documentation preview 📚: https://anemoi-inference--383.org.readthedocs.build/en/383/

@gmertes gmertes linked an issue Nov 17, 2025 that may be closed by this pull request
5 tasks
@github-actions github-actions bot added tests and removed tests labels Nov 17, 2025
@gmertes gmertes moved this to Now In Progress in Anemoi-dev Nov 17, 2025
@github-actions github-actions bot added the tests label Nov 17, 2025
@gmertes gmertes force-pushed the 315-grib-template-manager branch from b45b878 to 22eb7c9 Compare November 17, 2025 17:05
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 21, 2025
@gmertes gmertes added the enhancement New feature or request label Nov 21, 2025
@gmertes gmertes marked this pull request as ready for review November 24, 2025 11:55
@gmertes gmertes requested a review from HCookie November 24, 2025 11:55
@frazane
Copy link
Contributor

frazane commented Nov 27, 2025

Hi @gmertes , this looks really good, thanks for working on this! I think our use pattern will mostly revolve on the "samples" provider. In the documentation you mention that we can use keys such as grid, levtype, param in the matching rules. However these are all mars keys and I am not sure we can use them to distinguish between all our templates. I was wondering: would also other arbitrary GRIB keys (e.g. numberOfPoints, typeOfLevel, etc.) be supported in the matching filters?

CC @dnerini

@gmertes
Copy link
Member Author

gmertes commented Nov 28, 2025

It is possible but we would have to tackle that first at the dataset creation side. The rules are matched against the contents of the variable_metadata in the checkpoint metadata, which comes from the training dataset. Right now it only contains the mars language, but we can put anything we want in there at dataset creation. But if the training dataset source contains the mars language there is also no problem and you can use mars language to map to your templates (like you're already doing in the SGM model).

@frazane
Copy link
Contributor

frazane commented Nov 28, 2025

It is possible but we would have to tackle that first at the dataset creation side. The rules are matched against the contents of the variable_metadata in the checkpoint metadata, which comes from the training dataset. Right now it only contains the mars language, but we can put anything we want in there at dataset creation. But if the training dataset source contains the mars language there is also no problem and you can use mars language to map to your templates (like you're already doing in the SGM model).

Okay, I'll look into that, thanks. Unfortunately we tried to use the mars language for our matching filters but there are no keys in our variables_metadata.mars lookup that can help distinguish e.g. between COSMO or ICON. Maybe a temporary solution is to manually insert some additional keys in the checkpoint.

Comment on lines +66 to +68
@cached_property
def _data(self):
return ekd.from_source("file", self.path)
Copy link
Contributor

@frazane frazane Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we gain by making this a cached property as opposed to just assigning an attribute inside the __init__ (we would also know earlier if e.g. something goes wrong reading the data)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory you could have unused template providers in your config (if all the providers at the higher priority already provided all the templates). This way the file will not be opened if the template provider is not used. I think that the from_source will already open the file when it's first called, so delaying that until we are sure it's needed is a good design I think?

Re "knowing if something goes wrong" I think the only thing that is checked at this call is whether the file exists, if something is wrong with the grib you will still only see it when the data is accessed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are saying is correct, my question is whether it's worth it. I would personally prefer a less sophisticated design that facilitates debugging and early exit of the program if something is wrong. "Fail fast" is also a good design.

Re "knowing if something goes wrong" I think the only thing that is checked at this call is whether the file exists, if something is wrong with the grib you will still only see it when the data is accessed.

Ideally one should be able to parse the headers of the data upfront without actually loading it. If that was possible we could just do that already in the __init__ ... but yeah apparently it's (shockingly) not so that's not an option.

Copy link
Member

@HCookie HCookie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, very nicely written

The manager for the template provider.
index_path : str
The path to the index file.
index_path : str | list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
index_path : str | list
index : str | list

import earthkit.data as ekd

from anemoi.inference.decorators import main_argument
from anemoi.inference.inputs.ekd import find_variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably this should be moved into a utils

if self.variables and variable not in self.variables:
return None

match self.mode:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay first use of a match

Comment on lines +40 to +44

if fallback:
self.fallback = fallback
else:
self.fallback = kwargs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will ignore kwargs if the fallback is set, I would suggest a .update(kwargs)?

if len(variables) > 0:
to_log[provider] = set(variable.param for variable in variables)
for variable in variables:
variable._template_manager_logged = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like this setting of a private attribute on this object, Could you not use a set of param to unique identify and filter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ATS Approval not needed documentation Improvements or additions to documentation enhancement New feature or request tests

Projects

Status: Now In Progress

Development

Successfully merging this pull request may close these issues.

Make grib template manager more user friendly

4 participants