Download and convert satellite data for use in ML pipelines
Satellite data is a valuable resource for training machine learning models. Forecasting renewable generation requires knowledge of the weather conditions, and those weather conditions can be inferred and enriched using satellite data.
EUMETSAT provide a range of satellite data products, which are easily available
in NAT
image format. In order to improve its accessibility for training models,
this consumer processes downloaded data into the Zarr
format.
Note
This repo is in early development and so will undergo rapid changes. Breaking changes may occur in the CLI and the API without warning.
Install using the container image:
$ docker pull ghcr.io/openclimatefix/satellite-consumer
$ docker run \
-e SATCONS_COMMAND=consume \
-e SATCONS_SATELLITE=rss \
-e EUMETSAT_CONSUMER_KEY=<your-key> \
-e EUMETSAT_CONSUMER_SECRET=<your-secret> \
-v $(pwd)/work:/work \
ghcr.io/openclimatefix/satellite-consumer
For a description of all the possible configuration options, see Documentation.
When running the satellite consumer using the environment entrypoint (as in the docker container), the following environment variables are available:
Variable | Default | Description |
---|---|---|
SATCONS_COMMAND |
The command to run. | |
SATCONS_SATELLITE |
The satellite to consume data from. | |
SATCONS_MONTH |
The month to consume data for (when using the archive command). |
|
SATCONS_TIME |
The time to consume data for (when using the consume command). Leave unset to download latest available. |
|
SATCONS_VALIDATE |
false |
Whether to validate the downloaded data. |
SATCONS_HRV |
false |
Whether to download the HRV channel. |
SATCONS_RESCALE |
false |
Whether to rescale the downloaded data to the unit interval. |
SATCONS_WORKDIR |
/mnt/disks/sat |
The working directory. In the container, this is set to /work for easy mounting. |
SATCONS_NUM_WORKERS |
1 |
The number of workers to use for processing. |
SATCONS_ZIP |
false |
Whether to zip the processed data to a latest.zarr.zip file. Only valid for the consume command. |
EUMETSAT_CONSUMER_KEY |
The EUMETSAT consumer key. | |
EUMETSAT_CONSUMER_SECRET |
The EUMETSAT consumer secret. |
Current;y the consumer is built to the specific data requirements of Open Climate Fix.
However, adding a new satellite in the from EUMETSAT shouldn't be too hard, provided it uses
the same seviri_l1b_native
format and sensor channels - just update the available satellites
in config.py
.
The python package contains a CLI entrypoint for ease of use when developing, which is available
to your shell via the sat-consumer-cli
command, assuming you have built the project in a virtual
environment, and activated it.
This project uses MyPy for static type checking and Ruff for linting. Installing the development dependencies makes them available in your virtual environment.
Use them via:
$ python -m mypy .
$ python -m ruff check .
Be sure to do this periodically while developing to catch any errors early and prevent headaches with the CI pipeline. It may seem like a hassle at first, but it prevents accidental creation of a whole suite of bugs.
There are some additional dependencies to be installed for running the tests,
be sure to pass --extra=dev
to the pip install -e .
command when creating your virtualenv.
(Or use uv and let it do it for you!)
Run the unit tests with:
$ python -m unittest discover -s src/nwp_consumer -p "test_*.py"
On the directory structure:
- The official PyPA discussion on "source" and "flat" layouts.
- PR's are welcome! See the Organisation Profile for details on contributing
- Find out about our other projects in the here
- Check out the OCF blog for updates
- Follow OCF on LinkedIn
Part of the Open Climate Fix community.