Skip to content

Commit 19b5ea2

Browse files
committed
remove clone depth from ocf-data-sampler install in readme
1 parent e9837bb commit 19b5ea2

File tree

1 file changed

+23
-19
lines changed

1 file changed

+23
-19
lines changed

README.md

Lines changed: 23 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
# PVNet 2.1
2+
23
<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
4+
35
[![All Contributors](https://img.shields.io/badge/all_contributors-8-orange.svg?style=flat-square)](#contributors-)
6+
47
<!-- ALL-CONTRIBUTORS-BADGE:END -->
58

69
[![Python Bump Version & release](https://github.com/openclimatefix/PVNet/actions/workflows/release.yml/badge.svg)](https://github.com/openclimatefix/PVNet/actions/workflows/release.yml) [![ease of contribution: hard](https://img.shields.io/badge/ease%20of%20contribution:%20hard-bb2629)](https://github.com/openclimatefix/ocf-meta-repo?tab=readme-ov-file#overview-of-ocfs-nowcasting-repositories)
710

8-
911
This project is used for training PVNet and running PVNet on live data.
1012

1113
PVNet2 is a multi-modal late-fusion model that largely inherits the same architecture from
@@ -18,7 +20,6 @@ feature vector is put through an output network which outputs predictions of the
1820
future GSP yield. National forecasts are made by adding all the GSP forecasts
1921
together.
2022

21-
2223
## Experiments
2324

2425
Our paper based on this repo was accepted into the Tackling Climate Change with Machine Learning workshop at ICLR 2024 and can be viewed [here](https://www.climatechange.ai/papers/iclr2024/46).
@@ -28,8 +29,6 @@ Some slightly more structured notes on deliberate experiments we have performed
2829
Some very rough, early working notes on this model are
2930
[here](https://docs.google.com/document/d/1fbkfkBzp16WbnCg7RDuRDvgzInA6XQu3xh4NCjV-WDA). These are now somewhat out of date.
3031

31-
32-
3332
## Setup / Installation
3433

3534
```bash
@@ -39,9 +38,11 @@ pip install .
3938
```
4039

4140
The commit history is extensive. To save download time, use a depth of 1:
41+
4242
```bash
4343
git clone --depth 1 https://github.com/openclimatefix/PVNet.git
4444
```
45+
4546
This means only the latest commit and its associated files will be downloaded.
4647

4748
Next, in the PVNet repo, install PVNet as an editable package:
@@ -56,8 +57,6 @@ pip install -e .
5657
pip install ".[dev]"
5758
```
5859

59-
60-
6160
## Getting started with running PVNet
6261

6362
Before running any code in PVNet, copy the example configuration to a
@@ -76,29 +75,29 @@ As a minimum, in order to create batches of data/run PVNet, you will need to
7675
supply paths to NWP and GSP data. PV data can also be used. We list some
7776
suggested locations for downloading such datasets below:
7877

79-
**GSP (Grid Supply Point)** - Regional PV generation data\
78+
**GSP (Grid Supply Point)** - Regional PV generation data
8079
The University of Sheffield provides API access to download this data:
8180
https://www.solar.sheffield.ac.uk/api/
8281

8382
Documentation for querying generation data aggregated by GSP region can be found
8483
here:
8584
https://docs.google.com/document/d/e/2PACX-1vSDFb-6dJ2kIFZnsl-pBQvcH4inNQCA4lYL9cwo80bEHQeTK8fONLOgDf6Wm4ze_fxonqK3EVBVoAIz/pub#h.9d97iox3wzmd
8685

87-
**NWP (Numerical weather predictions)**\
86+
**NWP (Numerical weather predictions)**
8887
OCF maintains a Zarr formatted version of the German Weather Service's (DWD)
8988
ICON-EU NWP model here:
9089
https://huggingface.co/datasets/openclimatefix/dwd-icon-eu which includes the UK
9190

92-
**PV**\
91+
**PV**
9392
OCF maintains a dataset of PV generation from 1311 private PV installations
9493
here: https://huggingface.co/datasets/openclimatefix/uk_pv
9594

96-
9795
### Connecting with ocf-data-sampler for batch creation
9896

9997
Outside the PVNet repo, clone the ocf-data-sampler repo and exit the conda env created for PVNet: https://github.com/openclimatefix/ocf-data-sampler
98+
10099
```bash
101-
git clone --depth 1 https://github.com/openclimatefix/ocf-data-sampler.git
100+
git clone https://github.com/openclimatefix/ocf-data-sampler.git
102101
conda create -n ocf-data-sampler python=3.11
103102
```
104103

@@ -114,11 +113,14 @@ Then exit this environment, and enter back into the pvnet conda environment and
114113
pip install -e <PATH-TO-ocf-data-sampler-REPO>
115114
```
116115

116+
If you install the local version of `ocf-data-sampler` that is more recent than the version specified in PVNet, you might receive a warning. However, it should still function correctly.
117+
117118
## Generating pre-made batches of data for training/validation of PVNet
118119

119120
PVNet contains a script for generating batches of data suitable for training the PVNet models. To run the script you will need to make some modifications to the datamodule configuration.
120121

121122
Make sure you have copied the example configs (as already stated above):
123+
122124
```
123125
cp -r configs.example configs
124126
```
@@ -127,7 +129,6 @@ cp -r configs.example configs
127129

128130
We will use the following example config file for creating batches: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. You will need to comment out or delete the parts of `example_configuration.yaml` pertaining to the data you are not using.
129131

130-
131132
When creating batches, an additional datamodule config located in `PVNet/configs/datamodule` is passed into the batch creation script: `streamed_batches.yaml`. Like before, a placeholder variable is used when specifying which configuration to use:
132133

133134
```yaml
@@ -151,6 +152,7 @@ Run the `save_samples.py` script to create batches with the parameters specified
151152
```bash
152153
python scripts/save_samples.py
153154
```
155+
154156
PVNet uses
155157
[hydra](https://hydra.cc/) which enables us to pass variables via the command
156158
line that will override the configuration defined in the `./configs` directory, like this:
@@ -185,7 +187,6 @@ satellite:
185187

186188
ocf-data-sampler is currently set up to use 11 channels from the satellite data, the 12th of which is HRV and is not included in these.
187189

188-
189190
### Training PVNet
190191

191192
How PVNet is run is determined by the extensive configuration in the config
@@ -194,13 +195,13 @@ files. The configs stored in `PVNet/configs.example` should work with batches cr
194195
Make sure to update the following config files before training your model:
195196

196197
1. In `configs/datamodule/local_premade_batches.yaml`:
197-
- update `batch_dir` to point to the directory you stored your batches in during batch creation
198+
- update `batch_dir` to point to the directory you stored your batches in during batch creation
198199
2. In `configs/model/local_multimodal.yaml`:
199-
- update the list of encoders to reflect the data sources you are using. If you are using different NWP sources, the encoders for these should follow the same structure with two important updates:
200-
- `in_channels`: number of variables your NWP source supplies
201-
- `image_size_pixels`: spatial crop of your NWP data. It depends on the spatial resolution of your NWP; should match `image_size_pixels_height` and/or `image_size_pixels_width` in `datamodule/configuration/site_example_configuration.yaml` for the NWP, unless transformations such as coarsening was applied (e. g. as for ECMWF data)
200+
- update the list of encoders to reflect the data sources you are using. If you are using different NWP sources, the encoders for these should follow the same structure with two important updates:
201+
- `in_channels`: number of variables your NWP source supplies
202+
- `image_size_pixels`: spatial crop of your NWP data. It depends on the spatial resolution of your NWP; should match `image_size_pixels_height` and/or `image_size_pixels_width` in `datamodule/configuration/site_example_configuration.yaml` for the NWP, unless transformations such as coarsening was applied (e. g. as for ECMWF data)
202203
3. In `configs/local_trainer.yaml`:
203-
- set `accelerator: 0` if running on a system without a supported GPU
204+
- set `accelerator: 0` if running on a system without a supported GPU
204205

205206
If creating copies of the config files instead of modifying existing ones, update `defaults` in the main `./configs/config.yaml` file to use
206207
your customised config files:
@@ -228,7 +229,6 @@ python run.py
228229

229230
If you have successfully trained a PVNet model and have a saved model checkpoint you can create a backtest using this, e.g. forecasts on historical data to evaluate forecast accuracy/skill. This can be done by running one of the scripts in this repo such as [the UK GSP backtest script](scripts/backtest_uk_gsp.py) or the [the pv site backtest script](scripts/backtest_sites.py), further info on how to run these are in each backtest file.
230231

231-
232232
## Testing
233233

234234
You can use `python -m pytest tests` to run tests
@@ -238,8 +238,11 @@ You can use `python -m pytest tests` to run tests
238238
Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
239239

240240
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
241+
241242
<!-- prettier-ignore-start -->
243+
242244
<!-- markdownlint-disable -->
245+
243246
<table>
244247
<tbody>
245248
<tr>
@@ -258,6 +261,7 @@ Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/d
258261
</table>
259262

260263
<!-- markdownlint-restore -->
264+
261265
<!-- prettier-ignore-end -->
262266

263267
<!-- ALL-CONTRIBUTORS-LIST:END -->

0 commit comments

Comments
 (0)