You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[](https://github.com/openclimatefix/PVNet/actions/workflows/release.yml)
4
4
5
-
This project is used for training PVNet and running PVnet on live data.
5
+
This project is used for training PVNet and running PVNet on live data.
6
6
7
7
PVNet2 is a multi-modal late-fusion model that largely inherits the same architecture from
8
-
[PVNet1.0](https://github.com/openclimatefix/predict_pv_yield). The NWP and
8
+
[PVNet1.0](https://github.com/openclimatefix/predict_pv_yield). The NWP (Numerical Weather Prediction) and
9
9
satellite data are sent through some neural network which encodes them down to
10
-
1D intermediate representations. These are concatenated together with the GSP
10
+
1D intermediate representations. These are concatenated together with the GSP (Grid Supply Point)
11
11
output history, the calculated solar coordinates (azimuth and elevation) and the
12
12
GSP ID which has been put through an embedding layer. This 1D concatenated
13
13
feature vector is put through an output network which outputs predictions of the
@@ -56,7 +56,7 @@ pip install ".[dev]"
56
56
57
57
## Getting started with running PVNet
58
58
59
-
Before running any code in within PVNet, copy the example configuration to a
59
+
Before running any code in PVNet, copy the example configuration to a
60
60
configs directory:
61
61
62
62
```
@@ -74,14 +74,14 @@ suggested locations for downloading such datasets below:
Ensure that the file paths are set to the correct locations in
127
-
`gcp_configuration.yaml`.
124
+
We will use the following example config file for creating batches: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. You will need to comment out or delete the parts of `example_configuration.yaml` pertaining to the data you are not using.
128
125
129
-
`PLACEHOLDER` is used to indcate where to input the location of the files.
130
126
131
-
For OCF use cases, file locations can be found in `template_configuration.yaml` located alongside `gcp_configuration.yaml`.
127
+
When creating batches, an additional datamodule config located in `PVNet/configs/datamodule` is passed into the batch creation script: `streamed_batches.yaml`. Like before, a placeholder variable is used when specifying which configuration to use:
132
128
133
-
In these configurations you can update the train, val & test periods to cover the data you have access to.
134
-
135
-
136
-
With your configuration in place, you can proceed to create batches. PVNet uses
137
-
[hydra](https://hydra.cc/) which enables us to pass variables via the command
138
-
line that will override the configuration defined in the `./configs` directory.
139
-
140
-
When creating batches, an additional config is used which is passed into the batch creation script. This is the datamodule config located `PVNet/configs/datamodule`.
141
-
142
-
For this example we will be using the `streamed_batches.yaml` config. Like before, a placeholder variable is used when specifing which configuration to use:
143
-
144
-
`configuration: "PLACEHOLDER.yaml"`
129
+
```yaml
130
+
configuration: "PLACEHOLDER.yaml"
131
+
```
145
132
146
-
This should be given the whole path to the config on your local machine, such as for our example it should be changed to:
133
+
This should be given the whole path to the config on your local machine, for example:
In this function the datamodule argument looks for a config under `PVNet/configs/datamodule`. The examples here are either to use "premade_batches" or "streamed_batches".
167
-
168
-
Its important that the dates set for the training, validation and testing in the datamodule (`streamed_batches.yaml`) config are within the ranges of the dates set for the input features in the configuration (`gcp_configuration.yaml`).
158
+
`scripts/save_batches.py`needs a config under `PVNet/configs/datamodule`. You can adapt `streamed_batches.yaml` or create your own in the same folder.
169
159
170
-
If downloading private data from a gcp bucket make sure to authenticate gcloud (the public satellite data does not need authentication):
160
+
If downloading private data from a GCP bucket make sure to authenticate gcloud (the public satellite data does not need authentication):
171
161
172
162
```
173
163
gcloud auth login
174
164
```
175
165
176
-
For files stored in multiple locations they can be added as list. For example from the gcp_configuration.yaml file we can change from satellite data stored on a bucket:
166
+
Files stored in multiple locations can be added as a list. For example, in the `example_configuration.yaml` file we can supply a path to satellite data stored on a bucket:
accelerator: cpu # Important if running on a system without a supported GPU
318
-
devices: auto
319
-
320
-
min_epochs: null
321
-
max_epochs: null
322
-
reload_dataloaders_every_n_epochs: 0
323
-
num_sanity_val_steps: 8
324
-
fast_dev_run: false
325
-
accumulate_grad_batches: 4
326
-
log_every_n_steps: 50
327
-
```
192
+
1. In `configs/datamodule/local_premade_batches.yaml`:
193
+
- update `batch_dir` to point to the directory you stored your batches in during batch creation
194
+
2. In `configs/model/local_multimodal.yaml`:
195
+
- update the list of encoders to reflect the data sources you are using. If you are using different NWP sources, the encoders for these should follow the same structure with two important updates:
196
+
- `in_channels`: number of variables your NWP source supplies
197
+
- `image_size_pixels`: spatial crop of your NWP data. It depends on the spatial resolution of your NWP; should match `nwp_image_size_pixels_height` and/or `nwp_image_size_pixels_width` in `datamodule/example_configs.yaml`, unless transformations such as coarsening was applied (e. g. as for ECMWF data)
198
+
3. In `configs/local_trainer.yaml`:
199
+
- set `accelerator: 0` if running on a system without a supported GPU
328
200
329
-
And finally update `defaults` in the main `./configs/config.yaml` file to use
201
+
If creating copies of the config files instead of modifying existing ones, update `defaults` in the main `./configs/config.yaml` file to use
330
202
your customised config files:
331
203
332
204
```yaml
@@ -350,7 +222,7 @@ python run.py
350
222
351
223
## Backtest
352
224
353
-
If you have succesfully trained a PVNet model and have a saved model checkpoint you can create a backtest using this, e.g. forecasts on historical data to evaluate forecast accuracy/skill. This can be done by running one of the scripts in this repo such as [the UK gsp backtest script](scripts/backtest_uk_gsp.py) or the [the pv site backtest script](scripts/backtest_sites.py), further info on how to run these are in each backtest file.
225
+
If you have successfully trained a PVNet model and have a saved model checkpoint you can create a backtest using this, e.g. forecasts on historical data to evaluate forecast accuracy/skill. This can be done by running one of the scripts in this repo such as [the UK GSP backtest script](scripts/backtest_uk_gsp.py) or the [the pv site backtest script](scripts/backtest_sites.py), further info on how to run these are in each backtest file.
0 commit comments