Skip to content

Commit 823020a

Browse files
Made configs ediatable without risk of versioning + update readme with detailed getting started instructions
1 parent adeb778 commit 823020a

30 files changed

+316
-18
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
# Custom
22
config_tree.txt
3+
configs/
34
lightning_logs/
45
logs/
6+
output/
57
checkpoints*
68
csv/
79
notebooks/

README.md

+313-17
Original file line numberDiff line numberDiff line change
@@ -4,40 +4,336 @@
44

55
This project is used for training PVNet and running PVnet on live data.
66

7-
PVNet2 largely inherits the same architecture from [PVNet1.0](https://github.com/openclimatefix/predict_pv_yield).
8-
The NWP and satellite data are sent through some neural network which encodes them down to 1D intermediate representations.
9-
These are concatenated together with the GSP output history, the calculated solar coordinates (azimuth and elevation) and the GSP ID which has been put through an embedding layer.
10-
This 1D concatenated feature vector is put through an output network which outputs predictions of the future GSP yield.
11-
National forecasts are made by adding all the GSP forecasts together.
7+
PVNet2 largely inherits the same architecture from
8+
[PVNet1.0](https://github.com/openclimatefix/predict_pv_yield). The NWP and
9+
satellite data are sent through some neural network which encodes them down to
10+
1D intermediate representations. These are concatenated together with the GSP
11+
output history, the calculated solar coordinates (azimuth and elevation) and the
12+
GSP ID which has been put through an embedding layer. This 1D concatenated
13+
feature vector is put through an output network which outputs predictions of the
14+
future GSP yield. National forecasts are made by adding all the GSP forecasts
15+
together.
16+
17+
## Setup / Installation
1218

13-
## Setup
1419
```bash
1520
git clone https://github.com/openclimatefix/PVNet.git
1621
cd PVNet
1722
pip install -r requirements.txt
18-
pip install git+https://github.com/SheffieldSolar/PV_Live-API
1923
```
2024

21-
## Running
25+
### Additional development dependencies
26+
2227
```bash
23-
python run.py
28+
pip install -r requirements-dev.txt
2429
```
2530

26-
## Development
27-
```bash
28-
pip install -r requirements.txt -r requirements-dev.txt
29-
pytest
31+
## Getting started with running PVNet
32+
33+
Before running any code in within PVNet, copy the example configuration to a
34+
configs directory:
35+
36+
```
37+
cp -r configs.example configs
38+
```
39+
40+
You will be making local amendments to these configs
41+
42+
### Datasets
43+
44+
As a minimum, in order to create batches of data/run PVNet, you will need to
45+
supply paths to NWP and GSP data. PV data can also be used. We list some
46+
suggested locations for downloading such datasets below:
47+
48+
**GSP (Grid Supply Point)** - Regional PV generation data\
49+
The University of Sheffield provides API access to download this data:
50+
https://www.solar.sheffield.ac.uk/pvlive/api/
51+
52+
Documentation for querying generation data aggregated by GSP region can be found
53+
here:
54+
https://docs.google.com/document/d/e/2PACX-1vSDFb-6dJ2kIFZnsl-pBQvcH4inNQCA4lYL9cwo80bEHQeTK8fONLOgDf6Wm4ze_fxonqK3EVBVoAIz/pub#h.9d97iox3wzmd
55+
56+
**NWP (Numerical weather predictions)**\
57+
OCF maintains a Zarr formatted version the German Weather Service's (DWD)
58+
ICON-EU NWP model here:
59+
https://huggingface.co/datasets/openclimatefix/dwd-icon-eu which includes the UK
60+
61+
**PV**\
62+
OCF maintains a dataset of PV generation from 1311 private PV installations
63+
here: https://huggingface.co/datasets/openclimatefix/uk_pv
64+
65+
### Generating pre-made batches of data for training/validation of PVNet
66+
67+
PVNet contains a script for generating batches of data suitable for training the
68+
PVNet models.
69+
70+
To run the script you will need to make some modifications to the datamodule
71+
configuration.
72+
73+
1. First, create your new configuration file in
74+
`./configs/datamodule/configiration/local_configuration.yaml` and paste the
75+
sample config (shown below)
76+
2. Duplicate the `./configs/datamodule/ocf_datapipes.yaml` to
77+
`./configs/datamodule/_local_ocf_datapipes.yaml` and ensure the
78+
`configuration` key points to your newly created configuration file in
79+
step 1.
80+
3. Also in this file, update the train, val & test periods to cover the data you
81+
have access to.
82+
4. To get you started with your own configuration file, see the sample config
83+
below. Update the data paths to the location of your local GSP, NWP and PV
84+
datasets:
85+
86+
```yaml
87+
general:
88+
description: Demo config
89+
name: demo_datamodule_config
90+
91+
input_data:
92+
default_history_minutes: 60
93+
default_forecast_minutes: 120
94+
95+
gsp:
96+
gsp_zarr_path: /path/to/gsp-data.zarr
97+
history_minutes: 60
98+
forecast_minutes: 120
99+
time_resolution_minutes: 30
100+
start_datetime: "2019-01-01T00:00:00"
101+
end_datetime: "2019-01-08T00:00:00"
102+
metadata_only: false
103+
104+
nwp:
105+
ukv:
106+
nwp_zarr_path: /path/to/nwp-data.zarr
107+
history_minutes: 60
108+
forecast_minutes: 120
109+
time_resolution_minutes: 60
110+
nwp_channels: # comment out channels as appropriate
111+
- t # live = t2m
112+
- dswrf
113+
- dlwrf
114+
- hcc
115+
- MCC
116+
- lcc
117+
- vis
118+
- r # live = r2
119+
- prate # live ~= rprate
120+
- si10 # 10-metre wind speed | live = unknown
121+
nwp_image_size_pixels_height: 24
122+
nwp_image_size_pixels_width: 24
123+
nwp_provider: ukv
124+
125+
pv:
126+
pv_files_groups:
127+
- label: pvoutput.org
128+
pv_filename: /path/to/pv-data/pv.netcdf
129+
pv_metadata_filename: /path/to/pv-data/metadata.csv
130+
history_minutes: 60
131+
forecast_minutes: 0 # PVNet assumes no future PV generation
132+
time_resolution_minutes: 5
133+
start_datetime: "2019-01-01T00:00:00"
134+
end_datetime: "2019-01-08T00:00:00"
135+
pv_image_size_meters_height: 24
136+
pv_image_size_meters_width: 24
137+
pv_ml_ids: [154,155,156,158,159,160,162,164,165,166,167,168,169,171,173,177,178,179,181,182,185,186,187,188,189,190,191,192,193,197,198,199,200,202,204,205,206,208,209,211,214,215,216,217,218,219,220,221,225,229,230,232,233,234,236,242,243,245,252,254,255,256,257,258,260,261,262,265,267,268,272,273,275,276,277,280,281,282,283,287,289,291,292,293,294,295,296,297,298,301,302,303,304,306,307,309,310,311,317,318,319,320,321,322,323,325,326,329,332,333,335,336,338,340,342,344,345,346,348,349,352,354,355,356,357,360,362,363,368,369,370,371,372,374,375,376,378,380,382,384,385,388,390,391,393,396,397,398,399,400,401,403,404,405,406,407,409,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,429,431,435,437,438,440,441,444,447,450,451,453,456,457,458,459,464,465,466,467,468,470,471,473,474,476,477,479,480,481,482,485,486,488,490,491,492,493,496,498,501,503,506,507,508,509,510,511,512,513,515,516,517,519,520,521,522,524,526,527,528,531,532,536,537,538,540,541,542,543,544,545,549,550,551,552,553,554,556,557,560,561,563,566,568,571,572,575,576,577,579,580,581,582,584,585,588,590,594,595,597,600,602,603,604,606,611,613,614,616,618,620,622,623,624,625,626,628,629,630,631,636,637,638,640,641,642,644,645,646,650,651,652,653,654,655,657,660,661,662,663,666,667,668,670,675,676,679,681,683,684,685,687,696,698,701,702,703,704,706,710,722,723,724,725,727,728,729,730,732,733,734,735,736,737,]
138+
n_pv_systems_per_example: 128
139+
get_center: false
140+
is_live: false
141+
142+
satellite:
143+
satellite_zarr_path: "" # Left empty to avoid using satellite data
144+
history_minutes: 60
145+
forecast_minutes: 0
146+
live_delay_minutes: 30
147+
time_resolution_minutes: 5
148+
satellite_channels:
149+
- IR_016
150+
- IR_039
151+
- IR_087
152+
- IR_097
153+
- IR_108
154+
- IR_120
155+
- IR_134
156+
- VIS006
157+
- VIS008
158+
- WV_062
159+
- WV_073
160+
satellite_image_size_pixels_height: 24
161+
satellite_image_size_pixels_width: 24
162+
```
163+
164+
With your configuration in place, you can proceed to create batches. PVNet uses
165+
[hydra](https://hydra.cc/) which enables us to pass variables via the command
166+
line that will override the configuration defined in the `./configs` directory.
167+
168+
Run the save_batches.py script to create batches with the following arguments as
169+
a minimum:
170+
171+
```
172+
python scripts/save_batches.py datamodule=local_ocf_datapipes +batch_output_dir="./output" +num_train_batches=10 +num_val_batches=5
173+
```
174+
175+
### Training PVNet
176+
177+
How PVNet is run is determined by the extensive configuration in the config
178+
files. The following configs have been tested to work using batches of data
179+
created using the steps and batch creation config mentioned above.
180+
181+
You should create the following configs before trying to train a model locally,
182+
as so:
183+
184+
In `configs/datamodule/local_premade_batches.yaml`:
185+
186+
```yaml
187+
_target_: pvnet.data.datamodule.DataModule
188+
configuration: null
189+
batch_dir: "./output" # where the batches are saved
190+
num_workers: 20
191+
prefetch_factor: 2
192+
batch_size: 8
30193
```
31194

32-
Might need to install PVLive
195+
In `configs/model/local_multimodal.yaml`:
196+
197+
```yaml
198+
_target_: pvnet.models.multimodal.multimodal.Model
199+
200+
output_quantiles: [0.02, 0.1, 0.25, 0.5, 0.75, 0.9, 0.98]
201+
202+
#--------------------------------------------
203+
# NWP encoder
204+
#--------------------------------------------
205+
206+
nwp_encoders_dict:
207+
ukv:
208+
_target_: pvnet.models.multimodal.encoders.encoders3d.DefaultPVNet
209+
_partial_: True
210+
in_channels: 10
211+
out_features: 256
212+
number_of_conv3d_layers: 6
213+
conv3d_channels: 32
214+
image_size_pixels: 24
215+
216+
#--------------------------------------------
217+
# Sat encoder settings
218+
#--------------------------------------------
219+
220+
# Ignored as premade batches were created without satellite data
221+
# sat_encoder:
222+
# _target_: pvnet.models.multimodal.encoders.encoders3d.DefaultPVNet
223+
# _partial_: True
224+
# in_channels: 11
225+
# out_features: 256
226+
# number_of_conv3d_layers: 6
227+
# conv3d_channels: 32
228+
# image_size_pixels: 24
229+
230+
add_image_embedding_channel: False
231+
232+
#--------------------------------------------
233+
# PV encoder settings
234+
#--------------------------------------------
235+
236+
pv_encoder:
237+
_target_: pvnet.models.multimodal.site_encoders.encoders.SingleAttentionNetwork
238+
_partial_: True
239+
num_sites: 349
240+
out_features: 40
241+
num_heads: 4
242+
kdim: 40
243+
pv_id_embed_dim: 20
244+
245+
#--------------------------------------------
246+
# Tabular network settings
247+
#--------------------------------------------
248+
249+
output_network:
250+
_target_: pvnet.models.multimodal.linear_networks.networks.ResFCNet2
251+
_partial_: True
252+
fc_hidden_features: 128
253+
n_res_blocks: 6
254+
res_block_layers: 2
255+
dropout_frac: 0.0
256+
257+
embedding_dim: 16
258+
include_sun: True
259+
include_gsp_yield_history: False
260+
261+
#--------------------------------------------
262+
# Times
263+
#--------------------------------------------
264+
265+
# Foreast and time settings
266+
history_minutes: 60
267+
forecast_minutes: 120
268+
269+
min_sat_delay_minutes: 60
270+
271+
sat_history_minutes: 90
272+
pv_history_minutes: 60
273+
274+
# These must be set for each NWP encoder
275+
nwp_history_minutes:
276+
ukv: 60
277+
nwp_forecast_minutes:
278+
ukv: 120
279+
280+
# ----------------------------------------------
281+
# Optimizer
282+
# ----------------------------------------------
283+
optimizer:
284+
_target_: pvnet.optimizers.EmbAdamWReduceLROnPlateau
285+
lr: 0.0001
286+
weight_decay: 0.01
287+
amsgrad: True
288+
patience: 5
289+
factor: 0.1
290+
threshold: 0.002
291+
```
292+
293+
In `configs/local_trainer.yaml`:
294+
295+
```yaml
296+
_target_: lightning.pytorch.trainer.trainer.Trainer
297+
298+
accelerator: cpu # Important if running on a system without a supported GPU
299+
devices: auto
300+
301+
min_epochs: null
302+
max_epochs: null
303+
reload_dataloaders_every_n_epochs: 0
304+
num_sanity_val_steps: 8
305+
fast_dev_run: false
306+
accumulate_grad_batches: 4
307+
log_every_n_steps: 50
308+
```
309+
310+
And finally update `defaults` in the main `./configs/config.yaml` file to use
311+
your customised config files:
312+
313+
```yaml
314+
defaults:
315+
- trainer: local_trainer.yaml
316+
- model: local_multimodal.yaml
317+
- datamodule: local_premade_batches.yaml
318+
- callbacks: null
319+
- logger: csv.yaml
320+
- experiment: null
321+
- hparams_search: null
322+
- hydra: default.yaml
33323
```
34-
pip install git+https://github.com/SheffieldSolar/PV_Live-API#pvlive_api
324+
325+
Assuming you ran the `save_batches.py` script to generate some premade train and
326+
val data batches, you can now train PVNet by running:
327+
328+
```
329+
python run.py
35330
```
36331

37332
## Testing
38333

39-
You can use `pytest` to run tests
334+
You can use `python -m pytest tests` to run tests
40335

41336
## Experiments
42337

43-
Notes on these experiments are [here](https://docs.google.com/document/d/1fbkfkBzp16WbnCg7RDuRDvgzInA6XQu3xh4NCjV-WDA/edit?usp=sharing).
338+
Notes on these experiments are
339+
[here](https://docs.google.com/document/d/1fbkfkBzp16WbnCg7RDuRDvgzInA6XQu3xh4NCjV-WDA/edit?usp=sharing).
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

configs/model/multimodal.yaml configs.example/model/multimodal.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ output_quantiles: [0.02, 0.1, 0.25, 0.5, 0.75, 0.9, 0.98]
66
# NWP encoder
77
#--------------------------------------------
88

9-
nwp_encoder:
9+
nwp_encoders_dict:
1010
ukv:
1111
_target_: pvnet.models.multimodal.encoders.encoders3d.DefaultPVNet
1212
_partial_: True
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)