Skip to content

Commit 668a89a

Browse files
Update Experiment Log (#180)
* Update experiment log * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update MAE value * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add plots * Update MAE value * Add plot and analysis script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Clean up script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Run black * Run black * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Split into separate file * Add non-meteomatics error * New analysis script * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update india_windnet_v2.md --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 718989f commit 668a89a

File tree

3 files changed

+162
-0
lines changed

3 files changed

+162
-0
lines changed

experiments/analysis.py

+96
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
"""
2+
Script to generate a table comparing two run for MAE values for 48 hour 15 minute forecast
3+
"""
4+
5+
import argparse
6+
7+
import matplotlib.pyplot as plt
8+
import numpy as np
9+
import wandb
10+
11+
12+
def main(runs: list[str], run_names: list[str]) -> None:
13+
"""
14+
Compare two runs for MAE values for 48 hour 15 minute forecast
15+
"""
16+
api = wandb.Api()
17+
dfs = []
18+
for run in runs:
19+
run = api.run(f"openclimatefix/india/{run}")
20+
21+
df = run.history()
22+
# Get the columns that are in the format 'MAE_horizon/step_<number>/val`
23+
mae_cols = [col for col in df.columns if "MAE_horizon/step_" in col and "val" in col]
24+
# Sort them
25+
mae_cols.sort()
26+
df = df[mae_cols]
27+
# Get last non-NaN value
28+
# Drop all rows with all NaNs
29+
df = df.dropna(how="all")
30+
# Select the last row
31+
# Get average across entire row, and get the IDX for the one with the smallest values
32+
min_row_mean = np.inf
33+
for idx, (row_idx, row) in enumerate(df.iterrows()):
34+
if row.mean() < min_row_mean:
35+
min_row_mean = row.mean()
36+
min_row_idx = idx
37+
df = df.iloc[min_row_idx]
38+
# Calculate the timedelta for each group
39+
# Get the step from the column name
40+
column_timesteps = [int(col.split("_")[-1].split("/")[0]) * 15 for col in mae_cols]
41+
dfs.append(df)
42+
# Get the timedelta for each group
43+
groupings = [
44+
[0, 0],
45+
[15, 15],
46+
[30, 45],
47+
[45, 60],
48+
[60, 120],
49+
[120, 240],
50+
[240, 360],
51+
[360, 480],
52+
[480, 720],
53+
[720, 1440],
54+
[1440, 2880],
55+
]
56+
header = "| Timestep |"
57+
separator = "| --- |"
58+
for run_name in run_names:
59+
header += f" {run_name} MAE % |"
60+
separator += " --- |"
61+
print(header)
62+
print(separator)
63+
for grouping in groupings:
64+
group_string = f"| {grouping[0]}-{grouping[1]} minutes |"
65+
# Select indicies from column_timesteps that are within the grouping, inclusive
66+
group_idx = [
67+
idx
68+
for idx, timestep in enumerate(column_timesteps)
69+
if timestep >= grouping[0] and timestep <= grouping[1]
70+
]
71+
for df in dfs:
72+
group_string += f" {df.iloc[group_idx].mean()*100.:0.3f} |"
73+
print(group_string)
74+
75+
# Plot the error on per timestep, and grouped timesteps
76+
plt.figure()
77+
for idx, df in enumerate(dfs):
78+
plt.plot(column_timesteps, df, label=run_names[idx])
79+
plt.legend()
80+
plt.xlabel("Timestep (minutes)")
81+
plt.ylabel("MAE %")
82+
plt.title("MAE % for each timestep")
83+
plt.savefig("mae_per_timestep.png")
84+
plt.show()
85+
86+
87+
if __name__ == "__main__":
88+
parser = argparse.ArgumentParser()
89+
"5llq8iw6"
90+
parser.add_argument("--first_run", type=str, default="xdlew7ib")
91+
parser.add_argument("--second_run", type=str, default="v3mja33d")
92+
# Add arguments that is a list of strings
93+
parser.add_argument("--list_of_runs", nargs="+")
94+
parser.add_argument("--run_names", nargs="+")
95+
args = parser.parse_args()
96+
main(args.list_of_runs, args.run_names)

experiments/india_pv_wind.md

+20
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,26 @@ Overall MAE is 4.9% on the validation set, and forecasts look overall good.
2525
## WindNet
2626

2727

28+
### April-29-2024 WindNet v1 Production Model
29+
30+
[WandB Link](https://wandb.ai/openclimatefix/india/runs/5llq8iw6)
31+
32+
Improvements: Larger input size (64x64), 7 hour delay for ECMWF NWP inputs, to match productions.
33+
New, much more efficient encoder for NWP, allowing for more filters and layers, with less parameters.
34+
The 64x64 input size corresponds to 6.4 degrees x 6.4 degrees, which is around 700km x 700km. This allows for the
35+
model to see the wind over the wind generation sites, which seems to be the biggest reason for the improvement in the model.
36+
37+
38+
39+
MAE is 7.6% with real improvements on the production side of things.
40+
41+
42+
There were other experiments with slightly different numbers of filters, model parameters and the like, but generally no
43+
improvements were seen.
44+
45+
46+
## WindNet v1 Results
47+
2848
### Data
2949

3050
We use Wind generation data for India from April 2019-Nov 2022 for training

experiments/india_windnet_v2.md

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
### WindNet v2 Meteomatics + ECMWF Model
2+
3+
[WandB Linl](https://wandb.ai/openclimatefix/india/runs/v3mja33d)
4+
5+
This newest experiment uses Meteomatics data in addition to ECMWF data. The Meteomatics data is at specific locations corresponding
6+
to the gneeration sites we know about. It is smartly downscaled ECMWF data, down to 15 minutes and at a few height levels we are
7+
interested in, primarily 10m, 100m, and 200m. The Meteomatics data is a semi-reanalysis, with each block of 6 hours being from one forecast run.
8+
For example, in one day, hours 00-06 are from the same, 00 forecast run, and hours 06-12 are from the 06 forecast run. This is important to note
9+
as it is both not a real reanalysis, but we also can't have it exactly match the live data, as any forecast steps beyond 6 hours are thrown away.
10+
This does mean that these results should be taken as a best case or better than best case scenario, as every 6 hour, observations from the future
11+
are incorporated into the Meteomatics input data from the next NWP mode run.
12+
13+
For the purposes of WindNet, Meteomatics data is treated as Sensor data that goes into the future.
14+
The model encodes the sensor information the same way as for the historical PV, Wind, and GSP generation, and has
15+
a simple, single attention head to encode the information. This is then concatenated along with the rest of the data, like in
16+
previous experiments.
17+
18+
This model also has an even larger input size of ECMWF data, 81x81 pixels, corresponding to around 810kmx810km.
19+
![Screenshot_20240430_082855](https://github.com/openclimatefix/PVNet/assets/7170359/6981a088-8664-474b-bfea-c94c777fc119)
20+
21+
MAE is 7.0% on the validation set, showing a slight improvement over the previous model.
22+
23+
Comperison with the production model:
24+
25+
| Timestep | Prod MAE % | No Meteomatics MAE % | Meteomatics MAE % |
26+
| --- | --- | --- | --- |
27+
| 0-0 minutes | 7.586 | 5.920 | 2.475 |
28+
| 15-15 minutes | 8.021 | 5.809 | 2.968 |
29+
| 30-45 minutes | 7.233 | 5.742 | 3.472 |
30+
| 45-60 minutes | 7.187 | 5.698 | 3.804 |
31+
| 60-120 minutes | 7.231 | 5.816 | 4.650 |
32+
| 120-240 minutes | 7.287 | 6.080 | 6.028 |
33+
| 240-360 minutes | 7.319 | 6.375 | 6.738 |
34+
| 360-480 minutes | 7.285 | 6.638 | 6.964 |
35+
| 480-720 minutes | 7.143 | 6.747 | 6.906 |
36+
| 720-1440 minutes | 7.380 | 7.207 | 6.962 |
37+
| 1440-2880 minutes | 7.904 | 7.507 | 7.507 |
38+
39+
![mae_per_timestep](https://github.com/openclimatefix/PVNet/assets/7170359/e3c942e8-65c6-4b95-8c51-f25d43e7a082)
40+
41+
42+
43+
44+
Example plot
45+
46+
![Screenshot_20240430_082937](https://github.com/openclimatefix/PVNet/assets/7170359/88db342e-bf82-414e-8255-5ad4af659fb8)

0 commit comments

Comments
 (0)