Skip to content

Commit e021b41

Browse files
authored
Merge pull request #51 from openclimatefix/docs/issue-48
update readme and getting started
2 parents 2cfc1f2 + 9c6991e commit e021b41

File tree

3 files changed

+94
-7
lines changed

3 files changed

+94
-7
lines changed

README.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,18 @@ Tasks include:
1818

1919
We will begin in the UK to benchmark against OCF results and expand to other countries as the project progresses. 😄
2020

21-
---
21+
22+
### Basic Usage Examples
23+
```bash
24+
# Archive Met Office UK data for a specific day in zarr format to Hugging Face
25+
open-data-pvnet metoffice archive --year 2023 --month 12 --day 1 --region uk
26+
27+
# Load data for analysis
28+
open-data-pvnet metoffice load --year 2023 --month 1 --day 16 --region uk
29+
30+
```
31+
32+
For detailed usage instructions and examples, see our [Getting Started Guide](getting_started.md#command-line-interface-cli).
2233

2334
## Volunteer Skills/Roles Needed
2435

getting_started.md

Lines changed: 75 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ Welcome to the Solar Forecasting project! This document will introduce you to th
1919
14. [Helpful Knowledge and Skills](#helpful-knowledge-and-skills)
2020
15. [How This Project Fits into Renewable Energy](#how-this-project-fits-into-renewable-energy)
2121
16. [Development and Testing Guide](#development-and-testing-guide)
22+
17. [Command Line Interface (CLI)](#command-line-interface-cli)
2223

2324
---
2425

@@ -149,8 +150,8 @@ APIs play a crucial role in fetching real-time and historical data required for
149150
3. **Satellite Data APIs**
150151
Satellite imagery and radiance data are invaluable for analyzing cloud cover and solar irradiance:
151152
- **Copernicus Atmosphere Monitoring Service (CAMS)**: Provides satellite-based aerosol, cloud, and solar radiation data. [Learn more](https://atmosphere.copernicus.eu/).
152-
- **NASAs POWER API**: Offers meteorological and solar datasets tailored for renewable energy applications, including European regions. [Learn more](https://power.larc.nasa.gov/).
153-
- **EUMETSAT**: Europes satellite-based service providing weather and climate data, including cloud cover and solar radiation products. [Learn more](https://www.eumetsat.int/).
153+
- **NASA's POWER API**: Offers meteorological and solar datasets tailored for renewable energy applications, including European regions. [Learn more](https://power.larc.nasa.gov/).
154+
- **EUMETSAT**: Europe's satellite-based service providing weather and climate data, including cloud cover and solar radiation products. [Learn more](https://www.eumetsat.int/).
154155

155156
4. **AWS S3 Access**
156157
You will need access to the AWS S3 bucket containing the NWP data. Ensure you have the required permissions to list and download objects from the bucket.
@@ -261,7 +262,7 @@ Below is a glossary of key terms that might be useful when working on this proje
261262

262263
### Geospatial Terms
263264

264-
- **Geostationary**: A satellite orbit where the satellite remains fixed relative to a specific point on Earths surface, providing continuous observation of the same region. Commonly used in weather monitoring and solar radiation measurement.
265+
- **Geostationary**: A satellite orbit where the satellite remains fixed relative to a specific point on Earth's surface, providing continuous observation of the same region. Commonly used in weather monitoring and solar radiation measurement.
265266
- **Geospatial Data**: Information about objects, events, or phenomena on Earth's surface, represented by geographic coordinates and often used in mapping and analysis.
266267
- **Latitude**: The angular distance of a location north or south of the equator, measured in degrees. Important for determining solar angles and irradiance.
267268
- **Longitude**: The angular distance of a location east or west of the prime meridian, measured in degrees. Used in conjunction with latitude to pinpoint geographic locations.
@@ -288,7 +289,7 @@ Below is a glossary of key terms that might be useful when working on this proje
288289
- **Atmospheric Pressure**: The force exerted by the weight of the atmosphere above a given point, measured in hectopascals (hPa) or millibars (mb). It affects weather patterns and the movement of air masses.
289290
- **Relative Humidity**: The amount of water vapor in the air compared to the maximum amount the air can hold at a given temperature, expressed as a percentage. It influences cloud formation and precipitation.
290291
- **Dew Point**: The temperature at which air becomes saturated with moisture and water vapor condenses into dew, clouds, or fog.
291-
- **Radiative Forcing**: The change in the energy balance of the Earths atmosphere due to factors like greenhouse gases, aerosols, and changes in solar irradiance. It is a key concept in climate change studies.
292+
- **Radiative Forcing**: The change in the energy balance of the Earth's atmosphere due to factors like greenhouse gases, aerosols, and changes in solar irradiance. It is a key concept in climate change studies.
292293
- **Turbidity**: A measure of the atmosphere's clarity, influenced by aerosols, dust, and pollution. High turbidity reduces the amount of solar radiation reaching the Earth's surface.
293294
- **Ozone Layer**: A layer of ozone (O₃) in the stratosphere that absorbs the majority of the Sun’s harmful ultraviolet radiation. Changes in the ozone layer can impact solar irradiance measurements.
294295
- **Wind Shear**: A change in wind speed or direction over a short distance in the atmosphere. It can influence cloud formation, storm development, and the dispersal of aerosols.
@@ -542,4 +543,74 @@ Use `pytest` to ensure the project works as expected:
542543

543544
---
544545

546+
## Command Line Interface (CLI)
547+
548+
The `open-data-pvnet` CLI provides various commands for downloading, processing, and loading weather and solar data.
549+
550+
### Basic Structure
551+
```bash
552+
open-data-pvnet <provider> <operation> [options]
553+
```
554+
555+
### Available Providers
556+
- `metoffice`: UK Met Office weather data
557+
- `gfs`: Global Forecast System data (coming soon)
558+
- `dwd`: German Weather Service data (coming soon)
559+
560+
### Operations
561+
1. **archive**: Download and archive data
562+
```bash
563+
# Archive a single hour
564+
open-data-pvnet metoffice archive --year 2023 --month 12 --day 1 --hour 12 --region uk
565+
566+
# Archive an entire day with parallel processing
567+
open-data-pvnet metoffice archive --year 2023 --month 12 --day 1 --region uk --workers 4
568+
```
569+
570+
2. **load**: Load archived data for analysis
571+
```bash
572+
# Load a single hour
573+
open-data-pvnet metoffice load --year 2023 --month 1 --day 16 --hour 0 --region uk
574+
575+
# Load an entire day
576+
open-data-pvnet metoffice load --year 2023 --month 1 --day 16 --region uk
577+
578+
# Load with custom chunking
579+
open-data-pvnet metoffice load --year 2023 --month 1 --day 16 --region uk \
580+
--chunks "time:24,latitude:100,longitude:100"
581+
```
582+
583+
### Common Options
584+
- `--region`: Specify data region (`uk` or `global`) for Met Office data
585+
- `--overwrite`: Force overwrite of existing files
586+
- `--remote`: Load data remotely without downloading
587+
- `--chunks`: Specify chunking for data loading
588+
- `--workers`: Number of parallel workers for archiving (default: 1)
589+
- `--archive-type`: Type of archive to create (`zarr.zip` or `tar`)
590+
591+
### Examples for Different Use Cases
592+
593+
#### Working with UK Data
594+
```bash
595+
# Download Met Office UK weather data
596+
open-data-pvnet metoffice archive --year 2023 --month 12 --day 1 --region uk --workers 2
597+
598+
# Load and analyze the data
599+
open-data-pvnet metoffice load --year 2023 --month 12 --day 1 --region uk
600+
```
601+
602+
#### Remote Data Access
603+
```bash
604+
# Load data directly from HuggingFace without downloading
605+
open-data-pvnet metoffice load --year 2023 --month 1 --day 16 --region uk --remote
606+
```
607+
608+
### Error Handling
609+
Common error messages and their solutions:
610+
- "No datasets found": Check if the specified date has available data
611+
- "Error loading dataset": Verify your internet connection and credentials
612+
- "Invalid chunks specification": Ensure chunk string follows the format "dim1:size1,dim2:size2"
613+
614+
---
615+
545616
Thank you for joining us on this journey to advance solar forecasting and renewable energy solutions!

src/open_data_pvnet/nwp/gfs_dataset.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
# Ensure xarray retains attributes during operations
2323
xr.set_options(keep_attrs=True)
2424

25+
2526
def open_gfs(dataset_path: str) -> xr.DataArray:
2627
"""
2728
Opens the GFS dataset stored in Zarr format and prepares it for processing.
@@ -50,7 +51,9 @@ def open_gfs(dataset_path: str) -> xr.DataArray:
5051
return gfs_data
5152

5253

53-
def handle_nan_values(dataset: xr.DataArray, method: str = "fill", fill_value: float = 0.0) -> xr.DataArray:
54+
def handle_nan_values(
55+
dataset: xr.DataArray, method: str = "fill", fill_value: float = 0.0
56+
) -> xr.DataArray:
5457
"""
5558
Handle NaN values in the dataset.
5659
@@ -133,7 +136,9 @@ def _get_sample(self, t0: pd.Timestamp) -> xr.Dataset:
133136
logging.info(f"Generating sample for t0={t0}...")
134137
interval_start = pd.Timedelta(minutes=self.config.input_data.nwp.gfs.interval_start_minutes)
135138
interval_end = pd.Timedelta(minutes=self.config.input_data.nwp.gfs.interval_end_minutes)
136-
time_resolution = pd.Timedelta(minutes=self.config.input_data.nwp.gfs.time_resolution_minutes)
139+
time_resolution = pd.Timedelta(
140+
minutes=self.config.input_data.nwp.gfs.time_resolution_minutes
141+
)
137142

138143
start_dt = t0 + interval_start
139144
end_dt = t0 + interval_end

0 commit comments

Comments
 (0)