Skip to content

Commit 36f42cd

Browse files
extended readme; added max_scaling_factor to variance scaling method
1 parent 9a52b98 commit 36f42cd

File tree

4 files changed

+84
-62
lines changed

4 files changed

+84
-62
lines changed

README.md

Lines changed: 57 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Bias adjustment/correction procedures for climatic reasearch
22

3-
<div style="text-align: center">
3+
<div align="center">
44

55
[![GitHub](https://badgen.net/badge/icon/github?icon=github&label)](https://github.com/btschwertfeger/Bias-Adjustment-Python)
66
[![Generic badge](https://img.shields.io/badge/python-3.7+-green.svg)](https://shields.io/)
@@ -9,12 +9,26 @@
99

1010
</div>
1111

12-
Collection of different scale- and distribution-based bias adjustment techniques for climatic research (see `examples.ipynb` for help).
12+
This Python module contains a collection of different scale- and distribution-based bias adjustment techniques for climatic research (see `examples.ipynb` for help).
1313

14-
Bias adjustment procedures in Python are very slow, so they should not be used on large data sets.
15-
A C++ implementation that works way faster can be found [here](https://github.com/btschwertfeger/Bias-Adjustment-Cpp).
14+
Since the Python programming language is very slow and bias adjustments are complex statistical transformations, it is recommended to use the C++ implementation on large data sets. This can be found [here](https://github.com/btschwertfeger/Bias-Adjustment-Cpp).
1615

17-
## About
16+
---
17+
18+
## Table of Contents
19+
20+
1. [ About ](#about)
21+
2. [ Available Methods ](#methods)
22+
3. [ Installation ](#installation)
23+
4. [ Usage and Examples ](#examples)
24+
5. [ Notes ](#notes)
25+
6. [ References ](#references)
26+
27+
---
28+
29+
<a name="about"></a>
30+
31+
## 1. About
1832

1933
These programs and data structures are designed to help minimize discrepancies between modeled and observed climate data. Data from past periods are used to adjust variables from current and future time series so that their distributional properties approximate possible actual values.
2034

@@ -33,31 +47,41 @@ In this way, for example, modeled data, which on average represent values that a
3347
src="images/dm-doy-plot.png?raw=true"
3448
alt="Temperature per day of year in modeled, observed and bias-adjusted climate data"
3549
style="background-color: white; border-radius: 7px">
36-
<figcaption>Figure 2: Temperature per day of year in modeled, observed and bias-adjusted climate data</figcaption>
50+
<figcaption>Figure 2: Temperature per day of year in observed, modeled and bias-adjusted climate data</figcaption>
3751
</figure>
3852

3953
---
4054

41-
## Available methods:
55+
<a name="methods"></a>
56+
57+
## 2. Available methods:
58+
59+
All methods except the `adjust_3d` function requires the application on one time series.
4260

43-
- Linear Scaling (additive and multiplicative)
44-
- Variance Scaling (additive)
45-
- Delta (Change) Method (additive and multiplicative)
46-
- Quantile Mapping (additive)
47-
- Detrended Quantile Mapping (additive and multiplicative)
48-
- Quantile Delta Mapping (additive and multuplicative)
61+
| Function name | Description |
62+
| ------------------------ | -------------------------------------------------------------------------------------------- |
63+
| `linear_scaling` | Linear Scaling (additive and multiplicative) |
64+
| `variance_scaling` | Variance Scaling (additive) |
65+
| `delta_method` | Delta (Change) Method (additive and multiplicative) |
66+
| `quantile_mapping` | Quantile Mapping (additive) and Detrended Quantile Mapping (additive and multiplicative) |
67+
| `quantile_delta_mapping` | Quantile Delta Mapping (additive and multiplicative) |
68+
| `adjust_3d` | requires a method name and the respective parameters to adjust all time series of a data set |
4969

5070
---
5171

52-
## Usage
72+
<a name="installation"></a>
5373

54-
### Installation
74+
## 3. Installation
5575

5676
```bash
5777
python3 -m pip install python-cmethods
5878
```
5979

60-
### Import and application
80+
---
81+
82+
<a name="examples"></a>
83+
84+
## 4. Usage and Examples
6185

6286
```python
6387
import xarray as xr
@@ -91,15 +115,13 @@ qdm_result = cm.adjust_3d( # 3d = 2 spatial and 1 time dimension
91115
Notes:
92116

93117
- When using the `adjust_3d` method you have to specify the method by name.
94-
- For the multiplicative linear scaling and delta method is a maximum scaling factor of 10 set. This can be changed by the `max_scaling_factor` parameter.
95-
96-
---
118+
- For the multiplicative linear scaling and the delta method as well as the variance scaling method a maximum scaling factor of 10 is defined. This can be changed by the parameter `max_scaling_factor`.
97119

98120
## Examples (see repository on [GitHub](https://github.com/btschwertfeger/Bias-Adjustment-Python))
99121

100-
`/examples/examples.ipynb`: Notebook containing different methods and plots
122+
Notebook with different methods and plots: `/examples/examples.ipynb`
101123

102-
`/examples/do_bias_correction.py`: Example script for adjusting climate data
124+
Example script for adjusting climate data: `/examples/do_bias_correction.py`
103125

104126
```bash
105127
python3 do_bias_correction.py \
@@ -109,26 +131,32 @@ python3 do_bias_correction.py \
109131
--method linear_scaling \
110132
--variable tas \
111133
--unit '°C' \
112-
--group time.month \
134+
--group 'time.month' \
113135
--kind +
114136
```
115137

116138
- Linear and variance, as well as delta change method require `--group time.month` as argument.
117-
- Adjustment methods that apply changes in distributional biasses (QM, QDM, DQM; EQM, ...) need the `--nquantiles` argument set to some integer.
118-
- Data sets should have the same spatial resolutions.
139+
- Adjustment methods that apply changes in distributional biasses (QM, QDM, DQM, ...) need the `--nquantiles` argument set to some integer.
140+
- Data sets must have the same spatial resolutions.
119141

120142
---
121143

122-
## Notes
144+
<a name="notes"></a>
123145

124-
- Computation in Python takes some time, so this is only for demonstration. When adjusting large datasets, its best to the C++ implementation mentioned above.
146+
## 5. Notes
147+
148+
- Computation in Python takes some time, so this is only for demonstration. When adjusting large datasets, its best to use the C++ implementation mentioned above.
125149
- Formulas and references can be found in the implementations of the corresponding functions.
126150

127-
## Space for improvements
151+
### Space for improvements:
152+
153+
Since the scaling methods implemented so far scale by default over the mean values of the respective months, unrealistic long-term mean values may occur at the month transitions. This can be prevented either by selecting `group='time.dayofyear'`. Alternatively, it is possible not to scale using long-term mean values, but using a 31-day interval, which takes the 31 surrounding values over all years as the basis for calculating the mean values. This is not yet implemented in this module, but is available in the C++ implementation [here](https://github.com/btschwertfeger/Bias-Adjustment-Cpp).
154+
155+
---
128156

129-
Since the scaling methods implemented so far scale by default over the mean values of the respective months, unrealistic long-term mean values may occur at the month transitions. This can be prevented either by selecting `group='time.dayofyear`. Alternatively, it is possible not to scale using long-term mean values, but using a 30-day interval, which takes the 30 surrounding values over all years as the basis for calculating the mean values. This is not yet implemented in this module, but is available in the C++ implementation [here](https://github.com/btschwertfeger/Bias-Adjustment-Cpp).
157+
<a name="references"></a>
130158

131-
## References
159+
## 6. References
132160

133161
- Schwertfeger, Benjamin Thomas (2022) The influence of bias corrections on variability, distribution, and correlation of temperatures in comparison to observed and modeled climate data in Europe (https://epic.awi.de/id/eprint/56689/)
134162
- Linear Scaling and Variance Scaling based on: Teutschbein, Claudia and Seibert, Jan (2012) Bias correction of regional climate model simulations for hydrological climate-change impact studies: Review and evaluation of different methods (https://doi.org/10.1016/j.jhydrol.2012.05.052)

cmethods/CMethods.py

Lines changed: 24 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -272,7 +272,7 @@ def linear_scaling(cls,
272272
> obs=obs[variable],
273273
> simh=simh[variable],
274274
> simp=simp[variable],
275-
> group='time.month' # optional
275+
> group='time.month' # optional, this is default here
276276
>)
277277
278278
----- E Q U A T I O N S -----
@@ -292,13 +292,11 @@ def linear_scaling(cls,
292292
else:
293293
if kind in cls.ADDITIVE: return np.array(simp) + (np.nanmean(obs) - np.nanmean(simh)) # Eq. 1
294294
elif kind in cls.MULTIPLICATIVE:
295-
scaling_factor = (np.nanmean(obs) / np.nanmean(simh))
296-
if scaling_factor > 0 and scaling_factor > abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR)):
297-
return np.array(simp) * abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR))
298-
elif scaling_factor < 0 and scaling_factor < -abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR)):
299-
return np.array(simp) * -abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR))
300-
else:
301-
return np.array(simp) * scaling_factor # Eq. 2
295+
adj_scaling_factor = cls.get_adjusted_scaling_factor(
296+
np.nanmean(obs) / np.nanmean(simh),
297+
kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR)
298+
)
299+
return np.array(simp) * adj_scaling_factor # Eq. 2
302300
else: raise ValueError('Scaling type invalid. Valid options for param kind: "+" and "*"')
303301

304302
# ? -----========= V A R I A N C E - S C A L I N G =========------
@@ -360,8 +358,12 @@ def variance_scaling(cls,
360358
VS_1_simh = LS_simh - np.nanmean(LS_simh) # Eq. 3
361359
VS_1_simp = LS_simp - np.nanmean(LS_simp) # Eq. 4
362360

363-
VS_2_simp = VS_1_simp * (np.std(obs) / np.std(VS_1_simh)) # Eq. 5
364-
361+
adj_scaling_factor = cls.get_adjusted_scaling_factor(
362+
np.std(obs) / np.std(VS_1_simh),
363+
kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR)
364+
)
365+
366+
VS_2_simp = VS_1_simp * adj_scaling_factor # Eq. 5
365367
return VS_2_simp + np.nanmean(LS_simp) # Eq. 6
366368

367369
# ? -----========= D E L T A - M E T H O D =========------
@@ -414,13 +416,11 @@ def delta_method(cls,
414416
else:
415417
if kind in cls.ADDITIVE: return np.array(obs) + (np.nanmean(simp) - np.nanmean(simh)) # Eq. 1
416418
elif kind in cls.MULTIPLICATIVE:
417-
scaling_factor = (np.nanmean(simp) / np.nanmean(simh))
418-
if scaling_factor > 0 and scaling_factor > abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR)):
419-
return np.array(obs) * abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR))
420-
elif scaling_factor < 0 and scaling_factor < -abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR)):
421-
return np.array(obs) * -abs(kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR))
422-
else:
423-
return np.array(obs) * scaling_factor # Eq. 2
419+
adj_scaling_factor = cls.get_adjusted_scaling_factor(
420+
np.nanmean(simp) / np.nanmean(simh),
421+
kwargs.get('max_scaling_factor', cls.MAX_SCALING_FACTOR)
422+
)
423+
return np.array(obs) * adj_scaling_factor # Eq. 2
424424
else: raise ValueError(f'{kind} not implemented! Use "+" or "*" instead.')
425425

426426

@@ -689,16 +689,10 @@ def get_inverse_of_cdf(base_cdf, insert_cdf, xbins) -> np.array:
689689
return np.interp(insert_cdf, base_cdf, xbins)
690690

691691
@staticmethod
692-
def load_data(
693-
obs_fpath: str,
694-
simh_fpath: str,
695-
simp_fpath: str,
696-
use_cftime: bool=False,
697-
chunks=None
698-
) -> (xr.core.dataarray.Dataset, xr.core.dataarray.Dataset, xr.core.dataarray.Dataset):
699-
'''Load and return loaded netcdf datasets'''
700-
obs = xr.open_dataset(obs_fpath, use_cftime=use_cftime, chunks=chunks)
701-
simh = xr.open_dataset(simh_fpath, use_cftime=use_cftime, chunks=chunks)
702-
simp = xr.open_dataset(simp_fpath, use_cftime=use_cftime, chunks=chunks)
703-
704-
return obs, simh, simp
692+
def get_adjusted_scaling_factor(factor: float, max_scaling_factor: float) -> float:
693+
if factor > 0 and factor > abs(max_scaling_factor):
694+
return abs(max_scaling_factor)
695+
elif factor < 0 and factor < -abs(max_scaling_factor):
696+
return -abs(max_scaling_factor)
697+
else:
698+
return factor

examples/examples.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -716,7 +716,7 @@
716716
],
717717
"metadata": {
718718
"kernelspec": {
719-
"display_name": "Python 3.7.3 64-bit",
719+
"display_name": "Python 3 (ipykernel)",
720720
"language": "python",
721721
"name": "python3"
722722
},
@@ -730,7 +730,7 @@
730730
"name": "python",
731731
"nbconvert_exporter": "python",
732732
"pygments_lexer": "ipython3",
733-
"version": "3.7.3"
733+
"version": "3.9.13"
734734
},
735735
"vscode": {
736736
"interpreter": {

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717

1818
# What packages are required for this module to be executed?
1919
REQUIRED = [
20-
'xarray', 'numpy', 'tqdm', 'netCDF4' # <- always conflicts on install with tqdm on test.pypi..
20+
'xarray>=2022.11.0','netCDF4>=1.6.1', 'numpy', 'tqdm',
2121
]
2222

2323
# What packages are optional?

0 commit comments

Comments
 (0)