Skip to content

Commit 8c0372a

Browse files
Merge pull request #124 from automl/development
Development
2 parents 1732571 + 15c76af commit 8c0372a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+4446
-939
lines changed

.github/workflows/run_singularity_versions.yml

+5
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,11 @@ jobs:
2424
RUN_CONTAINER_EXAMPLES: true
2525
USE_SINGULARITY: false
2626
SINGULARITY_VERSION: "3.7"
27+
- python-version: 3.7
28+
DISPLAY_NAME: "Singularity Container Examples with S3.8"
29+
RUN_CONTAINER_EXAMPLES: true
30+
USE_SINGULARITY: false
31+
SINGULARITY_VERSION: "3.8"
2732

2833
fail-fast: false
2934

.gitignore

+7-1
Original file line numberDiff line numberDiff line change
@@ -130,4 +130,10 @@ dmypy.json
130130

131131
# Misc
132132
.idea/
133-
experiments/
133+
experiments/
134+
.DS_Store
135+
136+
# Vagrant
137+
.vagrant
138+
Vagrantfile
139+
/hpobench/container/recipes_local/

README.md

+49-78
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,22 @@
11
# HPOBench
22

3-
HPOBench is a library for hyperparameter optimization and black-box optimization benchmark with a focus on reproducibility.
3+
HPOBench is a library for providing benchmarks for (multi-fidelity) hyperparameter optimization and with a focus on reproducibility.
44

5-
**Note:** HPOBench is under active construction. Stay tuned for more benchmarks. Information on how to contribute a new benchmark will follow shortly.
5+
A list of benchmarks can be found in the [wiki](https://github.com/automl/HPOBench/wiki/Available-Containerized-Benchmarks) and a guide on howto contribute benchmarks is avaiable [here](https://github.com/automl/HPOBench/wiki/https://github.com/automl/HPOBench/wiki/How-to-add-a-new-benchmark-step-by-step)
66

7-
**Note:** If you are looking for a different or older version of our benchmarking library, you might be looking for
8-
[HPOlib1.5](https://github.com/automl/HPOlib1.5)
7+
## Status
8+
9+
Status for Master Branch:
10+
[![Build Status](https://github.com/automl/HPOBench/workflows/Test%20Pull%20Requests/badge.svg?branch=master)](https://https://github.com/automl/HPOBench/actions)
11+
[![codecov](https://codecov.io/gh/automl/HPOBench/branch/master/graph/badge.svg)](https://codecov.io/gh/automl/HPOBench)
12+
13+
Status for Development Branch:
14+
[![Build Status](https://github.com/automl/HPOBench/workflows/Test%20Pull%20Requests/badge.svg?branch=development)](https://https://github.com/automl/HPOBench/actions)
15+
[![codecov](https://codecov.io/gh/automl/HPOBench/branch/development/graph/badge.svg)](https://codecov.io/gh/automl/HPOBench)
916

1017
## In 4 lines of code
1118

12-
Run a random configuration within a singularity container
19+
Evaluate a random configuration using a singularity container
1320
```python
1421
from hpobench.container.benchmarks.ml.xgboost_benchmark import XGBoostBenchmark
1522
b = XGBoostBenchmark(task_id=167149, container_source='library://phmueller/automl', rng=1)
@@ -27,79 +34,45 @@ result_dict = b.objective_function(configuration=config, fidelity={"n_estimators
2734
result_dict = b.objective_function(configuration=config, rng=1)
2835
```
2936

30-
Containerized benchmarks do not rely on external dependencies and thus do not change. To do so, we rely on [Singularity (version 3.5)](https://sylabs.io/guides/3.5/user-guide/).
31-
32-
Further requirements are: [ConfigSpace](https://github.com/automl/ConfigSpace), *scipy* and *numpy*
33-
34-
**Note:** Each benchmark can also be run locally, but the dependencies must be installed manually and might conflict with other benchmarks.
35-
This can be arbitrarily complex and further information can be found in the docstring of the benchmark.
36-
37-
A simple example is the XGBoost benchmark which can be installed with `pip install .[xgboost]`
38-
```python
39-
from hpobench.benchmarks.ml.xgboost_benchmark import XGBoostBenchmark
40-
b = XGBoostBenchmark(task_id=167149)
41-
config = b.get_configuration_space(seed=1).sample_configuration()
42-
result_dict = b.objective_function(configuration=config, fidelity={"n_estimators": 128, "dataset_fraction": 0.5}, rng=1)
43-
44-
```
37+
For more examples see `/example/`.
4538

4639
## Installation
4740

48-
Before we start, we recommend using a virtual environment. To run any benchmark using its singularity container,
49-
run the following:
41+
We recommend using a virtual environment. To install HPOBench, please run the following:
5042
```
5143
git clone https://github.com/automl/HPOBench.git
5244
cd HPOBench
5345
pip install .
5446
```
5547

56-
**Note:** This does not install *singularity (version 3.5)*. Please follow the steps described here: [user-guide](https://sylabs.io/guides/3.5/user-guide/quick_start.html#quick-installation-steps).
48+
**Note:** This does not install *singularity (version 3.6)*. Please follow the steps described here: [user-guide](https://sylabs.io/guides/3.6/user-guide/quick_start.html#quick-installation-steps).
49+
If you run into problems, using the most recent singularity version might help: [here](https://singularity.hpcng.org/admin-docs/master/installation.html)
5750

58-
## Available Containerized Benchmarks
51+
## Containerized Benchmarks
5952

60-
| Benchmark Name | Container Name | Additional Info |
61-
| :-------------------------------- | ------------------ | ------------------------------------ |
62-
| BNNOn* | pybnn | There are 4 benchmark in total (ToyFunction, BostonHousing, ProteinStructure, YearPrediction) |
63-
| CartpoleFull | cartpole | Not deterministic. |
64-
| CartpoleReduced | cartpole | Not deterministic. |
65-
| SliceLocalizationBenchmark | tabular_benchmarks | Loading may take several minutes. |
66-
| ProteinStructureBenchmark | tabular_benchmarks | Loading may take several minutes. |
67-
| NavalPropulsionBenchmark | tabular_benchmarks | Loading may take several minutes. |
68-
| ParkinsonsTelemonitoringBenchmark | tabular_benchmarks | Loading may take several minutes. |
69-
| NASCifar10*Benchmark | nasbench_101 | Loading may take several minutes. There are 3 benchmark in total (A, B, C) |
70-
| *NasBench201Benchmark | nasbench_201 | Loading may take several minutes. There are 3 benchmarks in total (Cifar10Valid, Cifar100, ImageNet) |
71-
| NASBench1shot1SearchSpace*Benchmark | nasbench_1shot1 | Loading may take several minutes. There are 3 benchmarks in total (1,2,3) |
72-
| ParamNet*OnStepsBenchmark | paramnet | There are 6 benchmarks in total (Adult, Higgs, Letter, Mnist, Optdigits, Poker) |
73-
| ParamNet*OnTimeBenchmark | paramnet | There are 6 benchmarks in total (Adult, Higgs, Letter, Mnist, Optdigits, Poker) |
74-
| SurrogateSVMBenchmark | surrogate_svm | Random Forest Surrogate of a SVM on MNIST |
75-
| Learna⁺ | learna_benchmark | Not deterministic. |
76-
| MetaLearna⁺ | learna_benchmark | Not deterministic. |
77-
| XGBoostBenchmark⁺ | xgboost_benchmark | Works with OpenML task ids. |
78-
| XGBoostExtendedBenchmark⁺ | xgboost_benchmark | Works with OpenML task ids + Contains Additional Parameter `Booster |
79-
| SupportVectorMachine⁺ | svm_benchmark | Works with OpenML task ids. |
53+
We provide all benchmarks as containerized versions to (i) isolate their dependencies and (ii) keep them reproducible. Our containerized benchmarks do not rely on external dependencies and thus do not change over time. For this, we rely on [Singularity (version 3.6)](https://sylabs.io/guides/3.6/user-guide/) and for now upload all containers to a [gitlab registry](https://gitlab.tf.uni-freiburg.de/muelleph/hpobench-registry/container_registry)
8054

81-
⁺ these benchmarks are not yet final and might change
55+
The only other requirements are: [ConfigSpace](https://github.com/automl/ConfigSpace), *scipy* and *numpy*
8256

83-
**Note:** All containers are uploaded [here](https://gitlab.tf.uni-freiburg.de/muelleph/hpobench-registry/container_registry)
57+
### Run a Benchmark Locally
8458

85-
## Further Notes
86-
87-
### Configure the HPOBench
59+
Each benchmark can also be run locally, but the dependencies must be installed manually and might conflict with other benchmarks. This can be arbitrarily complex and further information can be found in the docstring of the benchmark.
60+
61+
A simple example is the XGBoost benchmark which can be installed with `pip install .[xgboost]`
8862

89-
All of HPOBench's settings are stored in a file, the `hpobenchrc`-file.
90-
It is a yaml file, which is automatically generated at the first use of HPOBench.
91-
By default, it is placed in `$XDG_CONFIG_HOME`. If `$XDG_CONFIG_HOME` is not set, then the
92-
`hpobenchrc`-file is saved to `'~/.config/hpobench'`. When using the containerized benchmarks, the Unix socket is
93-
defined via `$TEMP_DIR`. This is by default `\tmp`. Make sure to have write permissions in those directories.
63+
```python
64+
from hpobench.benchmarks.ml.xgboost_benchmark_old import XGBoostBenchmark
9465

95-
In the `hpobenchrc`, you can specify for example the directory, in that the benchmark containers are
96-
downloaded. We encourage you to take a look into the `hpobenchrc`, to find out more about all
97-
possible settings.
66+
b = XGBoostBenchmark(task_id=167149)
67+
config = b.get_configuration_space(seed=1).sample_configuration()
68+
result_dict = b.objective_function(configuration=config,
69+
fidelity={"n_estimators": 128, "dataset_fraction": 0.5}, rng=1)
9870

71+
```
9972

100-
### How to build a container locally
73+
### How to Build a Container Locally
10174

102-
With singularity installed run the following to built the xgboost container
75+
With singularity installed run the following to built the, e.g. xgboost container
10376

10477
```bash
10578
cd hpobench/container/recipes/ml
@@ -116,18 +89,23 @@ config = b.get_configuration_space(seed=1).sample_configuration()
11689
result_dict = b.objective_function(config, fidelity={"n_estimators": 128, "dataset_fraction": 0.5})
11790
```
11891

92+
## Configure HPOBench
93+
94+
All of HPOBench's settings are stored in a file, the `hpobenchrc`-file. It is a .yaml file, which is automatically generated at the first use of HPOBench.
95+
By default, it is placed in `$XDG_CONFIG_HOME` (or if not set this defaults to `'~/.config/hpobench'`). This file defines where to store containers and datasets and much more. We highly recommend to have a look at this file once it's created. Furthermore, please make sure to have write permission in these directories or adapt if necessary. For more information on where data is stored, please see the section on `HPOBench Data` below.
96+
97+
Furthermore, for running containers, we rely on Unix sockets which by default are located in `$TEMP_DIR` (or if not set this defaults to `\tmp`).
98+
11999
### Remove all data, containers, and caches
120100

121-
Update: In version 0.0.8, we have added the script `hpobench/util/clean_up_script.py`. It allows to easily remove all
122-
data, downloaded containers, and caches. To get more information, you can use the following command.
101+
Feel free to use `hpobench/util/clean_up_script.py` to remove all data, downloaded containers and caches:
123102
```bash
124103
python ./hpobench/util/clean_up_script.py --help
125104
```
126105

127-
If you like to delete only specific parts, i.e. a single container,
128-
you can find the benchmark's data, container, and caches in the following directories:
106+
If you like to delete only specific parts, i.e. a single container, you can find the benchmark's data, container, and caches in the following directories:
129107

130-
#### HPOBench data
108+
#### HPOBench Data
131109
HPOBench stores downloaded containers and datasets at the following locations:
132110

133111
```bash
@@ -138,20 +116,20 @@ $XDG_DATA_HOME # ~/.local/share/hpobench
138116

139117
For crashes or when not properly shutting down containers, there might be socket files left under `/tmp/hpobench_socket`.
140118

141-
#### OpenML data
119+
#### OpenML Data
142120

143121
OpenML data additionally maintains its cache which is located at `~/.openml/`
144122

145-
#### Singularity container
123+
#### Singularity Containers
146124

147125
Singularity additionally maintains its cache which can be removed with `singularity cache clean`
148126

149-
### Use HPOBench benchmarks in research projects
127+
### Use HPOBench Benchmarks in Research Projects
150128

151129
If you use a benchmark in your experiments, please specify the version number of the HPOBench as well as the version of
152-
the used container. When starting an experiment, HPOBench writes automatically the 2 version numbers to the log.
130+
the used container to ensure reproducibility. When starting an experiment, HPOBench writes automatically these two version numbers to the log.
153131

154-
### Troubleshooting
132+
### Troubleshooting and Further Notes
155133

156134
- **Singularity throws an 'Invalid Image format' exception**
157135
Use a singularity version > 3. For users of the Meta-Cluster in Freiburg, you have to set the following path:
@@ -160,13 +138,6 @@ the used container. When starting an experiment, HPOBench writes automatically t
160138
- **A Benchmark fails with `SystemError: Could not start an instance of the benchmark. Retried 5 times` but the container
161139
can be started locally with `singularity instance start <pathtocontainer> test`**
162140
See whether in `~/.singularity/instances/sing/$HOSTNAME/*/` there is a file that does not end with '}'. If yes delete this file and retry.
163-
164-
## Status
165-
166-
Status for Master Branch:
167-
[![Build Status](https://github.com/automl/HPOBench/workflows/Test%20Pull%20Requests/badge.svg?branch=master)](https://https://github.com/automl/HPOBench/actions)
168-
[![codecov](https://codecov.io/gh/automl/HPOBench/branch/master/graph/badge.svg)](https://codecov.io/gh/automl/HPOBench)
169141

170-
Status for Development Branch:
171-
[![Build Status](https://github.com/automl/HPOBench/workflows/Test%20Pull%20Requests/badge.svg?branch=development)](https://https://github.com/automl/HPOBench/actions)
172-
[![codecov](https://codecov.io/gh/automl/HPOBench/branch/development/graph/badge.svg)](https://codecov.io/gh/automl/HPOBench)
142+
**Note:** If you are looking for a different or older version of our benchmarking library, you might be looking for
143+
[HPOlib1.5](https://github.com/automl/HPOlib1.5)

changelog.md

+11
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,14 @@
1+
# 0.0.9
2+
* Add new Benchmarks: Tabular Benchmarks.
3+
Provided by @Neeratyoy.
4+
* New Benchmark: ML Benchmark Class
5+
This new benchmark class offers a unified interface for XGB, SVM, MLP, HISTGB, RF, LR benchmarks operating on OpenML
6+
tasks.
7+
Provided by @Neeratyoy.
8+
* This version is the used for the paper:
9+
"HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO" (Eggensperger et al.)
10+
https://openreview.net/forum?id=1k4rJYEwda-
11+
112
# 0.0.8
213
* Improve container integration
314
The containers had some problems when the file system was read-only. In this case, the home directory, which contains the

ci_scripts/install.sh

+2-2
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ install_packages=""
44

55
if [[ "$RUN_TESTS" == "true" ]]; then
66
echo "Install tools for testing"
7-
install_packages="${install_packages}xgboost,pytest,test_paramnet,"
7+
install_packages="${install_packages}xgboost,pytest,test_paramnet,test_tabular_datamanager,"
88
pip install codecov
99

1010
# The param net benchmark does not work with a scikit-learn version != 0.23.2. (See notes in the benchmark)
@@ -65,7 +65,7 @@ if [[ "$USE_SINGULARITY" == "true" ]]; then
6565
sudo make -C builddir install
6666

6767
cd ..
68-
install_packages="${install_packages}singularity,"
68+
install_packages="${install_packages}placeholder,"
6969
else
7070
echo "Skip installing Singularity"
7171
fi

ci_scripts/install_singularity.sh

+2
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ elif [[ "$SINGULARITY_VERSION" == "3.6" ]]; then
2020
export VERSION=3.6.4
2121
elif [[ "$SINGULARITY_VERSION" == "3.7" ]]; then
2222
export VERSION=3.7.3
23+
elif [[ "$SINGULARITY_VERSION" == "3.8" ]]; then
24+
export VERSION=3.8.0
2325
else
2426
echo "Skip installing Singularity"
2527
fi

examples/local/xgboost_local.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
import argparse
1111
from time import time
1212

13-
from hpobench.benchmarks.ml.xgboost_benchmark import XGBoostBenchmark as Benchmark
13+
from hpobench.benchmarks.ml.xgboost_benchmark_old import XGBoostBenchmark as Benchmark
1414
from hpobench.util.openml_data_manager import get_openmlcc18_taskids
1515

1616

extra_requirements/ml_mfbb.json

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"ml_tabular_benchmarks": ["tqdm","pandas==1.2.4","scikit-learn==0.24.2","openml==0.12.2","xgboost==1.3.1"],
3+
"ml_mfbb": ["tqdm","pandas==1.2.4","scikit-learn==0.24.2","openml==0.12.2","xgboost==1.3.1"]
4+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"outlier_detection": ["torch==1.9.0", "pytorch_lightning==1.3.8", "scikit-learn==0.24.2"]
3+
}

extra_requirements/tests.json

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
{
22
"codestyle": ["pycodestyle","flake8","pylint"],
33
"pytest": ["pytest>=4.6","pytest-cov"],
4-
"test_paramnet": ["tqdm", "scikit-learn==0.23.2"]
4+
"test_paramnet": ["tqdm", "scikit-learn==0.23.2"],
5+
"test_tabular_datamanager": ["pyarrow", "fastparquet"]
56
}

0 commit comments

Comments
 (0)