Skip to content

Commit ae61580

Browse files
committed
Fix issue with latest pandas version and add tests
1 parent 241576e commit ae61580

File tree

9 files changed

+208
-71
lines changed

9 files changed

+208
-71
lines changed

.github/workflows/build.yml

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
name: build
2+
concurrency:
3+
group: build-${{ github.ref }}
4+
cancel-in-progress: true
5+
on: [push, workflow_dispatch]
6+
jobs:
7+
build:
8+
runs-on: ubuntu-latest
9+
strategy:
10+
matrix:
11+
python-version: ['3.6', '3.7', '3.8', '3.9', '3.10']
12+
name: Python ${{ matrix.python-version }} sample
13+
steps:
14+
- uses: actions/checkout@v2
15+
- name: Set up Python
16+
uses: actions/setup-python@v2
17+
with:
18+
python-version: ${{ matrix.python-version }}
19+
- run: python --version
20+
- run: pip install --upgrade pip
21+
- run: pip install -r requirements.txt
22+
- run: pytest -rP -p no:cacheprovider doepy

.testmondata

32 KB
Binary file not shown.

Makefile

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
RUN=docker run -it --rm -p 5000:5000
2+
DEV_IMAGE=doepy
3+
docker-images:
4+
docker build -t $(DEV_IMAGE) -f docker/Dockerfile.doepy .
5+
6+
7+
PYTEST=pytest -o cache_dir=/tmp --testmon --quiet -rP --durations=5
8+
watch-tests:
9+
rm -f src/.testmondata
10+
docker run -it --rm -v $(PWD):/src $(DEV_IMAGE) \
11+
ptw --runner "$(PYTEST)" --ignore docs
12+
rm -f src/.testmondata
13+
14+
bash-tests:
15+
docker run -it --rm -v $(PWD):/src $(DEV_IMAGE) bash
16+
17+
# jupyter:
18+
# docker run --rm -v $(PWD)/src:/src -p 8898:8898 $(JUPYTER_IMAGE) jupyter notebook --allow-root --port=8898 --ip 0.0.0.0 --no-browser
19+
20+
21+
# isort:
22+
# docker run -it --rm -v $(PWD)/src:/src $(DEV_IMAGE) isort mistat
23+
24+
# mypy:
25+
# docker run -it --rm -v $(PWD)/src:/src $(DEV_IMAGE) mypy --install-types mistat
26+
27+
# pylint:
28+
# docker run -it --rm -v $(PWD)/src:/src $(DEV_IMAGE) pylint mistat
29+
30+
# bash-dev:
31+
# docker run -it --rm -v $(PWD)/src:/src $(DEV_IMAGE) bash

docker/Dockerfile.doepy

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
FROM python:3.9-slim
2+
RUN mkdir /src
3+
WORKDIR /src
4+
COPY ./requirements.txt /requirements.txt
5+
RUN pip3 install --requirement /requirements.txt
6+
RUN pip3 install --upgrade pip
7+

doepy/build.py

+33-33
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def full_fact(d):
2626
def frac_fact_res(d, res=None):
2727
"""
2828
Builds a 2-level fractional factorial design dataframe from a dictionary of factor/level ranges and given resolution.
29-
29+
3030
Parameters
3131
----------
3232
factor_level_ranges : Dictionary of factors and ranges
@@ -37,8 +37,8 @@ def frac_fact_res(d, res=None):
3737
res : int
3838
Desired design resolution.
3939
Default: Set to half of the total factor count.
40-
41-
Notes
40+
41+
Notes
4242
-----
4343
The resolution of a design is defined as the length of the shortest
4444
word in the defining relation. The resolution describes the level of
@@ -71,7 +71,7 @@ def frac_fact_res(d, res=None):
7171
5 5.0 0.3 15.0 3.0 -1.0
7272
6 1.0 0.7 15.0 3.0 -2.0
7373
7 5.0 0.7 15.0 7.0 -1.0
74-
74+
7575
It builds a dataframe with only 8 rows (designs) from a dictionary with 6 factors.
7676
A full factorial design would have required 2^6 = 64 designs.
7777
>>> build_frac_fact_res(d1,5)
@@ -89,14 +89,14 @@ def plackett_burman(d):
8989
Only min and max values of the range are required.
9090
Example of the dictionary which is needed as the input:
9191
{'Pressure':[50,70],'Temperature':[290, 350],'Flow rate':[0.9,1.0]}
92-
93-
Plackett–Burman designs are experimental designs presented in 1946 by Robin L. Plackett and J. P. Burman while working in the British Ministry of Supply.(Their goal was to find experimental designs for investigating the dependence of some measured quantity on a number of independent variables (factors), each taking L levels, in such a way as to minimize the variance of the estimates of these dependencies using a limited number of experiments.
94-
92+
93+
Plackett–Burman designs are experimental designs presented in 1946 by Robin L. Plackett and J. P. Burman while working in the British Ministry of Supply.(Their goal was to find experimental designs for investigating the dependence of some measured quantity on a number of independent variables (factors), each taking L levels, in such a way as to minimize the variance of the estimates of these dependencies using a limited number of experiments.
94+
9595
Interactions between the factors were considered negligible. The solution to this problem is to find an experimental design where each combination of levels for any pair of factors appears the same number of times, throughout all the experimental runs (refer to table).
96-
A complete factorial design would satisfy this criterion, but the idea was to find smaller designs.
97-
98-
These designs are unique in that the number of trial conditions (rows) expands by multiples of four (e.g. 4, 8, 12, etc.).
99-
The max number of columns allowed before a design increases the number of rows is always one less than the next higher multiple of four.
96+
A complete factorial design would satisfy this criterion, but the idea was to find smaller designs.
97+
98+
These designs are unique in that the number of trial conditions (rows) expands by multiples of four (e.g. 4, 8, 12, etc.).
99+
The max number of columns allowed before a design increases the number of rows is always one less than the next higher multiple of four.
100100
"""
101101

102102
return build_plackett_burman(d)
@@ -110,9 +110,9 @@ def sukharev(d, num_samples=None):
110110
Example of the dictionary which is needed as the input:
111111
{'Pressure':[50,70],'Temperature':[290, 350],'Flow rate':[0.9,1.0]}
112112
num_samples: Number of samples to be generated
113-
114-
Special property of this grid is that points are not placed on the boundaries of the hypercube, but at centroids of the subcells constituted by individual samples.
115-
This design offers optimal results for the covering radius regarding distances based on the max-norm.
113+
114+
Special property of this grid is that points are not placed on the boundaries of the hypercube, but at centroids of the subcells constituted by individual samples.
115+
This design offers optimal results for the covering radius regarding distances based on the max-norm.
116116
"""
117117

118118
return build_sukharev(d, num_samples=num_samples)
@@ -124,12 +124,12 @@ def box_behnken(d, center=1):
124124
Note 3 levels of factors are necessary. If not given, the function will automatically create 3 levels by linear mid-section method.
125125
Example of the dictionary which is needed as the input:
126126
{'Pressure':[50,60,70],'Temperature':[290, 320, 350],'Flow rate':[0.9,1.0,1.1]}
127-
128-
In statistics, Box–Behnken designs are experimental designs for response surface methodology, devised by George E. P. Box and Donald Behnken in 1960, to achieve the following goals:
129-
* Each factor, or independent variable, is placed at one of three equally spaced values, usually coded as −1, 0, +1. (At least three levels are needed for the following goal.)
130-
* The design should be sufficient to fit a quadratic model, that is, one containing squared terms, products of two factors, linear terms and an intercept.
131-
* The ratio of the number of experimental points to the number of coefficients in the quadratic model should be reasonable (in fact, their designs kept it in the range of 1.5 to 2.6).*estimation variance should more or less depend only on the distance from the centre (this is achieved exactly for the designs with 4 and 7 factors), and should not vary too much inside the smallest (hyper)cube containing the experimental points.
132-
"""
127+
128+
In statistics, Box–Behnken designs are experimental designs for response surface methodology, devised by George E. P. Box and Donald Behnken in 1960, to achieve the following goals:
129+
* Each factor, or independent variable, is placed at one of three equally spaced values, usually coded as −1, 0, +1. (At least three levels are needed for the following goal.)
130+
* The design should be sufficient to fit a quadratic model, that is, one containing squared terms, products of two factors, linear terms and an intercept.
131+
* The ratio of the number of experimental points to the number of coefficients in the quadratic model should be reasonable (in fact, their designs kept it in the range of 1.5 to 2.6).*estimation variance should more or less depend only on the distance from the centre (this is achieved exactly for the designs with 4 and 7 factors), and should not vary too much inside the smallest (hyper)cube containing the experimental points.
132+
"""
133133

134134
return build_box_behnken(d, center=center)
135135

@@ -140,12 +140,12 @@ def central_composite(d, center=(2, 2), alpha="o", face="ccc"):
140140
Only min and max values of the range are required.
141141
Example of the dictionary which is needed as the input:
142142
{'Pressure':[50,70],'Temperature':[290, 350],'Flow rate':[0.9,1.0]}
143-
144-
In statistics, a central composite design is an experimental design, useful in response surface methodology, for building a second order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment.
145-
The design consists of three distinct sets of experimental runs:
146-
* A factorial (perhaps fractional) design in the factors studied, each having two levels;
147-
* A set of center points, experimental runs whose values of each factor are the medians of the values used in the factorial portion. This point is often replicated in order to improve the precision of the experiment;
148-
* A set of axial points, experimental runs identical to the centre points except for one factor, which will take on values both below and above the median of the two factorial levels, and typically both outside their range. All factors are varied in this way.
143+
144+
In statistics, a central composite design is an experimental design, useful in response surface methodology, for building a second order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment.
145+
The design consists of three distinct sets of experimental runs:
146+
* A factorial (perhaps fractional) design in the factors studied, each having two levels;
147+
* A set of center points, experimental runs whose values of each factor are the medians of the values used in the factorial portion. This point is often replicated in order to improve the precision of the experiment;
148+
* A set of axial points, experimental runs identical to the centre points except for one factor, which will take on values both below and above the median of the two factorial levels, and typically both outside their range. All factors are varied in this way.
149149
"""
150150

151151
return build_central_composite(d, center=center, alpha=alpha, face=face)
@@ -159,10 +159,10 @@ def lhs(d, num_samples=None, prob_distribution=None):
159159
{'Pressure':[50,70],'Temperature':[290, 350],'Flow rate':[0.9,1.0]}
160160
num_samples: Number of samples to be generated
161161
prob_distribution: Analytical probability distribution to be applied over the randomized sampling.
162-
Accepts one of the following strings:
162+
Accepts one of the following strings:
163163
'Normal', 'Poisson', 'Exponential', 'Beta', 'Gamma'
164164
165-
Latin hypercube sampling (LHS) is a form of stratified sampling that can be applied to multiple variables. The method commonly used to reduce the number or runs necessary for a Monte Carlo simulation to achieve a reasonably accurate random distribution. LHS can be incorporated into an existing Monte Carlo model fairly easily, and work with variables following any analytical probability distribution.
165+
Latin hypercube sampling (LHS) is a form of stratified sampling that can be applied to multiple variables. The method commonly used to reduce the number or runs necessary for a Monte Carlo simulation to achieve a reasonably accurate random distribution. LHS can be incorporated into an existing Monte Carlo model fairly easily, and work with variables following any analytical probability distribution.
166166
"""
167167

168168
return build_lhs(d, num_samples=num_samples, prob_distribution=prob_distribution)
@@ -199,11 +199,11 @@ def maximin(d, num_samples=None):
199199
Example of the dictionary which is needed as the input:
200200
{'Pressure':[50,70],'Temperature':[290, 350],'Flow rate':[0.9,1.0]}
201201
num_samples: Number of samples to be generated
202-
203-
This algorithm carries out a user-specified number of iterations to maximize the minimal distance of a point in the set to
204-
* other points in the set,
205-
* existing (fixed) points,
206-
* the boundary of the hypercube.
202+
203+
This algorithm carries out a user-specified number of iterations to maximize the minimal distance of a point in the set to
204+
* other points in the set,
205+
* existing (fixed) points,
206+
* the boundary of the hypercube.
207207
"""
208208

209209
return build_maximin(d, num_samples=num_samples)

0 commit comments

Comments
 (0)