Skip to content

Commit 61adde8

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents d728f5f + 1048531 commit 61adde8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+970
-1129
lines changed

.github/ISSUE_TEMPLATE/feature_request.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
name: Feature request
33
about: Request a new feature that you would like to see implemented in SDV
44
title: ''
5-
labels: new feature, new
5+
labels: feature request, new
66
assignees: ''
77

88
---

.github/workflows/integration.yml

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
name: Integration Tests
22

33
on:
4-
- push
5-
- pull_request
4+
push:
5+
pull_request:
6+
types: [opened, reopened]
67

78
jobs:
89
unit:
910
runs-on: ${{ matrix.os }}
1011
strategy:
1112
matrix:
1213
python-version: [3.6, 3.7, 3.8, 3.9]
13-
os: [ubuntu-latest, macos-10.15, windows-latest]
14+
os: [ubuntu-latest, macos-latest, windows-latest]
1415
steps:
1516
- uses: actions/checkout@v1
1617
- name: Set up Python ${{ matrix.python-version }}

.github/workflows/lint.yml

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
name: Style Checks
22

33
on:
4-
- push
5-
- pull_request
4+
push:
5+
pull_request:
6+
types: [opened, reopened]
67

78
jobs:
89
lint:

.github/workflows/minimum.yml

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
name: Unit Tests Minimum Versions
22

33
on:
4-
- push
5-
- pull_request
4+
push:
5+
pull_request:
6+
types: [opened, reopened]
67

78
jobs:
89
minimum:
910
runs-on: ${{ matrix.os }}
1011
strategy:
1112
matrix:
1213
python-version: [3.6, 3.7, 3.8, 3.9]
13-
os: [ubuntu-latest, macos-10.15, windows-latest]
14+
os: [ubuntu-latest, macos-latest, windows-latest]
1415
steps:
1516
- uses: actions/checkout@v1
1617
- name: Set up Python ${{ matrix.python-version }}

.github/workflows/readme.yml

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
name: Test README
22

33
on:
4-
- push
5-
- pull_request
4+
push:
5+
pull_request:
6+
types: [opened, reopened]
67

78
jobs:
89
readme:
910
runs-on: ${{ matrix.os }}
1011
strategy:
1112
matrix:
1213
python-version: [3.6, 3.7, 3.8, 3.9]
13-
os: [ubuntu-latest, macos-10.15] # skip windows bc rundoc fails
14+
os: [ubuntu-latest, macos-latest] # skip windows bc rundoc fails
1415
steps:
1516
- uses: actions/checkout@v1
1617
- name: Set up Python ${{ matrix.python-version }}

.github/workflows/tutorials.yml

+8-3
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
name: Run Tutorials
22

33
on:
4-
- push
5-
- pull_request
4+
push:
5+
pull_request:
6+
types: [opened, reopened]
67

78
jobs:
89
tutorials:
910
runs-on: ${{ matrix.os }}
1011
strategy:
1112
matrix:
1213
python-version: [3.6, 3.7, 3.8, 3.9]
13-
os: [ubuntu-latest, macos-10.15, windows-latest]
14+
os: [ubuntu-latest, macos-latest, windows-latest]
1415
steps:
1516
- uses: actions/checkout@v1
1617
- name: Set up Python ${{ matrix.python-version }}
@@ -34,5 +35,9 @@ jobs:
3435
run: python -m pip install pywinpty==2.0.1
3536
- name: Install package and dependencies
3637
run: pip install invoke jupyter .
38+
39+
- if: matrix.python-version != 3.6
40+
name: Install NBConvert
41+
run: pip install nbconvert==6.4.5 nbformat==5.4.0
3742
- name: invoke tutorials
3843
run: invoke tutorials

.github/workflows/unit.yml

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
11
name: Unit Tests
22

33
on:
4-
- push
5-
- pull_request
4+
push:
5+
pull_request:
6+
types: [opened, reopened]
67

78
jobs:
89
unit:
910
runs-on: ${{ matrix.os }}
1011
strategy:
1112
matrix:
1213
python-version: [3.6, 3.7, 3.8, 3.9]
13-
os: [ubuntu-latest, macos-10.15, windows-latest]
14+
os: [ubuntu-latest, macos-latest, windows-latest]
1415
steps:
1516
- uses: actions/checkout@v1
1617
- name: Set up Python ${{ matrix.python-version }}

HISTORY.md

+23
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,28 @@
11
# Release Notes
22

3+
## 0.17.0 - 2022-09-09
4+
5+
This release updates the code to use RDT version 1.2.0 and greater, so that those new features are now available in SDV. This changes the transformers that are available in SDV models to be those that are in RDT version 1.2.0. As a result, some arguments for initializing models have changed.
6+
7+
Additionally, this release fixes bugs related to loading models with custom constraints. It also fixes a bug that added `NaNs` to the index of sampled data when using `sample_remaining_columns`.
8+
9+
### Bugs Fixed
10+
11+
* Incorrect rounding in Custom Constraint example - Issue [#941](https://github.com/sdv-dev/SDV/issues/941) by @amontanez24
12+
* Can't save the model if use the custom constraint - Issue [#928](https://github.com/sdv-dev/SDV/issues/928) by @pvk-developer
13+
* User Guide code fixes - Issue [#983](https://github.com/sdv-dev/SDV/issues/983) by @amontanez24
14+
* Index contains NaNs when using sample_remaining_columns - Issue [#985](https://github.com/sdv-dev/SDV/issues/985) by @amontanez24
15+
* Cannot sample after loading a model with custom constraints: TypeError - Issue [#984](https://github.com/sdv-dev/SDV/issues/984) by @pvk-developer
16+
* Set HyperTransformer config manually, based on Metadata if given - Issue [#982](https://github.com/sdv-dev/SDV/issues/982) by @pvk-developer
17+
18+
### New Features
19+
20+
* Change default metrics for evaluate - Issue [#949](https://github.com/sdv-dev/SDV/issues/949) by @fealho
21+
22+
### Maintenance
23+
24+
* Update the RDT version to 1.0 - Issue [#897](https://github.com/sdv-dev/SDV/issues/897) by @pvk-developer
25+
326
## 0.16.0 - 2022-07-21
427

528
This release brings user friendly improvements and bug fixes on the `SDV` constraints, to help

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ test-tutorials: ## run the tutorial notebooks
134134
invoke tutorials
135135

136136
.PHONY: test
137-
test: test-unit test-readme test-tutorials ## test everything that needs test dependencies
137+
test: test-unit test-integration test-readme test-tutorials ## test everything that needs test dependencies
138138

139139
.PHONY: test-all
140140
test-all: ## run tests on every Python version with tox

conda/meta.yaml

+11-11
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{% set name = 'sdv' %}
2-
{% set version = '0.16.0' %}
2+
{% set version = '0.17.0.dev3' %}
33

44
package:
55
name: "{{ name|lower }}"
@@ -19,29 +19,29 @@ requirements:
1919
- pytest-runner
2020
- graphviz
2121
- python >=3.6,<3.10
22-
- faker >=3.0.0,<10
22+
- faker >=10,<15
2323
- python-graphviz >=0.13.2,<1
2424
- numpy >=1.18.0,<2
2525
- pandas >=1.1.3,<2
2626
- tqdm >=4.15,<5
27-
- copulas >=0.6.0,<0.7
28-
- ctgan >=0.5.0,<0.6
27+
- copulas >=0.7.0,<0.8
28+
- ctgan >=0.5.2,<0.6
2929
- deepecho >=0.3.0.post1,<0.4
30-
- rdt >=0.6.1,<0.7
31-
- sdmetrics >=0.4.1,<0.5
30+
- rdt >=1.2.0,<2
31+
- sdmetrics >=0.6.0,<0.7
3232
run:
3333
- graphviz
3434
- python >=3.6,<3.10
35-
- faker >=3.0.0,<10
35+
- faker >=10,<15
3636
- python-graphviz >=0.13.2,<1
3737
- numpy >=1.18.0,<2
3838
- pandas >=1.1.3,<2
3939
- tqdm >=4.15,<5
40-
- copulas >=0.6.0,<0.7
41-
- ctgan >=0.5.0,<0.6
40+
- copulas >=0.7.0,<0.8
41+
- ctgan >=0.5.2,<0.6
4242
- deepecho >=0.3.0.post1,<0.4
43-
- rdt >=0.6.1,<0.7
44-
- sdmetrics >=0.4.1,<0.5
43+
- rdt >=1.2.0,<2
44+
- sdmetrics >=0.6.0,<0.7
4545

4646
about:
4747
home: "https://sdv.dev"

docs/api_reference/metrics/relational.rst

+3-6
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,9 @@ Multi Table Statistical Metrics
3535
CSTest
3636
CSTest.get_subclasses
3737
CSTest.compute
38-
KSTest
39-
KSTest.get_subclasses
40-
KSTest.compute
41-
KSTestExtended
42-
KSTestExtended.get_subclasses
43-
KSTestExtended.compute
38+
KSComplement
39+
KSComplement.get_subclasses
40+
KSComplement.compute
4441

4542
Multi Table Detection Metrics
4643
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

docs/api_reference/metrics/tabular.rst

+3-6
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,9 @@ Single Table Statistical Metrics
3737
CSTest
3838
CSTest.get_subclasses
3939
CSTest.compute
40-
KSTest
41-
KSTest.get_subclasses
42-
KSTest.compute
43-
KSTestExtended
44-
KSTestExtended.get_subclasses
45-
KSTestExtended.compute
40+
KSComplement
41+
KSComplement.get_subclasses
42+
KSComplement.compute
4643
ContinuousKLDivergence
4744
ContinuousKLDivergence.get_subclasses
4845
ContinuousKLDivergence.compute

docs/developer_guides/sdv/tabular.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,8 @@ A part from the previous steps, the ``BaseTabularModel`` also offers a couple of
5858
functionalities:
5959

6060
* ``get_metadata``: Returns the Table metadata object that has been fitted to the data.
61-
* ``save``: Saves the complete Tabular Model in a file using ``pickle``.
62-
* ``load``: Loads a previously saved model from a ``pickle`` file.
61+
* ``save``: Saves the complete Tabular Model in a file using ``cloudpickle``.
62+
* ``load``: Loads a previously saved model from a ``cloudpickle`` file.
6363

6464
Implementing a Tabular Model
6565
----------------------------

docs/user_guides/evaluation/evaluation_framework.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -98,21 +98,21 @@ are included within the SDV Evaluation framework. However, the list of
9898
metrics that are applied can be controlled by passing a list with the
9999
names of the metrics that you want to apply.
100100

101-
For example, if you were interested on obtaining only the ``CSTest`` and
102-
``KSTest`` metrics you can call the ``evaluate`` function as follows:
101+
For example, if you were interested on obtaining only the ``CSTest``
102+
metric you can call the ``evaluate`` function as follows:
103103

104104
.. ipython:: python
105105
:okwarning:
106106
107-
evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSTest'])
107+
evaluate(synthetic_data, real_data, metrics=['CSTest'])
108108
109109
110110
Or, if we want to see the scores separately:
111111

112112
.. ipython:: python
113113
:okwarning:
114114
115-
evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSTest'], aggregate=False)
115+
evaluate(synthetic_data, real_data, metrics=['CSTest'], aggregate=False)
116116
117117
118118
For more details about all the metrics that exist for the different data modalities

docs/user_guides/evaluation/multi_table_metrics.rst

+4-5
Original file line numberDiff line numberDiff line change
@@ -153,21 +153,20 @@ report back the average score obtained.
153153
The list of such metrics is:
154154

155155
* ``CSTest``: Multi Single Table metric based on the Single Table CSTest metric.
156-
* ``KSTest``: Multi Single Table metric based on the Single Table KSTest metric.
157-
* ``KSTestExtended``: Multi Single Table metric based on the Single Table KSTestExtended metric.
156+
* ``KSComplement``: Multi Single Table metric based on the Single Table KSComplement metric.
158157
* ``LogisticDetection``: Multi Single Table metric based on the Single Table LogisticDetection metric.
159158
* ``SVCDetection``: Multi Single Table metric based on the Single Table SVCDetection metric.
160159
* ``BNLikelihood``: Multi Single Table metric based on the Single Table BNLikelihood metric.
161160
* ``BNLogLikelihood``: Multi Single Table metric based on the Single Table BNLogLikelihood metric.
162161

163-
Let's try to use the ``KSTestExtended`` metric:
162+
Let's try to use the ``KSComplement`` metric:
164163

165164
.. ipython::
166165
:verbatim:
167166

168-
In [6]: from sdv.metrics.relational import KSTestExtended
167+
In [6]: from sdv.metrics.relational import KSComplement
169168

170-
In [7]: KSTestExtended.compute(real_data, synthetic_data)
169+
In [7]: KSComplement.compute(real_data, synthetic_data)
171170
Out[7]: 0.8194444444444443
172171

173172
Parent Child Detection Metrics

docs/user_guides/evaluation/single_table_metrics.rst

+6-6
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ outcome from the test.
136136

137137
Such metrics are:
138138

139-
* ``sdv.metrics.tabular.KSTest``: This metric uses the two-sample Kolmogorov–Smirnov test
139+
* ``sdv.metrics.tabular.KSComplement``: This metric uses the two-sample Kolmogorov–Smirnov test
140140
to compare the distributions of continuous columns using the empirical CDF.
141141
The output for each column is 1 minus the KS Test D statistic, which indicates the maximum
142142
distance between the expected CDF and the observed CDF values.
@@ -150,16 +150,16 @@ Let us execute these two metrics on the loaded data:
150150
.. ipython::
151151
:verbatim:
152152

153-
In [6]: from sdv.metrics.tabular import CSTest, KSTest
153+
In [6]: from sdv.metrics.tabular import CSTest, KSComplement
154154

155155
In [7]: CSTest.compute(real_data, synthetic_data)
156156
Out[7]: 0.8078084931103922
157157

158-
In [8]: KSTest.compute(real_data, synthetic_data)
158+
In [8]: KSComplement.compute(real_data, synthetic_data)
159159
Out[8]: 0.6372093023255814
160160

161161
In each case, the statistical test will be executed on all the compatible column (so, categorical
162-
or boolean columns for ``CSTest`` and numerical columns for ``KSTest``), and report the average
162+
or boolean columns for ``CSTest`` and numerical columns for ``KSComplement``), and report the average
163163
score obtained.
164164

165165
.. note:: If your table does not contain any column of the compatible type, the output of
@@ -173,11 +173,11 @@ metric classes or their names:
173173

174174
In [9]: from sdv.evaluation import evaluate
175175

176-
In [10]: evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSTest'], aggregate=False)
176+
In [10]: evaluate(synthetic_data, real_data, metrics=['CSTest', 'KSComplement'], aggregate=False)
177177
Out[10]:
178178
metric name raw_score normalized_score min_value max_value goal
179179
0 CSTest Chi-Squared 0.807808 0.807808 0.0 1.0 MAXIMIZE
180-
1 KSTest Inverted Kolmogorov-Smirnov D statistic 0.637209 0.637209 0.0 1.0 MAXIMIZE
180+
1 KSComplement Inverted Kolmogorov-Smirnov D statistic 0.637209 0.637209 0.0 1.0 MAXIMIZE
181181

182182

183183
Likelihood Metrics

docs/user_guides/relational/hma1.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ method passing the name of the file in which you want to save the model.
125125
Note that the extension of the filename is not relevant, but we will be
126126
using the ``.pkl`` extension to highlight that the serialization
127127
protocol used is
128-
`pickle <https://docs.python.org/3/library/pickle.html>`__.
128+
`cloudpickle <https://github.com/cloudpipe/cloudpickle>`__.
129129

130130
.. ipython:: python
131131
:okwarning:

0 commit comments

Comments
 (0)