Skip to content

Commit 28e751a

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents b7392bb + f44ad92 commit 28e751a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+5203
-7080
lines changed

.github/ISSUE_TEMPLATE/bug_report.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
name: Bug report
33
about: Report an error that you found when using SDV
44
title: ''
5-
labels: bug, pending review
5+
labels: bug, new
66
assignees: ''
77

88
---

.github/ISSUE_TEMPLATE/feature_request.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
name: Feature request
33
about: Request a new feature that you would like to see implemented in SDV
44
title: ''
5-
labels: new feature, pending review
5+
labels: new feature, new
66
assignees: ''
77

88
---

.github/ISSUE_TEMPLATE/question.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
name: Question
33
about: Ask a general question about SDV usage
44
title: ''
5-
labels: question, pending review
5+
labels: question, new
66
assignees: ''
77

88
---

HISTORY.md

+68
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,73 @@
11
# Release Notes
22

3+
## 0.16.0 - 2022-07-21
4+
5+
This release brings user friendly improvements and bug fixes on the `SDV` constraints, to help
6+
users generate their synthetic data easily.
7+
8+
Some predefined constraints have been renamed and redefined to be more user friendly & consistent.
9+
The custom constraint API has also been updated for usability. The SDV now automatically determines
10+
the best `handling_strategy` to use for each constraint, attempting `transform` by default and
11+
falling back to `reject_sampling` otherwise. The `handling_strategy` parameters are no longer
12+
included in the API.
13+
14+
Finally, this version of `SDV` also unifies the parameters for all sampling related methods for
15+
all models (including TabularPreset).
16+
17+
### Changes to Constraints
18+
19+
* `GreatherThan` constraint is now separated in two new constraints: `Inequality`, which is
20+
intended to be used between two columns, and `ScalarInequality`, which is intended to be used
21+
between a column and a scalar.
22+
23+
* `Between` constraint is now separated in two new constraints: `Range`, which is intended to
24+
be used between three columns, and `ScalarRange`, which is intended to be used between a column
25+
and low and high scalar values.
26+
27+
* `FixedIncrements` a new constraint that makes the data increment by a certain value.
28+
* New `create_custom_constraint` function available to create custom constraints.
29+
30+
### Removed Constraints
31+
* `Rounding` Rounding is automatically being handled by the ``rdt.HyperTransformer``.
32+
* `ColumnFormula` the `create_custom_constraint` takes place over this one and allows more
33+
advanced usage for the end users.
34+
35+
### New Features
36+
37+
* Improve error message for invalid constraints - Issue [#801](https://github.com/sdv-dev/SDV/issues/801) by @fealho
38+
* Numerical Instability in Constrained GaussianCopula - Issue [#806](https://github.com/sdv-dev/SDV/issues/806) by @fealho
39+
* Unify sampling params for reject sampling - Issue [#809](https://github.com/sdv-dev/SDV/issues/809) by @amontanez24
40+
* Split `GreaterThan` constraint into `Inequality` and `ScalarInequality` - Issue [#814](https://github.com/sdv-dev/SDV/issues/814) by @fealho
41+
* Split `Between` constraint into `Range` and `ScalarRange` - Issue [#815](https://github.com/sdv-dev/SDV/issues/815) @pvk-developer
42+
* Change `columns` to `column_names` in `OneHotEncoding` and `Unique` constraints - Issue [#816](https://github.com/sdv-dev/SDV/issues/816) by @amontanez24
43+
* Update columns parameter in `Positive` and `Negative` constraint - Issue [#817](https://github.com/sdv-dev/SDV/issues/817) by @fealho
44+
* Create `FixedIncrements` constraint - Issue [#818](https://github.com/sdv-dev/SDV/issues/818) by @amontanez24
45+
* Improve datetime handling in `ScalarInequality` and `ScalarRange` constraints - Issue [#819](https://github.com/sdv-dev/SDV/issues/819) by @pvk-developer
46+
* Support strict boundaries even when transform strategy is used - Issue [#820](https://github.com/sdv-dev/SDV/issues/820) by @fealho
47+
* Add `create_custom_constraint` factory method - Issue [#836](https://github.com/sdv-dev/SDV/issues/836) by @fealho
48+
49+
### Internal Improvements
50+
* Remove `handling_strategy` parameter - Issue [#833](https://github.com/sdv-dev/SDV/issues/833) by @amontanez24
51+
* Remove `fit_columns_model` parameter - Issue [#834](https://github.com/sdv-dev/SDV/issues/834) by @pvk-developer
52+
* Remove the `ColumnFormula` constraint - Issue [#837](https://github.com/sdv-dev/SDV/issues/837) by @amontanez24
53+
* Move `table_data.copy` to base class of constraints - Issue [#845](https://github.com/sdv-dev/SDV/issues/845) by @fealho
54+
55+
### Bugs Fixed
56+
* Numerical Instability in Constrained GaussianCopula - Issue [#801](https://github.com/sdv-dev/SDV/issues/801) by @tlranda and @fealho
57+
* Fix error message for `FixedIncrements` - Issue [#865](https://github.com/sdv-dev/SDV/issues/865) by @pvk-developer
58+
* Fix constraints with conditional sampling - Issue [#866](https://github.com/sdv-dev/SDV/issues/866) by @amontanez24
59+
* Fix error message in `ScalarInequality` - Issue [#868](https://github.com/sdv-dev/SDV/issues/868) by @pvk-developer
60+
* Cannot use `max_tries_per_batch` on sample: `TypeError: sample() got an unexpected keyword argument 'max_tries_per_batch'` - Issue [#885](https://github.com/sdv-dev/SDV/issues/885) by @amontanez24
61+
* Conditional sampling + batch size: `ValueError: Length of values (1) does not match length of index (5)` - Issue [#886](https://github.com/sdv-dev/SDV/issues/886) by @amontanez24
62+
* `TabularPreset` doesn't support new sampling parameters - Issue [#887](https://github.com/sdv-dev/SDV/issues/887) by @fealho
63+
* Conditional Sampling: `batch_size` is being set to `None` by default? - Issue [#889](https://github.com/sdv-dev/SDV/issues/889) by @amontanez24
64+
* Conditional sampling using GaussianCopula inefficient when categories are noised - Issue [#910](https://github.com/sdv-dev/SDV/issues/910) by @amontanez24
65+
66+
### Documentation Changes
67+
* Show the `API` for `TabularPreset` models - Issue [#854](https://github.com/sdv-dev/SDV/issues/854) by @katxiao
68+
* Update handling constraints doc - Pull Request [#856](https://github.com/sdv-dev/SDV/issues/856) by @amontanez24
69+
* Update custom costraints documentation - Pull Request [#857](https://github.com/sdv-dev/SDV/issues/857) by @pvk-developer
70+
371
## 0.15.0 - 2022-05-25
472

573
This release improves the speed of the `GaussianCopula` model by removing logic that previously searched for the appropriate distribution to

README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
[![Coverage Status](https://codecov.io/gh/sdv-dev/SDV/branch/master/graph/badge.svg)](https://codecov.io/gh/sdv-dev/SDV)
1111
[![Downloads](https://pepy.tech/badge/sdv)](https://pepy.tech/project/sdv)
1212
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sdv-dev/SDV/master?filepath=tutorials)
13-
[![Slack](https://img.shields.io/badge/Slack%20Workspace-Join%20now!-36C5F0?logo=slack)](https://join.slack.com/t/sdv-space/shared_invite/zt-gdsfcb5w-0QQpFMVoyB2Yd6SRiMplcw)
13+
[![Slack](https://img.shields.io/badge/Slack%20Workspace-Join%20now!-36C5F0?logo=slack)](https://bit.ly/sdv-slack-invite)
1414

1515
<div align="left">
1616
<br/>
@@ -59,7 +59,7 @@ hierarchical generative modeling and recursive sampling techniques.
5959
[License]: https://github.com/sdv-dev/SDV/blob/master/LICENSE
6060
[Development Status]: https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha
6161
[Slack Logo]: https://github.com/sdv-dev/SDV/blob/master/docs/images/slack.png
62-
[Community]: https://join.slack.com/t/sdv-space/shared_invite/zt-gdsfcb5w-0QQpFMVoyB2Yd6SRiMplcw
62+
[Community]: https://bit.ly/sdv-slack-invite
6363
[MyBinder Logo]: https://github.com/sdv-dev/SDV/blob/master/docs/images/mybinder.png
6464
[Tutorials]: https://mybinder.org/v2/gh/sdv-dev/SDV/master?filepath=tutorials
6565

@@ -98,7 +98,7 @@ If you want to be part of the SDV community to receive announcements of the late
9898
ask questions, suggest new features or participate in the development meetings, please join
9999
our Slack Workspace!
100100

101-
[![Slack](https://img.shields.io/badge/Slack%20Workspace-Join%20now!-36C5F0?logo=slack)](https://join.slack.com/t/sdv-space/shared_invite/zt-gdsfcb5w-0QQpFMVoyB2Yd6SRiMplcw)
101+
[![Slack](https://img.shields.io/badge/Slack%20Workspace-Join%20now!-36C5F0?logo=slack)](https://bit.ly/sdv-slack-invite)
102102

103103
# Install
104104

@@ -251,7 +251,7 @@ https://github.com/sdv-dev/SDMetrics) library.
251251
to see how you can contribute to the project.
252252
3. If you have any doubts, feature requests or detect an error, please [open an issue on github](
253253
https://github.com/sdv-dev/SDV/issues) or [join our Slack Workspace](
254-
https://join.slack.com/t/sdv-space/shared_invite/zt-gdsfcb5w-0QQpFMVoyB2Yd6SRiMplcw)
254+
https://bit.ly/sdv-slack-invite)
255255
4. Also, do not forget to check the [project documentation site](https://sdv.dev/SDV/)!
256256

257257
# Citation

conda/meta.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{% set name = 'sdv' %}
2-
{% set version = '0.15.0' %}
2+
{% set version = '0.16.0.dev6' %}
33

44
package:
55
name: "{{ name|lower }}"

docs/api_reference/constraints/tabular.rst

+53-69
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,8 @@ Tabular Constraints
55

66
.. currentmodule:: sdv.constraints
77

8-
CustomConstraint
9-
~~~~~~~~~~~~~~~~
10-
11-
.. autosummary::
12-
:toctree: api/
13-
14-
CustomConstraint
15-
CustomConstraint.fit
16-
CustomConstraint.transform
17-
CustomConstraint.fit_transform
18-
CustomConstraint.reverse_transform
19-
CustomConstraint.is_valid
20-
CustomConstraint.filter_valid
21-
CustomConstraint.from_dict
22-
CustomConstraint.to_dict
23-
248
FixedCombinations
25-
~~~~~~~~~~~~~~~~~~
9+
~~~~~~~~~~~~~~~~~
2610

2711
.. autosummary::
2812
:toctree: api/
@@ -37,24 +21,40 @@ FixedCombinations
3721
FixedCombinations.from_dict
3822
FixedCombinations.to_dict
3923

40-
GreaterThan
24+
Inequality
25+
~~~~~~~~~~
26+
27+
.. autosummary::
28+
:toctree: api/
29+
30+
Inequality
31+
Inequality.fit
32+
Inequality.transform
33+
Inequality.fit_transform
34+
Inequality.reverse_transform
35+
Inequality.is_valid
36+
Inequality.filter_valid
37+
Inequality.from_dict
38+
Inequality.to_dict
39+
40+
ScalarInequality
4141
~~~~~~~~~~~~~~~~
4242

4343
.. autosummary::
4444
:toctree: api/
4545

46-
GreaterThan
47-
GreaterThan.fit
48-
GreaterThan.transform
49-
GreaterThan.fit_transform
50-
GreaterThan.reverse_transform
51-
GreaterThan.is_valid
52-
GreaterThan.filter_valid
53-
GreaterThan.from_dict
54-
GreaterThan.to_dict
46+
ScalarInequality
47+
ScalarInequality.fit
48+
ScalarInequality.transform
49+
ScalarInequality.fit_transform
50+
ScalarInequality.reverse_transform
51+
ScalarInequality.is_valid
52+
ScalarInequality.filter_valid
53+
ScalarInequality.from_dict
54+
ScalarInequality.to_dict
5555

5656
Positive
57-
~~~~~~~~~~~~~~~~
57+
~~~~~~~~
5858

5959
.. autosummary::
6060
:toctree: api/
@@ -70,7 +70,7 @@ Positive
7070
Positive.to_dict
7171

7272
Negative
73-
~~~~~~~~~~~~~~~~
73+
~~~~~~~~
7474

7575
.. autosummary::
7676
:toctree: api/
@@ -85,56 +85,40 @@ Negative
8585
Negative.from_dict
8686
Negative.to_dict
8787

88-
ColumnFormula
89-
~~~~~~~~~~~~~~~~
88+
Range
89+
~~~~~
9090

9191
.. autosummary::
9292
:toctree: api/
9393

94-
ColumnFormula
95-
ColumnFormula.fit
96-
ColumnFormula.transform
97-
ColumnFormula.fit_transform
98-
ColumnFormula.reverse_transform
99-
ColumnFormula.is_valid
100-
ColumnFormula.filter_valid
101-
ColumnFormula.from_dict
102-
ColumnFormula.to_dict
94+
Range
95+
Range.fit
96+
Range.transform
97+
Range.fit_transform
98+
Range.reverse_transform
99+
Range.is_valid
100+
Range.filter_valid
101+
Range.from_dict
102+
Range.to_dict
103103

104-
Between
105-
~~~~~~~
104+
ScalarRange
105+
~~~~~~~~~~~
106106

107107
.. autosummary::
108108
:toctree: api/
109109

110-
Between
111-
Between.fit
112-
Between.transform
113-
Between.fit_transform
114-
Between.reverse_transform
115-
Between.is_valid
116-
Between.filter_valid
117-
Between.from_dict
118-
Between.to_dict
119-
120-
Rounding
121-
~~~~~~~~
122-
123-
.. autosummary::
124-
:toctree: api/
125-
126-
Rounding
127-
Rounding.fit
128-
Rounding.transform
129-
Rounding.fit_transform
130-
Rounding.reverse_transform
131-
Rounding.is_valid
132-
Rounding.filter_valid
133-
Rounding.from_dict
134-
Rounding.to_dict
110+
ScalarRange
111+
ScalarRange.fit
112+
ScalarRange.transform
113+
ScalarRange.fit_transform
114+
ScalarRange.reverse_transform
115+
ScalarRange.is_valid
116+
ScalarRange.filter_valid
117+
ScalarRange.from_dict
118+
ScalarRange.to_dict
135119

136120
OneHotEncoding
137-
~~~~~~~~~~~~~~~~
121+
~~~~~~~~~~~~~~
138122

139123
.. autosummary::
140124
:toctree: api/
@@ -150,7 +134,7 @@ OneHotEncoding
150134
OneHotEncoding.to_dict
151135

152136
Unique
153-
~~~~~~~~~~~~~~~~
137+
~~~~~~
154138

155139
.. autosummary::
156140
:toctree: api/

docs/api_reference/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ and classes in SDV.
1010
:maxdepth: 2
1111

1212
sdv
13+
lite/index
1314
tabular/index
1415
relational/index
1516
timeseries/index

docs/api_reference/lite/index.rst

+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
.. _sdv.lite:
2+
3+
sdv.lite
4+
========
5+
6+
.. toctree::
7+
:maxdepth: 1
8+
:titlesonly:
9+
10+
tabular

docs/api_reference/lite/tabular.rst

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
.. _sdv.lite.tabular:
2+
3+
.. currentmodule:: sdv.lite.tabular
4+
5+
TabularPreset
6+
=============
7+
8+
.. autosummary::
9+
:toctree: api/
10+
11+
TabularPreset
12+
TabularPreset.list_available_presets
13+
TabularPreset.fit
14+
TabularPreset.sample
15+
TabularPreset.sample_conditions
16+
TabularPreset.sample_remaining_columns
17+
TabularPreset.save
18+
TabularPreset.load

docs/conf.py

+5-2
Original file line numberDiff line numberDiff line change
@@ -33,17 +33,20 @@
3333
'nbsphinx',
3434
'sphinx.ext.autodoc',
3535
'sphinx.ext.autosummary',
36+
'sphinx.ext.autosectionlabel',
3637
'sphinx.ext.githubpages',
3738
'sphinx.ext.viewcode',
3839
'sphinx.ext.napoleon',
3940
'IPython.sphinxext.ipython_console_highlighting',
4041
'IPython.sphinxext.ipython_directive',
42+
'sphinx_toolbox.collapse'
4143
]
4244

4345
ipython_execlines = [
46+
"from utils import is_valid, transform, reverse_transform",
4447
"import pandas as pd",
4548
"pd.set_option('display.width', 1000000)",
46-
"pd.set_option('max_columns', 1000)",
49+
"pd.set_option('display.max_columns', 1000)",
4750
]
4851

4952
autosummary_generate = True
@@ -135,7 +138,7 @@
135138
html_theme_options = {
136139
"github_url": "https://github.com/sdv-dev/SDV",
137140
"twitter_url": "https://twitter.com/sdv_dev",
138-
"slack_url": "https://join.slack.com/t/sdv-space/shared_invite/zt-gdsfcb5w-0QQpFMVoyB2Yd6SRiMplcw",
141+
"slack_url": "https://bit.ly/sdv-slack-invite",
139142
"show_prev_next": True,
140143
"google_analytics_id": "UA-180602145-3",
141144
}

0 commit comments

Comments
 (0)