Release v0.9.0 - 2021-03-31 · sdv-dev/SDV

This release brings new privacy metrics to the evaluate framework which help to determine if the real data could be obtained or deduced from the synthetic samples. Additionally, now there is a normalized score for the metrics, which stays between 0 and 1.

There are improvements that reduce the usage of memory ram when sampling new data. Also there is a new parameter to control the reject sampling crash, graceful_reject_sampling, which if set to True and if it's not possible to generate all the requested rows, it will just issue a warning and return whatever it was able to generate.

The Metadata object can now be visualized using different combinations of names and details, which can be set to True or False in order to display only the table names with details or without. There is also an improvement on the validation, which now will display all the errors found at the end of the validation instead of only the first one.

This version also exposes all the hyperparameters of the models CTGAN and TVAE to allow a more advanced usage. There is also a fix for the TVAE model on small datasets and it's performance with NaN values has been improved. There is a fix for when using UniqueCombinationConstraint with the transform strategy.

Issues resolved

Memory Usage Gaussian Copula Trained Model consuming high memory when generating synthetic data - Issue #304 by @pvk-developer
Add option to visualize metadata with only table names - Issue #347 by @csala
Add sample parameter to control reject sampling crash - Issue #343 by @fealho
Verbose metadata validation - Issue #348 by @csala
Missing the introduction of custom specification for hyperparameters in the TVAE model - Issue #344 by @pvk-developer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.0 - 2021-03-31

Issues resolved