v1.8.0 - 2023-12-05
This release adds support for the new Diagnostic Report from SDMetrics. This report calculates scores for three basic but important properties of your data: data validity, data structure and in the multi table case, relationship validity. Data validity checks that the columns of your data are valid (eg. correct range or values). Data structure makes sure the synthetic data has the correct columns. Relationship validity checks to make sure key references are correct and the cardinality is within ranges seen in the real data.
Additionally, a few bugs were fixed and functionality was improved around synthesizers. It is now possible to access the loss values for the TVAESynthesizer
and CTGANSynthesizer
by using the get_loss_values
method. The get_parameters
method is now more detailed and returns all the parameters used to make a synthesizer. The metadata is now capable of detecting some common pii sdtypes. Finally, a bug that made every parent row generated by the HMASynthesizer
have at least one child row was patched. This should improve cardinality.
Maintenance
- Address
SettingWithCopyWarning
(HMASynthesizer) - Issue #1557 by @pvk-developer - Bump SDMetrics version - Issue #1702 by @amontanez24
New Features
- Allow me to access loss values for GAN-based synthesizers - Issue #1671 by @frances-h
- Create a unified
get_parameters
method for all multi-table synthesizers - Issue #1674 by @frances-h - Set credentials key as variables - Issue #1680 by @R-Palazzo
- Identifying PII Sdtypes in Metadata - Issue #1683 by @R-Palazzo
- Make SDV compatible with the latest SDMetrics - Issue #1687 by @fealho
- SingleTablePreset uses FrequencyEncoder - Issue #1695 by @fealho
Bugs Fixed
- HMASynthesizer creates too much synthetic data (always creates a child for every parent row) - Issue #1673 by @frances-h