Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disclosure control #1

Open
tombisho opened this issue Sep 2, 2021 · 3 comments
Open

Disclosure control #1

tombisho opened this issue Sep 2, 2021 · 3 comments

Comments

@tombisho
Copy link
Owner

tombisho commented Sep 2, 2021

Look at Swiss knife disclosure measures and implement them

@tombisho
Copy link
Owner Author

tombisho commented Sep 2, 2021

and check how the seed is set

@tombisho
Copy link
Owner Author

Note that a variable to be synthesized
first that has no predictors is a special case and its synthetic values are by default generated
by random sampling with replacement from the original data ("sample" method). I

@tombisho
Copy link
Owner Author

tombisho commented Oct 24, 2022

  • concern that the density smoothing does not hide extreme values, so might need to use top and bottom coding. How to set the top and bottom? 90% of real value? Might hide large values causing problems?
  • add a label to the data to show it is synthetic (easy?)
  • remove unique combinations of factors that are also in the real data
  • first column always sampled - contains real data? Force it to be a factor? The smoothing and top and bottom is applied to it, which can hide the real values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant