LeakPro

About the project

LeakPro was created to enable seamless risk assessment of leaking sensitive data when sharing machine learning models or synthetic datasets.
To achieve this, it consolidates state-of-the-art privacy attacks into a unified and user-friendly tool, designed with a focus on realistic threat models and practical applicability.

When running LeakPro, results are automatically collected, summarized, and presented in a comprehensive PDF report. This report is designed for easy sharing with stakeholders and to provide a solid foundation for risk assessment, compliance documentation, and decision-making around data sharing and model deployment.

The recent opinion from the EDPB has further underscored the necessity of a tool like LeakPro, emphasizing that to argue about model anonymity, a released model must have undergone stress-testing with “all means reasonably likely to be used” by an adversary.

Philosophy behind LeakPro

LeakPro is built on the idea that privacy risks in machine learning can be framed as an adversarial game between a challenger and an attacker. In this framework, the attacker attempts to infer sensitive information from the challenger, while the challenger controls what information is exposed. By adjusting these controls, LeakPro allows users to explore different threat models, simulating various real-world attack scenarios.

One common concern is that future attacks may surpass those currently known. To address this, LeakPro adopts a proactive approach, equipping adversaries with more side information than they would typically have in reality. This ensures that LeakPro does not just evaluate existing risks but also anticipates and tests against stronger, future threats, all while keeping assumptions realistic and relevant to practical scenarios. By integrating these principles, LeakPro serves as a flexible and robust tool for assessing privacy risks in machine learning models, helping researchers and practitioners stress-test their systems before real-world vulnerabilities emerge.

LeakPro is designed to minimize user burden, requiring minimal manual input and featuring automated hyperparameter tuning for relevant attacks. The development is organized into four parallel legs with a shared architectural backbone:

Membership Inference Attacks (MIA):
This WP focuses on attacks that determine whether a specific data point was used in training. Adversaries in this setting have black-box access to the model, motivated by findings in the literature that black-box attacks can be as effective as white-box attacks.
Model Inversion Attacks (MInvA):
This recently initiated WP explores attacks that aim to reconstruct sensitive training data. In this case, the adversary is assumed to have white-box access to the model.
Gradient Inversion Attacks (GIA):
This WP targets federated learning, investigating the risk of an adversary reconstructing client data at the server by leveraging the global model and client updates.
Synthetic Data Attacks:
In this WP, adversaries only have access to a synthetic dataset generated from sensitive data. The goal is to infer information about the original dataset using only interactions with the synthetic data.

Each leg follows core design principles: easy integration of new attacks, model agnosticism, and support for diverse data modalities. Currently, it supports tabular, image, text, and graph data, with time series integration underway.

Real world examples

Our example portfolio of real industry use cases cover four distinct data modalities: tabular, image, text, and graphs. The example portfolio is continuously improved and extended.

Length-of-stay Prediction length-of-stay.md MIA: ✅ MInvA: ❌ GIA: ✅ SynA: ✅	Text Masking text-masking.md MIA: ✅ MInvA: ❌ GIA: ✅ SynA: ✅
Camera Surveillance surveillance.md MIA: ✅ MInvA: ❌ GIA: ✅ SynA: ❌	Molecule Property Prediction molecule-property.md MIA: ✅ MInvA: ❌ GIA: ❌ SynA: ❌

To install

Clone repository git clone https://github.com/aidotse/LeakPro.git
Navigate to the project repo cd Leakpro
Install with pip pip install -e .[dev]

To Contribute

Ensure local repo is up-to-date: git fetch origin
Create feature branch git checkout -b my-feature-branch
Make changes and commit: git add . git commit -m "Added new feature"
Ensure the local main is up to date: git checkout main git pull origin main
Merge main onto feature branch git checkout my-feature-branch git merge main
Resolve conflicts, add and commit.
Push your update to the remore repository git push origin my-feature-branch
Open pull request

Research Outputs

LeakPro has contributed to the research community by enabling empirical studies on privacy risks in machine learning. Selected publications include:

Funding

LeakPro is funded by Sweden's innovation agency, Vinnova, under grant 2023-03000. The project is a collaboration between AI Sweden, RISE, Scaleout AB, Syndata AB, AstraZeneca AB, Sahlgrenska University Hopsital, and Region Halland, with the goal of advancing privacy-preserving machine learning and responsible AI deployment.

Name		Name	Last commit message	Last commit date
Latest commit History 608 Commits
.github		.github
examples		examples
leakpro		leakpro
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
env_fl.yml		env_fl.yml
pyproject.toml		pyproject.toml
third_party.txt		third_party.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LeakPro

About the project

Philosophy behind LeakPro

Real world examples

To install

To Contribute

Research Outputs

Funding

About

Releases

Packages

Contributors 9

Languages

License

aidotse/LeakPro

Folders and files

Latest commit

History

Repository files navigation

LeakPro

About the project

Philosophy behind LeakPro

Real world examples

To install

To Contribute

Research Outputs

Funding

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages