Skip to content

aidotse/LeakPro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LeakPro

Tests Last Commit License Open Issues Open PRs Downloads Coverage

About the project

LeakPro was created to enable seamless risk assessment of leaking sensitive data when sharing machine learning models or synthetic datasets.
To achieve this, it consolidates state-of-the-art privacy attacks into a unified and user-friendly tool, designed with a focus on realistic threat models and practical applicability.

When running LeakPro, results are automatically collected, summarized, and presented in a comprehensive PDF report. This report is designed for easy sharing with stakeholders and to provide a solid foundation for risk assessment, compliance documentation, and decision-making around data sharing and model deployment.

The recent opinion from the EDPB has further underscored the necessity of a tool like LeakPro, emphasizing that to argue about model anonymity, a released model must have undergone stress-testing with “all means reasonably likely to be used” by an adversary.

Philosophy behind LeakPro

LeakPro is built on the idea that privacy risks in machine learning can be framed as an adversarial game between a challenger and an attacker. In this framework, the attacker attempts to infer sensitive information from the challenger, while the challenger controls what information is exposed. By adjusting these controls, LeakPro allows users to explore different threat models, simulating various real-world attack scenarios.

One common concern is that future attacks may surpass those currently known. To address this, LeakPro adopts a proactive approach, equipping adversaries with more side information than they would typically have in reality. This ensures that LeakPro does not just evaluate existing risks but also anticipates and tests against stronger, future threats, all while keeping assumptions realistic and relevant to practical scenarios. By integrating these principles, LeakPro serves as a flexible and robust tool for assessing privacy risks in machine learning models, helping researchers and practitioners stress-test their systems before real-world vulnerabilities emerge.

LeakPro is designed to minimize user burden, requiring minimal manual input and featuring automated hyperparameter tuning for relevant attacks. The development is organized into four parallel legs with a shared architectural backbone:

  • Membership Inference Attacks (MIA):
    This WP focuses on attacks that determine whether a specific data point was used in training. Adversaries in this setting have black-box access to the model, motivated by findings in the literature that black-box attacks can be as effective as white-box attacks.

  • Model Inversion Attacks (MInvA):
    This recently initiated WP explores attacks that aim to reconstruct sensitive training data. In this case, the adversary is assumed to have white-box access to the model.

  • Gradient Inversion Attacks (GIA):
    This WP targets federated learning, investigating the risk of an adversary reconstructing client data at the server by leveraging the global model and client updates.

  • Synthetic Data Attacks:
    In this WP, adversaries only have access to a synthetic dataset generated from sensitive data. The goal is to infer information about the original dataset using only interactions with the synthetic data.

Each leg follows core design principles: easy integration of new attacks, model agnosticism, and support for diverse data modalities. Currently, it supports tabular, image, text, and graph data, with time series integration underway.

Real world examples

Our example portfolio of real industry use cases cover four distinct data modalities: tabular, image, text, and graphs. The example portfolio is continuously improved and extended.

Length-of-stay Prediction
LOS
length-of-stay.md
MIA: ✅
MInvA: ❌
GIA: ✅
SynA: ✅
Text Masking
NER
text-masking.md
MIA: ✅
MInvA: ❌
GIA: ✅
SynA: ✅
Camera Surveillance
Surveillance
surveillance.md
MIA: ✅
MInvA: ❌
GIA: ✅
SynA: ❌
Molecule Property Prediction
Graph
molecule-property.md
MIA: ✅
MInvA: ❌
GIA: ❌
SynA: ❌

To install

  1. Clone repository git clone https://github.com/aidotse/LeakPro.git
  2. Navigate to the project repo cd Leakpro
  3. Install with pip pip install -e .[dev]

To Contribute

  1. Ensure local repo is up-to-date: git fetch origin
  2. Create feature branch git checkout -b my-feature-branch
  3. Make changes and commit: git add . git commit -m "Added new feature"
  4. Ensure the local main is up to date: git checkout main git pull origin main
  5. Merge main onto feature branch git checkout my-feature-branch git merge main
  6. Resolve conflicts, add and commit.
  7. Push your update to the remore repository git push origin my-feature-branch
  8. Open pull request

Research Outputs

LeakPro has contributed to the research community by enabling empirical studies on privacy risks in machine learning. Selected publications include:

Funding

LeakPro is funded by Sweden's innovation agency, Vinnova, under grant 2023-03000. The project is a collaboration between AI Sweden, RISE, Scaleout AB, Syndata AB, AstraZeneca AB, Sahlgrenska University Hopsital, and Region Halland, with the goal of advancing privacy-preserving machine learning and responsible AI deployment.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages