Membership Inference Attacks

LeakPro supports a wide range of attack scenarios targeting different privacy vulnerabilities in machine learning models. This section provides an in-depth look at each scenario, including supported attack types, data modalities, and references to specific implementations.

Overview

mia_flow

In the figure above, the MIA workflow is outlined. The upper part of the image (above the dashed line) shows user-controlled inputs, while the lower part illustrates the inner workings of LeakPro. To evaluate MIAs using LeakPro, the following steps are necessary:

Step 1: Ensure access to an auxiliary dataset, which may originate from the same distribution as the training dataset or a different one. The figure above illustrates the former case. Additionally, the user must split the dataset into a training set and a test set. The training set is used for model training, while the test set is used to assess the generalization gap. During evaluation, the attack will be tested on both training samples (in-members) and testing samples (out-members). The complete dataset (including the training, test, and auxiliary sets) will be provided to LeakPro and referred to as the population data. Importantly, the population dataset must be indexable.
Step 2: The user must provide a function to train a model using the training set. This function can either be used to train a target model or be bypassed if a pre-trained target model is provided. Regardless, the user must supply training functionality to enable the adversary to train shadow models during the evaluation process. It is important to note that this training functionality can be designed to limit the adversary's knowledge, as the training process may differ from the actual model training used in practice. Additionally, along with the target model, the user should provide the following metadata:
- Training and testing indices within the population data.
- The optimizer function used during training.
- The loss function applied during model evaluation.
Step 3: The user-provided inputs–population data, target model, target model metadata, and the training function–are supplied to LeakPro when the LeakPro object is created. This information is stored within the Handler object, which acts as an interface between the user and the various attacks performed by LeakPro.
Step 4: The relevant attacks are prepared within LeakPro, utilizing different tools based on the specific attacks being performed. For instance, some attacks rely on shadow models, while others leverage techniques such as model distillation or quantile regression. These tools are built using the auxiliary data, which is assumed to be accessible to the adversary, along with the training loop provided by the user. Additionally, certain attacks feature both online and offline versions:
- Offline attacks: The adversary can only sample from the provided auxiliary dataset, limiting access to other data sources.
- Online attacks: The adversary can also sample from the training and test datasets, though without knowing whether specific samples were used during the training of the target model.
Step 5: The attack tools are utilized during attack execution to generate signals for membership inference. The membership inference attacks are evaluated on both the training and test datasets. The adversary's objective is twofold:
- Correctly infer that the training data samples are in-members (part of the training set).
- Accurately detect that the test data samples are out-members (not part of the training set).
Step 6: The signals generated by each attack, along with the corresponding decisions, are passed to LeakPro's report module for summarization. The report module compiles the results into a comprehensive PDF report for easy sharing while also storing the individual data outputs produced by the attacks for further analysis. Once the results have been generated and stored, the auditing process is considered complete.

3.1 Membership Inference Attacks

Attack Types

Label-Only Attacks: Infer membership using only predicted labels.
Logit-Based Attacks: Use model confidence scores (logits) to increase inference accuracy.

Supported Data Modalities

Image Data: Vision models trained on datasets like CIFAR-10 and MNIST.
Text Data: NLP models such as sentiment analyzers.
Tabular Data: Structured datasets like financial or healthcare records.
Graph Data: Graph neural networks used in social network analysis.

Implemented Attacks

This table lists all implemented attacks in LeakPro across different scenarios, linking directly to the code implementation and original research papers.

Attack	Adversary Access	Code Implementation	Original Paper
HSJ	Label-Based	LeakPro Code	Choquette-Choo et al., in ICML (2021)
LiRA	Logit-Based	LeakPro Code	Carlini et al., IEEE SP (2022)
TrajectoryMIA	Logit-Based	LeakPro Code	Liu et al., in ACM SIGSAC (2022)
P-Attack	Loss-Based	LeakPro Code	Ye et al., in ACM CCS (2022)
QMIA	Logit-Based	LeakPro Code	Bertran et al., in NeurIPS (2023)
RMIA	Logit-Based	LeakPro Code	Zarifzadeh et al., in ICML (2024)
YOQO	Label-Based	LeakPro Code	Wu et al., in ICLR (2024)

How to Use This Table

Click on the Code Implementation links to explore the corresponding LeakPro modules.
Click on the Original Paper links to read the foundational research behind each attack.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly