Awesome Attacks on Machine Learning Privacy

This repository contains a curated list of papers related to privacy attacks against machine learning. A code repository is provided when available by the authors. For corrections, suggestions, or missing papers, please either open an issue or submit a pull request.

Surveys and Overviews

A Survey of Privacy Attacks in Machine Learning (Rigaki and Garcia, 2020)
An Overview of Privacy in Machine Learning (De Cristofaro, 2020)
Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks (Fan et al., 2020)

Privacy Testing Tools

PrivacyRaven (Trail of Bits)
TensorFlow Privacy (TensorFlow)
Machine Learning Privacy Meter (NUS Data Privacy and Trustworthy Machine Learning Lab)
CypherCat (archive-only) (IQT Labs/Lab 41)
Adversarial Robustness Toolbox (ART) (IBM)

Papers and Code

Membership inference

Membership inference attacks against machine learning models (Shokri et al., 2017) (code)
Understanding membership inferences on well-generalized learning models(Long et al., 2018)
Privacy risk in machine learning: Analyzing the connection to overfitting, (Yeom et al., 2018) (code)
Membership inference attack against differentially private deep learning model (Rahman et al., 2018)
Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. (Nasr et al., 2019) (code)
Logan: Membership inference attacks against generative models. (Hayes et al. 2019) (code)
Evaluating differentially private machine learning in practice (Jayaraman and Evans, 2019) (code)
Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models (Salem et al., 2019) (code)
Privacy risks of securing machine learning models against adversarial examples (Song L. et al., 2019) (code)
White-box vs Black-box: Bayes Optimal Strategies for Membership Inference (Sablayrolles et al., 2019)
Privacy risks of explaining machine learning models (Shokri et al., 2019)
Demystifying membership inference attacks in machine learning as a service (Truex et al., 2019)
Monte carlo and reconstruction membership inference attacks against generative models (Hilprecht et al., 2019)
MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples (Jia et al., 2019) (code)
Gan-leaks: A taxonomy of membership inference attacks against gans (Chen,et al., 2019))
Auditing Data Provenance in Text-Generation Models (Song and Shmatikov, 2019)
Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System? (Hisamoto et al., 2020)
Revisiting Membership Inference Under Realistic Assumptions (Jayaraman et al., 2020)
When Machine Unlearning Jeopardizes Privacy (Chen et al., 2020)
Modelling and Quantifying Membership Information Leakage in Machine Learning (Farokhi and Kaafar, 2020)
Systematic Evaluation of Privacy Risks of Machine Learning Models (Song and Mittal, 2020) (code)
Towards the Infeasibility of Membership Inference on Deep Models (Rezaei and Liu, 2020) (code)
Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference (Leino and Fredrikson, 2020)
Label-Only Membership Inference Attacks (Choquette Choo et al., 2020)
Label-Leaks: Membership Inference Attack with Label (Li and Zhang, 2020)
Alleviating Privacy Attacks via Causal Learning (Tople et al., 2020)
On the Effectiveness of Regularization Against Membership Inference Attacks (Kaya et al., 2020)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries (Rahimian et al., 2020)
Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation (He et al., 2019)
Differential Privacy Defenses and Sampling Attacks for Membership Inference (Rahimian et al., 2019)
privGAN: Protecting GANs from membership inference attacks at low cost (Mukherjee et al., 2020)
Sharing Models or Coresets: A Study based on Membership Inference Attack (Lu et al., 2020)
Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning (Zou et al., 2020)
Quantifying Membership Inference Vulnerability via Generalization Gap and Other Model Metrics (Bentley et al., 2020)
MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models (Liu et al., 2020)
On Primes, Log-Loss Scores and (No) Privacy (Aggarwal et al., 2020)
MCMIA: Model Compression Against Membership Inference Attack in Deep Neural Networks (Wang et al., 2020)

Reconstruction

Reconstruction attacks cover also attacks known as model inversion and attribute inference.

Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing (Fredrikson et al., 2014)
Model inversion attacks that exploit confidence information and basic countermeasures (Fredrikson et al., 2015) (code)
A methodology for formalizing model-inversion attacks (Wu et al., 2016)
Deep models under the gan: Information leakage from collaborative deep learning (Hitaj et al., 2017)
Machine learning models that remember too much (Song, C. et al., 2017) (code)
Model inversion attacks for prediction systems: Without knowledge of non-sensitive attributes (Hidano et al., 2017)
The secret sharer: Evaluating and testing unintended memorization in neural networks (Carlini et al., 2019)
Deep leakage from gradients (Zhu et al., 2019) (code)
Model inversion attacks against collaborative inference (He et al., 2019) (code)
Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning (Wang et al., 2019)
Neural network inversion in adversarial setting via background knowledge alignment (Yang et al., 2019)
iDLG: Improved Deep Leakage from Gradients (Zhao et al., 2020) (code)
Privacy Risks of General-Purpose Language Models (Pan et al., 2020)
The secret revealer: generative model-inversion attacks against deep neural networks) (Zhang et al., 2020)
Inverting Gradients - How easy is it to break privacy in federated learning? (Geiping et al., 2020)
GAMIN: An Adversarial Approach to Black-Box Model Inversion (Aivodji et al., 2019)
Adversarial Privacy Preservation under Attribute Inference Attack (Zhao et al., 2019)
Reconstruction of training samples from loss functions (Sannai, 2018)
A Framework for Evaluating Gradient Leakage Attacks in Federated Learning (Wei et al., 2020)
Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning (Hitaj et al., 2017)
Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning (Wang et al., 2018)
Exploring Image Reconstruction Attack in Deep Learning Computation Offloading (Oh and Lee, 2019)
I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators (Wei et al., 2019)
Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning (Salem et al., 2019)
Illuminating the Dark or how to recover what should not be seen in FE-based classifiers (Carpov et al., 2020)
Evaluation Indicator for Model Inversion Attack (Tanaka et al., 2020)
Understanding Unintended Memorization in Federated Learning (Thakkar et al., 2020)
An Attack-Based Evaluation Method for Differentially Private Learning Against Model Inversion Attack (Park et al., 2019)
Reducing Risk of Model Inversion Using Privacy-Guided Training (Goldsteen et al., 2020)
Robust Transparency Against Model Inversion Attacks (Alufaisan et al., 2020)
Does AI Remember? Neural Networks and the Right to be Forgotten (Graves et al., 2020)
Improving Robustness to Model Inversion Attacks via Mutual Information Regularization (Wang et al., 2020)
SAPAG: A Self-Adaptive Privacy Attack From Gradients (Wang et al., 2020)

Property inference

Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers (Ateniese et al., 2015)
Property inference attacks on fully connected neural networks using permutation invariant representations (Ganju et al., 2018)
Exploiting unintended feature leakage in collaborative learning (Melis et al., 2019) (code)
Overlearning Reveals Sensitive Attributes (Song C. et al., 2020) (code)

Model extraction

Stealing machine learning models via prediction apis (Tramèr et al., 2016) (code)
Stealing hyperparameters in machine learning (Wang B. et al., 2018)
Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data (Correia-Silva et al., 2018) (code)
Towards reverse-engineering black-box neural networks.(Oh et al., 2018) (code)
Knockoff nets: Stealing functionality of black-box models (Orekondy et al., 2019) (code)
PRADA: protecting against DNN model stealing attacks (Juuti et al., 2019) (code)
Model Reconstruction from Model Explanations (Milli et al., 2019)
Exploring connections between active learning and model extraction (Chandrasekaran et al., 2020)
High Accuracy and High Fidelity Extraction of Neural Networks (Jagielski et al., 2020)
Thieves on Sesame Street! Model Extraction of BERT-based APIs (Krishna et al., 2020) (code)
Cryptanalytic Extraction of Neural Network Models (Carlini et al., 2020)
CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples (Yu et al., 2020)
ACTIVETHIEF: Model Extraction Using Active Learning and Unannotated Public Data (Pal et al., 2020) (code)
Efficiently Stealing your Machine Learning Models (Reith et al., 2019)
Extraction of Complex DNN Models: Real Threat or Boogeyman? (Atli et al., 2020)
Stealing Neural Networks via Timing Side Channels (Duddu et al., 2019)
DeepSniffer: A DNN Model Extraction Framework Based on Learning Architectural Hints (Hu et al., 2020)
CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel (Batina et al., 2019)
Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures (Yan et al., 2020)
How to 0wn NAS in Your Spare Time (Hong et al., 2020)
Security Analysis of Deep Neural Networks Operating in the Presence of Cache Side-Channel Attacks (Hong et al., 2020)
Reverse-Engineering Deep ReLU Networks (Rolnick and Kording, 2020)
Model Extraction Oriented Data Publishing with k-anonymity (Fukuoka et al., 2020)
Hermes Attack: Steal DNN Models with Lossless Inference Accuracy (Zhu et al., 2020)
Model extraction from counterfactual explanations (Aïvodji et al., 2020)
MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks (Chen and Yong, 2020)
Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks (Orekondy et al., 2019)
IReEn: Iterative Reverse-Engineering of Black-Box Functions via Neural Program Synthesis (Hajipour et al., 2020)
ES Attack: Model Stealing against Deep Neural Networks without Data Hurdles (Yuan et al., 2020)

Other

Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy (Naseri et al., 2020)
Analyzing Information Leakage of Updates to Natural Language Models (Brockschmidt et al., 2020)
Estimating g-Leakage via Machine Learning (Romanelli et al., 2020)
Information Leakage in Embedding Models (Song and Raghunathan, 2020)
Hide-and-Seek Privacy Challenge (Jordan et al., 2020)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Awesome Attacks on Machine Learning Privacy

Contents

Surveys and Overviews

Privacy Testing Tools

Papers and Code

Membership inference

Reconstruction

Property inference

Model extraction

Other

Files

README.md

Latest commit

History

README.md

File metadata and controls

Awesome Attacks on Machine Learning Privacy

Contents

Surveys and Overviews

Privacy Testing Tools

Papers and Code

Membership inference

Reconstruction

Property inference

Model extraction

Other