This repository contains a curated list of papers related to privacy attacks against machine learning. A code repository is provided when available by the authors. For corrections, suggestions, or missing papers, please either open an issue or submit a pull request.
- A Survey of Privacy Attacks in Machine Learning (Rigaki and Garcia, 2020)
- An Overview of Privacy in Machine Learning (De Cristofaro, 2020)
- Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks (Fan et al., 2020)
- PrivacyRaven (Trail of Bits)
- TensorFlow Privacy (TensorFlow)
- Machine Learning Privacy Meter (NUS Data Privacy and Trustworthy Machine Learning Lab)
- CypherCat (archive-only) (IQT Labs/Lab 41)
- Adversarial Robustness Toolbox (ART) (IBM)
- Membership inference attacks against machine learning models (Shokri et al., 2017) (code)
- Understanding membership inferences on well-generalized learning models(Long et al., 2018)
- Privacy risk in machine learning: Analyzing the connection to overfitting, (Yeom et al., 2018) (code)
- Membership inference attack against differentially private deep learning model (Rahman et al., 2018)
- Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. (Nasr et al., 2019) (code)
- Logan: Membership inference attacks against generative models. (Hayes et al. 2019) (code)
- Evaluating differentially private machine learning in practice (Jayaraman and Evans, 2019) (code)
- Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models (Salem et al., 2019) (code)
- Privacy risks of securing machine learning models against adversarial examples (Song L. et al., 2019) (code)
- White-box vs Black-box: Bayes Optimal Strategies for Membership Inference (Sablayrolles et al., 2019)
- Privacy risks of explaining machine learning models (Shokri et al., 2019)
- Demystifying membership inference attacks in machine learning as a service (Truex et al., 2019)
- Monte carlo and reconstruction membership inference attacks against generative models (Hilprecht et al., 2019)
- MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples (Jia et al., 2019) (code)
- Gan-leaks: A taxonomy of membership inference attacks against gans (Chen,et al., 2019))
- Auditing Data Provenance in Text-Generation Models (Song and Shmatikov, 2019)
- Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System? (Hisamoto et al., 2020)
- Revisiting Membership Inference Under Realistic Assumptions (Jayaraman et al., 2020)
- When Machine Unlearning Jeopardizes Privacy (Chen et al., 2020)
- Modelling and Quantifying Membership Information Leakage in Machine Learning (Farokhi and Kaafar, 2020)
- Systematic Evaluation of Privacy Risks of Machine Learning Models (Song and Mittal, 2020) (code)
- Towards the Infeasibility of Membership Inference on Deep Models (Rezaei and Liu, 2020) (code)
- Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference (Leino and Fredrikson, 2020)
- Label-Only Membership Inference Attacks (Choquette Choo et al., 2020)
- Label-Leaks: Membership Inference Attack with Label (Li and Zhang, 2020)
- Alleviating Privacy Attacks via Causal Learning (Tople et al., 2020)
- On the Effectiveness of Regularization Against Membership Inference Attacks (Kaya et al., 2020)
- Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries (Rahimian et al., 2020)
- Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation (He et al., 2019)
- Differential Privacy Defenses and Sampling Attacks for Membership Inference (Rahimian et al., 2019)
- privGAN: Protecting GANs from membership inference attacks at low cost (Mukherjee et al., 2020)
- Sharing Models or Coresets: A Study based on Membership Inference Attack (Lu et al., 2020)
- Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning (Zou et al., 2020)
- Quantifying Membership Inference Vulnerability via Generalization Gap and Other Model Metrics (Bentley et al., 2020)
- MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models (Liu et al., 2020)
- On Primes, Log-Loss Scores and (No) Privacy (Aggarwal et al., 2020)
- MCMIA: Model Compression Against Membership Inference Attack in Deep Neural Networks (Wang et al., 2020)
Reconstruction attacks cover also attacks known as model inversion and attribute inference.
- Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing (Fredrikson et al., 2014)
- Model inversion attacks that exploit confidence information and basic countermeasures (Fredrikson et al., 2015) (code)
- A methodology for formalizing model-inversion attacks (Wu et al., 2016)
- Deep models under the gan: Information leakage from collaborative deep learning (Hitaj et al., 2017)
- Machine learning models that remember too much (Song, C. et al., 2017) (code)
- Model inversion attacks for prediction systems: Without knowledge of non-sensitive attributes (Hidano et al., 2017)
- The secret sharer: Evaluating and testing unintended memorization in neural networks (Carlini et al., 2019)
- Deep leakage from gradients (Zhu et al., 2019) (code)
- Model inversion attacks against collaborative inference (He et al., 2019) (code)
- Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning (Wang et al., 2019)
- Neural network inversion in adversarial setting via background knowledge alignment (Yang et al., 2019)
- iDLG: Improved Deep Leakage from Gradients (Zhao et al., 2020) (code)
- Privacy Risks of General-Purpose Language Models (Pan et al., 2020)
- The secret revealer: generative model-inversion attacks against deep neural networks) (Zhang et al., 2020)
- Inverting Gradients - How easy is it to break privacy in federated learning? (Geiping et al., 2020)
- GAMIN: An Adversarial Approach to Black-Box Model Inversion (Aivodji et al., 2019)
- Adversarial Privacy Preservation under Attribute Inference Attack (Zhao et al., 2019)
- Reconstruction of training samples from loss functions (Sannai, 2018)
- A Framework for Evaluating Gradient Leakage Attacks in Federated Learning (Wei et al., 2020)
- Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning (Hitaj et al., 2017)
- Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning (Wang et al., 2018)
- Exploring Image Reconstruction Attack in Deep Learning Computation Offloading (Oh and Lee, 2019)
- I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators (Wei et al., 2019)
- Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning (Salem et al., 2019)
- Illuminating the Dark or how to recover what should not be seen in FE-based classifiers (Carpov et al., 2020)
- Evaluation Indicator for Model Inversion Attack (Tanaka et al., 2020)
- Understanding Unintended Memorization in Federated Learning (Thakkar et al., 2020)
- An Attack-Based Evaluation Method for Differentially Private Learning Against Model Inversion Attack (Park et al., 2019)
- Reducing Risk of Model Inversion Using Privacy-Guided Training (Goldsteen et al., 2020)
- Robust Transparency Against Model Inversion Attacks (Alufaisan et al., 2020)
- Does AI Remember? Neural Networks and the Right to be Forgotten (Graves et al., 2020)
- Improving Robustness to Model Inversion Attacks via Mutual Information Regularization (Wang et al., 2020)
- SAPAG: A Self-Adaptive Privacy Attack From Gradients (Wang et al., 2020)
- Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers (Ateniese et al., 2015)
- Property inference attacks on fully connected neural networks using permutation invariant representations (Ganju et al., 2018)
- Exploiting unintended feature leakage in collaborative learning (Melis et al., 2019) (code)
- Overlearning Reveals Sensitive Attributes (Song C. et al., 2020) (code)
- Stealing machine learning models via prediction apis (Tramèr et al., 2016) (code)
- Stealing hyperparameters in machine learning (Wang B. et al., 2018)
- Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data (Correia-Silva et al., 2018) (code)
- Towards reverse-engineering black-box neural networks.(Oh et al., 2018) (code)
- Knockoff nets: Stealing functionality of black-box models (Orekondy et al., 2019) (code)
- PRADA: protecting against DNN model stealing attacks (Juuti et al., 2019) (code)
- Model Reconstruction from Model Explanations (Milli et al., 2019)
- Exploring connections between active learning and model extraction (Chandrasekaran et al., 2020)
- High Accuracy and High Fidelity Extraction of Neural Networks (Jagielski et al., 2020)
- Thieves on Sesame Street! Model Extraction of BERT-based APIs (Krishna et al., 2020) (code)
- Cryptanalytic Extraction of Neural Network Models (Carlini et al., 2020)
- CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples (Yu et al., 2020)
- ACTIVETHIEF: Model Extraction Using Active Learning and Unannotated Public Data (Pal et al., 2020) (code)
- Efficiently Stealing your Machine Learning Models (Reith et al., 2019)
- Extraction of Complex DNN Models: Real Threat or Boogeyman? (Atli et al., 2020)
- Stealing Neural Networks via Timing Side Channels (Duddu et al., 2019)
- DeepSniffer: A DNN Model Extraction Framework Based on Learning Architectural Hints (Hu et al., 2020)
- CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel (Batina et al., 2019)
- Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures (Yan et al., 2020)
- How to 0wn NAS in Your Spare Time (Hong et al., 2020)
- Security Analysis of Deep Neural Networks Operating in the Presence of Cache Side-Channel Attacks (Hong et al., 2020)
- Reverse-Engineering Deep ReLU Networks (Rolnick and Kording, 2020)
- Model Extraction Oriented Data Publishing with k-anonymity (Fukuoka et al., 2020)
- Hermes Attack: Steal DNN Models with Lossless Inference Accuracy (Zhu et al., 2020)
- Model extraction from counterfactual explanations (Aïvodji et al., 2020)
- MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks (Chen and Yong, 2020)
- Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks (Orekondy et al., 2019)
- IReEn: Iterative Reverse-Engineering of Black-Box Functions via Neural Program Synthesis (Hajipour et al., 2020)
- ES Attack: Model Stealing against Deep Neural Networks without Data Hurdles (Yuan et al., 2020)
- Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy (Naseri et al., 2020)
- Analyzing Information Leakage of Updates to Natural Language Models (Brockschmidt et al., 2020)
- Estimating g-Leakage via Machine Learning (Romanelli et al., 2020)
- Information Leakage in Embedding Models (Song and Raghunathan, 2020)
- Hide-and-Seek Privacy Challenge (Jordan et al., 2020)