A curated list of Meachine learning Security & Privacy papers published in security top-4 conferences (IEEE S&P, ACM CCS, USENIX Security and NDSS).
- Awesome-ML-Security-and-Privacy-Papers
- Contents:
- 1. Security Papers
- 2. Privacy Papers
- Contributing
- Licenses
-
Hybrid Batch Attacks: Finding Black-box Adversarial Examples with Limited Queries. USENIX Security 2020.
Transferability + Query. Black-box Attack
[pdf] [code] -
Adversarial Preprocessing: Understanding and Preventing Image-Scaling Attacks in Machine Learning. USENIX Security 2020.
Defense of Image Scaling Attack
[pdf] [code] -
HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. IEEE S&P 2020.
Query-based Black-box Attack
[pdf] [code] -
PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking. USENIX Security 2021.
Adversarial Patch Defense
[pdf] [code] -
Gotta Catch'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks. ACM CCS 2020.
Build an trap in model to induce specific adversarial perturbation
[pdf] [code] -
A Tale of Evil Twins: Adversarial Inputs versus Poisoned Models. ACM CCS 2020.
Perturbate both input and model
[pdf] [code] -
Feature-Indistinguishable Attack to Circumvent Trapdoor-Enabled Defense. ACM CCS 2021.
A new attack method can break TeD defense mechanism
[pdf] [code] -
DetectorGuard: Provably Securing Object Detectors against Localized Patch Hiding Attacks. ACM CCS 2021.
Provable robustness for patch hiding in object detection
[pdf] [code] -
RamBoAttack: A Robust and Query Efficient Deep Neural Network Decision Exploit. NDSS 2022.
Query-based black box attack
[pdf] [code] -
What You See is Not What the Network Infers: Detecting Adversarial Examples Based on Semantic Contradiction. NDSS 2022.
Generative-based AE detection
[pdf] [code] -
AutoDA: Automated Decision-based Iterative Adversarial Attacks. USENIX 2022.
Program Synthesis for Adversarial Attack
[pdf] -
Blacklight: Scalable Defense for Neural Networks against Query-Based Black-Box Attacks. USENIX Security 2022.
AE Detection using probabilistic fingerprints based on hash of input similarity
[pdf] [code] -
Physical Hijacking Attacks against Object Trackers. ACM CCS 2022.
Adversarial Attacks on Object Trackers
[pdf] [code] -
Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models. ACM CCS 2022.
Adversarial Attacks on Object Trackers
[pdf]
-
TextShield: Robust Text Classification Based on Multimodal Embedding and Neural Machine Translation. USENIX Security 2020.
Defense in preprossing
[pdf] -
Bad Characters: Imperceptible NLP Attacks. IEEE S&P 2022.
Use unicode to conduct human imperceptible attack
[pdf] [code] -
Order-Disorder: Imitation Adversarial Attacks for Black-box Neural Ranking Models. ACM CCS 2022.
Attack Neural Ranking Models
[pdf]
-
WaveGuard: Understanding and Mitigating Audio Adversarial Examples. USENIX Security 2021.
Defense in preprossing
[pdf] [code] -
Dompteur: Taming Audio Adversarial Examples. USENIX Security 2021.
Defense in preprossing. Preprocessing the audio to make the noise human noticeable
[pdf] [code] -
Who is Real Bob? Adversarial Attacks on Speaker Recognition Systems. IEEE S&P 2021.
Attack
[pdf] [code] -
Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems. IEEE S&P 2021.
Black-box Attack
[pdf] -
SoK: The Faults in our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems. IEEE S&P 2021.
Survey
[pdf] -
AdvPulse: Universal, Synchronization-free, and Targeted Audio Adversarial Attacks via Subsecond Perturbations. ACM CCS 2020.
Attack
[pdf] -
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information. ACM CCS 2021.
Black-box Attack. Physical World
[pdf] -
Perception-Aware Attack: Creating Adversarial Music via Reverse-Engineering Human Perception. ACM CCS 2022.
Adversarial Audio with human-aware noise
[pdf] -
SpecPatch: Human-in-the-Loop Adversarial Audio Spectrogram Patch Attack on Speech Recognition. ACM CCS 2022.
Adversarial Patch for audio
[pdf]
- Universal 3-Dimensional Perturbations for Black-Box Attacks on Video Recognition Systems. IEEE S&P 2022.
Adversarial attack in video recognition
[pdf]
- A Hard Label Black-box Adversarial Attack Against Graph Neural Networks. ACM CCS 2021.
Graph Classification
[pdf]
-
Evading Classifiers by Morphing in the Dark. ACM CCS 2017.
Morpher and search to generate adversarial PDF
[pdf] -
Misleading Authorship Attribution of Source Code using Adversarial Learning. USENIX Security 2019.
Adversarial attack in source code, MCST
[pdf] [code] -
Intriguing Properties of Adversarial ML Attacks in the Problem Space. IEEE S&P 2020.
Attack Malware Classification
[pdf] -
Structural Attack against Graph Based Android Malware Detection. IEEE S&P 2020.
Perturbed function call graph
[pdf]
- ATTRITION: Attacking Static Hardware Trojan Detection Techniques Using Reinforcement Learning. ACM CCS 2022.
Attack Hardware Trojan Detection
[pdf]
-
Interpretable Deep Learning under Fire. USENIX Security 2020.
Attack both image classification and interpret method
[pdf] -
“Is your explanation stable?”: A Robustness Evaluation Framework for Feature Attribution. ACM CCS 2022.
Hypothesis Testing to increasing the robustness of explaination methods
[pdf]
-
SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations. USENIX Security 2021.
Projector light causes misclassification
[pdf] [code] -
Understanding Real-world Threats to Deep Learning Models in Android Apps. ACM CCS 2022.
Adversarial Attack in real-world models
[pdf]
- Adversarial Policy Training against Deep Reinforcement Learning. USENIX Security 2021.
Weird behavior to trigger opposite abnormal action. Two-agent competitor game
[pdf] [code]
-
Cost-Aware Robust Tree Ensembles for Security Applications. USENIX Security 2021.
Propose Cost of feature to certify the model robustness
[pdf] [code] -
CADE: Detecting and Explaining Concept Drift Samples for Security Applications. USENIX Security 2021.
Detect Concept shift
[pdf] [code] -
Learning Security Classifiers with Verified Global Robustness Properties. ACM CCS 2021.
Train a classifier with global robustness
[pdf] [code] -
On the Robustness of Domain Constraints. ACM CCS 2021.
Domain constraints. Input space robustness
[pdf] -
Cert-RNN: Towards Certifying the Robustness of Recurrent Neural Networks. ACM CCS 2021.
Certify robustness in RNN
[pdf] -
TSS: Transformation-Specific Smoothing for Robustness Certification. ACM CCS 2021.
Certify robustness about transformation
[pdf][code] -
Transcend: Detecting Concept Drift in Malware Classification Models. USENIX Security 2017.
Conformal evaluators
[pdf][code] -
Transcending Transcend: Revisiting Malware Classification in the Presence of Concept Drift. IEEE S&P 2022.
New conformal evaluators
[pdf][code] -
Transferring Adversarial Robustness Through Robust Representation Matching. USENIX Security 2022.
Robust Transfer Learning
[pdf]
- Defeating DNN-Based Traffic Analysis Systems in Real-Time With Blind Adversarial Perturbations. USENIX Security 2021.
Adversarial attack to defeat DNN-based traffic analysis
[pdf][code]
- Robust Adversarial Attacks Against DNN-Based Wireless Communication Systems. ACM CCS 2021.
Attack
[pdf]
-
Local Model Poisoning Attacks to Byzantine-Robust Federated Learning. USENIX Security 2020.
Poisoning Attack
[pdf] -
Manipulating the Byzantine: Optimizing Model Poisoning Attacks and Defenses for Federated Learning. NDSS 2021.
Poisoning Attack
[pdf] -
DeepSight: Mitigating Backdoor Attacks in Federated Learning Through Deep Model Inspection. NDSS 2022.
Backdoor defense
[pdf] -
FLAME: Taming Backdoors in Federated Learning. USENIX Security 2022.
Backdoor defense
[pdf] -
EIFFeL: Ensuring Integrity for Federated Learning. ACM CCS 2022.
New FL Protocol to guarteen integrity
[pdf] -
Eluding Secure Aggregation in Federated Learning via Model Inconsistency. ACM CCS 2022.
Model inconsistency to break the secure aggregation
[pdf] -
FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information. IEEE S&P 2023.
Poisoned Model Recovery Algorithm
[pdf]
- Justinian's GAAvernor: Robust Distributed Learning with Gradient Aggregation Agent. USENIX Security 2020.
Defense in Gradient Aggregation. Reinforcement learning
[pdf]
- Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning. IEEE S&P 2020.
Hijack Word Embedding
[pdf]
- You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion. USENIX Security 2021.
Hijack Code Autocomplete
[pdf]
- Poisoning the Unlabeled Dataset of Semi-Supervised Learning. USENIX Security 2021.
Poisoning semi-supervised learning
[pdf]
-
Data Poisoning Attacks to Deep Learning Based Recommender Systems. NDSS 2021.
The attacker chosen items are recommended as much as possible
[pdf] -
Reverse Attack: Black-box Attacks on Collaborative Recommendation. ACM CCS 2021.
Black-box setting. Surrogate model. Collaborative Filtering. Demoting and Promoting
[pdf]
-
Subpopulation Data Poisoning Attacks. ACM CCS 2021.
Poisoning to flip a group of data samples
[pdf] -
Get a Model! Model Hijacking Attack Against Machine Learning Models. NDSS 2022.
Fusing dataset to hijacking model
[pdf] [code]
- PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning. USENIX Security 2022.
Poison attack in constractive learning
[pdf]
- Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets. ACM CCS 2022.
Poison attack to reveal sensitive information
[pdf]
- Poison Forensics: Traceback of Data Poisoning Attacks in Neural Networks. USENIX Security 2022.
Identify poisioned subset by clustering and purning benign set
[pdf]
-
Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. USENIX Security 2021.
Class-specific Backdoor. Defense by decomposition
[pdf] -
Double-Cross Attacks: Subverting Active Learning Systems. USENIX Security 2021.
Active Learning System. Backdoor Attack
[pdf] -
Detecting AI Trojans Using Meta Neural Analysis. IEEE S&P 2021.
Meta Neural Classifier
[pdf] [code] -
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning. IEEE S&P 2022.
Backdoor attack in image-text pretrained model
[pdf] [code] -
Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features. ACM CCS 2020.
Composite backdoor. Image & text tasks
[pdf] [code] -
AI-Lancet: Locating Error-inducing Neurons to Optimize Neural Networks. ACM CCS 2021.
Locate neural location and finetuning it
[pdf] -
LoneNeuron: a Highly-Effective Feature-Domain Neural Trojan Using Invisible and Polymorphic Watermarks. ACM CCS 2022.
Backdoor attack by modifying neuros
[pdf] -
ATTEQ-NN: Attention-based QoE-aware Evasive Backdoor Attacks. NDSS 2022.
Backdoor attack by attention techniques
[pdf] -
RAB: Provable Robustness Against Backdoor Attacks. IEEE S&P 2023.
Backdoor Cetrification
[pdf]
-
T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification. USENIX Security 2021.
Backdoor Defense. GAN to recover trigger
[pdf] [code] -
Hidden Backdoors in Human-Centric Language Models. ACM CCS 2021.
Novel trigger
[pdf] [code] -
Backdoor Pre-trained Models Can Transfer to All. ACM CCS 2021.
Backdoor in pre-trained to poison the down stream task
[pdf] [code] -
Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation. USENIX Security 2022.
Backdoor via linguistic style manipulation
[pdf]
- Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers. USENIX Security 2021.
Explanation Method. Evade Classification
[pdf] [code]
- Blind Backdoors in Deep Learning Models. USENIX Security 2021.
Loss Manipulation. Backdoor
[pdf] [code]
- Towards Understanding and Detecting Cyberbullying in Real-world Images. NDSS 2021.
Detect image cyberbully
[pdf]
- FARE: Enabling Fine-grained Attack Categorization under Low-quality Labeled Data. NDSS 2021.
Clustering Method to complete the dataset label
[pdf] [code]
- WtaGraph: Web Tracking and Advertising Detection using Graph Neural Networks. IEEE S&P 2022.
GNN
[pdf]
- Text Captcha Is Dead? A Large Scale Deployment and Empirical Studys. ACM CCS 2020.
Adversarial CAPTCHA
[pdf]
- PalmTree: Learning an Assembly Language Model for Instruction Embedding. ACM CCS 2021.
Pre-trained model to generate code embedding
[pdf] [code]
- Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots. ACM CCS 2022.
Measuring Chatbot Textico behavior
[pdf]
- Dos and Don'ts of Machine Learning in Computer Security. USENIX Security 2022.
Survey pitfalls in ML4Security
[pdf]
- CERBERUS: Exploring Federated Prediction of Security Events. ACM CCS 2022.
Federated Learning to predict security event
[pdf]
- On the Security Risks of AutoML. USENIX Security 2022.
Adversarial evasion. Model poisoning. Backdoor. Functionality stealing. Membership Inference
[pdf]
- DeepDyve: Dynamic Verification for Deep Neural Networks. ACM CCS 2020. [pdf]
- DeepAID: Interpreting and Improving Deep Learning-based Anomaly Detection in Security Applications. ACM CCS 2021.
Anomaly detection
[pdf] [code]
- Who Are You (I Really Wanna Know)? Detecting Audio DeepFakes Through Vocal Tract Reconstruction. USENIX Security 2022.
deepfake detection using vocal tract reconstruction
[pdf]
-
Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning. USENIX Security 2020.
Online Learning. Model updates
[pdf] -
Extracting Training Data from Large Language Models. USENIX Security 2021.
Membership inference attack. GPT-2
[pdf] -
Analyzing Information Leakage of Updates to Natural Language Models. ACM CCS 2020.
data leakage in model changes
[pdf] -
TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing. ACM CCS 2021.
Membership collision in GAN
[pdf] -
DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation. ACM CCS 2021.
DP to train an privacy preserving GAN
[pdf] -
Property Inference Attacks Against GANs. NDSS 2022.
Property Inference Attacks Against GAN
[pdf] [code] -
MIRROR: Model Inversion for Deep Learning Network with High Fidelity. NDSS 2022.
Model inversion attack using GAN
[pdf] [code]
-
Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference. USENIX Security 2020.
White-box Setting
[pdf] -
Systematic Evaluation of Privacy Risks of Machine Learning Models. USENIX Security 2020.
Metric-based Membership inference Attack Method. Define Privacy Risk Score
[pdf] [code] -
Practical Blind Membership Inference Attack via Differential Comparisons. NDSS 2021.
Use non-member data to replace shadow model
[pdf] [code] -
GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models. ACM CCS 2020.
Membership inference attack in Generative model. Member has small reconstruction error
[pdf] -
Quantifying and Mitigating Privacy Risks of Contrastive Learning. ACM CCS 2021.
Membership inference attack. Property inference attack. Contrastive learning in classification task
[pdf] [code] -
Membership Inference Attacks Against Recommender Systems. ACM CCS 2021.
Recommender System
[pdf] [code] -
EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning. ACM CCS 2021.
Contrastive learning in pre-trained model. Data augmentation has higher similarity
[pdf] [code] -
Auditing Membership Leakages of Multi-Exit Networks. ACM CCS 2022.
Membership inference attack in multi-exit networks
[pdf] -
Membership Inference Attacks by Exploiting Loss Trajectory. ACM CCS 2022.
Membership inference attack, knowledge distillation
[pdf] -
On the Privacy Risks of Cell-Based NAS Architectures. ACM CCS 2022.
Membership inference attack in NAS
[pdf] -
Membership Inference Attacks and Defenses in Neural Network Pruning. USENIX Security 2022.
Membership inference attack in Neural Network Pruning
[pdf] -
Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture. USENIX Security 2022.
Membership inference defense by ensemble
[pdf] -
Enhanced Membership Inference Attacks against Machine Learning Models. USENIX Security 2022.
Membership inference attack with hypothesis testing
[pdf] [code] -
Membership Inference Attacks and Generalization: A Causal Perspective. ACM CCS 2022.
Membership inference attack with casual reasoning
[pdf]
-
Label Inference Attacks Against Vertical Federated Learning. USENIX Security 2022.
Label Leakage. Federated Learning
[pdf] [code] -
The Value of Collaboration in Convex Machine Learning with Differential Privacy. IEEE S&P 2020.
DP as Defense
[pdf] -
Leakage of Dataset Properties in Multi-Party Machine Learning. USENIX Security 2021.
Dataset Properties Leakage
[pdf] -
Unleashing the Tiger: Inference Attacks on Split Learning. ACM CCS 2021.
Split learning. Feature-space hijacking attack
[pdf] [code] -
Local and Central Differential Privacy for Robustness and Privacy in Federated Learning. NDSS 2022.
DP in federated learning
[pdf]
-
Privacy Risks of General-Purpose Language Models. IEEE S&P 2020.
Pretrained Language Model
[pdf] -
Information Leakage in Embedding Models. ACM CCS 2020.
Exact Word Recovery. Attribute inference. Membership inference
[pdf] -
Honest-but-Curious Nets: Sensitive Attributes of Private Inputs Can Be Secretly Coded into the Classifiers' Outputs. ACM CCS 2021.
Infer privacy information in classification output
[pdf] [code]
-
Stealing Links from Graph Neural Networks. USENIX Security 2021.
Inference Graph Link
[pdf] -
Inference Attacks Against Graph Neural Networks. USENIX Security 2022.
Property inference: number of nodes. Subgraph inference. Graph reconstruction
[pdf] [code] -
LinkTeller: Recovering Private Edges from Graph Neural Networks via Influence Analysis. IEEE S&P 2022.
Use node connection influence to infer graph edges
[pdf] -
Locally Private Graph Neural Networks. IEEE S&P 2022.
LDP as defense for node privacy
[pdf] [code] -
Finding MNEMON: Reviving Memories of Node Embeddings. ACM CCS 2022.
Graph recovery attack through node embedding
[pdf] -
Group Property Inference Attacks Against Graph Neural Networks. ACM CCS 2022.
Group Property inference attack on GNN
[pdf] -
LPGNet: Link Private Graph Networks for Node Classification. ACM CCS 2022.
DP to build private GNN
[pdf]
-
Machine Unlearning. IEEE S&P 2020.
Shard and isolate the training dataset
[pdf] [code] -
When Machine Unlearning Jeopardizes Privacy. ACM CCS 2021.
Membership inference attack in unlearning setting
[pdf] [code] -
Graph Unlearning. ACM CCS 2022.
Graph Unlearning
[pdf] [code] -
On the Necessity of Auditable Algorithmic Definitions for Machine Unlearning. ACM CCS 2022.
Auditable Unlearning
[pdf]
-
Are Attribute Inference Attacks Just Imputation?. ACM CCS 2022.
Attribute Inference Attack by identified neuro with data
[pdf] [code] -
Feature Inference Attack on Shapley Values. ACM CCS 2022.
Attribute Inference Attack using shapley values
[pdf] -
QuerySnout: Automating the Discovery of Attribute Inference Attacks against Query-Based Systems. ACM CCS 2022.
Attribute Inference detection
[pdf]
-
Exploring Connections Between Active Learning and Model Extraction. USENIX Security 2020.
Active Learning
[pdf] -
High Accuracy and High Fidelity Extraction of Neural Networks. USENIX Security 2020.
Fidelity
[pdf] -
DRMI: A Dataset Reduction Technology based on Mutual Information for Black-box Attacks. USENIX Security 2021.
Query Data Selection Method to reduce the query
[pdf] -
Entangled Watermarks as a Defense against Model Extraction. USENIX Security 2021.
Backdoor as watermark against model extraction
[pdf] -
CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples. NDSS 2020.
Adversarial Example to strengthen model stealing
[pdf] -
Teacher Model Fingerprinting Attacks Against Transfer Learning. USENIX Securiy 2022.
Teacher model fingerprinting
[pdf] -
StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning. ACM CCS 2022.
Model Stealing attack in encoder
[pdf]
- Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding. IEEE S&P 2021.
Encode secret message into LM
[pdf]
-
Proof-of-Learning: Definitions and Practice. IEEE S&P 2021.
Proof the ownership of model parameters
[pdf] -
SoK: How Robust is Image Classification Deep Neural Network Watermarking?. IEEE S&P 2022.
Survey of DNN watermarking
[pdf] -
Copy, Right? A Testing Framework for Copyright Protection of Deep Learning Models. IEEE S&P 2022.
Calculate model similarity by generating test examples
[pdf] [code] -
SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders. ACM CCS 2022.
Watermarking in encoder
[pdf]
-
Fawkes: Protecting Privacy against Unauthorized Deep Learning Models. USENIX Security 2020.
Protect Face Privacy
[pdf] [code] -
Automatically Detecting Bystanders in Photos to Reduce Privacy Risks. IEEE S&P 2020.
Detecting bystanders
[pdf] -
Characterizing and Detecting Non-Consensual Photo Sharing on Social Networks. IEEE S&P 2020.
Detecting Non-Consensual People in a photo
[pdf]
-
SWIFT: Super-fast and Robust Privacy-Preserving Machine Learning. USENIX Security 2021. [pdf]
-
BLAZE: Blazing Fast Privacy-Preserving Machine Learning. NDSS 2020. [pdf]
- Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning. NDSS 2020. [pdf]
- Cerebro: A Platform for Multi-Party Cryptographic Collaborative Learning. USENIX Security 2021. [pdf] [code]
- ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models. USENIX Security 2022.
Membership inference attack. Model inversion. Attribute inference. Model stealing
[pdf]
- Federated Boosted Decision Trees with Differential Privacy. ACM CCS 2022.
Federated Learning with Tree Model in DP
[pdf]
This list is mainly maintained by Ping He from NESA Lab.
We are very much welcome contributors for contributing this repository!
Markdown format
**Paper Name**. Conference Year. `Keywords` [[pdf](pdf_link)] [[code](code_link)]
To the extent possible under law, gnipping all copyright and related or neighboring rights to this repository.