⚔🛡 Awesome Backdoor Attack and Defense

This repository contains a collection of papers and resources on backdoor attacks and defenses in deep learning.

📃Survey

Year	Venue	Paper
2023	arXiv	Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example
2022	TPAMI	Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
2022	TNNLS	Backdoor Learning: A Survey
2022	IEEE Wireless Communications	Backdoor Attacks and Defenses in Federated Learning: State-of-the-art, Taxonomy, and Future Directions
2021	Neurocomputing	Defense against Neural Trojan Attacks: A Survey
2020	ISQED	A Survey on Neural Trojans

⚔Attack

Computer Vision

Year	Venue	Paper
2023	ICML 2023	Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning
2023	CVPR 2023	Architectural Backdoors in Neural Networks
2023	CVPR 2023	How to Backdoor Diffusion Models?
2023	CVPR 2023	Color Backdoor: A Robust Poisoning Attack in Color Space
2023	CVPR 2023	You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks?
2023	CVPR 2023	The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection
2023	CVPR 2023	Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
2023	ICLR 2023	Revisiting the Assumption of Latent Separability for Backdoor Defenses
2023	ICLR 2023	Few-shot Backdoor Attacks via Neural Tangent Kernels
2023	ICLR 2023	The Dark Side of AutoML: Towards Architectural Backdoor Search
2023	ICLR 2023	Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only
2023	SIGIR 2023	Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures
2022	AAAI 2022	Backdoor Attacks on the DNN Interpretation System
2022	AAAI 2022	Faster Algorithms for Weak Backdoors
2022	AAAI 2022	Finding Backdoors to Integer Programs: A Monte Carlo Tree Search Framework
2022	AAAI 2022	Hibernated Backdoor: A Mutual Information Empowered Backdoor Attack to Deep Neural Networks
2022	AAAI 2022	On Probabilistic Generalization of Backdoors in Boolean Satisfiability
2022	CCS 2022	Backdoor Attacks on Spiking NNs and Neuromorphic Datasets
2022	CCS 2022	LoneNeuron: A Highly-Effective Feature-Domain Neural Trojan Using Invisible and Polymorphic Watermarks
2022	CVPR 2022	BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning
2022	CVPR2022	Backdoor Attacks on Self-Supervised Learning
2022	CVPR 2022	DEFEAT: Deep Hidden Feature Backdoor Attacks by Imperceptible Perturbation and Latent Representation Constraints
2022	CVPR2022	Dual-Key Multimodal Backdoors for Visual Question Answering
2022	CVPR 2022	FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis
2022	CVPR 2022	Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks.
2022	ECCV 2022	An Invisible Black-Box Backdoor Attack Through Frequency Domain
2022	ECCV 2022	RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN
2022	EUROSP 2022	Dynamic Backdoor Attacks Against Machine Learning Models
2022	ICASSP 2022	Invisible and Efficient Backdoor Attacks for Compressed Deep Neural Networks
2022	ICASSP 2022	Stealthy Backdoor Attack with Adversarial Training
2022	ICASSP 2022	When Does Backdoor Attack Succeed in Image Reconstruction? A Study of Heuristics vs. Bi-Level Solution
2022	ICASSP 2022	Object-Oriented Backdoor Attack Against Image Captioning
2022	ICLR 2022	Few-Shot Backdoor Attacks on Visual Object Tracking
2022	ICLR 2022	How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data
2022	ICLR 2022	Poisoning and Backdooring Contrastive Learning
2022	ICML2022	Neurotoxin: Durable Backdoors in Federated Learning
2022	IJCAI 2022	Data-Efficient Backdoor Attacks
2022	IJCAI2022	Membership Inference via Backdooring
2022	IJCAI 2022	Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation
2022	IJCAI 2022	Imperceptible Backdoor Attack: From Input Space to Feature Representation
2022	MM 2022	Backdoor Attacks on Crowd Counting
2022	MM 2022	BadHash: Invisible Backdoor Attacks against Deep Hashing with Clean Label
2022	NeurIPS 2022	Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class
2022	NeurIPS 2022	Handcrafted Backdoors in Deep Neural Networks
2022	NIPS2022	Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection
2022	NeurIPS 2022	Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch
2022	TDSC 2022	One-to-N & N-to-One: Two Advanced Backdoor Attacks Against Deep Learning Models
2022	TIFS 2022	Dispersed Pixel Perturbation-Based Imperceptible Backdoor Trigger for Image Classifier Models
2022	TIFS 2022	Stealthy Backdoors as Compression Artifacts
2022	TIP 2022	Poison Ink: Robust and Invisible Backdoor Attack
2021	AAAI 2021	Backdoor Decomposable Monotone Circuits and Propagation Complete Encodings
2021	AAAI 2021	DeHiB: Deep Hidden Backdoor Attack on Semi-supervised Learning via Adversarial Perturbation
2021	AAAI 2021	Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification
2021	CVPR 2021	Backdoor Attacks Against Deep Learning Systems in the Physical World
2021	ICCV 2021	A Backdoor Attack against 3D Point Cloud Classifiers
2021	ICCV 2021	CLEAR: Clean-up Sample-Targeted Backdoor in Neural Networks
2021	ICCV 2021	Invisible Backdoor Attack with Sample-Specific Triggers
2021	ICCV 2021	LIRA: Learnable, Imperceptible and Robust Backdoor Attacks
2021	ICCV 2021	PointBA: Towards Backdoor Attacks in 3D Point Cloud
2021	ICCV 2021	Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective
2021	ICLR 2021	WaNet - Imperceptible Warping-based Backdoor Attack
2021	ICML 2021	Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks
2021	IJCAI 2021	Backdoor DNFs
2021	IJCAI 2021	BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning
2021	NeurIPS 2021	Backdoor Attack with Imperceptible Input and Latent Modification
2021	NeurIPS 2021	Excess Capacity and Backdoor Poisoning
2021	TDSC 2021	Invisible Backdoor Attacks on Deep Neural Networks Via Steganography and Regularization
2021	TIFS 2021	Deep Neural Backdoor in Semi-Supervised Learning: Threats and Countermeasures
2021	KDD 2021	What Do You See?: Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors
2021	USS 2021	Blind Backdoors in Deep Learning Models
2020	AAAI 2020	Hidden Trigger Backdoor Attacks
2020	CCS 2020	Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features
2020	CIKM 2020	Can Adversarial Weight Perturbations Inject Neural Backdoors
2020	CVPR 2020	Clean-Label Backdoor Attacks on Video Recognition Models
2020	ECCV 2020	Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks
2020	KDD 2020	An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks
2020	MM 2020	GangSweep: Sweep out Neural Backdoors by GAN
2020	NeurIPS 2020	Input-Aware Dynamic Backdoor Attack
2020	NeurIPS 2020	On the Trade-off between Adversarial and Backdoor Robustness
2020	AISTATS 2020	How To Backdoor Federated Learning
2020	ICLR 2020	DBA: Distributed Backdoor Attacks against Federated Learning
2020	NeurIPS 2020	Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
2019	CCS 2019	Latent Backdoor Attacks on Deep Neural Networks
2018	USS 2018	Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring
2018	NDSS 2018	Trojaning Attack on Neural Networks

Natural Language Processing

Year	Venue	Paper
2023	ICML 2023	Poisoning Language Models During Instruction Tuning
2023	ICLR 2023	TrojText: Test-time Invisible Textual Trojan Insertion
2022	ICLR 2022	BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models
2022	IJCAI 2022	PPT: Backdoor Attacks on Pre-trained Models via Poisoned Prompt Tuning
2022	MM 2022	Opportunistic Backdoor Attacks: Exploring Human-imperceptible Vulnerabilities on Speech Recognition Systems
2022	NeurIPS 2022	BadPrompt: Backdoor Attacks on Continuous Prompts
2022	NeurIPS 2022	Handcrafted Backdoors in Deep Neural Networks
2022	USS 2022	Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation
2022	USS 2022	FLAME: Taming Backdoors in Federated Learning
2021	ACL 2021	Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger
2021	ACL 2021	Rethinking Stealthiness of Backdoor Attack against NLP Models
2021	ACL 2021	Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution
2021	CCS 2021	Backdoor Pre-trained Models Can Transfer to All
2021	CCS 2021	Hidden Backdoors in Human-Centric Language Models
2021	EMNLP 2021	Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
2021	EMNLP 2021	Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer
2021	EUROSP 2021	Trojaning Language Models for Fun and Profit
2021	ICASSP 2021	Backdoor Attack Against Speaker Verification
2023	NDSS 2023	BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT

Graph Neural Networks

Year	Venue	Paper
2022	CCS 2022	Clean-label Backdoor Attack on Graph Neural Networks
2022	ICMR 2022	Camouflaged Poisoning Attack on Graph Neural Networks
2022	RAID 2022	Transferable Graph Backdoor Attack
2021	SACMAT 2021	Backdoor Attacks to Graph Neural Networks
2021	USS 2021	Graph Backdoor
2021	WiseML 2021	Explainability-based Backdoor Attacks Against Graph Neural Network

🛡Defense

Before-training

Year	Venue	Paper	Code
2023	USENIX Security 2023	How to Sift Out a Clean Data Subset in the Presence of Data Poisoning?

In-training

Year	Venue	Paper
2023	CVPR 2023	Backdoor Defense via Adaptively Splitting Poisoned Dataset
2023	CVPR 2023	Backdoor Defense via Deconfounded Representation Learning
2023	CVPR 2023	Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks

Post-training

Year	Venue	Paper
2023	CVPR 2023	Backdoor Cleansing with Unlabeled Data
2023	CVPR 2023	MEDIC: Remove Model Backdoors via Importance Driven Cloning
2023	CVPR 2023	Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency
2023	CVPR 2023	Detecting Backdoors in Pre-trained Encoders
2023	CVPR 2023	Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning
2023	CVPR 2023	Single Image Backdoor Inversion via Robust Smoothed Classifiers
2023	CVPR 2023	Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs
2023	ICLR 2023	UNICORN: A Unified Backdoor Trigger Inversion Framework
2023	ICLR 2023	Distilling Cognitive Backdoor Patterns within an Image
2023	ICLR 2023	SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency
2023	ICLR 2023	FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning
2023	ICLR 2023	Towards Robustness Certification Against Universal Perturbations
2023	ICLR 2023	Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks
2023	ACL 2023	Defending against Insertion-based Textual Backdoor Attacks via Attribution
2023	ACL 2023	Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
2023	USENIX Security 2023	PORE: Provably Robust Recommender Systems against Data Poisoning Attacks

⚙Toolbox

Name	Venue	Paper
BackdoorBench	NeurIPS 2022	BackdoorBench: A Comprehensive Benchmark of Backdoor Learning
OpenBackdoor	NeurIPS 2022	A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks
TrojanZoo	EuroS&P 2022	TrojanZoo: Towards Unified, Holistic, and Practical Evaluation of Neural Backdoors
BackdoorBox		BackdoorBox: An Open-sourced Python Toolbox for Backdoor Attacks and Defenses
BackdoorToolbox

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
script.py		script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚔🛡 Awesome Backdoor Attack and Defense

Table of contents

📃Survey

⚔Attack

Computer Vision

Natural Language Processing

Graph Neural Networks

🛡Defense

Before-training

In-training

Post-training

⚙Toolbox

About

Releases

Packages

Languages

License

ideasplus/Awesome-Backdoor-in-Deep-Learning

Folders and files

Latest commit

History

Repository files navigation

⚔🛡 Awesome Backdoor Attack and Defense

Table of contents

📃Survey

⚔Attack

Computer Vision

Natural Language Processing

Graph Neural Networks

🛡Defense

Before-training

In-training

Post-training

⚙Toolbox

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages