Skip to content

A curated list of papers & resources on backdoor attacks and defenses in deep learning.

License

Notifications You must be signed in to change notification settings

ideasplus/Awesome-Backdoor-in-Deep-Learning

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

⚔🛡 Awesome Backdoor Attack and Defense

This repository contains a collection of papers and resources on backdoor attacks and defenses in deep learning.

Table of contents

📃Survey

Year Venue Paper
2023 arXiv Adversarial Machine Learning: A Systematic Survey of Backdoor Attack, Weight Attack and Adversarial Example
2022 TPAMI Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses
2022 TNNLS Backdoor Learning: A Survey
2022 IEEE Wireless Communications Backdoor Attacks and Defenses in Federated Learning: State-of-the-art, Taxonomy, and Future Directions
2021 Neurocomputing Defense against Neural Trojan Attacks: A Survey
2020 ISQED A Survey on Neural Trojans

⚔Attack

Computer Vision

Year Venue Paper Code
2023 ICML 2023 Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning
2023 CVPR 2023 Architectural Backdoors in Neural Networks
2023 CVPR 2023 How to Backdoor Diffusion Models? :octocat:
2023 CVPR 2023 Color Backdoor: A Robust Poisoning Attack in Color Space
2023 CVPR 2023 You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks?
2023 CVPR 2023 The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection :octocat:
2023 CVPR 2023 Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger
2023 ICLR 2023 Revisiting the Assumption of Latent Separability for Backdoor Defenses :octocat:
2023 ICLR 2023 Few-shot Backdoor Attacks via Neural Tangent Kernels
2023 ICLR 2023 The Dark Side of AutoML: Towards Architectural Backdoor Search :octocat:
2023 ICLR 2023 Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only
2023 SIGIR 2023 Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures
2022 AAAI 2022 Backdoor Attacks on the DNN Interpretation System
2022 AAAI 2022 Faster Algorithms for Weak Backdoors
2022 AAAI 2022 Finding Backdoors to Integer Programs: A Monte Carlo Tree Search Framework
2022 AAAI 2022 Hibernated Backdoor: A Mutual Information Empowered Backdoor Attack to Deep Neural Networks
2022 AAAI 2022 On Probabilistic Generalization of Backdoors in Boolean Satisfiability
2022 CCS 2022 Backdoor Attacks on Spiking NNs and Neuromorphic Datasets :octocat:
2022 CCS 2022 LoneNeuron: A Highly-Effective Feature-Domain Neural Trojan Using Invisible and Polymorphic Watermarks
2022 CVPR 2022 BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning
2022 CVPR2022 Backdoor Attacks on Self-Supervised Learning :octocat:
2022 CVPR 2022 DEFEAT: Deep Hidden Feature Backdoor Attacks by Imperceptible Perturbation and Latent Representation Constraints
2022 CVPR2022 Dual-Key Multimodal Backdoors for Visual Question Answering :octocat:
2022 CVPR 2022 FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis :octocat:
2022 CVPR 2022 Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks.
2022 ECCV 2022 An Invisible Black-Box Backdoor Attack Through Frequency Domain :octocat:
2022 ECCV 2022 RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN :octocat:
2022 EUROSP 2022 Dynamic Backdoor Attacks Against Machine Learning Models
2022 ICASSP 2022 Invisible and Efficient Backdoor Attacks for Compressed Deep Neural Networks
2022 ICASSP 2022 Stealthy Backdoor Attack with Adversarial Training
2022 ICASSP 2022 When Does Backdoor Attack Succeed in Image Reconstruction? A Study of Heuristics vs. Bi-Level Solution
2022 ICASSP 2022 Object-Oriented Backdoor Attack Against Image Captioning
2022 ICLR 2022 Few-Shot Backdoor Attacks on Visual Object Tracking :octocat:
2022 ICLR 2022 How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data
2022 ICLR 2022 Poisoning and Backdooring Contrastive Learning
2022 ICML2022 Neurotoxin: Durable Backdoors in Federated Learning
2022 IJCAI 2022 Data-Efficient Backdoor Attacks :octocat:
2022 IJCAI2022 Membership Inference via Backdooring :octocat:
2022 IJCAI 2022 Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation
2022 IJCAI 2022 Imperceptible Backdoor Attack: From Input Space to Feature Representation :octocat:
2022 MM 2022 Backdoor Attacks on Crowd Counting :octocat:
2022 MM 2022 BadHash: Invisible Backdoor Attacks against Deep Hashing with Clean Label
2022 NeurIPS 2022 Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class
2022 NeurIPS 2022 Handcrafted Backdoors in Deep Neural Networks
2022 NIPS2022 Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection :octocat:
2022 NeurIPS 2022 Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch :octocat:
2022 TDSC 2022 One-to-N & N-to-One: Two Advanced Backdoor Attacks Against Deep Learning Models
2022 TIFS 2022 Dispersed Pixel Perturbation-Based Imperceptible Backdoor Trigger for Image Classifier Models
2022 TIFS 2022 Stealthy Backdoors as Compression Artifacts :octocat:
2022 TIP 2022 Poison Ink: Robust and Invisible Backdoor Attack
2021 AAAI 2021 Backdoor Decomposable Monotone Circuits and Propagation Complete Encodings
2021 AAAI 2021 DeHiB: Deep Hidden Backdoor Attack on Semi-supervised Learning via Adversarial Perturbation
2021 AAAI 2021 Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification :octocat:
2021 CVPR 2021 Backdoor Attacks Against Deep Learning Systems in the Physical World
2021 ICCV 2021 A Backdoor Attack against 3D Point Cloud Classifiers :octocat:
2021 ICCV 2021 CLEAR: Clean-up Sample-Targeted Backdoor in Neural Networks
2021 ICCV 2021 Invisible Backdoor Attack with Sample-Specific Triggers :octocat:
2021 ICCV 2021 LIRA: Learnable, Imperceptible and Robust Backdoor Attacks :octocat:
2021 ICCV 2021 PointBA: Towards Backdoor Attacks in 3D Point Cloud :octocat:
2021 ICCV 2021 Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective :octocat:
2021 ICLR 2021 WaNet - Imperceptible Warping-based Backdoor Attack :octocat:
2021 ICML 2021 Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks :octocat:
2021 IJCAI 2021 Backdoor DNFs
2021 IJCAI 2021 BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning
2021 NeurIPS 2021 Backdoor Attack with Imperceptible Input and Latent Modification
2021 NeurIPS 2021 Excess Capacity and Backdoor Poisoning :octocat:
2021 TDSC 2021 Invisible Backdoor Attacks on Deep Neural Networks Via Steganography and Regularization
2021 TIFS 2021 Deep Neural Backdoor in Semi-Supervised Learning: Threats and Countermeasures
2021 KDD 2021 What Do You See?: Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors
2021 USS 2021 Blind Backdoors in Deep Learning Models :octocat:
2020 AAAI 2020 Hidden Trigger Backdoor Attacks :octocat:
2020 CCS 2020 Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features
2020 CIKM 2020 Can Adversarial Weight Perturbations Inject Neural Backdoors :octocat:
2020 CVPR 2020 Clean-Label Backdoor Attacks on Video Recognition Models :octocat:
2020 ECCV 2020 Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks :octocat:
2020 KDD 2020 An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks :octocat:
2020 MM 2020 GangSweep: Sweep out Neural Backdoors by GAN
2020 NeurIPS 2020 Input-Aware Dynamic Backdoor Attack :octocat:
2020 NeurIPS 2020 On the Trade-off between Adversarial and Backdoor Robustness :octocat:
2020 AISTATS 2020 How To Backdoor Federated Learning :octocat:
2020 ICLR 2020 DBA: Distributed Backdoor Attacks against Federated Learning :octocat:
2020 NeurIPS 2020 Attack of the Tails: Yes, You Really Can Backdoor Federated Learning :octocat:
2019 CCS 2019 Latent Backdoor Attacks on Deep Neural Networks :octocat:
2018 USS 2018 Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring :octocat:
2018 NDSS 2018 Trojaning Attack on Neural Networks :octocat:

Natural Language Processing

Year Venue Paper Code
2023 ICML 2023 Poisoning Language Models During Instruction Tuning :octocat:
2023 ICLR 2023 TrojText: Test-time Invisible Textual Trojan Insertion :octocat:
2022 ICLR 2022 BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models :octocat:
2022 IJCAI 2022 PPT: Backdoor Attacks on Pre-trained Models via Poisoned Prompt Tuning
2022 MM 2022 Opportunistic Backdoor Attacks: Exploring Human-imperceptible Vulnerabilities on Speech Recognition Systems
2022 NeurIPS 2022 BadPrompt: Backdoor Attacks on Continuous Prompts :octocat:
2022 NeurIPS 2022 Handcrafted Backdoors in Deep Neural Networks
2022 USS 2022 Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation
2022 USS 2022 FLAME: Taming Backdoors in Federated Learning
2021 ACL 2021 Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger :octocat:
2021 ACL 2021 Rethinking Stealthiness of Backdoor Attack against NLP Models :octocat:
2021 ACL 2021 Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution :octocat:
2021 CCS 2021 Backdoor Pre-trained Models Can Transfer to All :octocat:
2021 CCS 2021 Hidden Backdoors in Human-Centric Language Models :octocat:
2021 EMNLP 2021 Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
2021 EMNLP 2021 Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer :octocat:
2021 EUROSP 2021 Trojaning Language Models for Fun and Profit :octocat:
2021 ICASSP 2021 Backdoor Attack Against Speaker Verification :octocat:
2023 NDSS 2023 BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT

Graph Neural Networks

Year Venue Paper Code
2022 CCS 2022 Clean-label Backdoor Attack on Graph Neural Networks
2022 ICMR 2022 Camouflaged Poisoning Attack on Graph Neural Networks :octocat:
2022 RAID 2022 Transferable Graph Backdoor Attack
2021 SACMAT 2021 Backdoor Attacks to Graph Neural Networks :octocat:
2021 USS 2021 Graph Backdoor :octocat:
2021 WiseML 2021 Explainability-based Backdoor Attacks Against Graph Neural Network :octocat:

🛡Defense

Before-training

Year Venue Paper Code
2023 USENIX Security 2023 How to Sift Out a Clean Data Subset in the Presence of Data Poisoning? :octocat:

In-training

Year Venue Paper Code
2023 CVPR 2023 Backdoor Defense via Adaptively Splitting Poisoned Dataset :octocat:
2023 CVPR 2023 Backdoor Defense via Deconfounded Representation Learning :octocat:
2023 CVPR 2023 Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks :octocat:

Post-training

Year Venue Paper Code
2023 CVPR 2023 Backdoor Cleansing with Unlabeled Data :octocat:
2023 CVPR 2023 MEDIC: Remove Model Backdoors via Importance Driven Cloning
2023 CVPR 2023 Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency
2023 CVPR 2023 Detecting Backdoors in Pre-trained Encoders :octocat:
2023 CVPR 2023 Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning :octocat:
2023 CVPR 2023 Single Image Backdoor Inversion via Robust Smoothed Classifiers :octocat:
2023 CVPR 2023 Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs
2023 ICLR 2023 UNICORN: A Unified Backdoor Trigger Inversion Framework :octocat:
2023 ICLR 2023 Distilling Cognitive Backdoor Patterns within an Image :octocat:
2023 ICLR 2023 SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency :octocat:
2023 ICLR 2023 FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning :octocat:
2023 ICLR 2023 Towards Robustness Certification Against Universal Perturbations :octocat:
2023 ICLR 2023 Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks
2023 ACL 2023 Defending against Insertion-based Textual Backdoor Attacks via Attribution
2023 ACL 2023 Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
2023 USENIX Security 2023 PORE: Provably Robust Recommender Systems against Data Poisoning Attacks

⚙Toolbox

Name Venue Paper Code
BackdoorBench NeurIPS 2022 BackdoorBench: A Comprehensive Benchmark of Backdoor Learning :octocat:
OpenBackdoor NeurIPS 2022 A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks :octocat:
TrojanZoo EuroS&P 2022 TrojanZoo: Towards Unified, Holistic, and Practical Evaluation of Neural Backdoors :octocat:
BackdoorBox BackdoorBox: An Open-sourced Python Toolbox for Backdoor Attacks and Defenses :octocat:
BackdoorToolbox :octocat:

About

A curated list of papers & resources on backdoor attacks and defenses in deep learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%