Under Progress

Limited-pixel-attack-reinforcement-learning

Novel method of performing limited pixel adversarial attacks using Reinforcement Learning. The number of pixels that can be changed, i.e. the L0 norm calculated between the original image and the generated adversarial sample is constrained.

The proposed architecture has three parts

A network for estimating the vulnerability of each pixel
A sampling layer to select the vulnerable pixels.
A network to generate perturbations.

The sampling layer outputs a mask of 0s and 1s that represents the vulnerable pixels that are to be perturbed. Since sampling is a non-differentiable operation, the network that estimates the pixel vulnerability cannot be trained using backprop. Instead, a reinforcement learning algorithm (REINFORCE) is used.

A modified Adversarial GAN (AdvGAN) is used to generate perturbations. The input to the AdvGAN is the original image and the pixel mask, and the perturbation obtained as the output of the AdvGAN is multiplied with the pixel mask before adding it to the original image so that only the selected pixels are perturbed. The AdvGAN can be trained using Backprop.

The current issue with this architecture is that although the reward function for reinforcement learning penalizes every pixel modified, the number of pixels modified is still high (~48%).

Efforts are being made to resolve this problem by adding loss functions to restrict the perturbations generated by the AdvGAN to the regions in the pixel mask with the value one. Changes to the reward functions are also being explored.

Results

Original Images with Class Labels

Generated Adversarial Samples with Misclassified Class Labels

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
advgan_pretrain_functions.py		advgan_pretrain_functions.py
discriminator.py		discriminator.py
generator_mask.py		generator_mask.py
lp_pretrained_attack_func.py		lp_pretrained_attack_func.py
mask_advgan_pretrain_functions.py		mask_advgan_pretrain_functions.py
orig_generator.py		orig_generator.py
original.png		original.png
original_advgan.png		original_advgan.png
pixel_valuation.py		pixel_valuation.py
pretrain_mask_advgan.py		pretrain_mask_advgan.py
pretrained_modified.png		pretrained_modified.png
resnet_model.py		resnet_model.py
test_mask_advgan_pretrained.py		test_mask_advgan_pretrained.py
test_orig_advgan.py		test_orig_advgan.py
test_rl_pre_attack.py		test_rl_pre_attack.py
train_orig_advgan.py		train_orig_advgan.py
train_rl_attack_pretrained.py		train_rl_attack_pretrained.py
train_rl_attack_scratch.py		train_rl_attack_scratch.py
train_target_resnet_model.py		train_target_resnet_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Under Progress

Limited-pixel-attack-reinforcement-learning

Results

About

Releases

Packages

Languages

parthsuresh/limited-pixel-attack-reinforcement-learning

Folders and files

Latest commit

History

Repository files navigation

Under Progress

Limited-pixel-attack-reinforcement-learning

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages