AutoAttack Targeted Attack Issues #2206
ClarktheDarkShark
started this conversation in
General
Replies: 1 comment 5 replies
-
Hi @Christopher-d-clark5 I think you might have identified a bug. Did you implement a solution that you could share? |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I noticed 2 issues with AutoAttack for targeted attacks. I believe I have corrected them in my cloned directory.
The first is that the "sample_is_robust" array seems to indicate that the image associated with the index is 'False' once the image is misclassified. This does not take into account targeted attacks, where the 'y' value starts out as different from the model.predict() values. So it instantly ends the attack. This can be fixed by adding a condition for self.targeted...
The next is with this function:
target = check_and_transform_label_format(
targeted_labels[:, i], nb_classes=self.estimator.nb_classes
)
I do not fully understand what this is supposed to do, but it converted all of my one-hot encoded targeted values into targeting the same class for every image. Instead of targeting the classes I passed in, it targets the first index class (e.g. all of the one-hot trays look like [1, 0, 0, 0, ...])
It is certainly possible that I misunderstand some functionality, but just wanted to share what I found.
Beta Was this translation helpful? Give feedback.
All reactions