Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoattack and APGDT #52

Open
Yeez-lee opened this issue May 4, 2021 · 3 comments
Open

Autoattack and APGDT #52

Yeez-lee opened this issue May 4, 2021 · 3 comments

Comments

@Yeez-lee
Copy link

Yeez-lee commented May 4, 2021

Hi, I am a primary learner and still feel little bit confused about the autoattack and target apgdt. (1) For the standard autoattack, do all images are attacked by 4 attacks respectively and calculate the average robust accuracy of 4 attacks after all? And in autoattack, are adversarial images for each attack saved respectively? (2) And for target apgdt, how the target label is found? I see that the number of target label is equal to the number of total class minus one for CIFAR-10. If I want to use it for the ImageNet, should I set n_target_classes = 999 or any number among 1-999? What's the principal for the target setting?

Looking forwards to your help! Thanks!

@fra31
Copy link
Owner

fra31 commented May 4, 2021

Hi,

there are a few options, described here. In particular, run_standard_evaluation uses an attack on some point only if none of the previous methods has been successful, which is equivalent to taking the worst-case over the four attacks. If instead you use run_standard_evaluation_individual all attacks are run on the full input batch, and the results are returned separately (this is more time-consuming).

The target classes for APGD-T are chosen, for each point, as the most likely (with highest logits) ones except for the correct one. We use 9 as standard value since the commonly used datasets have at least 10 classes. We use the same also for ImageNet to keep a constant computational budget and since we observed it to be effective, but in principle any value in [1, 999] can be used. Also, we use targeted losses, in the context of untargeted attacks, since these provide more diverse and, when multiple restarts are available, stronger attacks.

Hope this helps, and let me know if you have further questions!

@Yeez-lee
Copy link
Author

Yeez-lee commented May 4, 2021

Thanks for the response! I'm clearer than before but still have some questions.
(1) I know that when I use run_standard_evaluation_individual, 4 attacks' results will be saved separately. But for run_standard_evaluation, do you mean that only 4 attacks that successfully perturb the point together can represent the autoattack is successful? For example, given 100 clean images, untargeted APGD-CE (no restarts) successfully perturbs 80 ones,
targeted APGD-DLR (9 target classes) successfully perturbs 70 ones, targeted FAB (9 target classes), successfully perturbs 65 ones, and Square Attack (5000 queries) successfully perturbs 60 ones. Suppose that we have 45 ones that are successfully perturbed by all 4 attacks together and 90 ones that are successfully perturbed by at least one of 4 attacks , does autoattack regard 45 ones as the final perturbed images or 90 ones? Or it is the another case that given 100 clean images, each one is firstly attacked by untargeted APGD-CE (no restarts) and if it succeeds, the last 3 attacks will not work but if it fails, targeted APGD-DLR (9 target classes) will attack the data and so on so forth. If all 4 attacks successively fail, then the data is robust under the robust model. And the robust accuracy is these failures (robust data under model) / total number.
(2) I think APGDT uses only top-k ( k=9) labels to generate adversarial examples where k is 9 and I see the codes .

for target_class in range(2, self.n_target_classes + 2):
So why do we need the for loop instead of directedly using the case that we find the best target in the top-9 labels? And finally is only the best case saved or are all 9 cases saved for current for loop?
(3) In the codes,
def attack_single_run(self, x_in, y_in):

Why do we need to use the norm twice in Line 56 and 113? Normally, I just use the norm once to constrain the generated ones in the last few steps like the second one below.
if self.norm == 'Linf':

if self.norm == 'Linf':

Thanks for your help again!

@fra31
Copy link
Owner

fra31 commented May 6, 2021

We consider a point successfully misclassified if any of the four attacks finds an adversarial perturbation. In your example it would be 90 points. And it woks as you described that the attacks are run sequentially only on the points which haven't been successfully attacked by a previous one, and the robust accuracy is given as the percentage of points robust to all attacks.

About APGD-T, we use different target classes so that we have different losses as objective functions of the maximization scheme. Also in this case, only one adversarial image is saved, if found, for each input.

The first line you mentioned is for the generation of the random starting point, which should be in the feasible set.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants