Skip to main navigation Skip to search Skip to main content

Structured adversarial attack: Towards general implementation and better interpretability

  • Kaidi Xu
  • , Sijia Liu
  • , Pu Zhao
  • , Pin-Yu Chen
  • , Huan Zhang
  • , Quanfu Fan
  • , Deniz Erdogmus
  • , Yanzhi Wang
  • , Xue Lin

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

When generating adversarial examples to attack deep neural networks (DNNs), `p norm of the added perturbation is usually used to measure the similarity between original image and adversarial example. However, such adversarial attacks perturbing the raw input spaces may fail to capture structural information hidden in the input. This work develops a more general attack model, i.e., the structured attack (StrAttack), which explores group sparsity in adversarial perturbations by sliding a mask through images aiming for extracting key spatial structures. An ADMM (alternating direction method of multipliers)-based framework is proposed that can split the original problem into a sequence of analytically solvable subproblems and can be generalized to implement other attacking methods. Strong group sparsity is achieved in adversarial perturbations even with the same level of `p-norm distortion (p ∈ {1, 2, ∞}) as the state-of-the-art attacks. We demonstrate the effectiveness of StrAttack by extensive experimental results on MNIST, CIFAR-10 and ImageNet. We also show that StrAttack provides better interpretability (i.e., better correspondence with discriminative image regions) through adversarial saliency map (Papernot et al., 2016b) and class activation map (Zhou et al., 2016). Our code is available at https://github.com/KaidiXu/StrAttack. © 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved.
Original languageEnglish
Title of host publication7th International Conference on Learning Representations, ICLR 2019
PublisherInternational Conference on Learning Representations, ICLR
Publication statusPublished - May 2019
Externally publishedYes
Event7th International Conference on Learning Representations, ICLR 2019 - New Orleans, United States
Duration: 6 May 20199 May 2019
https://dblp.org/db/conf/iclr/iclr2019.html

Publication series

Name7th International Conference on Learning Representations, ICLR 2019

Conference

Conference7th International Conference on Learning Representations, ICLR 2019
PlaceUnited States
CityNew Orleans
Period6/05/199/05/19
Internet address

Bibliographical note

Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].

Funding

This work is supported by Air Force Research Laboratory FA8750-18-2-0058, and U.S. Office of Naval Research. Sijia Liu, Pin-Yu Chen, Huan Zhang and Quanfu Fan were supported by the MIT-IBM Watson Ai Lab, IBM Research.

Fingerprint

Dive into the research topics of 'Structured adversarial attack: Towards general implementation and better interpretability'. Together they form a unique fingerprint.

Cite this