Options
A retrieval-based approach for diverse and image-specific adversary selection
ISSN
21926611
Date Issued
2020-06-01
DOI
10.1007/s13735-019-00177-8
Abstract
While deep neural network-based models have demonstrated compelling performance on various tasks in computer vision and other fields, they have been found to be vulnerable to adversarial attacks. Particularly, deep convolutional neural network (CNN)-based models can be easily fooled by adding a small quasi-imperceptible perturbation to the input, thus resulting in significant drop in prediction accuracies. While most of the previous works have focused on generating one adversary/perturbation per model, it was recently shown that it is possible to learn a continuous distribution over adversarial perturbations for a model. Building upon this work, in this paper, we propose a new technique for image-specific adversary selection and treat it as a retrieval task. The proposed technique utilizes a learned model that ranks the perturbations in a given set of perturbations based on their ability to fool with respect to a given sample. This model is a conditional determinantal point process model that also explicitly induces diversity among the retrieved perturbations. We conduct experiments on the ImageNet dataset using four popular deep CNN image classification models, and demonstrate that the proposed method consistently achieves state-of-the-art fooling rates.