Evaluating Concept Explanations for CNNs Under Adversarial Image Transformations

Ugochukwu Ejike Akpudo; Yongsheng Gao; Andrew Lewis; Edwin Kwadwo Tenagyei; Yi Liao; Jun Zhou

Authors

Ugochukwu Ejike Akpudo
Yongsheng Gao
Andrew Lewis
Edwin Kwadwo Tenagyei
Yi Liao
Jun Zhou

Keywords:

Convolutional neural networks, adversarial attacks, concept explanations, fidelity, image transformation

Abstract

Concept-based explainers for convolutional neural networks (CNNs) provide human-understandable explanations by revealing what the CNN sees, rather than merely indicating where it looked. However, their performance is limited by the reducer at its core and adversarial attacks. Although CNN classification performance may be enhanced by some image transformations in small amounts whereas intense image transformations can cause noticeable variations to CNN predictions, it is uncertain how explainers perform in such cases. This paper investigates the performance of state-of-the-art concept-based explainers at different levels of adversarial attacks for the first time. We achieve this by exploring different image transformations as adversarial attacks, including Gaussian noise, elastic transform, rotation, and contrast on the ILSVRC2012 dataset. Our study shows that image transformation techniques altering only image coordinates have little impact on classifier and explainer performance, whereas methods modifying image pixels, such as elastic transform and contrast, significantly affect performance, akin to introducing Gaussian noise. Our work underscores the significance of scrutinizing explainers during their development and adoption for CNNs.

https://doi.org/10.59200/ICONIC.2024.006

Evaluating Concept Explanations for CNNs Under Adversarial Image Transformations

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section