Supplementary Materials
PuVAE: A Variational Autoencoder to Purify Adversarial Examples
Anonymous Authors
A Purified Images from PuVAE
Figures 1 and 2 are the visualizations of the experimental results on the MNIST and Fashion-MNIST datasets respectively. In
each image, the left is the image attacked with FGSM(0.3), and the right is the purified result of the left image using the PuVAE.
In both datasets, the adversarial noise is selectively removed while preserving the original shape of the image. Particularly, the
PuVAE purifies adversarial examples while maintaining the angle of the number 1 in the first row of Figure 1.
Figure 1: Examples of purified images from PuVAE on MNIST dataset.
Figure 2: Examples of purified images from PuVAE using Fashion-MNIST dataset.
B Architectures of Target Classifiers
The target classifiers are convolutional neural networks, and the details are shown in Table 1. We used A and B architectures for
MNIST and Fashion-MNIST, and C and D architectures for CIFAR-10.
Table 1: Neural network architectures used for target classifiers
A B C D
Conv(64, 5×5, 1) Conv(128, 3×3, 1) Conv(64, 5×5, 1) Conv(128, 3×3, 1)
ReLU ReLU ReLU ReLU
Conv(64, 5×5, 2) Conv(64, 3×3, 2) Conv(128, 5×5, 1) Conv(256, 3×3, 1)
ReLU ReLU ReLU ReLU
Dropout(0.25) Dropout(0.25) Conv(256, 5×5, 2) Conv(512, 3×3, 2)
FC(128) FC(128) ReLU ReLU
ReLU ReLU Dropout(0.25) Dropout(0.25)
Dropout(0.5) Dropout(0.5) FC(512) FC(512)
FC(10) + Softmax FC(10) + Softmax ReLU ReLU
Dropout(0.5) Dropout(0.5)
FC(256) FC(10) + Softmax
ReLU
FC(10) + Softmax