370 likes | 400 Views
Adversarial Defense through Bayesian Neural Network. Feb.25.2019. Outline. Adversarial training Bayesian n eural network Different approaches Adv -BNN(2019) Bayesian Adversarial Learning(2018). Outline. Adversarial training Bayesian n eural network Different approaches Adv -BNN(2019)
E N D
Adversarial Defense through Bayesian Neural Network Feb.25.2019
Outline • Adversarial training • Bayesian neural network • Different approaches • Adv-BNN(2019) • Bayesian Adversarial Learning(2018)
Outline • Adversarial training • Bayesian neural network • Different approaches • Adv-BNN(2019) • Bayesian Adversarial Learning(2018)
Adversarial Training • Towards deep learning models resistant to adversarial attacks. ICLR 2018. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A.
Adversarial Training • Define threat model : L∞ ball • Objective function: • Original:. • Adversarial robust: . -> So if we can find a classifier with small robust loss, we can be sure that the classifier is robust to any perturbation in the set P. • How can we learn such a classifier?
Adversarial Training • Robust optimization (adversarial training): . • an outer minimization and an inner maximization. In addition, the inner maximization is non-concave, and the outer minimization is non-convex. • Outer minimization • Inner maximization
Adversarial Training • Inner maximization (attacking) • A constrained optimization problem -> Projected Gradient Descent (PGD) FGSM: PGD: : step size, which is crucial for PGD : project a point onto set
Adversarial Training • Cross-entropy loss values while creating an adversarial examples • randomize the starting point of the attack
Adversarial Training • The landscape of adversarial examples • randomize the starting point of the attack
Adversarial Training • Outer minimization , • SGD • Danskin’sTheorem -> : a constrained maximizer of the inner maximization problem
Adversarial Training • Consistently reduce the loss during training by applying SGD
Adversarial Training • Experiment results
Outline • Adversarial training • Bayesian neural network • Different approaches • Adv-BNN(2019) • Bayesian Adversarial Learning(2018)
Bayesian Neural Network • Bayesian Neural Networks (BNN) are NN whose weights or parameters are expressed as a distribution rather than a deterministic value and learned using Bayesian inference. -> learn complex non-linear functions from data and also express uncertainties.
Bayesian Neural Network • Deterministic NN • evaluate the maximum likelihood point estimates (MLE) -> Avoid overfitting by adding regularizations • evaluate maximum a posteriori point estimates (MAP) • Bayesian NN • estimate a full posterior distribution of the parameters
Outline • Adversarial training • Bayesian neural network • Different approaches • Adv-BNN(2019) • Bayesian Adversarial Learning(2018)
Adv-BNN Xuanqing Liu, Yao Li, Chongruo Wu, Cho-Jui Hsieh. ADV-BNN: IMPROVED ADVERSARIAL DEFENSE THROUGH ROBUST BAYESIAN NEURAL NETWORK. ICLR 2019
Adv-BNN • Motivation: • adversarial training can improve the robustness of neural networks • incorporate randomness can improve the robustness of neural networks • Approach: adversarial-trained Bayesian neural network
Adv-BNN • Goal: the posterior over the weights • Problem: the denominator involves a high dimensional integral • Solution: approximate the true posterior by a parametric distribution , and find the setting of the parameters that makes q close to the posterior.
Adv-BNN • Target: find a parametric distribution (variational posterior) that approximate the true posterior . • Objective function: ) = = = = minimize KL divergence maximize ELBO not depend on the variational distribution q(w)
Adv-BNN • Target: find a parametric distribution q_θ(w) (variational posterior) that approximate the true posterior p(w|D). • Objective function: • Maximize ELBO: ) = minimize KL divergence maximize ELBO not depend on the variational distribution q(w) -> difficult to compute gradient = = we cannot swap the gradient and the expectation, since the expectation is being taken with respect to the distribution that we are trying to differentiate! Take Monte Carlo estimates of the gradient by sampling from q.
Adv-BNN • Maximize ELBO: • Assume the joint distribution is fully factorizable, and each posterior is normal distribution with mean and standard deviation . • Assume the prior distribution is isometric Gaussian . -> closed form KL divergence: ->
Adv-BNN • ELBO on theoriginal dataset • ELBO with adversarial training
Adv-BNN • ELBO with adversarial training • Inner minimization • Outer maximization • SGD • 1st term • 2ndterm: Bayes by Backprop algorithm (Blundell et al., 2015).
Adv-BNN Take Monte Carlo estimates of the gradient by sampling from q. closed form not possible • ELBO with adversarial training • Outer maximization • 2nd term: Bayes by Backprop algorithm (Blundell et al., 2015). • approximate by sampling • calculate derivatives using the Reparameterization Trick (Kingma et al., 2015) -> reparameterize , where
Adv-BNN • For doing SGD iterations, rewrite • a weaker regularization for small dataset or large model
Adv-BNN • Will it be beneficial to have randomness in adversarial training? • Both randomized network and adversarial training can be viewed as different ways for controlling local Lipschitz constants of the loss surface around the image manifold
Adv-BNN • Will it be beneficial to have randomness in adversarial training? • Both randomized network and adversarial training can be viewed as different ways for controlling local Lipschitz constants of the loss surface around the image manifold • The robustness of a deep model is closely related to the gradient of the loss over the input, i.e. .
Adv-BNN • Will it be beneficial to have randomness in adversarial training? • Both randomized network and adversarial training can be viewed as different ways for controlling local Lipschitz constants of the loss surface around the image manifold • The robustness of a deep model is closely related to the gradient of the loss over the input, i.e. . • Adversarial training directly controls the local Lipschitz value on the training set original loss Lipschitz regularization
Adv-BNN • Will it be beneficial to have randomness in adversarial training? • Both randomized network and adversarial training can be viewed as different ways for controlling local Lipschitz constants of the loss surface around the image manifold • The robustness of a deep model is closely related to the gradient of the loss over the input, i.e. . • Randomness also helps control the Lipschitz constant original loss Lipschitz regularization
Adv-BNN • Experimental result: - Classification - Datasets: CIFAR-10, STL-10 and ImageNet-143 - measurement: classification accuracy under -PGD attack
Adv-BNN Robust classification relies on the controlled local Lipschitz value, while adversarial training does not generalize this property well enough to the test set; if we train the BNN with adversarial examples, the robustness increases by a large margin. • Accuracy under -PGD attack.
Adv-BNN • Comparing the testing accuracy under different levels of PGD attacks.
Adv-BNN • Left: tried different number of forward propagation and averaged the results to make prediction. • Right: testing accuracy stabilizes quickly as #PGDsteps goes greater than 20.