Adversarial Machine Learning: Towards More Robust and Secure Neural Networks

Présentation de Thomas Philippon, étudiant à la maîtrise en génie électrique, sous la supervision Christian Gagné, sur l’apprentissage automatique contradictoire et le potentiel de réseaux neuronaux plus robustes et sûrs. 

Date
  • 14 décembre 2021
Heure

15h00 à 16h00

Localisation

En ligne

Coûts

Gratuit

Résumé

Deep learning models are used for solving a variety of complex problems such as image classification, natural language processing and speech recognition. Over the years, research has allowed the development of more capable and accurate models. However, higher accuracy scores do not necessarily mean more secure and robust models. For instance, it was shown that state of the art neural network architectures for image classification tasks can be fooled by carefully crafted and non-perceptible perturbations added to the test data. A real-life attack was also performed on a self-driving car with traffic-sign detection. In this attack, researchers managed to fool the car into classifying a stop sign as a speed limit sign. These adversarial attacks raise concerns about the use of deep learning in security and safety critical systems, where performance guarantees are required. They also prove that efforts should be made to improve the robustness of deep learning models. Adversarial Machine Learning is a research field which focuses on the robustness of deep learning models. It includes designing adversarial attacks and developing defenses.

For this project, the focus is primarily on adversarial defenses. More specifically, improving the robustness of neural networks for image classification tasks using ensemble methods. In the first part, we assess the robustness of a current ensemble architecture named Error-Correcting-Output-Codes (ECOC). This architecture differs from traditional ensembles by encoding the outputs in specific codewords related to the true class labels instead of using one-hot encodings. In the second part, we propose a new method for improving the diversity in a neural network ensemble. We call it Adversarial Ensemble Feature Regularization (AERL). This method uses parallel Siamese networks and contrastive losses to force the features learned by members of an ensemble to be different, promoting the diversity among the ensemble members. This work is in progress, but we hope that it will help improving the robustness of neural network architectures such as ECOC ensembles.

Restons en contact!

Vous souhaitez être informé des nouvelles et activités de l'IID? Abonnez-vous dès maintenant à notre infolettre mensuelle.