Publications | Jonathan Peck

2024

Robust width: A lightweight and certifiable adversarial defense

Jonathan Peck, and Bart Goossens

arXiv preprint arXiv:2405.15971, 2024

Abs Bib

Deep neural networks are vulnerable to so-called adversarial examples: inputs which are intentionally constructed to cause the model to make incorrect predictions or classifications. Adversarial examples are often visually indistinguishable from natural data samples, making them hard to detect. As such, they pose significant threats to the reliability of deep learning systems. In this work, we study an adversarial defense based on the robust width property (RWP), which was recently introduced for compressed sensing. We show that a specific input purification scheme based on the RWP gives theoretical robustness guarantees for images that are approximately sparse. The defense is easy to implement and can be applied to any existing model without additional training or finetuning. We empirically validate the defense on ImageNet against L^∞perturbations at perturbation budgets ranging from 4/255 to 32/255. In the black-box setting, our method significantly outperforms the state-of-the-art, especially for large perturbations. In the white-box setting, depending on the choice of base classifier, we closely match the state of the art in robust ImageNet classification while avoiding the need for additional data, larger models or expensive adversarial training routines. Our code is available at https://github.com/peck94/robust-width-defense.
@article{peck2024robust, title = {Robust width: A lightweight and certifiable adversarial defense}, author = {Peck, Jonathan and Goossens, Bart}, journal = {arXiv preprint arXiv:2405.15971}, year = {2024}, }
An Introduction to Adversarially Robust Deep Learning

Jonathan Peck, Bart Goossens, and Yvan Saeys

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

Abs Bib

The widespread success of deep learning in solving machine learning problems has fueled its adoption in many fields, from speech recognition to drug discovery and medical imaging. However, deep learning systems are extremely fragile: imperceptibly small modifications to their input data can cause the models to produce erroneous output. It is very easy to generate such adversarial perturbations even for state-of-the-art models, yet immunization against them has proven exceptionally challenging. Despite over a decade of research on this problem, our solutions are still far from satisfactory and many open problems remain. In this work, we survey some of the most important contributions in the field of adversarial robustness. We pay particular attention to the reasons why past attempts at improving robustness have been insufficient, and we identify several promising areas for future research.
@article{peck2024introduction, author = {Peck, Jonathan and Goossens, Bart and Saeys, Yvan}, journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, title = {An Introduction to Adversarially Robust Deep Learning}, year = {2024}, volume = {46}, number = {4}, pages = {2071-2090}, keywords = {Perturbation methods;Deep learning;Surveys;Robustness;Mathematical models;Image recognition;Predictive models;Adversarial machine learning;computer vision;deep learning}, doi = {10.1109/TPAMI.2023.3331087}, }

2023

Improving the robustness of deep neural networks to adversarial perturbations

Jonathan Peck

Ghent University, 2023

Abs Bib

Over the past decade, artificial neural networks have ushered in a revolution in science and society. Nowadays, neural networks are applied to various problems such as speech recognition on smartphones, self-driving cars, malware detection and even assisting doctors in making medical diagnoses. Often, neural networks achieve accuracy scores that rival or even surpass human domain experts. It is all the more surprising, then, that these same networks can be misled by minor manipulations that are invisible to the human eye. A neural network trained to identify lung cancer from MRI images, for example, can come to an incorrect diagnosis when a single pixel in the image is deliberately manipulated in a very specific way. Despite the fact that these perturbations are of no consequence to the underlying task and often would go unnoticed by human experts, neural networks tend to be incredibly sensitive to them. Developing defense methods which make our models resilient to such attacks therefore becomes paramount. In this work, I propose four methods that can be employed under different circumstances to protect systems based on artificial intelligence against adversarial attacks.
@phdthesis{peck2023improving, title = {Improving the robustness of deep neural networks to adversarial perturbations}, author = {Peck, Jonathan}, year = {2023}, school = {Ghent University}, }

2022

Distilling Deep RL Models Into Interpretable Neuro-Fuzzy Systems

Arne Gevaert, Jonathan Peck, and Yvan Saeys

2022

Abs Bib

Deep Reinforcement Learning uses a deep neural network to encode a policy, which achieves very good performance in a wide range of applications but is widely regarded as a black box model. A more interpretable alternative to deep networks is given by neuro-fuzzy controllers. Unfortunately, neuro-fuzzy controllers often need a large number of rules to solve relatively simple tasks, making them difficult to interpret. In this work, we present an algorithm to distill the policy from a deep Q-network into a compact neuro-fuzzy controller. This allows us to train compact neuro-fuzzy controllers through distillation to solve tasks that they are unable to solve directly, combining the flexibility of deep reinforcement learning and the interpretability of compact rule bases. We demonstrate the algorithm on three well-known environments from OpenAI Gym, where we nearly match the performance of a DQN agent using only 2 to 6 fuzzy rules.
@misc{gevaert2022distilling, title = {Distilling Deep RL Models Into Interpretable Neuro-Fuzzy Systems}, author = {Gevaert, Arne and Peck, Jonathan and Saeys, Yvan}, year = {2022}, eprint = {2209.03357}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, doi = {10.48550/arXiv.2209.03357}, }

2020

Inline Detection of DGA Domains Using Side Information

Raaghavi Sivaguru, Jonathan Peck, Femi Olumofin, and 2 more authors

IEEE Access, 2020

Abs Bib

Malware applications typically use a command and control (C&C) server to manage bots to perform malicious activities. Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names that can be used to establish a communication between an infected bot and the C&C server. In recent years, machine learning based systems have been widely used to detect DGAs. There are several well known state-of-the-art classifiers in the literature that can detect DGA domain names in real-time applications with high predictive performance. However, these DGA classifiers are highly vulnerable to adversarial attacks in which adversaries purposely craft domain names to evade DGA detection classifiers. In our work, we focus on hardening DGA classifiers against adversarial attacks. To this end, we train and evaluate state-of-the-art deep learning and random forest (RF) classifiers for DGA detection using side information that is harder for adversaries to manipulate than the domain name itself. Additionally, the side information features are selected such that they are easily obtainable in practice to perform inline DGA detection. The performance and robustness of these models is assessed by exposing them to one day of real-traffic data as well as domains generated by adversarial attack algorithms. We found that the DGA classifiers that rely on both the domain name and side information have high performance and are more robust against adversaries.
@article{9153925, author = {Sivaguru, Raaghavi and Peck, Jonathan and Olumofin, Femi and Nascimento, Anderson and De Cock, Martine}, journal = {IEEE Access}, title = {Inline Detection of DGA Domains Using Side Information}, year = {2020}, volume = {8}, number = {}, pages = {141910-141922}, doi = {10.1109/ACCESS.2020.3013494}, }
Calibrated Multi-probabilistic Prediction as a Defense Against Adversarial Attacks

Jonathan Peck, Bart Goossens, and Yvan Saeys

In Artificial Intelligence and Machine Learning, 2020

Abs Bib

Machine learning (ML) classifiers—in particular deep neural networks—are surprisingly vulnerable to so-called adversarial examples. These are small modifications of natural inputs which drastically alter the output of the model even though no relevant features appear to have been modified. One explanation that has been offered for this phenomenon is the calibration hypothesis, which states that the probabilistic predictions of typical ML models are miscalibrated. As a result, classifiers can often be very confident in completely erroneous predictions. Based on this idea, we propose the MultIVAP algorithm for defending arbitrary ML models against adversarial examples. Our method is inspired by the inductive Venn-ABERS predictor (IVAP) technique from the field of conformal prediction. The IVAP enjoys the theoretical guarantee that its predictions will be perfectly calibrated, thus addressing the problem of miscalibration. Experimental results on five image classification tasks demonstrate empirically that the MultIVAP has a reasonably small computational overhead and provides significantly higher adversarial robustness without sacrificing accuracy on clean data. This increase in robustness is observed both against defense-oblivious attacks as well as a defense-aware white-box attack specifically designed for the MultIVAP.
@inproceedings{10.1007/978-3-030-65154-1_6, author = {Peck, Jonathan and Goossens, Bart and Saeys, Yvan}, editor = {Bogaerts, Bart and Bontempi, Gianluca and Geurts, Pierre and Harley, Nick and Lebichot, Bertrand and Lenaerts, Tom and Louppe, Gilles}, title = {Calibrated Multi-probabilistic Prediction as a Defense Against Adversarial Attacks}, booktitle = {Artificial Intelligence and Machine Learning}, year = {2020}, publisher = {Springer International Publishing}, address = {Cham}, pages = {85--125}, isbn = {978-3-030-65154-1}, doi = {10.1007/978-3-030-65154-1_6}, }
Regional image perturbation reduces L_p norms of adversarial examples while maintaining model-to-model transferability

Utku Özbulak, Jonathan Peck, Wesley De Neve, and 3 more authors

In the 37th International Conference on Machine Learning (ICML 2020), Proceedings, 2020

Abs Bib

Regional adversarial attacks often rely on complicated methods for generating adversarial perturbations, making it hard to compare their efficacy against well-known attacks. In this study, we show that effective regional perturbations can be generated without resorting to complex methods. We develop a very simple regional adversarial perturbation attack method using cross-entropy sign, one of the most commonly used losses in adversarial machine learning. Our experiments on ImageNet with multiple models reveal that, on average, 76% of the generated adversarial examples maintain model-to-model transferability when the perturbation is applied to local image regions. Depending on the selected region, these localized adversarial examples require significantly less L_p norm distortion (for p ∈{0,2,∞}) compared to their non-local counterparts. These localized attacks therefore have the potential to undermine defenses that claim robustness under the aforementioned norms.
@inproceedings{8670733, author = {Özbulak, Utku and Peck, Jonathan and De Neve, Wesley and Goossens, Bart and Saeys, Yvan and Van Messem, Arnout}, booktitle = {{the 37th International Conference on Machine Learning (ICML 2020), Proceedings}}, language = {{eng}}, location = {{Virtual Conference Only}}, pages = {{9}}, title = {Regional image perturbation reduces $L_p$ norms of adversarial examples while maintaining model-to-model transferability}, year = {2020}, }

2019

CharBot: A Simple and Effective Method for Evading DGA Classifiers

Jonathan Peck, Claire Nie, Raaghavi Sivaguru, and 5 more authors

IEEE Access, 2019

Abs Bib

Domain generation algorithms (DGAs) are commonly leveraged by malware to create lists of domain names which can be used for command and control (C&C) purposes. Approaches based on machine learning have recently been developed to automatically detect generated domain names in real-time. In this work, we present a novel DGA called CharBot which is capable of producing large numbers of unregistered domain names that are not detected by state-of-the-art classifiers for real-time detection of DGAs, including the recently published methods FANCI (a random forest based on human-engineered features) and LSTM.MI (a deep learning approach). CharBot is very simple, effective and requires no knowledge of the targeted DGA classifiers. We show that retraining the classifiers on CharBot samples is not a viable defense strategy. We believe these findings show that DGA classifiers are inherently vulnerable to adversarial attacks if they rely only on the domain name string to make a decision. Designing a robust DGA classifier may, therefore, necessitate the use of additional information besides the domain name alone. To the best of our knowledge, CharBot is the simplest and most efficient black-box adversarial attack against DGA classifiers proposed to date.
@article{8756038, author = {Peck, Jonathan and Nie, Claire and Sivaguru, Raaghavi and Grumer, Charles and Olumofin, Femi and Yu, Bin and Nascimento, Anderson and De Cock, Martine}, journal = {IEEE Access}, title = {CharBot: A Simple and Effective Method for Evading DGA Classifiers}, year = {2019}, volume = {7}, number = {}, pages = {91759-91771}, doi = {10.1109/ACCESS.2019.2927075}, }
Hardening DGA Classifiers Utilizing IVAP

Charles Grumer, Jonathan Peck, Femi Olumofin, and 2 more authors

In 2019 IEEE International Conference on Big Data (Big Data), 2019

Abs Bib

Domain Generation Algorithms (DGAs) are used by malware to generate a deterministic set of domains, usually by utilizing a pseudo-random seed. A malicious botmaster can establish connections between their command-and-control center (C&C) and any malware-infected machines by registering domains that will be DGA-generated given a specific seed, rendering traditional domain blacklisting ineffective. Given the nature of this threat, the real-time detection of DGA domains based on incoming DNS traffic is highly important. The use of neural network machine learning (ML) models for this task has been well-studied, but there is still substantial room for improvement. In this paper, we propose to use Inductive Venn-Abers predictors (IVAPs) to calibrate the output of existing ML models for DGA classification. The IVAP is a computationally efficient procedure which consistently improves the predictive accuracy of classifiers at the expense of not offering predictions for a small subset of inputs and consuming an additional amount of training data.
@inproceedings{9006398, author = {Grumer, Charles and Peck, Jonathan and Olumofin, Femi and Nascimento, Anderson and Cock, Martine De}, booktitle = {2019 IEEE International Conference on Big Data (Big Data)}, title = {Hardening DGA Classifiers Utilizing IVAP}, year = {2019}, volume = {}, number = {}, pages = {6063-6065}, doi = {10.1109/BigData47090.2019.9006398}, }
Detecting adversarial examples with inductive Venn-ABERS predictors

Jonathan Peck, Bart Goossens, and Yvan Saeys

In Proceedings of the 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2019), 2019

Abs Bib

Inductive Venn-ABERS predictors (IVAPs) are a type of probabilistic predictors with the theoretical guarantee that their predictions are perfectly calibrated. We propose to exploit this calibration property for the detection of adversarial examples in binary classification tasks. By rejecting predictions if the uncertainty of the IVAP is too high, we obtain an algorithm that is both accurate on the original test set and significantly more robust to adversarial examples. The method appears to be competitive to the state of the art in adversarial defense, both in terms of robustness as well as scalability
@inproceedings{8622378, author = {Peck, Jonathan and Goossens, Bart and Saeys, Yvan}, booktitle = {Proceedings of the 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2019)}, isbn = {9782875870650}, keywords = {machine learning,adversarial robustness,conformal prediction}, language = {eng}, location = {Bruges, Belgium}, pages = {143--148}, publisher = {ESANN}, title = {Detecting adversarial examples with inductive Venn-ABERS predictors}, year = {2019}, }

2017

Lower bounds on the robustness to adversarial perturbations

Jonathan Peck, Joris Roels, Bart Goossens, and 1 more author

In Advances in Neural Information Processing Systems, 2017

Abs Bib

The input-output mappings learned by state-of-the-art neural networks are significantly discontinuous. It is possible to cause a neural network used for image recognition to misclassify its input by applying very specific, hardly perceptible perturbations to the input, called adversarial perturbations. Many hypotheses have been proposed to explain the existence of these peculiar samples as well as several methods to mitigate them. A proven explanation remains elusive, however. In this work, we take steps towards a formal characterization of adversarial perturbations by deriving lower bounds on the magnitudes of perturbations necessary to change the classification of neural networks. The bounds are experimentally verified on the MNIST and CIFAR-10 data sets.
@inproceedings{NIPS2017_298f95e1, author = {Peck, Jonathan and Roels, Joris and Goossens, Bart and Saeys, Yvan}, booktitle = {Advances in Neural Information Processing Systems}, editor = {Guyon, I. and Luxburg, U. Von and Bengio, S. and Wallach, H. and Fergus, R. and Vishwanathan, S. and Garnett, R.}, pages = {}, publisher = {Curran Associates, Inc.}, title = {Lower bounds on the robustness to adversarial perturbations}, volume = {30}, year = {2017}, }