Abstract
Training certifiable neural networks enables us to obtain models with robustness guarantees against adversarial attacks. In this work, we introduce a framework to obtain a provable adversarial-free region in the neighborhood of the input data by a polyhedral envelope, which yields more fine-grained certified robustness than existing methods. We further introduce polyhedral envelope regularization (PER) to encourage larger adversarial-free regions and thus improve the provable robustness of the models. We demonstrate the flexibility and effectiveness of our framework on standard benchmarks; it applies to networks of different architectures and with general activation functions. Compared with state of the art, PER has negligible computational overhead; it achieves better robustness guarantees and accuracy on the clean data in various settings.
| Original language | English |
|---|---|
| Pages (from-to) | 3146-3160 |
| Journal | IEEE Transactions on Neural Networks and Learning Systems |
| Volume | 34 |
| Issue number | 6 |
| Online published | 26 Oct 2021 |
| DOIs | |
| Publication status | Published - Jun 2023 |
| Externally published | Yes |
Research Keywords
- Adversarial training
- Computational modeling
- Predictive models
- provable robustness.
- Recurrent neural networks
- Robustness
- Smoothing methods
- Standards
- Training