PubDef: Defending Against Transfer Attacks Using Public Models

The article discusses a new defense against adversarial attacks on machine learning systems, called PubDef, developed by researchers at UC Berkeley. Adversarial attacks manipulate inputs to cause machine learning models to produce incorrect outputs. PubDef is designed to resist transfer attacks from publicly available models, using game theory to formulate the interaction between attacker and defender. The defense trains models by selecting a diverse set of publicly available source models and using a training loss that minimizes error against transfer attacks from these source models.

The researchers tested PubDef against 264 different transfer attacks on CIFAR-10, CIFAR-100, and ImageNet datasets, and found that it significantly outperformed prior defenses like adversarial training, with almost no drop in accuracy on clean inputs. However, PubDef does have limitations, including reliance on model secrecy and vulnerability to private surrogate model training. Despite these limitations, the article concludes that PubDef represents a promising step towards developing practical defenses against adversarial attacks on machine learning systems.

Key takeaways:

Adversarial attacks are a significant threat to machine learning systems, and defending against them is a major area of research. A new defense called PubDef, introduced by UC Berkeley researchers, shows promise in increasing robustness against a realistic class of attacks while maintaining accuracy on clean inputs.
PubDef is designed to resist transfer attacks from publicly available models. It uses a game theory approach, where the attacker's strategy is to pick a public source model and attack algorithm, and the defender's strategy is to choose parameters for the model to make it robust.
PubDef significantly outperforms prior defenses like adversarial training, achieving higher accuracy rates on CIFAR-10, CIFAR-100, and ImageNet datasets. It also maintains almost the same level of accuracy on clean inputs, demonstrating better robustness with a smaller impact on performance on unperturbed data.
While PubDef represents a promising step towards developing practical defenses, it does have limitations. It specifically focuses on transfer attacks from public models and does not address other threats like white-box attacks. Further work is needed to handle other threats and reduce reliance on model secrecy.

PubDef: Defending Against Transfer Attacks Using Public Models

Key takeaways:

Comments (0)

Newsletter