
PhD@ETH Zürich,
Natural Language Processing

Research


Adversarial example generation becomes a viable method for evaluating the robustness of a machine learning model. In this work, we consider hard-label black-box attacks (a.k.a. decision-based attacks), which is a challenging setting that generates adversarial examples based on only a series of black-box hard-label queries. This type of attacks can be used to attack discrete and complex models, such as Gradient Boosting Decision Tree (GBDT) and detection-based defense models. Existing decision-based attacks based on iterative local updates often get stuck in a local minimum and fail to generate the optimal adversarial example with the smallest distortion.
To remedy this issue, we propose an efficient meta-algorithm called BOSH-attack, which tremendously improves existing algorithms through Bayesian Optimization (BO) and Successive Halving (SH). In particular, instead of traversing a single solution path when searching an adversarial example, we maintain a pool of solution paths to explore important regions. We show empirically that the proposed algorithm converges to a better solution than existing approaches, while the query count is smaller than applying multiple random initializations by a factor of 10.
BOSH: An Efficient Meta Algorithm for Decision-based Attacks
Zhenxin Xiao, Puyudi Yang, Yuchen Jiang, Kai-Wei Chang, Cho-Jui Hsieh