Transfer-based adversarial attack is an important type of black-box attacks, where the adversary could attack an unseen target model, without knowing its information. Existing approaches usually only utilize pretrained model to generate adversarial examples. However, the inconsistency between different will induce poor adversarial transferability, especially in different model architectures. To solve the problems, a research team led by Yuanfang Guo published their
new research on 15 October 2025 in
Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed a common knowledge learning (CKL) method for the substitute (source) model to learn better network weights to generate adversarial examples with better transferability, under fixed network architectures. Specifically, to reduce the model-specific features and obtain better output distributions, a multi-teacher approach is proposed, where the knowledge is distilled from different teacher architectures into one student network. Compared with the existing research results, the proposed method could significantly improve the attack success rate (ASR).
In the research, they analyze the relation of adversarial transferability and output consistency of different models, and observe that higher output inconsistency tends to induce lower transferability and vice versa. Therefore, they introduce a multi-teacher approach to learn common knowledge of teacher models.
First, a multi-teacher approach is proposed, where the student model learns from the teacher outputs to reduce the model-specific features and obtain common (model-agnostic) features, to alleviate the output inconsistency problem. In addition, since the input gradient is always utilized in typical adversarial attack process, they design a constraint on the input gradients between the teacher models and the student model, to further promote the transferability of generated adversarial examples. Extensive experiments on CIFAR-10, CIFAR-100, and TinyImageNet have clearly demonstrated the superiority of the proposed work.
Future work can focus on finding more suitable and efficient teacher models and student models to further improve the attack transferability.
DOI:10.1007/s11704-024-40533-4