Simple and Easy: Transfer Learning-Based Attacks to Text CAPTCHA
2020; Institute of Electrical and Electronics Engineers; Volume: 8; Linguagem: Inglês
10.1109/access.2020.2982945
ISSN2169-3536
AutoresPing Wang, Haichang Gao, Ziyu Shi, Zhongni Yuan, Jiangping Hu,
Tópico(s)Spam and Phishing Detection
ResumoCAPTCHA, or Completely Automated Public Turing Tests to Tell Computers and Humans Apart, is a common mechanism used to protect commercial accounts from malicious computer bots, and the most widely used scheme is text-based CAPTCHA. In recent years, newly emerged deep learning techniques have achieved high accuracy and speed in attacking text-based CAPTCHAs. However, most of the existing attacks have various disadvantages, the attack process made high complexity or manually collecting and labeling a large number of samples to train a deep learning recognition model is time-consuming and expensive. In this paper, we propose a transfer learning-based approach that greatly reduces the attack complexity and the cost of labeling samples, specifically, by pre-training the model with randomly generated samples and fine-tuning the pre-trained model with a small number of real-world samples. To evaluate our attack, we tested 25 online CAPTCHAs achieving success rates ranging from 36.3% to 96.9%. To further explore the effect of the training sample characteristics on the attack accuracy, we elaborately imitate some samples and apply a generative adversarial network to refine the samples, sequentially we use these two kinds of generated samples to pre-train the models, respectively. The experimental results demonstrate that the similarity between randomly generated samples and elaborately imitated samples has a negligible impact on the attack accuracy. Instead, transfer learning is the key factor; it reduces the cost of data preparation while preserving the model's attack accuracy.
Referência(s)