Artigo Acesso aberto Revisado por pares

Simple and Easy: Transfer Learning-Based Attacks to Text CAPTCHA

2020; Institute of Electrical and Electronics Engineers; Volume: 8; Linguagem: Inglês

10.1109/access.2020.2982945

ISSN

2169-3536

Autores

Ping Wang, Haichang Gao, Ziyu Shi, Zhongni Yuan, Jiangping Hu,

Tópico(s)

Spam and Phishing Detection

Resumo

CAPTCHA, or Completely Automated Public Turing Tests to Tell Computers and Humans Apart, is a common mechanism used to protect commercial accounts from malicious computer bots, and the most widely used scheme is text-based CAPTCHA. In recent years, newly emerged deep learning techniques have achieved high accuracy and speed in attacking text-based CAPTCHAs. However, most of the existing attacks have various disadvantages, the attack process made high complexity or manually collecting and labeling a large number of samples to train a deep learning recognition model is time-consuming and expensive. In this paper, we propose a transfer learning-based approach that greatly reduces the attack complexity and the cost of labeling samples, specifically, by pre-training the model with randomly generated samples and fine-tuning the pre-trained model with a small number of real-world samples. To evaluate our attack, we tested 25 online CAPTCHAs achieving success rates ranging from 36.3% to 96.9%. To further explore the effect of the training sample characteristics on the attack accuracy, we elaborately imitate some samples and apply a generative adversarial network to refine the samples, sequentially we use these two kinds of generated samples to pre-train the models, respectively. The experimental results demonstrate that the similarity between randomly generated samples and elaborately imitated samples has a negligible impact on the attack accuracy. Instead, transfer learning is the key factor; it reduces the cost of data preparation while preserving the model's attack accuracy.

Referência(s)
Altmetric
PlumX