Image-to-Markup Generation via Paired Adversarial Learning
2019; Springer Science+Business Media; Linguagem: Inglês
10.1007/978-3-030-10925-7_2
ISSN1611-3349
AutoresJin-Wen Wu, Fei Yin, Yanming Zhang, Xu-Yao Zhang, Cheng‐Lin Liu,
Tópico(s)Multimodal Machine Learning Applications
ResumoMotivated by the fact that humans can grasp semantic-invariant features shared by the same category while attention-based models focus mainly on discriminative features of each object, we propose a scalable paired adversarial learning (PAL) method for image-to-markup generation. PAL can incorporate the prior knowledge of standard templates to guide the attention-based model for discovering semantic-invariant features when the model pays attention to regions of interest. Furthermore, we also extend the convolutional attention mechanism to speed up the image-to-markup parsing process while achieving competitive performance compared with recurrent attention models. We evaluate the proposed method in the scenario of handwritten-image-to-LaTeX generation, i.e., converting handwritten mathematical expressions to LaTeX. Experimental results show that our method can significantly improve the generalization performance over standard attention-based encoder-decoder models.
Referência(s)