Image-to-Markup Generation via Paired Adversarial Learning

Capítulo de livro Revisado por pares

Image-to-Markup Generation via Paired Adversarial Learning

2019; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-030-10925-7_2

ISSN

1611-3349

Autores

Jin-Wen Wu, Fei Yin, Yanming Zhang, Xu-Yao Zhang, Cheng‐Lin Liu,

Tópico(s)

Multimodal Machine Learning Applications

Resumo

Motivated by the fact that humans can grasp semantic-invariant features shared by the same category while attention-based models focus mainly on discriminative features of each object, we propose a scalable paired adversarial learning (PAL) method for image-to-markup generation. PAL can incorporate the prior knowledge of standard templates to guide the attention-based model for discovering semantic-invariant features when the model pays attention to regions of interest. Furthermore, we also extend the convolutional attention mechanism to speed up the image-to-markup parsing process while achieving competitive performance compared with recurrent attention models. We evaluate the proposed method in the scenario of handwritten-image-to-LaTeX generation, i.e., converting handwritten mathematical expressions to LaTeX. Experimental results show that our method can significantly improve the generalization performance over standard attention-based encoder-decoder models.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Image-to-Markup Generation via Paired Adversarial Learning