Representation of Semantic Word Embeddings Based on SLDA and Word2vec Model
2023; Institution of Engineering and Technology; Volume: 32; Issue: 3 Linguagem: Inglês
10.23919/cje.2021.00.113
ISSN2075-5597
AutoresTang Huanling, Hui Zhu, Wei Hongmin, Han Zheng, Mao Xueli, Mingyu Lu, Jin Guo,
Tópico(s)Text and Document Classification Technologies
ResumoTo solve the problem of semantic loss in text representation, this paper proposes a new embedding method of word representation in semantic space called wt2svec based on supervised latent Dirichlet allocation (SLDA) and Word2vec. It generates the global topic embedding word vector utilizing SLDA which can discover the global semantic information through the latent topics on the whole document set. It gets the local semantic embedding word vector based on the Word2vec. The new semantic word vector is obtained by combining the global semantic information with the local semantic information. Additionally, the document semantic vector named doc2svec is generated. The experimental results on different datasets show that wt2svec model can obviously promote the accuracy of the semantic similarity of words, and improve the performance of text categorization compared with Word2vec.
Referência(s)