Artigo Revisado por pares

Person name disambiguation on the web in a multilingual context

2018; Elsevier BV; Volume: 465; Linguagem: Inglês

10.1016/j.ins.2018.07.024

ISSN

1872-6291

Autores

Agustín Delgado, Raquel Martínez, Soto Montalvo, Víctor Fresno,

Tópico(s)

Semantic Web and Ontologies

Resumo

Person Name Disambiguation on the Web is the problem of grouping web pages retrieved by a search engine when looking for a person name according to the individual they refer to. This problem has been addressed in a monolingual scenario where all the search results are written in the same language. However, search engines can also return links to web pages written in different languages. We study how to address multilingualism for this problem using the MC4WePS data set, a recent gold standard that includes real search results written in different languages. For this purpose, we first analyze the suitability of using a translation tool to treat multilingualism with two state-of-the-art clustering algorithms. Since the use of this kind of tools increases the processing time of the disambiguation process, we propose an approach to deal with multilingualism that generalizes the monolingual scenario and does not require any translation resources. Our approach obtains better results than the translation approaches with the gold standard, making it a competitive choice in a real scenario.

Referência(s)
Altmetric
PlumX