Capítulo de livro Revisado por pares

Entity Resolution in Texts Using Statistical Learning and Ontologies

2009; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-642-10871-6_7

ISSN

1611-3349

Autores

Tadej Štajner, Dunja Mladenić,

Tópico(s)

Semantic Web and Ontologies

Resumo

Ambiguities, which are inherently present in natural languages represent a challenge of determining the actual identities of entities mentioned in a document (e.g., Paris can refer to a city in France but it can also refer to a small city in Texas, USA or to a 1984 film directed by Wim Wenders having title Paris, Texas). Disambiguation is a problem that can be successfully solved by entity resolution methods. This paper studies various methods for estimating relatedness between entities, used in collective entity resolution. We define a unified entity resolution approach, capable of using implicit as well as explicit relatedness for collectively identifying in-text entities. As a relatedness measure, we propose a method, which expresses relatedness using the heterogeneous relations of a domain ontology. We also experiment with other relatedness measures, such as using statistical learning of co-occurrences of two entities or using content similarity between them. Evaluation on real data shows that the new methods for relatedness estimation give good results.

Referência(s)