Artigo Produção Nacional Revisado por pares

Learning to expand queries using entities

2014; Wiley; Volume: 65; Issue: 9 Linguagem: Inglês

10.1002/asi.23084

ISSN

2330-1643

Autores

Wladmir Cardoso Brandão, Rodrygo L. T. Santos, Nívio Ziviani, Edleno Silva de Moura, Altigran S. da Silva,

Tópico(s)

Web Data Mining and Analysis

Resumo

A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. Recently, methods that exploit named entities have been shown to be more effective for query expansion than traditional pseudorelevance feedback methods. In this article, we introduce a supervised learning approach that exploits named entities for query expansion using Wikipedia as a repository of high‐quality feedback documents. In contrast with existing entity‐oriented pseudorelevance feedback approaches, we tackle query expansion as a learning‐to‐rank problem. As a result, not only do we select effective expansion terms but we also weigh these terms according to their predicted effectiveness. To this end, we exploit the rich structure of Wikipedia articles to devise discriminative term features, including each candidate term's proximity to the original query terms, as well as its frequency across multiple article fields and in category and infobox descriptors. Experiments on three Text RE trieval Conference web test collections attest the effectiveness of our approach, with gains of up to 23.32% in terms of mean average precision, 19.49% in terms of precision at 10, and 7.86% in terms of normalized discounted cumulative gain compared with a state‐of‐the‐art approach for entity‐oriented query expansion.

Referência(s)