Artigo Revisado por pares

A system for the retrieval of Italian broadcast news

2000; Elsevier BV; Volume: 32; Issue: 1-2 Linguagem: Inglês

10.1016/s0167-6393(00)00022-4

ISSN

1872-7182

Autores

Marcello Federico,

Tópico(s)

Topic Modeling

Resumo

Abstract This paper presents a prototype for the retrieval of Italian broadcast news, which has been developed at ITC-irst. The architecture employs a speech recognition engine for the automatic transcription of audio news. Moreover, it features document indexing based on part-of-speech tagging of text coupled with morphological analysis, and query expansion exploiting the Italian WordNet thesaurus. Query-document matching is based on a statistical term weighting scheme. The system was tested on a 203-story collection of audio news, augmented with 9500 newspaper articles. The evaluation was based on a “known item” retrieval task and aimed at evaluating the impact of speech recognition errors and query expansion on retrieval performance.

Referência(s)
Altmetric
PlumX