Capítulo de livro Produção Nacional Revisado por pares

STEMBR: A Stemming Algorithm for the Brazilian Portuguese Language

2005; Springer Science+Business Media; Linguagem: Inglês

10.1007/11595014_67

ISSN

1611-3349

Autores

Reinaldo Viana Alvares, Ana Cristina Bicharra García, Inhaúma Neves Ferraz,

Tópico(s)

Topic Modeling

Resumo

Stemming algorithms have traditionally been utilized in information retrieval systems as they generate a more concise word representation. However, the efficiency of these algorithms varies according to the language they are used with. This paper presents STEMBR, a stemmer for Brazilian Portuguese whereby the suffix treatment is based on a statistical study of the frequency of the last letter for words found in Brazilian web pages. The proposed stemmer is compared with another algorithm specifically developed for Portuguese. The results show the efficiency of our stemmer.

Referência(s)