Artigo Acesso aberto

Creating and Weighting Hunspell Dictionariesas Finite-State Automata

2010; Adam Mickiewicz University in Poznań; Volume: 21; Linguagem: Inglês

10.14746/il.2010.21.1

ISSN

1426-188X

Autores

Tommi A. Pirinen, Krister Lindén,

Tópico(s)

Software Engineering Research

Resumo

Therearenumerousformatsforwritingspell-checkersforopen-source systems and there are many lexical descriptions for natural languages written in these formats. In this paper, we demonstrate a method for converting Hunspell and related spell-checking lexicons into finite-state automata. We also present a simple way to apply unigram corpus training in order to improve the spellcheckingsuggestionmechanismusingweightedfinite-statetechnology.Whatwe propose is a generic and efficient language-independent framework of weighted finite-stateautomataforspell checkingintypicalopen-sourcesoftware,e.g.Mozilla Firefox, OpenOffice and the Gnome desktop.

Referência(s)
Altmetric
PlumX