Creating and Weighting Hunspell Dictionariesas Finite-State Automata
2010; Adam Mickiewicz University in Poznań; Volume: 21; Linguagem: Inglês
10.14746/il.2010.21.1
ISSN1426-188X
AutoresTommi A. Pirinen, Krister Lindén,
Tópico(s)Software Engineering Research
ResumoTherearenumerousformatsforwritingspell-checkersforopen-source systems and there are many lexical descriptions for natural languages written in these formats. In this paper, we demonstrate a method for converting Hunspell and related spell-checking lexicons into finite-state automata. We also present a simple way to apply unigram corpus training in order to improve the spellcheckingsuggestionmechanismusingweightedfinite-statetechnology.Whatwe propose is a generic and efficient language-independent framework of weighted finite-stateautomataforspell checkingintypicalopen-sourcesoftware,e.g.Mozilla Firefox, OpenOffice and the Gnome desktop.
Referência(s)