Artigo Revisado por pares

Construction of web search engine supporting intelligent Chinese word segmentation

2006; China Aerospace Science and Industry Group; Linguagem: Inglês

ISSN

1000-7024

Autores

Chao Liu,

Tópico(s)

Advanced Computational Techniques and Applications

Resumo

Chinese word segmentation has a vital effect on the precision and the recall of web search engine for Chinese.By analyzing an open source web search engine Nutch,a scalable lexical analyzer is implemented based on JavaCC.Then through integrating it with Nutch,a web search engine NutchEnhanced which supports intelligent Chinese word segmentation is constructed,and is used as a plat-form for testing the effect of various Chinese word segmentation algorithms in search engine.The experimental result show,for Chinese query,NutchEnhanced outperforms Nutch on the precision.With recall of 0.74 and precision of top 30 results getting 0.86,its Chinese search quality is as good as Google and Baidu in general.

Referência(s)