Tree view self-organisation of web content
2004; Elsevier BV; Volume: 63; Linguagem: Inglês
10.1016/j.neucom.2004.07.005
ISSN1872-8286
AutoresRichard T. Freeman, Hujun Yin,
Tópico(s)Advanced Clustering Algorithms Research
ResumoWhen browsing a large set of unstructured documents, it is advantageous if the documents have been organised and presented in a way that makes navigation efficient, understanding underlying concepts easy and locating related information quickly. This paper proposes a new method termed Treeview self-organising maps (Treeview SOMs) for clustering and organising text documents by means of a series of independently and automatically created, hierarchical one-dimensional SOMs. The method generates a topological taxonomy tree for a set of unstructured text documents in terms of presentation and visualisation. The documents are organised in a hierarchy of dynamically generated and automatically validated topics extracted from the corpus of the documents. The results presented in a labelled tree view, clearly show underlying contents of the documents and can help browsing the document set more efficiently than those of previous work using SOMs or hierarchical clustering methods. A brief overview on general document clustering and a review on SOM-based document analysis methods are also provided together with a comparison among them.
Referência(s)