Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data
2008; Springer Science+Business Media; Volume: 25; Issue: 2 Linguagem: Inglês
10.1016/s1016-8478(23)17582-0
ISSN0219-1032
Autores Tópico(s)Machine Learning in Bioinformatics
ResumoThe analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many coexpressed genes are co-regulated, and identifying coexpressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.
Referência(s)