Combining Clustering and Classification for Software Quality Evaluation
2014; Springer Science+Business Media; Linguagem: Inglês
10.1007/978-3-319-07064-3_22
ISSN1611-3349
AutoresDiomidis Papas, Christos Tjortjis,
Tópico(s)Advanced Malware Detection Techniques
ResumoSource code and metric mining have been used to successfully assist with software quality evaluation. This paper presents a data mining approach which incorporates clustering Java classes, as well as classifying extracted clusters, in order to assess internal software quality. We use Java classes as entities and static metrics as attributes for data mining. We identify outliers and apply K-means clustering in order to establish clusters of classes. Outliers indicate potentially fault prone classes, whilst clusters are examined so that we can establish common characteristics. Subsequently, we apply C4.5 to build classification trees for identifying metrics which determine cluster membership. We evaluate the proposed approach with two well known open source software systems, Jedit and Apache Geronimo. Results have consolidated key findings from previous work and indicated that combining clustering with classification produces better results than stand alone clustering.
Referência(s)