A novel multi-objective grammar-based framework for the generation of Convolutional Neural Networks
2022; Elsevier BV; Volume: 212; Linguagem: Inglês
10.1016/j.eswa.2022.118670
ISSN1873-6793
AutoresCleber A.C.F. da Silva, Daniel Carneiro Rosa, Péricles Miranda, Filipe R. Cordeiro, Tapas Si, André Nascimento, Rafael Ferreira Mello, Paulo S. G. de Mattos Neto,
Tópico(s)Advanced Multi-Objective Optimization Algorithms
ResumoIn recent years, the adoption of deep Convolutional Neural Networks (CNNs) has stood out in solving computer vision tasks, such as image classification. Researchers have proposed several architectures with varying sizes, complexities, and an increasing number of trainable parameters. For this reason, finding an optimized configuration and architecture with reduced complexity and high performance has become a very difficult task, since these configurations are totally dependent on the target classification problem and mostly depend on the optimization of a specialist in the area. To assist in the search for these optimal configurations, this work proposes the use of a multi-objective grammatical evolution framework, composed of a multi-objective search engine, a new context-free grammar responsible for creating the problem search space and a process mapping of individuals. Such a framework automatically generates and optimizes CNNs for a given image classification problem, without the need for human intervention from an expert. The framework navigates the search space using two objective functions seeking to maximize two metrics: accuracy and F 1 -score. The proposal was validated in the CIFAR-10, CIFAR-100, MNIST, KMNIST and EuroSAT datasets and the results show that the proposed method is able to generate simpler networks, but that statistically outperform (more complex) state-of-the-art CNNs in both metrics considered in the study. • An evolutional framework that optimizes CNNs with no need of an expert is introduced. • A context-free grammar responsible for modelling the architecture of CNNs is proposed. • The framework is tested on 5 datasets, guided by two metrics (accuracy and F 1 -score). • The results obtained are analyzed and compared to other known architectures. • The framework is capable of generating models that surpass other known architectures.
Referência(s)