Artigo Revisado por pares

Discovering Agent Behaviors Through Code Reuse: Examples From Half-Field Offense and Ms. Pac-Man

2017; Institute of Electrical and Electronics Engineers; Volume: 10; Issue: 2 Linguagem: Inglês

10.1109/tciaig.2017.2766980

ISSN

2475-1510

Autores

Stephen Kelly, Malcolm I. Heywood,

Tópico(s)

Metaheuristic Optimization Algorithms Research

Resumo

This paper demonstrates how code reuse allows genetic programming (GP) to discover strategies for difficult gaming scenarios while maintaining relatively low model complexity. Critical factors in the proposed approach are illustrated through an in-depth study in two challenging task domains: RoboCup soccer and Ms. Pac-Man. In RoboCup, we demonstrate how policies initially evolved for simple subtasks can be reused, with no additional training or transfer function, in order to improve learning in the complex half-field offense (HFO) task. We then show how the same approach to code reuse can be applied directly in Ms. Pac-Man. In the latter case, the use of task-agnostic diversity maintenance removes the need to explicitly identify suitable subtasks a priori. The resulting GP policies achieve state-of-the-art levels of play in HFO and surpass scores previously reported in the Ms. Pac-Man literature, while employing less domain knowledge during training. Moreover, the highly modular policies discovered by GP are shown to be significantly less complex than state-of-the-art solutions in both domains. Throughout this paper, we pay special attention to a pair of task-agnostic diversity maintenance techniques, and empirically demonstrate their importance to the development of strong policies.

Referência(s)
Altmetric
PlumX