Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games

Artigo Acesso aberto

Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games

2020; Institute of Electrical and Electronics Engineers; Volume: 33; Issue: 4 Linguagem: Inglês

10.1109/tnnls.2020.3042943

ISSN

2162-2388

Autores

Dong Xie, X. Zhong,

Tópico(s)

Digital Games and Media

Resumo

In this article, we propose a novel semicentralized deep deterministic policy gradient (SCDDPG) algorithm for cooperative multiagent games. Specifically, we design a two-level actor-critic structure to help the agents with interactions and cooperation in the StarCraft combat. The local actor-critic structure is established for each kind of agents with partially observable information received from the environment. Then, the global actor-critic structure is built to provide the local design an overall view of the combat based on the limited centralized information, such as the health value. These two structures work together to generate the optimal control action for each agent and to achieve better cooperation in the games. Comparing with the fully centralized methods, this design can reduce the communication burden by only sending limited information to the global level during the learning process. Furthermore, the reward functions are also designed for both local and global structures based on the agents' attributes to further improve the learning performance in the stochastic environment. The developed method has been demonstrated on several scenarios in a real-time strategy game, i.e., StarCraft. The simulation results show that the agents can effectively cooperate with their teammates and defeat the enemies in various StarCraft scenarios.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games