Laser Learning Environment: A new environment for coordination-critical multi-agent tasks

Molinghen, Yannick; Avalos, Raphaël; Van Achter, Mark; Nowé, Ann; Lenaerts, Tom; Oliehoek, Frans F.A.; Manon, Kok; Verwer, Sicco

doi://doi.org/10.1007/978-3-031-74650-5_8

Citer

Laser Learning Environment: A new environment for coordination-critical multi-agent tasks

par Molinghen, Yannick

;Avalos, Raphaël ;Van Achter, Mark ;Nowé, Ann ;Lenaerts, Tom

Editeur scientifique Oliehoek, Frans F.A.;Manon, Kok ;Verwer, Sicco
Référence Benelux Conference Ai conference, BNAIC(35: 8-10/11/2023: TU Delft), Artificial Intelligence and Machine Learning, Revised Selected Papers, Springer Science and Business Media Deutschland GmbH
Publication Publié, 2024-11-02

Publication dans des actes

Résumé :

We introduce the Laser Learning Environment (LLE), a collaborative multi-agent reinforcement learning environment where coordination is key. In LLE, agents depend on each other to make progress (interdependence), must jointly take specific sequences of actions to succeed (perfect coordination), and accomplishing those joint actions does not yield any intermediate reward (zero-incentive dynamics). The challenge of such problems lies in the difficulty of escaping state space bottlenecks caused by interdependence steps since escaping those bottlenecks is not rewarded. We test multiple state-of-the-art value-based MARL algorithms against LLE and show that they consistently fail at the collaborative task because of their inability to escape state space bottlenecks, even though they successfully achieve perfect coordination. We show that Q-learning extensions such as prioritised experience replay and n-steps return hinder exploration in environments with zero-incentive dynamics, and find that intrinsic curiosity with random network distillation is not sufficient to escape those bottlenecks. We demonstrate the need for novel methods to solve this problem and the relevance of LLE as cooperative MARL benchmark.

Référencement	Visibilité	Pérennité	Facilité
Les publications encodées constituent la bibliographie académique de l'Université.	Les documents déposés sont indexés par les moteurs de recherche (Google Scholar,…).	Les documents déposés en open-access sont archivés au sein du réseau de préservation SAFE-PLN (www.safepln.org).	Les listes de publications sont compatibles avec le CV-ULB, le FNRS et accessibles sur le web.

Laser Learning Environment: A new environment for coordination-critical multi-agent tasks

Documents en relation

DI-fusion