Monte Carlo Tree Search Guided by Symbolic Advice for MDPs

Busatto-Gaston, Damien; Chakraborty, Debraj; Raskin, Jean-François

doi:doi/10.4230/LIPIcs.CONCUR.2020.40

Citer

Monte Carlo Tree Search Guided by Symbolic Advice for MDPs

par Busatto-Gaston, Damien

;Chakraborty, Debraj

;Raskin, Jean-François

Référence 31st International Conference on Concurrency Theory (CONCUR 2020), Vol. 171, page (40:1-40:24)
Publication Publié, 2020-08-31

Publication dans des actes

Résumé :

n this paper, we consider the online computation of a strategy that aims at optimizing the expectedaverage reward in a Markov decision process. The strategy is computed with a receding horizonand using Monte Carlo tree search (MCTS). We augment the MCTS algorithm with the notion ofsymbolic advice, and show that its classical theoretical guarantees are maintained. Symbolic adviceare used to bias the selection and simulation strategies of MCTS. We describe how to use QBF andSAT solvers to implement symbolic advice in an efficient way. We illustrate our new algorithm usingthe popular gamePac-Manand show that the performances of our algorithm exceed those of plainMCTS as well as the performances of human players.

Référencement	Visibilité	Pérennité	Facilité
Les publications encodées constituent la bibliographie académique de l'Université.	Les documents déposés sont indexés par les moteurs de recherche (Google Scholar,…).	Les documents déposés en open-access sont archivés au sein du réseau de préservation SAFE-PLN (www.safepln.org).	Les listes de publications sont compatibles avec le CV-ULB, le FNRS et accessibles sur le web.

Monte Carlo Tree Search Guided by Symbolic Advice for MDPs

Documents en relation

DI-fusion