Limit synchronization in Markov decision processes

Doyen, Laurent; Shirmohammadi, Mahsa; Massart, Thierry

doi:doi/10.1007/978-3-642-54830-7_4

Citer

Limit synchronization in Markov decision processes

par Doyen, Laurent

;Shirmohammadi, Mahsa

;Massart, Thierry

Référence Lecture notes in computer science, 8412 LNCS, page (58-72)
Publication Publié, 2014

Article révisé par les pairs

Résumé :

Markov decision processes (MDP) are finite-state systems with both strategic and probabilistic choices. After fixing a strategy, an MDP produces a sequence of probability distributions over states. The sequence is eventually synchronizing if the probability mass accumulates in a single state, possibly in the limit. Precisely, for 0 ≤ p ≤ 1 the sequence is p-synchronizing if a probability distribution in the sequence assigns probability at least p to some state, and we distinguish three synchronization modes: (i) sure winning if there exists a strategy that produces a 1-synchronizing sequence; (ii) almost-sure winning if there exists a strategy that produces a sequence that is, for all ε > 0, a (1-ε)-synchronizing sequence; (iii) limit-sure winning if for all ε > 0, there exists a strategy that produces a (1-ε)-synchronizing sequence. We consider the problem of deciding whether an MDP is sure, almost-sure, or limit-sure winning, and we establish the decidability and optimal complexity for all modes, as well as the memory requirements for winning strategies. Our main contributions are as follows: (a) for each winning modes we present characterizations that give a PSPACE complexity for the decision problems, and we establish matching PSPACE lower bounds; (b) we show that for sure winning strategies, exponential memory is sufficient and may be necessary, and that in general infinite memory is necessary for almost-sure winning, and unbounded memory is necessary for limit-sure winning; (c) along with our results, we establish new complexity results for alternating finite automata over a one-letter alphabet. © 2014 Springer-Verlag.

Référencement	Visibilité	Pérennité	Facilité
Les publications encodées constituent la bibliographie académique de l'Université.	Les documents déposés sont indexés par les moteurs de recherche (Google Scholar,…).	Les documents déposés en open-access sont archivés au sein du réseau de préservation SAFE-PLN (www.safepln.org).	Les listes de publications sont compatibles avec le CV-ULB, le FNRS et accessibles sur le web.

Limit synchronization in Markov decision processes

Documents en relation

DI-fusion