par Le Roux, Stephane ;Perez, Guillermo A.
Référence Lecture notes in computer science, 10803 LNCS, page (367-383)
Publication Publié, 2018
Article révisé par les pairs
Résumé : We study the never-worse relation (NWR) for Markov decision processes with an infinite-horizon reachability objective. A state q is never worse than a state p if the maximal probability of reaching the target set of states from p is at most the same value from q, regardless of the probabilities labelling the transitions. Extremal-probability states, end components, and essential states are all special cases of the equivalence relation induced by the NWR. Using the NWR, states in the same equivalence class can be collapsed. Then, actions leading to sub-optimal states can be removed. We show that the natural decision problem associated to computing the NWR is coNP-complete. Finally, we extend a previously known incomplete polynomial-time iterative algorithm to under-approximate the NWR.