Safe Learning for Near-Optimal Scheduling

Busatto-Gaston, Damien; Chakraborty, Debraj; Guha, Shibashis; Perez, Guillermo A.; Raskin, Jean-François

doi:doi/10.1007/978-3-030-85172-9_13

Citer

Safe Learning for Near-Optimal Scheduling

par Busatto-Gaston, Damien

;Chakraborty, Debraj

;Guha, Shibashis

;Perez, Guillermo A.

;Raskin, Jean-François

Référence International Conference on Quantitative Evaluation of Systems
Publication Publié, s.d.

Publication dans des actes

Résumé :

In this paper, we investigate the combination of synthesis, model-based learning, and online sampling techniques to obtain safe and near-optimal schedulers for a preemptible task scheduling problem. Our algorithms can handle Markov decision processes (MDPs) that have 10 20 states and beyond which cannot be handled with state-of-the art probabilistic model-checkers. We provide probably approximately correct (PAC) guarantees for learning the model. Additionally, we extend Monte-Carlo tree search with advice, computed using safety games or obtained using the earliest-deadline-first scheduler, to safely explore the learned model online. Finally, we implemented and compared our algorithms empirically against shielded deep Q-learning on large task systems.

Référencement	Visibilité	Pérennité	Facilité
Les publications encodées constituent la bibliographie académique de l'Université.	Les documents déposés sont indexés par les moteurs de recherche (Google Scholar,…).	Les documents déposés en open-access sont archivés au sein du réseau de préservation SAFE-PLN (www.safepln.org).	Les listes de publications sont compatibles avec le CV-ULB, le FNRS et accessibles sur le web.

Safe Learning for Near-Optimal Scheduling

Documents en relation

DI-fusion