Formal Methods for Control of Markov Decision Processes

Wu, Bo

doi:10.7274/w9504x54r55

File(s) under permanent embargo

Formal Methods for Control of Markov Decision Processes

thesis

posted on 2018-04-20, 00:00 authored by Bo Wu

Sequential decision making under uncertainties in a complex and changing environment is a crucial research area with many practical applications, such as autonomous driving, human-robot interaction, multi-robot tasking and so on. Most of the existing results focus on the strategy that assigns a single action at each decision time that (approximately) optimizes some performance metric. However, in many applications, especially the ones that are safety critical, achieving the given task in a provably correct manner is preferred or even required, where optimizing a utility function may not be adequate.

My Ph.D. dissertation is motivated by such a need to build a correct-by-design framework that considers permissive controllers, unknown models and privacy preservation in the model of Markov Decision Process (MDP). The work in this dissertation is multidisciplinary by nature and integrates automata theory, supervisory control, machine learning, and robotics. The goal is to establish a fundamental understanding of design principles in systems modeled as MDPs and to develop adaptive, scalable and provably correct solutions for diverse applications. In particular, my research utilizes control theory, formal methods, optimization, machine learning, and stochasticity.

Starting with counterexample-guided permissive supervisor synthesis given a formal specification, an iterative learning framework is established for both single and multiple agents. To reduce the verification complexity, abstraction based assume-guarantee framework is applied. Then we move on to consider the controller synthesis problem when the MDP model is not known with human-robot collaboration as the motivating example. Through mathematical derivation, the procedure to learn the model while guarantee that the optimal controller based on the learned model is approximately optimal with a high confidence level is summarized in an algorithm. Then we go on to consider the privacy problem in MDP control, which first makes use of the existing notion of opacity and later we propose our own notion defined in the belief state. Numerical and abstraction based methods are proposed. The synthesized controllers are guaranteed to satisfy or optimally satisfy the given specification while preserving the secret from being detected.

History

Date Created

2018-04-20

Date Modified

2018-11-01

Defense Date

2018-04-19

Research Director(s)

Hai Lin

Committee Members

Taeho Jung Panos Antsaklis Michael Lemmon

Degree

Doctor of Philosophy

Degree Level

Doctoral Dissertation

Program Name

Electrical Engineering

Usage metrics

Keywords

Not Assigned

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Formal Methods for Control of Markov Decision Processes

History

Date Created

Date Modified

Defense Date

Research Director(s)

Committee Members

Degree

Degree Level

Program Name

Usage metrics

Categories

Keywords

Licence

Exports