Reachability and expected cost properties for uncertain Markov decision processes

Abstract

We study uncertain Markov decision processes with interval uncertainty. The problem is to Find a scheduler that satisfies both a reachability property and an expected cost property. We propose a robust geometric program (robust GP) to find this scheduler. Solutions to this robust geometric program are proven to be correct but not complete: if the solution to the robust GP defines a valid scheduler, it is a solution to the problem, but getting a valid scheduler is not guaranteed. Because robust geometric programming is PSPACE-hard, we employ an approximation method that results in a robust linear program (robust LP). A solution to this robust LP is also a solution to the robust GP. Robust LPs can be solved efficiently depending on the uncertainty set that is used. What specific kind the uncertainty set in the robust GP (and as a consequence in the robust LP) is, remains open.

Full bachelor thesis

You can download the thesis (pdf) with the button below. The presentation slides I used to give a quick overview of what I did are available here (pdf). If you have any questions, feel free to send me an email. Mistakes are easily made, so there's also an erratum (pdf) in which I keep track of errors and typos.

Accompanying Python Files

There are two Python scripts accompanying the thesis. One to (re)produce the results from the counter example, and one for the calculation of a r-term piecewise-linear approximation of a 2-term lse-function. Both can be found on Github.

Counter example
The results from the counter example used to show that the robust GP is not complete can be calculated with the code in counter_example.py. Running this script requires Python 2.7 and the GPKit package.

Piecewise-linear approximation algorithm of the 2-term lse-function
A 2-term lse-function can be approximated by two piecewise-linear functions: one for the upper approximation, and one for the lower approximation. The Python script lseapprox.py can calculate both by using the upperapprox(r) or lowerapprox(r) functions respectively. This Python implementation is a direct translation of the Matlab function accompanying the paper Tractable Approximate Robust Geometric Programming. The Python script requires Python 3.6 and the Numpy package.

Future work

As noted in the thesis, there are two main concerns: the uncertainty set of the robust LP and the incompleteness. Future work will focus on finding ways to make the robust LP complete, and on finding a known uncertainty set for which robust LPs can be solved and that accurately describes the 'for all instantiations' uncertainty we want.