Partially Observed Markov
Decision Processes: From Filtering to Control
Credits
6 hp / ECTS
Objectives
The course is aimed at PhD-level students in
signal processing, control and telecommunications.
The goal is to learn useful tools in the
estimation and optimization of stochastic dynamical systems. In particular, we will study discrete-time Markov decision
processes.
The course will primarily deal with synthesis
of algorithms and structural results. Examples will be drawn primarily from
signal processing, telecommunications and sensor networks. For more details,
see the lecture plan.
Examiner and course responsible
Vkram Krishnamurthy, vikramk@ece.ubc.ca.
Prerrequisites
Engineering level probability
and random processes is adequate background. No measure-theory is required.
Literature
Much of the material in
POMDPs have appeared recently in papers in IEEE Transactions Signal
Processing, IEEE Transactions Information Theory, IEEE Transactions Automatic
Control, Math of Operations Research.
Dates (at 15:15 - 17:00, in Q2, Osquldas väg 10, floor 2, KTH)
- Tuesday April 9
- Friday April 12
- Tuesday April 16
- Friday April 19
- Monday April 22 (new time)
- Tuesday April 23
- Friday April 26
- Monday April 29
- Friday May 3
- Monday May 6 (new time)
- Tuesday May 14
- Friday May 17
Outline
The total course is for 18 hours.
1. Stochastic State Space models and
Stochastic Simulation [3 hours], intro slides, part1 slides, homework1
* Stochastic
Dynamical Systems
* Markov
Models, Perron Frobenius
Theorem, Geometric ergodicity
* Linear
Gaussian Models
* Jump
Markov Linear Systems and Target Tracking
*
Stochastic Simulation: Acceptance Rejection, Composition method.
Simulation-based optimal predictors
2. Optimal Filtering [4 hours], filter slides, homework2, ML parameter estimation
* What is Optimal State Estimation?..(Conditional
mean minimizes Bregman Loss Functions)
* Optimal Filtering: Bayes’ Recursion
* Optimal Predictors and Smoothers
* Kalman Filter
* Hidden Markov Model (HMM)Filter
* Geometric Forgetting of Initial Condition in
Optimal Filter .
.
* Particle Filter
* Data Augmentation Algorithm
* Reference Probability Method for Filtering
3. Filtering with Non-standard Information
Patters [2 hours]
*
Non-universal Filters for the
State
* Filtering
with social learning
*
Filtering of Reciprocal Markov Random Fields
4. Fully Observed Markov Decision
Processes [2 hours] fullmdp
slides
* Problem
Formulation
*
Stochastic Dynamic Programming
* Supermodularity and Structural
Results
* Constrained
Markov Decision Processes
5. Partially Observed Markov Decision
Processes [3 hours] partialmdp
slides, homework3
* Problem
Formulation
*
Information State
*
Stochastic Dynamic Programming and algorithms
6. Structural Results for POMDPs [4 hours]
*
Stochastic Orders
* Stochastic
Dominance of Filters
* Lattice
Programming
* Example
1: Quickest Detection with Optimal Sampling
* Example
2: Optimized Social Learning
* Example
3: Global Games
*
Multi-armed bandits
Evaluation (pass/fail)
Homeworks and a final exam.