17th International Symposium on
Mathematical Theory of Networks and Systems
Kyoto International Conference Hall, Kyoto, Japan, July 24-28, 2006

MTNS 2006 Paper Abstract

Close

Paper TuP13.6

Chang, Hyeong Soo (Sogang Univ.)

On Policy Iteration for Finite Horizon Markov Decision Processes with Target-Level Risk Sensitive Objectives

Scheduled for presentation during the Regular Session "Linear Systems I" (TuP13), Tuesday, July 25, 2006, 17:25−17:50, Room 101

17th International Symposium on Mathematical Theory of Networks and Systems, July 24-28, 2006, Kyoto, Japan

This information is tentative and subject to change. Compiled on April 24, 2024

Keywords Dynamic programming, Iterative methods, Stochastic systems

Abstract

In this paper, we present a multi-policy improvement method for a risk-sensitive Markov decision process model where the objective is to minimize the probability that the total discounted rewards over finite horizon do not exceed a specified target. The method can be implemented with simulation within the context of on-line control to break the curse of dimensionality problem in solving the MDP. We relate the multi-policy improvement method with a policy-iteration algorithm for solving the model and discuss issues on the algorithm.