In this paper, we address the problem of generating efficient state sample points for the solution of continuous-state finite-horizon Markovian decision problems through approximate dynamic programming. It is known that the selection of sam- pling points at which the value function is observed is a key factor when such function is approximated by a model based on a finite number of evaluations. A standard approach consists in generating these points through a random or deterministic procedure, aiming at a balanced covering of the state space. Yet, this solution may not be efficient if the state trajectories are not uniformly distributed. Here, we propose to exploit F-discrepancy, a quantity that measures how closely a set of random points rep- resents a probability distribution, and introduce an example of an algorithm based on such concept to automatically select point sets that are efficient with respect to the underlying Markovian process. An error analysis of the approximate solution is pro- vided, showing how the proposed algorithm enables convergence under suitable regularity hypotheses. Then, simulation results are provided concerning an inventory forecasting test problem. The tests confirm in general the important role of F-discrepancy, and show how the proposed algorithm is able to yield bet- ter results than uniform sampling, using sets even 50 times smaller.

F-Discrepancy for Efficient Sampling in Approximate Dynamic Programming

Cervellera C;
2016

Abstract

In this paper, we address the problem of generating efficient state sample points for the solution of continuous-state finite-horizon Markovian decision problems through approximate dynamic programming. It is known that the selection of sam- pling points at which the value function is observed is a key factor when such function is approximated by a model based on a finite number of evaluations. A standard approach consists in generating these points through a random or deterministic procedure, aiming at a balanced covering of the state space. Yet, this solution may not be efficient if the state trajectories are not uniformly distributed. Here, we propose to exploit F-discrepancy, a quantity that measures how closely a set of random points rep- resents a probability distribution, and introduce an example of an algorithm based on such concept to automatically select point sets that are efficient with respect to the underlying Markovian process. An error analysis of the approximate solution is pro- vided, showing how the proposed algorithm enables convergence under suitable regularity hypotheses. Then, simulation results are provided concerning an inventory forecasting test problem. The tests confirm in general the important role of F-discrepancy, and show how the proposed algorithm is able to yield bet- ter results than uniform sampling, using sets even 50 times smaller.
2016
Istituto di Studi sui Sistemi Intelligenti per l'Automazione - ISSIA - Sede Bari
Istituto di iNgegneria del Mare - INM (ex INSEAN)
Approximate dynamic pro
F-discrepancy
Markovian decision problem
state sam- pling
value function approximation
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/316812
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 8
social impact