Value Function Discovery in Markov Decision Processes with Evolutionary Algorithms

Onderwater, Martijn; Bhulai, Sandjai; van der Mei, Rob

doi:10.1109/TSMC.2015.2475716

M. Onderwater (Martijn), S. Bhulai (Sandjai) and R.D. van der Mei (Rob)

2015

Value Function Discovery in Markov Decision Processes with Evolutionary Algorithms

IEEE Transactions on Systems, Man, and Cybernetics: Systems

In this paper we introduce a novel method for discovery of value functions for Markov Decision Processes (MDPs). This method, which we call Value Function Discovery (VFD), is based on ideas from the Evolutionary Algorithm field. VFD’s key feature is that it discovers descriptions of value functions that are algebraic in nature. This feature is unique, because the descriptions include the model parameters of the MDP. The algebraic expression of the value function discovered by VFD can be used in several scenarios, e.g., conversion to a policy (with one-step policy improvement) or control of systems with time-varying parameters. The work in this paper is a first step towards exploring potential usage scenarios of discovered value functions. We give a detailed description of VFD and illustrate its application on an example MDP. For this MDP we let VFD discover an algebraic description of a value function that closely resembles the optimal value function. The discovered value function is then used to obtain a policy, which we compare numerically to the optimal policy of the MDP. The resulting policy shows near-optimal performance on a wide range of model parameters. Finally, we identify and discuss future application scenarios of discovered value functions.

Additional Metadata
Keywords	Markov Decision Processes, Evolutionary Algorithms, Value Function, Genetic Programming.
THEME	Logistics (theme 3)
Publisher	IEEE
Persistent URL	doi.org/10.1109/TSMC.2015.2475716
Journal	IEEE Transactions on Systems, Man, and Cybernetics: Systems
Project	Realisation of Reliable and Secure Residential Sensor Platforms
Note	[accepted for publication]
Organisation	Stochastics
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Onderwater, M., Bhulai, S., & van der Mei, R. (2015). Value Function Discovery in Markov Decision Processes with Evolutionary Algorithms. IEEE Transactions on Systems, Man, and Cybernetics: Systems. doi:10.1109/TSMC.2015.2475716

View at Publisher

Free Full Text ( Author Manuscript , 205kb )

Value Function Discovery in Markov Decision Processes with Evolutionary Algorithms

Publication

Publication

Address

Publishing at CWI

Questions or comments?

Value Function Discovery in Markov Decision Processes with Evolutionary Algorithms

Publication

Publication

Workflow

Workflow

Add Content