In this paper, we introduce a novel approach to optimizing the control of systems that can be modeled as Markov decision processes (MDPs) with a threshold-based optimal policy. Our method is based on a specific type of genetic program known as symbolic regression (SR). We present how the performance of this program can be greatly improved by taking into account the corresponding MDP framework in which we apply it. The proposed method has two main advantages: (1) it results in near-optimal decision policies, and (2) in contrast to other algorithms, it generates closed-form approximations. Obtaining an explicit expression for the decision policy gives the opportunity to conduct sensitivity analysis, and allows instant calculation of a new threshold function for any change in the parameters. We emphasize that the introduced technique is highly general and applicable to MDPs that have a threshold-based policy. Extensive experimentation demonstrates the usefulness of the method.

, , , , ,
doi.org/10.1145/3388831.3388840
Adviseren en verrichten van onderzoek naar de functionele en niet-functionele eigenschappen van een deel van ING\'s IT-infrastructuur
International Conference on Performance Evaluation Methodologies and Tools
Stochastics

Hristov, P., Bosman, J., Bhulai, S., & van der Mei, R. (2020). Deriving explicit control policies for Markov Decision Processes using Symbolic Regression. In EAI International Conference on Performance Evaluation Methodologies and Tools (pp. 41–47). doi:10.1145/3388831.3388840