2021-03-29
Deep reinforcement learning in linear discrete action spaces
Publication
Publication
Problems in operations research are typically combinatorial and high-dimensional. To a degree, linear programs may efficiently solve such large decision problems. For stochastic multi-period problems, decomposition into a sequence of one-stage decisions with approximated downstream effects is often necessary, e.g., by deploying reinforcement learning to obtain value function approximations (VFAs). When embedding such VFAs into one-stage linear programs, VFA design is restricted by linearity. This paper presents an integrated simulation approach for such complex optimization problems, developing a deep reinforcement learning algorithm that combines linear programming and neural network VFAs. Our proposed method embeds neural network VFAs into one-stage linear decision problems, combining the nonlinear expressive power of neural networks with the efficiency of solving linear programs. As a proof of concept, we perform numerical experiments on a transportation problem. The neural network VFAs consistently outperform polynomial VFAs as well as other benchmarks, with limited design and tuning effort.
Additional Metadata | |
---|---|
, , , , , , | |
doi.org/10.1109/WSC48552.2020.9384078 | |
Winter Simulation Conference 2020 | |
Organisation | Intelligent and autonomous systems |
van Heeswijk, W., & La Poutré, H. (2021). Deep reinforcement learning in linear discrete action spaces. In Proceedings of the Winter Simulation Conference (pp. 1063–1074). doi:10.1109/WSC48552.2020.9384078 |