Algorithm configuration in sequential decision-making

Begnardi, Luca; von Meijenfeldt, Bart; Zhang, Yingqian (Jennie; van Jaarsveld, Willem; Baier, Hendrik

doi:10.1007/978-3-031-95973-8_6

L. Begnardi (Luca), B. von Meijenfeldt (Bart), Y.Q. Zhang (Yingqian (Jennie)), W. van Jaarsveld (Willem) and H.J.S. Baier (Hendrik)

2025-06-28

Algorithm configuration in sequential decision-making

Presented at the 22nd International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR 2025) (November 2025), Melbourne, Australia

Proper parameter configuration of algorithms is essential, but often time-consuming and complex, as many parameters need to be tuned simultaneously and evaluation can be expensive. In this paper, we focus on sequential decision-making (SDM) algorithms, which are applied to problems that require a series of decisions to be taken sequentially, aiming for an optimal cumulative outcome for the agent. To do this, every time the agent needs to make a decision, SDM algorithms take the current state of the environment as input and provide a decision as output. We propose a taxonomy of algorithm configuration approaches for SDM and introduce the concept of Per-State Algorithm Configuration (PSAC). To perform PSAC automatically, we present a framework based on Reinforcement Learning (RL). We demonstrate how PSAC by RL works in practice by applying it to two SDM algorithms on two SDM problems: Monte Carlo Tree Search, to solve a collaborative order picking problem in warehouses, and AlphaZero, to play a classic board game called Connect Four. Our experiments show that, in both use cases, PSAC achieves significant performance improvements compared to fixed parameter configurations. In general, our work expands the field of automated algorithm configuration and opens new possibilities for further research on SDM algorithms and their applications. Code is available at: https://github.com/ai-for-decision-making-tue/Per-State_Algorithm_Configuration.

Additional Metadata
Keywords	Algorithm configuration, Sequential decision-making, Reinforcement learning
Persistent URL	doi.org/10.1007/978-3-031-95973-8_6
Conference	22nd International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR 2025)
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Begnardi, L., von Meijenfeldt, B., Zhang, Y. (Jennie) ., van Jaarsveld, W., & Baier, H. (2025). Algorithm configuration in sequential decision-making. In Proceedings of the 22nd International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR 2025) (pp. 86–102). doi:10.1007/978-3-031-95973-8_6

View at Publisher

Full Text ( Author Manuscript , 643kb )

Algorithm configuration in sequential decision-making

Publication

Publication

Address

CWI researchers

Questions or comments?

Algorithm configuration in sequential decision-making

Publication

Publication

Workflow

Workflow

Add Content