Decision making in non-stationary environments with policy-augmented search

Pettet, Ava; Zhang, Yunuo; Luo, Baiting; Wray, Kyle; Baier, Hendrik; Laszka, Aron; Dubey, Abhishek; Mukhopadhyay, Ayan

A. Pettet (Ava), Y. Zhang (Yunuo), B. Luo (Baiting), K. Wray (Kyle), H.J.S. Baier (Hendrik), A. Laszka (Aron), A. Dubey (Abhishek) and A. Mukhopadhyay (Ayan)

2024-05-06

Decision making in non-stationary environments with policy-augmented search

Presented at the AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems (May 2024), Auckland, New Zealand

Sequential decision-making is challenging in non-stationary environments, where the environment in which an agent operates can change over time. Policies learned before execution become stale when the environment changes, and relearning takes time and computational effort. Online search, on the other hand, can return sub-optimal actions when there are limitations on allowed runtime. In this paper, we introduce Policy-Augmented Monte Carlo tree search (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment. We prove several theoretical results about PA-MCTS. We also compare and contrast our approach with AlphaZero, another hybrid planning approach, and Deep Q Learning on several OpenAI Gym environments and show that PA-MCTS outperforms these baselines.

Additional Metadata
Keywords	Mcts, Non-stationary environments, Sequential decision-making
Conference	AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Pettet, A., Zhang, Y., Luo, B., Wray, K., Baier, H., Laszka, A., … Mukhopadhyay, A. (2024). Decision making in non-stationary environments with policy-augmented search. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (pp. 2417–2419).

Additional Files
View Online

Decision making in non-stationary environments with policy-augmented search

Publication

Publication

Address

CWI researchers

Questions or comments?

Decision making in non-stationary environments with policy-augmented search

Publication

Publication

Workflow

Workflow

Add Content