Detecting switches against non-stationary opponents
Interactions in multiagent systems are generally more com- plicated than single agent ones. Game theory provides solu- tions on how to act in multiple agent scenarios; however, it assumes that all agents will act rationally. Moreover, some works also assume the opponent will use a stationary strat- egy. These assumptions usually do not hold in real world scenarios where agents have limited capacities and may de- viate from a perfect rational response. Our goal is still to act optimally in this cases by learning the appropriate response and without any prior policies on how to act. Thus, we fo- cus on the problem when another agent in the environment uses dierent stationary strategies over time. This paper introduces DriftER, an algorithm that 1) learns a model of the opponent, 2) uses that to obtain an optimal policy and then 3) determines when it must re-learn due to an opponent strategy change. We provide theoretical results showing that DriftER guarantees to detect switches with high probability. Also, we provide empirical results in normal form games and then in a more realistic scenario, the Power TAC simulator.
|Keywords||Non-stationary opponents, Repeated games, Markov decision processes|
|Stakeholder||PROWLER.io Ltd. (Cambridge)|
|Project||Demand response for grid-friendly quasi-autarkic energy cooperatives|
|Conference||International Joint Conference on Autonomous Agents and Multiagent Systems|
|Grant||This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/651.001.003 - Demand response for grid-friendly quasi-autarkic energy cooperatives|
Hernandez-Leal, P, Zhan, Y, Taylor, M.E, Sucar, L.E, & Munoz de Cote, E. (2017). Detecting switches against non-stationary opponents. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems (pp. 920–921).