Efficiently detecting switches against non-stationary opponents

Hernandez-Leal, Pablo; Zhan, Yusen; Taylor, Matthew; Sucar, Enrique; Munoz de Cote, Enrique

doi:10.1007/s10458-016-9352-6

P. Hernandez-Leal (Pablo), Y. Zhan (Yusen), M.E. Taylor (Matthew), L.E. Sucar (Enrique) and E. Munoz de Cote (Enrique)

2017-07-01

Efficiently detecting switches against non-stationary opponents

Autonomous Agents and Multi-Agent Systems , Volume 31 p. 767- 789

Interactions in multiagent systems are generally more complicated than single agent ones. Game theory provides solutions on how to act in multiagent scenarios; however, it assumes that all agents will act rationally. Moreover, some works also assume the opponent will use a stationary strategy. These assumptions usually do not hold in real world scenarios where agents have limited capacities and may deviate from a perfect rational response. Our goal is still to act optimally in these cases by learning the appropriate response and without any prior policies on how to act. Thus, we focus on the problem when another agent in the environment uses different stationary strategies over time. This will turn the problem into learning in a non-stationary environment, posing a problem for most learning algorithms. This paper introduces DriftER, an algorithm that (1) learns a model of the opponent, (2) uses that to obtain an optimal policy and then (3) determines when it must re-learn due to an opponent strategy change. We provide theoretical results showing that DriftER guarantees to detect switches with high probability. Also, we provide empirical results showing that our approach outperforms state of the art algorithms, in normal form games such as prisoner’s dilemma and then in a more realistic scenario, the Power TAC simulator.

Additional Metadata
Keywords	Learning, Non-stationary environments, Repeated games, Switching strategies
Stakeholder	PROWLER.io Ltd. (Cambridge)
Persistent URL	doi.org/10.1007/s10458-016-9352-6
Journal	Autonomous Agents and Multi-Agent Systems
Project	Demand response for grid-friendly quasi-autarkic energy cooperatives
Grant	This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/651.001.003 - Demand response for grid-friendly quasi-autarkic energy cooperatives
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Hernandez-Leal, P., Zhan, Y., Taylor, M., Sucar, E., & Munoz de Cote, E. (2017). Efficiently detecting switches against non-stationary opponents. Autonomous Agents and Multi-Agent Systems, 31, 767–789. doi:10.1007/s10458-016-9352-6

View at Publisher

Full Text ( Final Version , 1mb )

Efficiently detecting switches against non-stationary opponents

Publication

Publication

Address

CWI researchers

Questions or comments?

Efficiently detecting switches against non-stationary opponents

Publication

Publication

Workflow

Workflow

Add Content