RLBOA: A modular reinforcement learning framework for autonomous negotiating agents
Negotiation is a complex problem, in which the variety of settings and opponents that may be encountered prohibits the use of a single predefined negotiation strategy. Hence the agent should be able to learn such a strategy autonomously. To this end we propose RLBOA, a modular framework that facilitates the creation of autonomous negotiation agents using reinforcement learning. The framework allows for the creation of agents that are capable of negotiating effectively in many different scenarios. To be able to cope with the large size of the state and action spaces and diversity of settings, we leverage the modular BOA-framework. This decouples the negotiation strategy into a Bidding strategy, an Opponent model and an Acceptance condition. Furthermore, we map the multidimensional contract space onto the utility axis which enables a compact and generic state and action description. We demonstrate the value of the RLBOA framework by implementing an agent that uses tabular Q-learning on the compressed state and action space to learn a bidding strategy.We show that the resulting agent is able to learn well-performing bidding strategies in a range of negotiation settings and is able to generalize across opponents and domains.
|Bargaining and negotiation, Learning agent-to-agent interactions (negotiation, trust, coordination), Reinforcement Learning|
|Representing Users in a Negotiation (RUN): An Autonomous Negotiator Under Preference Uncertainty|
|International Conference on Autonomous Agents and Multi-Agent Systems|
|This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/639.021.751 - Representing Users in a Negotiation (RUN): An Autonomous Negotiator Under Preference Uncertainty|
|Organisation||Intelligent and autonomous systems|
Bakker, J, Hammond, A, Bloembergen, D, & Baarslag, T. (2019). RLBOA: A modular reinforcement learning framework for autonomous negotiating agents. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (pp. 260–268).