Continuous-time on-policy neural reinforcement learning of working memory tasks

Zambrano, Davide; Roelfsema, Pieter; Bohte, Sander

doi:10.1109/IJCNN.2015.7280636

D. Zambrano (Davide), P.R. Roelfsema (Pieter) and S.M. Bohte (Sander)

2015

Continuous-time on-policy neural reinforcement learning of working memory tasks

Presented at the International Joint Conference of Neural Networks

As living organisms, one of our primary characteristics is the ability to rapidly process and react to unknown and unexpected events. To this end, we are able to recognize an event or a sequence of events and learn to respond properly. Despite advances in machine learning, current cognitive robotic systems are not able to rapidly and efficiently respond in the real world: the challenge is to learn to recognize both what is important, and also when to act. Reinforcement Learning (RL) is typically used to solve complex tasks: to learn the how. To respond quickly - to learn when - the environment has to be sampled often enough. For “enough”, a programmer has to decide on the step-size as a time-representation, choosing between a fine-grained representation of time (many state-transitions; difficult to learn with RL) or to a coarse temporal resolution (easier to learn with RL but lacking precise timing). Here, we derive a continuous-time version of on-policy SARSA-learning in a working-memory neural network model, AuGMEnT. Using a neural working memory network resolves the what problem, our when solution is built on the notion that in the real world, instantaneous actions of duration dt are actually impossible. We demonstrate how we can decouple action duration from the internal time-steps in the neural RL model using an action selection system. The resultant CT-AuGMEnT successfully learns to react to the events of a continuous-time task, without any pre-imposed specifications about the duration of the events or the delays between them.

Additional Metadata
Keywords	feature selection, learning (artificial intelligence), neural nets, action selection system, continuous-time on-policy neural reinforcement learning, machine learning, neural RL model, working-memory neural network model, Biological system modeling, Brain modeling
MSC	Neural networks, artificial life and related topics (msc 92B20)
THEME	Life Sciences (theme 5), Null option (theme 11)
Publisher	IEEE
Persistent URL	doi.org/10.1109/IJCNN.2015.7280636
Project	Deep Spiking Vision: Better, Faster, Cheaper
Conference	International Joint Conference of Neural Networks
Organisation	Evolutionary Intelligence
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Zambrano, D., Roelfsema, P., & Bohte, S. (2015). Continuous-time on-policy neural reinforcement learning of working memory tasks. In Proceedings of the International Joint Conference of Neural Networks. IEEE. doi:10.1109/IJCNN.2015.7280636

View at Publisher

Free Full Text ( Final Version )

Additional Files
23907B.pdf Author Manuscript , 515kb
Publisher Version

Continuous-time on-policy neural reinforcement learning of working memory tasks

Publication

Publication

Address

Publishing at CWI

Questions or comments?

Continuous-time on-policy neural reinforcement learning of working memory tasks

Publication

Publication

Workflow

Workflow

Add Content