2011
Adaptive Hedge
Publication
Publication
Presented at the
25th Annual Conference on Neural Information Processing Systems, NIPS 2011 (December 2011), Granada, Spain
Most methods for decision-theoretic online learning are based on theHedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.
| Additional Metadata | |
|---|---|
| , | |
| The MIT Press | |
| Advances in Neural Information Processing Systems | |
| Learning when all models are wrong | |
| 25th Annual Conference on Neural Information Processing Systems, NIPS 2011 | |
| Organisation | Algorithms and Complexity |
|
van Erven, T., Grünwald, P., Koolen-Wijkstra, W., & de Rooij, S. (2011). Adaptive Hedge. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011. The MIT Press. |
|