Bayesian model averaging, model selection and their approximations such as BIC are generally statistically consistent, but sometimes achieve slower rates of convergence than other methods such as AIC and leave-one-out cross-validation. On the other hand, these other methods can be inconsistent. We identify the catch-up phenomenon as a novel explanation for the slow convergence of Bayesian methods. Based on this analysis we define the switch-distribution, a modification of the Bayesian model averaging distribution. We prove that in many situations model selection and prediction based on the switch-distribution is both consistent and achieves optimal convergence rates, thereby resolving the AIC-BIC dilemma. The method is practical; we give an efficient algorithm.

Learning when all models are wrong
21st Annual Conference on Neural Information Processing Systems, NIPS 2007
Quantum Computing and Advanced System Research

van Erven, T., Grünwald, P., & de Rooij, S. (2008). Catching up faster in Bayesian model selection and model averaging. In Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference (pp. 417–424).