Consider the set of all sequences of n outcomes, each taking one of m values, whose frequency vectors satisfy a set of linear constraints. If m is fixed while n increases, most sequences that satisfy the constraints result in frequency vectors whose entropy approaches that of the maximum entropy vector satisfying the constraints. This well-known entropy concentration phenomenon underlies the maximum entropy method. Existing proofs of the concentration phenomenon are based on limits or asymptotics and unrealistically assume that constraints hold precisely, supporting maximum entropy inference more in principle than in practice. We present, for the first time, non-asymptotic, explicit lower bounds on n for a number of variants of the concentration result to hold to any prescribed accuracies, with the constraints holding up to any specified tolerance, considering the fact that allocations of discrete units can satisfy constraints only approximately. Again unlike earlier results, we measure concentration not by deviation from the maximum entropy value, but by the ℓ1 and ℓ2 distances from the maximum entropy-achieving frequency vector. One of our results holds independently of the alphabet size m and is based on a novel proof technique using the multi-dimensional Berry-Esseen theorem. We illustrate and compare our results using various detailed examples.

IEEE Transactions on Information Theory
Machine Learning

Oikonomou, K., & Grünwald, P. (2016). Explicit bounds for entropy concentration under linear constraints. IEEE Transactions on Information Theory, 62(3), 1206–1230. doi:10.1109/TIT.2015.2458951