We develop and compare e-variables for testing whether k samples of data are drawn from the same distribution, the alternative being that they come from different elements of an exponential family. We consider the GRO (growth-rate optimal) e-variables for (1) a ‘small’ null inside the same exponential family, and (2) a ‘large’ nonparametric null, as well as (3) an e-variable arrived at by conditioning on the sum of the sufficient statistics. (2) and (3) are efficiently computable, and extend ideas from Turner et al. (2021) and Wald (1947) respectively from Bernoulli to general exponential families. We provide theoretical and simulation-based comparisons of these e-variables in terms of their logarithmic growth rate, and find that for small effects all four e-variables behave surprisingly similarly; for the Gaussian location and Poisson families, e-variables (1) and (3) coincide; for Bernoulli, (1) and (2) coincide; but in general, whether (2) or (3) grows faster under the alternative is family-dependent. We furthermore discuss algorithms for numerically approximating (1).

, , ,
doi.org/10.1007/s13171-024-00339-9
Sankhya A
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Hao, Y., Grünwald, P., Lardy, T., Long, L., & Armann, M. (2024). E-values for k-sample tests with exponential families. Sankhya A. doi:10.1007/s13171-024-00339-9