We develop the theory of hypothesis testing based on the e-value, a notion of evidence that, unlike the p-value, al- lows for effortlessly combining results from several studies in the common scenario where the decision to perform a new study may depend on previous outcomes. Tests based on e-values are safe, i.e. they preserve Type-I error guaran- tees, under such optional continuation. We define growth- rate optimality (GRO) as an analogue of power in an op- tional continuation context, and we show how to construct GRO e-variables for general testing problems with compos- ite null and alternative, emphasizing models with nuisance parameters. GRO e-values take the form of Bayes factors with special priors. We illustrate the theory using several classic examples including a one-sample safe t-test and the 2×2contingency table. Sharing Fisherian, Neymanian and Jeffreys-Bayesian interpretations, e-values may provide a methodology acceptable to adherents of all three schools.

, , , , ,
doi.org/10.1093/jrsssb/qkae011
Journal of the Royal Statistical Society Series B: Statistical Methodology
Machine Learning

Grünwald, P., de Heide, R., & Koolen-Wijkstra, W. (2024). Safe testing. Journal of the Royal Statistical Society Series B: Statistical Methodology, 86(5), 1091–1128. doi:10.1093/jrsssb/qkae011