Safe Testing

Grünwald, Peter; de Heide, Rianne; Koolen-Wijkstra, Wouter

We present a new theory of hypothesis testing. The main concept is the S-value, a notion of evidence which, unlike p-values, allows for effortlessly combining evidence from several tests, even in the common scenario where the decision to perform a new test depends on the previous test outcome: safe tests based on S-values generally preserve Type-I error guarantees under such "optional continuation". S-values exist for completely general testing problems with composite null and alternatives. Their prime interpretation is in terms of gambling or investing, each S-value corresponding to a particular investment. Surprisingly, optimal "GROW" S-values, which lead to fastest capital growth, are fully characterized by the joint information projection (JIPr) between the set of all Bayes marginal distributions on H0 and H1. Thus, optimal S-values also have an interpretation as Bayes factors, with priors given by the JIPr. We illustrate the theory using two classical testing scenarios: the one-sample t-test and the 2x2 contingency table. In the t-test setting, GROW s-values correspond to adopting the right Haar prior on the variance, like in Jeffreys' Bayesian t-test. However, unlike Jeffreys', the "default" safe t-test puts a discrete 2-point prior on the effect size, leading to better behavior in terms of statistical power. Sharing Fisherian, Neymanian and Jeffreys-Bayesian interpretations, S-values and safe tests may provide a methodology acceptable to adherents of all three schools.

Additional Metadata
Series	arXiv.org e-Print archive
Project	Safe Bayesian Inference: A Theory of Misspecification based on Statistical Learning
Grant	This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/617.001.651 - Safe Bayesian Inference: A Theory of Misspecification based on Statistical Learning
Organisation	Machine Learning
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Grünwald, P., de Heide, R., & Koolen-Wijkstra, W. (2019). Safe Testing. arXiv.org e-Print archive.

Free Full Text ( Final Version , 1mb )

See Also
inProceedings Safe Statistics for Means and Proportions R.J. Turner (Rosanne)
presentation Reducing Waste with Meta-Analysis/Replications: Why We Must and Can Do Better than All-or-Nothing Statistics J.A. ter Schure (Judith)
software safestats: Safe Anytime-Valid Inference R.J. Turner (Rosanne), A. Ly (Alexander), M.F. Pérez (Muriel), J.A. ter Schure (Judith) and P.D. Grünwald (Peter)
software\|data safestats: Safe Anytime-Valid Inference R.J. Turner (Rosanne), A. Ly (Alexander), M.F. Pérez (Muriel), J.A. ter Schure (Judith) and P.D. Grünwald (Peter)
software Safestats A. Ly (Alexander)

Safe Testing

Publication

Publication

inProceedings
Safe Statistics for Means and Proportions

presentation
Reducing Waste with Meta-Analysis/Replications: Why We Must and Can Do Better than All-or-Nothing Statistics

software
safestats: Safe Anytime-Valid Inference

software|data
safestats: Safe Anytime-Valid Inference

software
Safestats

Address

CWI researchers

Questions or comments?

Safe Testing

Publication

Publication

inProceedings Safe Statistics for Means and Proportions

presentation Reducing Waste with Meta-Analysis/Replications: Why We Must and Can Do Better than All-or-Nothing Statistics

software safestats: Safe Anytime-Valid Inference

software|data safestats: Safe Anytime-Valid Inference

software Safestats

Workflow

Workflow

Add Content

inProceedings
Safe Statistics for Means and Proportions

presentation
Reducing Waste with Meta-Analysis/Replications: Why We Must and Can Do Better than All-or-Nothing Statistics

software
safestats: Safe Anytime-Valid Inference

software|data
safestats: Safe Anytime-Valid Inference

software
Safestats