Accurate discovery of somatic variants is of central importance in cancer research. However, count statistics on discovered somatic insertions and deletions (indels) indicate that large amounts of discoveries are missed because of the quantification of uncertainties related to gap and alignment ambiguities, twilight zone indels, cancer heterogeneity, sample purity, sampling, and strand bias. We provide a unifying statistical model whose dependency structures enable accurate quantification of all inherent uncertainties in short time. Consequently, false discovery rate (FDR) in somatic indel discovery can now be controlled at utmost accuracy, increasing the amount of true discoveries while safely suppressing the FDR.

, , , ,
Genome Biology
Statistical Models for Structural Genetic Variants in the Genome of the Netherlands
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Köster, J, Dijkstra, L.J, Marschall, T, & Schönhuth, A. (2020). Varlociraptor: Enhancing sensitivity and controlling false discovery rate in somatic indel discovery. Genome Biology, 21(1). doi:10.1186/s13059-020-01993-6