This chapter focuses on techniques for next-generation resequencing studies, which allow to study the differences between a donor genome, a genome to be investigated, and a reference genome. It reviews why discovery and genotyping of mid-sized deletions has been difficult and explains the techniques by which this became possible. The chapter gives the formal definition of "twilight zone" deletions and revisits the different approaches suitable for deletion discovery in resequencing studies, as well as outlines their pitfalls when it comes to discovering mid-sized ("twilight zone") deletions. It presents a novel maximum likelihood approach for genotyping deletions which achieves highly favorable performance rates on twilight zone indels. The chapter evaluates a comprehensive selection of state-of-the-art tools on next-generation sequencing (NGS) reads from a genome containing real variants, where NGS reads are simulated by means of the Assemblathon read simulator and current NGS technology. CLEVER significantly outperforms MoDIL in terms of both recall and precision in discovery.

Additional Metadata
Keywords Assemblathon read simulator, CLEVER, Genotyping, MoDIL, Next-generation resequencing, Reference genome, Twilight zone deletions
Persistent URL dx.doi.org/10.1002/9781119272182.ch7
Citation
Marschall, T, & Schönhuth, A. (2016). Discovering and Genotyping Twilight Zone Deletions. In Computational Methods for Next Generation Sequencing Data Analysis (pp. 149–173). doi:10.1002/9781119272182.ch7