WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

Patterson, Murray; Marschall, Tobias; Pisanti, Nadia; van Iersel, Leo; Stougie, Leen; Klau, Gunnar; Schönhuth, Alexander

doi:10.1007/978-3-319-05269-4_19

M.D. Patterson (Murray), T. Marschall (Tobias), N. Pisanti (Nadia), L.J.J. van Iersel (Leo), L. Stougie (Leen), G.W. Klau (Gunnar) and A. Schönhuth (Alexander)

2014

WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

Presented at the Annual International Conference on Computational Molecular Biology , Pittsburgh, PA, USA

The human genome is diploid, that is each of its chromosomes comes in two copies. This requires to phase the single nucleotide polymorphisms (SNPs), that is, to assign them to the two copies, beyond just detecting them. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for downstream analyses in population genetics. Currently, statistical approaches, which avoid making use of direct read information, constitute the state-of-the-art. Haplotype assembly, which addresses phasing directly from sequencing reads, suffers from the fact that sequencing reads of the current generation are too short to serve the purposes of genome-wide phasing. Future sequencing technologies, however, bear the promise to generate reads of lengths and error rates that allow to bridge all SNP positions in the genome at sufficient amounts of SNPs per read. Existing haplotype assembly approaches, however, profit precisely, in terms of computational complexity, from the limited length of current-generation reads, because their runtime is usually exponential in the number of SNPs per sequencing read. This implies that such approaches will not be able to exploit the benefits of long enough, future-generation reads. Here, we suggest WhatsHap, a novel dynamic programming approach to haplotype assembly. It is the first approach that yields provably optimal solutions to the weighted minimum error correction (wMEC) problem in runtime linear in the number of SNPs per sequencing read, making it suitable for future-generation reads. WhatsHap is a fixed parameter tractable (FPT) approach with coverage as the parameter. We demonstrate that WhatsHap can handle datasets of coverage up to 20x, processing chromosomes on standard workstations in only 1-2 hours. Our simulation study shows that the quality of haplotypes assembled by WhatsHap significantly improves with increasing read length, both in terms of genome coverage as well as in terms of switch errors. The switch error rates we achieve in our simulations are superior to those obtained by state-of-the-art statistical phasers.

Additional Metadata
THEME	Life Sciences (theme 5)
Publisher	Springer
Persistent URL	doi.org/10.1007/978-3-319-05269-4_19
Series	Lecture Notes in Computer Science
Project	Bringing Phylogenetic Networks to Life
Conference	Annual International Conference on Computational Molecular Biology
Organisation	Evolutionary Intelligence
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Patterson, M., Marschall, T., Pisanti, N., van Iersel, L., Stougie, L., Klau, G., & Schönhuth, A. (2014). WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads. In Research in Computational Molecular Biology 2014 (RECOMB 0) (pp. 237–249). Springer. doi:10.1007/978-3-319-05269-4_19

View at Publisher

Free Full Text ( Final Version , 148kb )

Additional Files
22985B.pdf Author Manuscript , 318kb
Fulltext Final Version
Publisher Version

See Also
software WhatsHap A. Schönhuth (Alexander), M.D. Patterson (Murray), T. Marschall (Tobias) and M. Martin (Marcel)

WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

Publication

Publication

software
WhatsHap

Address

CWI researchers

Questions or comments?

WhatsHap: Haplotype Assembly for Future-Generation Sequencing Reads

Publication

Publication

software WhatsHap

Workflow

Workflow

Add Content

software
WhatsHap