Haplotype-aware diploid genome assembly is crucial in genomics, precision medicine, and many other disciplines. Long-read sequencing technologies have greatly improved genome assembly. However, current long-read assemblers are either reference based, so introduce biases, or fail to capture the haplotype diversity of diploid genomes. We present phasebook, a de novo approach for reconstructing the haplotypes of diploid genomes from long reads. phasebook outperforms other approaches in terms of haplotype coverage by large margins, in addition to achieving competitive performance in terms of assembly errors and assembly contiguity.

, , ,
doi.org/10.1186/s13059-021-02512-x
Genome Biology
Statistical Models for Structural Genetic Variants in the Genome of the Netherlands , Algorithms for PAngenome Computational Analysis , Pan-genome Graph Algorithms and Data Integration
, ,

Luo, V., Kang, X., & Schönhuth, A. (2021). phasebook: haplotype-aware de novo assembly of diploid genomes from long reads. Genome Biology, 22(1). doi:10.1186/s13059-021-02512-x