SAVAGE is a computational tool for reconstructing individual haplotypes of intra-host virus strains (a viral quasispecies) without the need for a high quality reference genome. SAVAGE makes use of either FM-index based data structures or ad-hoc consensus reference sequence for constructing overlap graphs from patient sample data. In this overlap graph, nodes represent reads and/or contigs, while edges reflect that two reads/contigs, based on sound statistical considerations, represent identical haplotypic sequence. Following an iterative scheme, a new overlap assembly algorithm that is based on the enumeration of statistically well-calibrated groups of reads/contigs then efficiently reconstructs the individual haplotypes from this overlap graph.
|Note||If you are using SAVAGE, please cite our paper: De novo viral quasispecies assembly using overlap graphs, J. Baaijens, A. Zine El Aabidine, E. Rivals, and A. Schoenhuth, Genome Res. 2017. 27: 835-848, doi:10.1101/gr.215038.116.|
Baaijens, J.A, & Schönhuth, A. (2016). SAVAGE.
|view at Bitbucket|