Gene assembly is an important step in functional analysis of shotgun metagenomic data. Nonetheless, strain aware assembly remains a challenging task, as current assembly tools often fail to distinguish among strain variants or require closely related reference genomes of the studied species to be available. We have developed Snowball, a novel strain aware and reference-free gene assembler for shotgun metagenomic data. It uses profile hidden Markov models (HMMs) of gene domains of interest to guide the assembly. Our assembler performs gene assembly of individual gene domains based on read overlaps and error correction using read quality scores at the same time, which result in very low per-base error rates. The software runs on a user-defined number of processor cores in parallel, runs on a standard laptop and is freely available for installation under Linux or OS X on: https://github.com/algbioi/snowball/wiki
Additional Metadata
THEME Life Sciences (theme 5)
Publisher Cornell University Library
Persistent URL dx.doi.org/10.1093/bioinformatics/btw426
Series arXiv.org e-Print archive
Citation
Gregor, I, Schönhuth, A, & McHardy, A.C. (2015). Snowball: Strain aware gene assembly of Metagenomes. arXiv.org e-Print archive. Cornell University Library . doi:10.1093/bioinformatics/btw426