An extension of Velvet assembler to de novo metagenomic assembly


An important step in metagenomics analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community. Most conventional pipelines use a single-genome assembler with carefully optimized parameters. A limitation of a single-genome assembler for de novo metagenome assembly is that sequences of highly abundant species are likely misidentified as repeats in a single genome, resulting in a number of small fragmented scaffolds. We extended a single-genome assembler for short reads, known as Velvet, to metagenomic assembly for mixed short reads of multiple species

MetaVelvet : An extension of Velvet assembler to de novo metagenome assembly from short sequence reads

We modified and extended a single-genome and de Bruijn-graph based assembler, Velvet, for de novo metagenomic assembly. Our fundamental ideas are first decomposing de Bruijn graph constructed from mixed short reads into individual sub-graphs and second building scaffolds based on every decomposed de Bruijn sub-graph as isolated species genome. MetaVelvet has been proven to generate assemblies of longer N50 and higher quality than Velvet when applied to metagenomic sequences, and recognized as one of frequently-used metagenomic assemblers in this research community read more

MetaVelvet - SL : An extension of Velvet assembler to de novo metagenomic assembler utilizing supervised learning

For the graph disconnection task, MetaVelvet identifies shared nodes (named chimeric nodes) between two subgraphs and disconnects two subgraphs by splitting the shared nodes. To identify chimeric nodes, MetaVelvet uses a simple heuristics based on coverage difference and paired-end information. One important remaining subject of MetaVelvet is the low sensitivity and low accuracy of detecting chimeric nodes, which prevents generating further longer contigs and scaffolds. We tackled this problem of detecting chimeric nodes by using supervised machine learning. The idea of the new tool, called MetaVelvet-SL, was to develop the learning model to classify a candidate node whether it is a chimeric node or not read more

Publication

Contact us

  • Simple questions and bug reports, please feel free to mail to the following adress:

  • Deep questions and discussion, please subscribe and mail to the Velvet mailing list:
    http://listserver.ebi.ac.uk/mailman/listinfo/velvet-users

  • Requests for future releases, we would like to continuously develop MetaVelvet with hearing from user voices: