Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs

Limasset, Antoine; Flot, Jean-François; Peterlongo, Pierre

doi:doi/10.1093/bioinformatics/btz102

Citer

Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs

par Limasset, Antoine

;Flot, Jean-François

;Peterlongo, Pierre
Référence Bioinformatics, 36, 5, page (1374-1381)
Publication Publié, 2020-05-01

Article révisé par les pairs

Résumé :

Motivations Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large data sets or consider reads as mere suites of k-mers, without taking into account their full-length read information. Results We propose a new method to correct short reads using de Bruijn graphs, and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filteblack on the basis of k-mer abundance then of unitig abundance, thereby removing most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond. Availability and Implementation The implementation is open source, available at http://github.com/Malfoy/BCOOL under the Affero GPL license and as a Bioconda package. Contact Antoine Limasset antoine.limasset@gmail.com & Jean-François Flot jflot@ulb.ac.be & Pierre Peterlongo pierre.peterlongo@inria.fr.

Référencement	Visibilité	Pérennité	Facilité
Les publications encodées constituent la bibliographie académique de l'Université.	Les documents déposés sont indexés par les moteurs de recherche (Google Scholar,…).	Les documents déposés en open-access sont archivés au sein du réseau de préservation SAFE-PLN (www.safepln.org).	Les listes de publications sont compatibles avec le CV-ULB, le FNRS et accessibles sur le web.

Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs

Documents en relation

DI-fusion