Funpec-RpAbout The JournalEditorial BoardCurrent IssueAll IssuesSearchIndexersInstructions For AuthorsContactSponsorsLinks

Effects of sample re-sequencing and trimming on the quality and size of assembled consensus sequences

F. Prosdocimi1, D.A.O. Lopes1, F.C. Peixoto2, M.M. Mourão3, L.G.G. Pacífico4, R.A. Ribeiro5 and J.M. Ortega1
1Laboratório de Biodados, Departamento de Bioquímica e Imunologia, ICB-UFMG, Belo Horizonte, MG, Brasil
2Laboratório de Computação Científica, UFMG, Belo Horizonte, MG, Brasil
3Laboratório de Genética-Bioquímica, Departamento de Bioquímica e Imunologia, UFMG, Belo Horizonte, MG, Brasil
4Laboratório de Imunologia de Doenças Infecciosas, Departamento de Bioquímica e Imunologia, UFMG, Belo Horizonte, MG, Brasil
5Laboratório de Biodiversidade e Evolução Molecular, Departamento de Biologia Geral, UFMG,
Belo Horizonte, MG, Brasil
Corresponding author: J.M. Ortega
E-mail: [email protected]

Genet. Mol. Res. 6 (4): 756-765 (2007)
Received August 03, 2007
Accepted September 25, 2007
Published October 05, 2007

ABSTRACT. The production of nucleic acid sequences by automatic DNA sequencer machines is always associated with some base-calling errors. In order to produce a high-quality DNA sequence from a molecule of interest, researchers normally sequence the same sample many times. Considering base-calling errors as rare events, re-sequencing the same molecule and assembling the reads produced are frequently thought to be a good way to generate reliable sequences. However, a relevant question on this issue is: how many times the sample needs to be re-sequenced to minimize costs and achieve a high-fidelity sequence? We examined how both the number of re-sequenced reads and PHRED trimming parameters affect the accuracy and size of final consensus sequences. Hundreds of single-pool reaction pUC18 reads were generated and assembled into consensus sequences with CAP3 software. Using local alignment against the published pUC18 cloning vector sequence, the position and number of errors in the consensus were identified and stored in MySQL databases. Stringent PHRED trimming parameters proved to be efficient for the reduction of errors; however, this procedure also decreased consensus size. Moreover, re-sequencing did not have a clear effect on the removal of consensus errors, although it was able to slightly increase consensus.

Key words: Sequencing reads, Trimming, Assembling, Consensus, Codifying sequences, PHRED, CAP3

   Copyright © 2007 by FUNPEC-RP