Funpec-RpAbout The JournalEditorial BoardCurrent IssueAll IssuesSearchIndexersInstructions For AuthorsContactSponsorsLinks

Evaluation of window cohabitation of DNA sequencing errors and lowest PHRED quality values
Francisco Prosdocimi1, Fabiano Cruz Peixoto2 and José Miguel Ortega3
1Laboratório de Biodiversidade e Evolução Molecular, Departamento de Biologia Geral,
ICB-UFMG, Belo Horizonte, MG, Brasil
2Laboratório de Computação Científica, UFMG, Belo Horizonte, MG, Brasil
3Laboratório de Biodados, Departamento de Bioquímica e Imunologia,
ICB-UFMG, Belo Horizonte, MG, Brasil
Corresponding author: J.M. Ortega
E-mail: [email protected]
Genet. Mol. Res. 3 (4): 483-492 (2004)
Received October 4, 2004
Accepted December 6, 2004
Published December 30, 2004

ABSTRACT. When analyzing sequencing reads, it is important to distinguish between putative correct and wrong bases. An open question is how a PHRED quality value is capable of identifying the miscalled bases and if there is a quality cutoff that allows mapping of most errors. Considering the fact that a low quality value does not necessarily indicate a miscalled position, we decided to investigate if window-based analyses of quality values might better predict errors. There are many reasons to look for a perfect window in DNA sequences, such as when using SAGE technique, looking for BLAST seeding and clustering sequences. Thus, we set out to find a quality cutoff value that would distinguish non-perfect windows from perfect ones. We produced and compared 846 reads of pUC18 with the published pUC consensus, by local alignment. We then generated a database containing all mismatches, insertions and gaps in order to map real perfect windows. An investigation was made to find the potential to predict perfect windows when all bases in the window show quality values over a given cutoff. We conclude that, in window-based applications, a PHRED quality value cutoff of 7 masks most of the errors without masking real correct windows. We suggest that the putative wrong bases be indicated in lower case, increasing the information on the sequence databases without increasing the size the files.

Key words: DNA sequence quality, PHRED, Quality window, SAGE, BLAST

 

Copyright © 2004 by FUNPEC