DISCUSSION
The objective of our analysis was to identify conservation between human and bovine orthologous noncoding sequences. Our human/bovine comparison is based on phylogenetic footprinting, the principle that important regulatory modules are retained via selective pressure during evolution, and that comparison of at least two divergent genomes can reveal conserved sequences that are most likely to be biologically relevant (Weitzman, 2003).
Most of the 3’-UTRs possessed CNSs, while only 36 of 95 introns shared CNSs. There were much more CNSs in 3’-UTRs than in introns, which is in accordance with a previous study that identified highly conserved regions in a set of genes from mammals, birds, amphibians, and bony fishes (Duret and Bucher, 1997). They found that conserved sequences were three times more frequent in 3’-noncoding regions than in introns.
The percentage identity that we observed was higher in 3’-UTRs (average identity = 77.2%) than in introns (average identity = 73.2%). This result agrees with the expectation that untranslated regions are more conserved than introns because of their crucial role in post-transcriptional and post-translational processes. Similar results were obtained by Jareborg et al. (1999), who compared 77 orthologous mouse and human gene pairs and found almost 10 times more 90% identical regions in 3’- and 5’-UTRs than in introns. Only 1% of the introns showed this high identity. In our study, the most identical CNSs were those found in FBN1 3’-UTR, which showed 303 bp and 91.7% identity, and in PPP1R8 3’-UTR, which showed 81.2% identity over 1116 bp.
One of the six 5’-UTRs possessed a conserved sequence, which was less identical than 3’-UTRs and intron CNSs. The lack of identity of the 5’-UTR conserved sequence is a surprising result because we expected a strong selective pressure on 5’-UTRs, as these sequences contain elements involved in the regulation of transcription (Duret and Bucher, 1997). However, since only one 5’-UTR sequence was analyzed, we cannot generalize about 5’-UTR conservation across the genome.
With presumably fewer functional constraints, noncoding sequences can accumulate neutral mutations and can evolve more rapidly than coding portions of the genomes. When a noncoding sequence encodes some regulatory function, selective pressure may drive sequence conservation within the region (Jareborg et al., 1999; Nobrega and Pennacchio, 2004). It is possible that some of the 73 noncoding sequences examined were conserved because of the short divergence time, given the intermediate evolutionary distance (70-100 Myrs) between humans and bovines. In this case, the lack of divergence would interfere with the detection of functional noncoding elements. Thus, it was necessary to compare more than two evolutionary related species. We further examined human/bovine sequence alignments, comparing them to mouse orthologous sequences to check the CNS region. Through multiple analyses, we were able to identify sequences that shared similarities because of functional constraints.
Based on human/bovine/mouse alignments, active conservation was detected in six 3’-UTRs and seven introns, totalizing 13 noncoding sequences from 11 genes. Assuming that noncoding sequences that have remained similar during evolution might play roles in gene regulation, the high levels of conservation identified in the assessed sequences may infer functional elements. Apparently, significant similarity exists between human and bovine noncoding gene sequences, and comparative analysis of noncoding sequences between these two genomes provides putative regulatory sequences that can be experimentally tested to confirm if they participate in gene regulation. These results will help elucidate the regulation process of these genes in human and bovine genomes.
ACKNOWLEDGMENTS
We are grateful to CNPq - Brazil, fellowship number 130541/2002-8, which has supported M.N. Miziara, and to FAPESP, Grant 97/13403-1 to M.E.J. Amaral. We and thank Dr. J.E. Womack and Elaine Owens for many suggestions and for critical reading of the manuscript.
REFERENCES
Batzoglou, S., Pachter, L., Mesirov, J.P., Berger, B. and Lander, E.S. (2000). Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10: 950-958.
Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. and Wheeler, D.L. (2003). Genbank. Nucleic Acids Res. 31: 23-27.
Blanchette, M., Schwikowski, B. and Tompa, M. (2002). Algorithms for phylogenetic footprinting. J. Comput. Biol. 9: 211-223.
Bray, N., Dubchak, I. and Pachter, L. (2003). AVID: A global alignment program. Genome Res. 13: 97-102.
Chapman, M.A., Charchar, F.J., Kinston, S., Bird, C.P., Grafham, D., Rogers, J., Grützner, F., Graves, J.A.M., Green, A.R. and Göttgens, B. (2003). Comparative and functional analyses of LYL1 loci establish marsupial sequences as a model for phylogenetic footprinting. Genomics 81: 249-259.
Cooper, G.M., Brudno, M., NISC Comparative Program, Green, E.D., Batzoglou, S. and Sidow, A. (2003). Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13: 813-820.
Dubchak, I., Brudno, M., Loots, G.G., Pachter, L., Mayor, C., Rubin, E.M. and Frazer, K.A. (2000). Active conservation of noncoding sequences revealed by three-way species comparisons. Genome Res. 10: 1304-1306.
Duret, L. and Bucher, P. (1997). Searching for regulatory elements in human noncoding sequences. Curr. Opin. Struct. Biol. 7: 399-406.
Frazer, K.A., Sheehan, J.B., Stokowski, R.P., Chen, X., Hosseini, R., Cheng, J.F., Fodor, S.P., Cox, D.R. and Patil, N. (2001). Evolutionarily conserved sequences on human chromosome 21. Genome Res. 11: 1651-1659.
Frazer, K.A., Elnitski, L., Church, D.M., Dubchak, I. and Hardison, R.C. (2003). Cross-species sequence comparisons: a review of methods and available resources. Genome Res. 13: 1-12.
Hering, T.M., Kollar, J., Huynh, T.D. and Sandell, L.J. (1995). Bovine chondrocyte link protein cDNA sequence: interspecies conservation of primary structure and mRNA untranslated regions. Comp. Biochem. Physiol. B Biochem. Mol. Biol. 112: 197-203.
Higgins, D., Thompson, J., Gibson, T., Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673-4680.
Hu, Z., Frith, M., Niu, T. and Weng, Z. (2003). SeqVISTA: a graphical tool for sequence feature visualization and comparison. BMC Bioinformatics 4: 1-8.
Huang, M.T.F. and Gorman, C.M. (1990). Intervening sequences increase efficiency of RNA 3' processing and accumulation of cytoplasmic RNA. Nucleic Acids Res. 18: 937-947.
Jareborg, N., Birney, E. and Durbin, R. (1999). Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs. Genome Res. 9: 815-824.
Koonin, E.V., Mushegian, A.R. and Borr, P. (1996). Non-orthologous gene displacement. Trends Genet. 12: 334-336.
Larizza, A., Makalowski, W., Pesole, G. and Saccone, C. (2002). Evolutionary dynamics of mammalian mRNA untranslated regions by comparative analysis of orthologous human, artiodactyls and rodent gene pairs. Comput. Chem. 26: 479-490.
Lenhard, B., Sandelin, A., Mendoza, L., Engström, P., Jareborg, N. and Wasserman, W.W. (2003). Identification of conserved regulatory elements by comparative genome analysis. J. Biol. 2: 13-13.11.
Liu, S.-Y. and Redmond, M. (1998). Role of the 3’-untranslated region of RPE65 mRNA in the translational regulation of the RPE65 gene: identification of a specific translation inhibitory element. Arch. Biochem. Biophys. 357: 37-44.
Loots, G.G., Locksley, R.M., Blankespoor, C.M., Wang, Z.E., Miller, W., Rubin, E.M. and Frazer, K.A. (2000). Identification of coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 288: 136-140.
Mayor, C., Brudno, M., Schwartz, J.R., Poliakov, A., Rubin, E.M., Frazer, K.A., Pachter, L.S. and Dubchak, I. (2000). VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16: 1046-1047.
Mazumber, B., Seshadri, V. and Fox, P.L. (2003). Translational control by the 3’UTR: the ends specify the means. Trends Biochem. Sci. 28: 91-98.
Nicholas, K.B., Nicholas Jr., H.B. and Deerfield II, D.W. (1997). GeneDoc: analysis and visualization of genetic variation. EMBNEW. News 4: 14.
Nobrega, M.A. and Pennacchio, L.A. (2004). Comparative genomic analysis as a tool for biological discovery. J. Physiol. 554: 31-39.
Nobrega, M.A., Ovcharenko, I., Afzal, V. and Rubin, E.M. (2003). Scanning human gene deserts for long-range enhancers. Science 302: 413.
O’Brien, S.J., Seuánez, H.N. and Womack, J.E. (1988). Mammalian genome organization: An evolutionary overview. Annu. Rev. Genet. 22: 323-351.
Peirce, J.L. (2004). Following phylogenetic footprints: Researchers apply computational power to their hunt for noncoding regulatory sequences. Scientist 18: 34-37.
Pruitt, K.D. and Maglott, D.R. (2001). RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res. 29: 137-140.
Solinas-Toldo, S., Lengauer, C. and Fries, R. (1995). Comparative genome map of human and cattle. Genomics 27: 489-496.
Tagle, D.A., Koop, B.F., Goodman, M., Slightom, J.L., Hess, D.L. and Jones, R.T. (1988). Embryonic epsilon and gamma globin genes of a prosimian primate (Galago crassicaudatus). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints. J. Mol. Biol. 203: 439-455.
Thomas, J.W. and Touchman, J.W. (2002). Vertebrate genome sequencing: building a backbone for comparative genomics. Trends Genet. 18: 104-108.
Thomas, J.W., Touchman, J.W., Blakesley, R.W. et al. (2003). Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424: 788-793.
Wedemeyer, N., Schmitt-John, T., Evers, D., Thiel, C., Eberhard, D. and Jockusch, H.C. (2000). Conservation of the 3’-untranslated region of the Rab1a gene in amniote vertebrates: exceptional structure in marsupials and possible role for posttranscriptional regulation. FEBS Lett. 477: 49-54.
Weitzman, J.B. (2003). Tracking evolution’s footprints in the genome. J. Biol. 2: 9-9.4.
Williams, S.H., Mouchel, N. and Harris, A. (2003). A comparative genomic analysis of the cow, pig, and human CFTR genes identifies potential intronic regulatory elements. Genomics 81: 628-639.
Womack, J.E. (1987). Genetic engineering in agriculture: Animal genetics and development. Trends Genet. 3: 65-68.