-
Essay / EST and SAGE Analysis - 1159
A) Expressed Sequence Tag (EST) Analysis There are a large number of genes in our genome, but only a few of them express for the synthesis of mRNAs that encode for different proteins. These mRNAs are collectively called the transcriptome and mRNA can be reverse transcribed into cDNA, which provides evidence for all mRNA transcripts. Therefore, mRNA and cDNA are crucial for gene expression profiling and transcriptome study. Expressed sequence tags (ESTs) are short, unverified nucleotide fragments, typically consisting of 200 to 800 nucleotide bases. It is randomly selected by single-pass sequencing of the 5' or 3' end of cDNA derived from cDNA libraries constructed based on the mRNA of a specific gene. EST datasets have been recognized as the “poor man's genome” because EST data are widely used as a substitute for genome sequencing. EST generation involves several steps. First, mRNAs isolated from a specific cell line will be reverse transcribed into double-stranded cDNA using the enzyme reverse transcriptase. The cDNAs are then ligated into a plasmid vector and cloned to obtain multiple copies of the cDNA for library construction. After that, ESTs can be generated by random sequencing of cDNA clones with single-pass analysis from the 5' and 3' ends, without full read-through. The redundancies of all ESTs can be reduced by standardization. EST data can be retrieved from different network interfaces such as NCBI's UniGene, TIGR, Cancer Genome Anatomy Project, NCBI's ESTree and dbEST. There are 5 steps involved in EST sequence analysis, as described below: Explanation of Step 1: EST Preprocessing • Reduce overall noise in EST data and improve the efficiency of downstream analysis. • Identify and remove contaminants from the vector sequence...... middle of paper ...... Next generation sequencing is certainly a better approach to achieve more efficient gene sequencing. EST and SAGE have similarities in that they do not require any prior understanding of the sequences to be analyzed since they are both sequencing-based gene expression profiling approaches (Pationo et al. cited in Yamamoto et al., 2001). Works Cited Nagaraj, SH, Gasser, RB and Ranganathan, S. (2006). A Hitchhiker's Guide to Expressed Sequence Tag (EST) Analysis. Bioinformatics Briefing, 8(1), 6-21. Patino, WD, Mian, OY, & Hwang, PM (2002). Serial analysis of gene expression: technical consideration and applications to cardiovascular biology. Circulation Research, 91, 565-569. Yamamoto, M., Wakatsuki, T., Hada, A. and Ryo, A. (2001). Use of serial analysis of gene expression (SAGE) technology. Journal of Immunological Methods, 250, 45-66.