80 likes | 193 Views
ENCODE Pseudogenes and Transcription. Deyou Zheng Yale University 7-05-05, ENCODE-GT. Yale 75. 113 VEGA 40. 23 ENSEMBL 2 9. Pseudogenes in ENCODE Regions.
E N D
ENCODE Pseudogenes and Transcription Deyou Zheng Yale University 7-05-05, ENCODE-GT
Yale 75 113 VEGA 40 23 ENSEMBL 2 9 Pseudogenes in ENCODE Regions • 211 pseudogenes were identified using an updated computational pipeline (Zhang et al. 2003) and manual curation. • Compare Yale pseudogenes with pseudogenes from VEGA group and the ENSEMBL group. 2
Manually Picked (ENm*) Randomly Picked (ENr*) No. of Pseudogenes No. of Genes Break Down of Yale Pseudogenes r2=0.31 • More pseudogenes in the manually picked regions. • 211 Pseudogenes can be separated into 104 processed, 19 duplicated and 88 others. Others – those can’t be clearly binned to processed or duplicated, e.g., fragments. • Numbers of genes and pseudogenes are weakly correlated in ENCODE regions. 3
Genes ENm004 Yale TARs using Oligo-microarray Affymetrix TARs using Oligo-microarray EST Intersection of Pseudogenes with Transcription Data Pseudogenes GIS-PET CAGE Transcription factors binding sites from ChIP-Chip Sequence conservation in rat, mouse and chimp 4
Yale_Pgene_58 Example of a Pseudogene with Various Transcription Evidence 5
Intersection of Pseudogenes with Transcription Data • By random chance, 20-30 Yale pseudogenes will intersect with TARs. • ~40% ENCODE pseudogenes intersect with TARs. So high percentage? 6
Intersection of TARs with Pseudogenes Affy-Unique-TAR Yale-Unique-TAR No. of TARs Overlapping a Pseudogene Affy-not-Unique-TAR Yale-not-Unique-TAR No. of TARs • Not-”unique” TAR: one with a sequence of 60 bp (~3 probes) mapping to > 1 genomic locations (≥ 95% identity). 7 7
Summary • 211 Pseudogenes (253, Yale + Vega) in ENCODE regions. • Some pseudogenes (< 7%) might be transcribed based on GIS-PET, CAGE or EST data. • About one half of pseudogenes overlap with TARs. • Non-unique TARs intersect with pseudogenes 5 times more often than unique TARs, probably due to cross-hybridization. • Comparison with previous analysis: • A more detailed survey found that 12-16% of chr22 pseudogenes intersected with TARs from tiling microarray (Zheng et al., 2005). • Both a chr22 and a whole genome analysis showed that ~5% human pseudogenes are likely transcribed (Zheng et al., 2005; Harrison et al., 2005). • Cheng et al. (2005) also reported that pseudogene-overlapping TARs are usually not unique. We repeat their analysis using ENCODE pseudogenes and find the same. • Refs: • Cheng et al., 2005, Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 308(5725): 1149-54. • Harrison et al., 2005, Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability. Nucleic Acids Res. 33(8): 2374-83. • Zheng et al., 2005, Integrated pseudogene annotation for human chromosome 22: evidence for transcription. J Mol Biol. 349(1):27-45. 8