1 / 20

Informatics approaches to the genomic-level assessment of non-sequenced species

Informatics approaches to the genomic-level assessment of non-sequenced species. Andrew Cossins Lab for Environmental Gene Regulation University of Liverpool <http://legr.liv.ac.uk>. Dominance of genetic models. Genomic approaches to integrative, comparative

gefjun
Download Presentation

Informatics approaches to the genomic-level assessment of non-sequenced species

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Informatics approaches to the genomic-level assessment of non-sequenced species Andrew Cossins Lab for Environmental Gene Regulation University of Liverpool <http://legr.liv.ac.uk>

  2. Dominance of genetic models Genomic approaches to integrative, comparative and evolutionary physiology…….? • Large research communities • Extensive genomic resources • Genome sequence • EST clones • Databases • Mutants • Deletions and K/Os Is there any hope for ‘non-model’ species? ‘Bespoke’ or customised approaches

  3. 30 23 17 10 Common carp (Cyprinus carpio)- environmental stress responses >3 months pre-conditioning Temperature (°C) 0 1 2 3 4 5 8 12 22 Days after onset of cooling A. Y. Gracey, et al,PNAS , 2004,101(48):16970-5

  4. PCA comparison of tissue responses to cold Intestine Muscle Liver 30 Gill 20 Component 3 10 Kidney 0 -10 10 25 0 20 Brain Component 2 15 -10 10 5 -20 Component 1 0 Heart 87% of variation explained by just 3 PCA components A. Y. Gracey, et al,PNAS , 2004,101(48):16970-5

  5. a Transcription & translation (40) snRNP core protein D3 Nucleolar RNA helicase II hnRNP G HMG proteins 1, 4, T & NHPX RNA-binding protein Translation initiation factors 1A, 2A & 2B Protein translation factor SUI1 Intestine Kidney Muscle Brain Heart Liver RNA splicing /protein initiation / chromosomal structure Gill ADP,ATP translocases 2 & 3 Phosphate carrier protein ATP synthase , ,  &  chains Electron transport flavoproteins  &  Na/K ATPase 1 subunit b Transport (37) ATP provision, Ion gradients Proteasome  subunits 1, 2, 3, 5, 6 & 7 Proteasome  subunits 1, 3, 5, 7 Ornithine decarboxylase antizyme Ubiquitin conjugating enzyme E2-17 F-box only protein 2 c Protein catabolism (35) Protein turnover Heat shock protein 10 T-complex protein 1 subunits Glutathione S-transferase 3 Superoxide-dismutase Stress response d Cell stress & chaperones (21) Isocitrate dehydrogenase Pyruvate dehydrogenase Malate dehydrogenase Transaldolase Stearoyl-CoA desaturase (cds2) e Metabolism (18) Metabolism f Signaling (13) Cell regulation GTP-binding nuclear protein RAN RAN specific GTPase-binding protein 2 MAPKK1 g Structural (12) Tubulin  &  subunits Cofilin Profilin Microtubule stability h Repressed (8) Common Response Genes RNA binding protein 5 Transcription factor AP-1 184/260 BLAST identified Fold repressed Fold induced 2x 2x 1:1

  6. Top genes by fold change…. ∆9-Acyl-CoA desaturase 92 kDa type IV collagenase precursor (92 kDa gelatinase) ATP synthase delta chain, mitochondrial precursor ADP,ATP carrier protein (ADP/ATP translocase 1) ATP-binding cassette, sub-family F, member 2 (Iron inhibited ABC) Apolipoproteins A-I, A-IV, B-100 Precursor (APO-AI, Apo-AIV, Apo B-100) 28kDa-1e Apolipoprotein) RNA-binding protein (Glycine-rich) NADP-dependent malic enzyme (NADP-ME) Mitochondrial uncoupling protein 3 (UCP 3) Calmodulin Cofilin, muscle isoform 2 Granulins 1 and 3 Tubulins a1, a8, b1, b2, b4 chains High mobility group proteins 1, 2, 4 Hypothetical 31.8 kDa protein in chromosome II A. Y. Gracey, et al,PNAS , 2004,101(48):16970-5

  7. Up-regulated “transport” genes in Cluster 5 (intestinal/liver) Solute transport Sodium/dicarboxylic acid cotransporter Na/glucose cotransporter Na/K-transporting ATPase alpha-1 chain precursor Plasmalipin Galectin-9 Aquaporin-9 UDP N-acetylamine transporter Ferritin, heavy subunit Na- and Cl-dependent neurotransmitter transporter Phosphate carrier protein, mitochondrial precursor (PTP) Electron transport ATP synthase A chain Vacuolar ATP synthase 16 kDa proteolipid subunit V-ATPase, E and S1 subunits Cytochrome c oxidase polypeptide sVB and VIC-2 Mitochondrial phosphate carrier UCP3 Dydrolipoamide dehydrogenase Protein 1-4 (ATP binding protein) Lipid transport Microsomal triglyceride transfer protein, large subunit precursor Apolipoprotein B-100 precursor (Apo B-100) Apolipoproterin E precursor (Apo-E1) Protein transport GTP-binding nuclear protein RAN ARF-related protein (ARP)

  8. Coding and renaming Reports and Databases Cleanup Assembly Annotation EST-Ferret (Weizhong Li)

  9. 25.3%, 1838 with GO match 9.9%, 287 with GO & EC matches 13.7%, 827 no GO & EC match c b 5.4%,323 homolog of 15.4%, 932 similar to 28.1%, 1679 weakly similar to carpBASE annotation summary 51.1%, 3081 sub-groups Unclassifiable 48.9%, 2952 sub-groups having gene names (BLASTx, e-15 cut-off) a 6033 sub-groups

  10. GO Profile in carpBASE GO-matrix for carp gene expression groups Further A. Y. Gracey, et al,PNAS , 2004,101(48):16970-5

  11. Enzymes from carpBASE in Fatty Acid Metabolism in KEGG

  12. 25.3%, 1838 with GO match 9.9%, 287 with GO & EC matches 13.7%, 827 no GO & EC match c 12.8%, 775 with UTR 3.8%, 232 5.3%, 321 b 29.1%, 1753 no annotation 5.4%,323 homolog of 15.4%, 932 similar to 28.1%, 1679 weakly similar to d with 2ndary structure domains with repeat elements 51.1%, 3081 sub-groups Unclassifiable 48.9%, 2952 sub-groups having gene names (BLAST) a 6033 sub-groups carpBASE annotation summary

  13. Using transcript expression profiles to identify probes Co-expression relationships Pearsons correlation coefficient Probe A (identified Probe B (unidentified) Matrix of coefficients (13440 x 13440) 180million calculations, 386 data points Identify informative co-expression relationships 3D visualisation package Selection of cut-off criterion using ROC procedures (0.87) - 3287 clones - 2050 unknown Generate 2D coexpression network

  14. Global landscape……. S6 S3 M18 Apolipoprotein S10 M13 M2 M21 M20 M23 S2 M14 M6 M19 M22 B2 Parvalbumin b M9 M16 Fructose-bisphosphate aldolase B Apolipoprotein A-I-2 precurso ADP,ATP carrier protein B1 M11 Fructose-bisphosphate aldolase A S4 M15 M3 Parvalbumin a M8 M5 M1 M7 M12 Alpha-actin 1 Glyceraldehyde 3-phosphate dehydrogenase M24 M17 60S ribosomal protein L30 M4 Ribosomal protein S9 M10 S5 S7 Mitochondrial uncoupling protein 2 Elongation factor 1-alpha S1 S8

  15. Clone identification by ExprAlign….. M13 34 clones 25 clones identified 23 clones Fructose bisphosphate aldolase A (90%) 11 unknown (relateable) M14 33 clones 27 clones identified 25 clones Fructose bisphosphate aldolase A (93%) 6 clones unknown (relateable) Total of 366 clones/2050 unidentified clones

  16. 25.3%, 1838 with GO match 9.9%, 287 with GO & EC matches 13.7%, 827 no GO & EC match c b 5.4%,323 homolog of 15.4%, 932 similar to 28.1%, 1679 weakly similar to 51.1%, 3081 sub-groups Unclassifiable 48.9%, 2952 sub-groups having gene names (BLAST) a 6033 sub-groups carpBASE annotation summary 12.8%, 775 with UTR 3.8%, 232 5.3%, 321 4.5%, 274 ZF cDNA 6.1%, 366 ExprAlign 17.5%, no annotation d with 2ndary structure domains with repeat elements

  17. Acknowledgements…… Carp Genomics Andrew Gracey, Jane Fraser, Margaret Hughes Sequence Bioinformatics Weizhong Li, Luciane de Mello Statistical Analysis Yongxiang Fang, Faraaz Yusufi, Andy Brass Funding: NERC (UK) BBSRC (UK) EC Framework 5

  18. Maximising ESTs identifications…. Total clones - 13,440 Total HQ ESTs - 9,202 Assembled - 6,033 BLAST I.d. - 2,952 (49%) Gene ontology - 2125 (72.0%) EC numbers - 287 (9.7%) KEGG - Unidentified - 3,081 (51%) UTR search - 699 (11.6%) Zebrafish cDNA - 274 (4.5%) 2ndary structure - 232 (3.8%) Repeat element -

  19. Gene expression landscapes….. C. elegans whole transcriptome analysis Stuart Kim - Stanford

More Related