390 likes | 717 Views
Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht. Transcriptomics. mRNA. cDNA. hybridise to microarray. What are microarrays ?. Transcriptomics?. labelled nucleic acid. labelled nucleic acid. labelled nucleic acid. labelled nucleic acid. Microarray.
E N D
Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht Transcriptomics
mRNA cDNA hybridise to microarray What are microarrays ? Transcriptomics?
labelled nucleic acid labelled nucleic acid labelled nucleic acid labelled nucleic acid Microarray array array array Gene expression data matrix Protocol Protocol Protocol Protocol Protocol Protocol normalization integration Experiment genes Sample Sample Sample Sample Sample Array design RNA extract RNA extract RNA extract RNA extract RNA extract hybridisation labelled nucleic acid hybridisation array hybridisation hybridisation hybridisation
Sample annotation Gene expression levels Gene annotation Microarray data and annotation Samples Gene expression matrix Genes
Traditions of data sharing in Life Sciences • Data used in publications should be made available so that • the experiments can be reproduced and the conclusions can be verified • the others can build on other’s results • In genome sequencing this has evolved into submissions to public sequence databases DDBJ/EMBL/Genbank – most journals require such submissions
Array scans Quantitations Samples Spots Genes A B D C Sharing microarray data – which data?
Sample source • Sample treatments • Extraction protocol • Labeling protocol Hybridization protocol Hybridisation Sample • Array design information • Location of each element • Description of each element Array • Image • Scanning protocol • Software specifications • Quantification matrix • Analysis protocol • Software specifications MGED – MIAME MIAME 6 parts of a microarray experiment
Microarray experiment Labelled Extracts Colours related to labels Hybridizations Shapes related to array designs Samples Extracts Experiment name Rustici et al., S. pombe cell-cycle mutant data (2004)
AE Data Warehouse MIAMExpress Database ArrayExpress Repository ExternalApplication MAGE-ML Submission support Curation Database Architecture XML MAGE-ML Data download Visualisation Data upload User Functionality Retrieval of raw & processed data for analysis Submissions Database Gene, sample, and experiment centric queries,
MIAMExpress • Submission and annotation tool • Potential local data annotation tool • Based on MIAME concepts • Accepts protocol, array and experiment submissions • User accounts allow re-use of protocols and arrays • Works with your own or commercial arrays
AE Data Warehouse MIAMExpress Database ArrayExpress Repository ExternalApplication MAGE-ML Submission support Curation Database Architecture XML MAGE-ML Data download Visualisation Data upload User Functionality Retrieval of raw & processed data for analysis Submissions Database Gene, sample, and experiment centric queries,
ArrayExpress http://www.ebi.ac.uk/arrayexpress • A public repository for microarray data at the EBI
Submissions by pipelines Online (MIAMExpress)Submissions
ArrayExpress data - by organism Total ~ 7000 hybridisations
AE Data Warehouse MIAMExpress Database ArrayExpress Repository ExternalApplication MAGE-ML Submission support Curation Database Architecture XML MAGE-ML Data download Visualisation Data upload User Functionality Retrieval of raw & processed data for analysis Submissions Database Gene, sample, and experiment centric queries,
New! http://www.ebi.ac.uk/aedw/ArrayExpress_main.html Gene-centric Query Prototype
New! Gene-centric Query Prototype - Driven by a BioMart backend
New! Gene-centric Query Prototype
AE Data Warehouse MIAMExpress Database ArrayExpress Repository ExternalApplication MAGE-ML Submission support Curation Database Architecture XML MAGE-ML Data download Visualisation Data upload User Functionality Retrieval of raw & processed data for analysis Submissions Database Gene, sample, and experiment centric queries,
Expression Profiler http://www.ebi.ac.uk/expressionprofiler • An online microarray data analysis platform
What can you do with the data? Expression ProfilerData Viewer Component ...view as a heatmap...
What can you do with the data? Expression ProfilerHierarchical Clustering Component ...cluster the data...
What can you do with the data? ...look at GeneOntology enrichment of a selected cluster ... Expression ProfilerGO Annotation Component
What can you do with the data? ... check out how clusterings compare ... Expression ProfilerClustering Comparison Component
What can you do with the data? ... integrate several data types together ... Expression ProfilerThreeway Similarity Analysis
Available Components • Data Selection • Data Transformation • Missing Value Imputation • Hierarchical Clustering & K-groups Clustering • Clustering Comparison • Signature Algorithm • Sequence Homology • SPEXS: Promoter Discovery • Visual Pattern Matching • Ordination (COA, PCA) • Between Group Analysis • Three-way Similarity Analysis • GO Annotation Uses: • ArrayExpress suite of tools • Standalone tool • Locally installed (UJI, UMC Utrecht) • Teaching tool • Pipelines, workflows, high-throughput analysis
Original EP Development: • Jaak Vilo (Tartu) • Patrick Kemmeren (Utrecht) • Misha Kapushesky EP:NG Framework Development: • Patrick Kemmeren (Utrecht) • Misha Kapushesky • Caroline Johnston (UCL) Visualization Components: • Misha Kapushesky • Steffen Durinck (Leuven) • Phil Hyoun Lee Acknowledgements EBI Microarray Informatics TeamAlvis Brazma, Head of Microarray Informatics Group Ahmet Oezcimen, Scientist (Oracle DBA) Anastasia Samsonova, PhD student Anjan Sharma, Scientist (Software Developer) Anna Farne, Scientist (Curation) Aurora Torrente, PhD Student Bhuwan Tiwari, Trainee Catherine Leroy, Summer Student Ele Holloway, Scientist (Curation) Gabriella Rustici, Scientist (Postdoc) Gaurab Mukherjee, Scientist (Curation) Gonzalo Garcia Lara, Scientist (Web Designer/Programmer) Helen Parkinson, Scientist (Curation Coordinator) Jaak Vilo, Consultant Lev Soinov, Scientist (Postdoc Wellcome Trust) Misha Kapushesky, Scientist (Scientific Application Programmer) Mohammadreza Shojatalab, Scientist (Database Programmer) Niran Abeygunawardena, Scientist (Web Designer/Programmer)Patrick Kemmeren, Consultant Per Lilja, Scientist (Database Programmer) Philippe Rocca-Serra, Scientist (Nutrigenomics Proj. Coordinator) Pierre Marguerite, Summer Student Richard Coulson, Scientist (Biosapiens Project) Sergio Contrino, Scientist (Database Programmer) Steffen Durinck, Student Susanna-Assunta Sansone, Scientist (Toxicogenomics Proj. Coordinator)Tim Rayner, Scientist (Curation) Ugis Sarkans, Scientist (Database Development Coordinator) Clustering Comparison: • Aurora Torrente • Christine Körner (Leipzig) PCA/COA/BGA: • Aedín Culhane (Cork) Signature Algorithm: • Jan Ihmels (Tel-Aviv) Gene Ordering: • Karlis Freivalds (Riga) Normalisation: • Caroline Johnston (UCL) Web Services: • Antonio Estruch (UJI)