1 / 38

SeRC Swedish e-Science Research Centre Juni Palmgren EGEE User Forum Uppsala April 2010

SeRC Swedish e-Science Research Centre Juni Palmgren EGEE User Forum Uppsala April 2010. e-Science in UK. Belfast e-Science Centre (BeSC) Cambridge e-Science Centre (CeSC) STFC e-Science Centre (STFCeSC) e-Science North West (eSNW) National Grid Service (NGS) OMII-UK

shanon
Download Presentation

SeRC Swedish e-Science Research Centre Juni Palmgren EGEE User Forum Uppsala April 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SeRC Swedish e-Science Research Centre Juni Palmgren EGEE User Forum Uppsala April 2010 J Palmgren

  2. e-Science in UK • Belfast e-Science Centre (BeSC) • Cambridge e-Science Centre (CeSC) • STFC e-Science Centre (STFCeSC) • e-Science North West (eSNW) • National Grid Service (NGS) • OMII-UK • Lancaster University Centre for e-Science • London e-Science Centre (LeSC) • North East Regional e-Science Centre (NEReSC) • Oxford e-Science Centre (OeSC) • Southampton e-Science Centre (SeSC) • Welsh e-Science Centre (WeSC) J Palmgren

  3. Outline • Background • e-Science as a Strategic Research Area • SeRC application 2009 • Funding from 2010 • What will SeRC do? • Example: LifeSciences • Example: Complex Disease e-Science Community J Palmgren

  4. Research InfrastructureSweden and Europe J Palmgren

  5. Key e-Infrastructure components Well functioning network SUNET – Swedish University NETwork SNIC – Swedish National Infrastructur for Computing DISC – Database InfraStructure Committee HPC/ compute and storage Well documented databases J Palmgren

  6. Strategic Research Areas 2008Call 2009 • 20 research areas in technology, climate and environment, energy and health – including e-Science • e-Science proposals to involve • Methods and tools • Application areas • Infrastructure • Resourses available for e-Science (research/infrastructure) • 2010 – 25 (20/5) MSEK • 2011 – 40 (32/8) MSEK • 2012 – 70 (56/14) MSEK etc for 2013 and beyond • Applications evaluated by international panel set up by the • Swedish Research Council Spring 2009 J Palmgren

  7. SeRC applicatione-Science • Consortium KTH – LiU – SU – KI • KTH main applicant • 30 Milj/yr research: 38%, 27%, 27%, 8% • 20 Milj/yr infrastructure: 50% KTH, 50% LiU • Consortium application includes • Core of methods development • Strong application centers and research groups • Majority of Swedish e-infrastructure (PDC och NSC) J Palmgren

  8. SeRC Main applicants • Dan Henningson, Mechanics, KTH • Hans Ågren, Theoretical chemistry, KTH • Anna Delin, Material science, KTH • Anna-Karin Tornberg, Numerical Analysis, KTH • Anders Ynnerman, NVIS, LiU • Bengt Persson, Bioinformatics, LiU/KI • Igor Abrikosov, Material science, LiU • Juni Palmgren, Mathematical statistics, SU/KI • Anders Lansner, Computational neuroscience, SU • Erik Lindahl, Bioinformatics, SU J Palmgren

  9. SeRC Application granted • Highest ranked application, full research budget granted • Additional research funding to UU-Lund-Umeå consortia • Infrastructure (national) part treated separately J Palmgren

  10. Strategic funding for e-Science J Palmgren

  11. Strengths of the SeRC consortium • Ten centers of excellence (5 Linné, 4 SSF, 1 Berzelii) • Seven additional research centers • 54 key researchers named • Six members of KVA, Five ERC grants, Five “rådsforskare” • PDC + NSC 80% of total computer capacity for academic research • Software development • DALTON, EMTO, GROMACS, PENCIL, SIMSON, SUPERFLUID, FEniCS • Participation in EU/National projects • SweGrid, NDGF, EGEE, DEISA, PRACE, ELIXIR, BBMRI, INCF, IS-ENES • Industrial support • SAAB, SCANIA, SKF, BOMBARDIER, AstraZeneca, SECTRA, LightLab, Portendo, SMHI J Palmgren

  12. SeRC Research showcases • FLUID MECHANICS: massively parallel turbulence simulations provide new understanding • ENVIRONMENT: simulation of future climat • MATERIAL: simulations give structure of iron in earths core • GENOME MEDICINE: Identification of genome sequence variation associated with cancer risk J Palmgren

  13. SeRC Research showcases • MOLECULAR MODELING: distributed computations enable the largest ion-channel simulations in membranes • e-EPIDEMIOLOGY: ”Twin-Net data federation” – distributed database model for European infrastructure • VISUALIZATION: Interactive full-body scan possible through smart handeling of data • BRAIN MODEL: New method enables largest simulation of memory network J Palmgren

  14. What will SeRC do? • Form ”e-Science communities”; collaboration between • e-Science applications • core e-Science • computer experts at PDC/NSC • Research in core e-Science areas of importance for e-Science applications • Strategic collaboration between PDC and NSC • national responsibility for large computer infrastructure • strong European partner • Cooperation with industry and society J Palmgren

  15. e-Science communities • Computational fluid dynamics • Climate and the Environment • Bioinformatics: mapping and analyzing DNA and protein sequences • Complex Diseases: modeling molecular and medical data • Particle Simulations and Molecular Dynamics • Electron structures: DFT and Hartee-Fock methods • Waves: aero-acoustics and electromagnetics J Palmgren

  16. Core e-Science areas • Numerical Analysis • Visualization and Image Science • Mathematical Modeling • Distributed resources • Database technology • Parallel Algorithms and Performance Optimization J Palmgren

  17. Interface to industry and society • External advisory committee with international experts and representatives from industry and society • Representatives from industry and society in e-Science communities; Establish e-Science industry forum with dedicated coordinator • e-Science software and spin-offs • Open Source software of industrial interest • Spin-off companies in electromagnetics, material science, medical visualization, … • Establish software curation unit • Production of PhDs of interest to industry J Palmgren

  18. Management structure • Director: Dan Henningson • Co-director: Anders Ynnerman J Palmgren

  19. Swedish e-Science Research Centre (SeRC) J Palmgren

  20. Life SciencesNew technology! Max IV, Lund Synchrotrones Deep sequencing MRI-PET Advanced light microscopy PANDA

  21. Genome sequencing • 2000: Human genome working drafts • All data freely released • Project took about 10 years and cost about $3 billion • Jan 2008: Every 4 days major genome centers can sequence the same number of base pairs as the HGP • New technology has now reduced this to 10 hours J Palmgren

  22. Explosion of new biological data Cataloging sequence variation highest priority! • 2008 genome sequencing produced 1.5 petabyte data at hundreds of labs in dozens of countries with two Tier 0 centra (EMBL-EBI and NCBI) • New sequencing technology produces 5-10 times more data • Bioinformatics needs are gigantic! Notice • LHC produces 15 petabyte data annually • LHC grid has 140 computer centra in 33 countries centered around CERN asTier 0. From Iain Mattai EMBL 2008

  23. Increasing computational cost Values are for BLAST searches on Amazon EC2 (from Wilkening et al, IEEE Cluster09) From Folke Meyer J Palmgren

  24. Chemical entities ChEBI; PubChem Protein interactions IntAct with Imex, PRIDE Pathways Kegg; Reactome Systems BioModels Bioinformatics Databases: From molecules to systems Genomes (Sanger, EBI + NCBI) Nucleotide sequence EMBL-Bank+Genbank+DDBj Protein sequence UniProt(EBI/SIB/PIR) Gene expression ArrayExpress, GEO Protein structure wwPDB(RCSB,EBI,PDBj) Protein families, motifs and domains InterPro (12 collaborators) From Janet Thornton, EBI

  25. Integration with other dataFrom molecules to medicine & agriculture Genomes Proteins Metabolites

  26. SeRC Complex Disease eScience Community - Focus on human complex disease - Secure transfer and storage of large scale individual specific data - mathematical and statistical modeling of molecular, clinical, epidemiological data - Early diagnosis, prevention and cure

  27. LIFE SPAN REGISTRIES Hospital Discharge Cancer LATE LIFE EARLY LIFE Birth Specific Disease Registries Clinical Quality Registries Congenital Malformations Cause of Death Register of prescribed drug sales L I F E GENETICALLY INFORMATIVE Population Multiple Generation Twin Migration Bulding on:(i) Swedish population based registries

  28. (ii) Swedish Research Biobanks Biobank deposit Storage Biobank use • Input • Ensure optimally valuable biological samples stored • Ethics • Quality control • Sample handling and processing • Data gathering • Links to registers • Output • Ensure biological samples are optimally used • Ethics • Technology platforms for high throughput sample analysis • Data handling • Biostatistics IT database

  29. (iii) Cutting edge technologyStockholm-UppsalaScience for Life Laboratory; SciLifeLab HTP technology platforms: • Genomics • Proteomics and protein profiling • Bioimaging and functional biology • Bioinformatics and systems biology J Palmgren

  30. Data-analyses pipeline

  31. SeRC Complex Disease eScience Community Initially three thematic areas • Cancer • Cancer Risk Prediction (CRiSP) • Cardiovascular disease/Inflammation • Center of Excellence for Research on Inflammation and Cardio-vascular disease (CERIC) • Neuroscience • Stockholm Brain Institute (SBI) J Palmgren

  32. … with reference to C www.crispcenter.org2008-2018 Namn Efternamn

  33. J Palmgren

  34. J Palmgren

  35. Summary The Swedish e-Science Research Centre (SeRC) Builds on • Transition from HPC-centres to e-Science enablers  Recognizes that frontier science today often • Is complex, multi-scale, and multi-science in nature • Requires multi-disciplinary, multi-investigator, multi-institutional approaches (often international in scope). • Involves high data intensity and heterogeneity from observations, digital instruments, sensor nets, and simulations • demands semantic federation, active curation and long-term preservation of access to data • Places new demands on education J Palmgren

  36. J Palmgren

  37. Example:Risk prediction in CRiSP • Model risk and prognosis • Traditional risk factors • Mammographic density • Genetic variants • Search for new biomarkers • Statistical learning methods for discovery and validation • Account for disease heterogeneity • ER+, ER- tumors • - omics tumor profiles • Other.. • Assess model performance • Internal and external validation

More Related