1 / 29

EDG Application

EDG Application. The European DataGrid Project Team http://www.eu-datagrid.org. EDG Application Areas. High Energy Physics. Earth Observation Science Applications. Biomedical Applications. High Energy Physics. CMS. 4 Experiments on LHC. ATLAS. ~6-8 PetaBytes / year

sharron
Download Presentation

EDG Application

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EDG Application The European DataGrid Project Team http://www.eu-datagrid.org

  2. EDG Application Areas High Energy Physics Earth Observation Science Applications Biomedical Applications

  3. High Energy Physics CMS 4 Experiments on LHC ATLAS ~6-8 PetaBytes / year ~108 events/year ~103 batch and interactive users LHCb

  4. CERN’s Network in the World Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users

  5. Data Flow in LHC

  6. Example: CMS Monte Carlo Production

  7. CMS jobs description CMKIN Job CMSIM Job Write to Grid Storage Element Read from Grid Storage Element CMKIN : MC Generation of the proton-proton interaction for a physics channel (dataset) CMSIM: Detailed simulation of the CMS detector, processing the data produced during the CMKIN step Write to Grid Storage Element Output data Output data Grid Storage * PIII 1GHz 512MB  46.8 SI95

  8. CMS EDG CE CE CE parameters CMS software CMS software CMS software JDL Push data or info Pull info CMS production components interfaced to EDG middleware • Production is managed from the EDG User Interface with IMPALA/BOSS • CMS Virtual Organization server at NIKHEF (Amsterdam) SE RefDB BOSS DB Workload Management System SE UI IMPALA/BOSS SE CE SE

  9. CMS EDG CE CE CE CE parameters CMS software CMS software CMS software CMS software SE JDL data registration Push data or info WN Pull info CMS production components interfaced to EDG middleware • CMKIN jobs running on all EDG Testbed sites with CMS software installed • CMSIM jobs running on CE close to the input data • produced data: scripts for batch replication to a dedicated SE SE RefDB BOSS DB Workload Management System SE UI IMPALA/BOSS X input data location SE CE Replica Manager SE

  10. CMS EDG CE CE CE CE parameters CMS software CMS software CMS software CMS software Job output filtering Runtime monitoring SE JDL data registration Push data or info WN Pull info CMS production components interfaced to EDG middleware • Job monitoring and bookkeeping: BOSS DBs, EDG Logging & Bookkeeping service SE RefDB BOSS DB Workload Management System SE UI IMPALA/BOSS input data location SE CE Replica Manager SE

  11. Nb. of evts time CMS use of the system (Statistics) Events Production within EDG is part of the Official CMS production http://cmsdoc.cern.ch/cms/production/www/html/general/index.html SEs CEs

  12. Summary of CMS work and the planning for use of EDG middleware • RESULTS • We can distribute and run CMS s/w in the EDG environment • We have generated ~250K events for physics with ~10000 jobs in 3 week period • OBSERVATIONS and PLANNING for the future • We were able to quickly add new sites to provide extra resources • There was a fast turnaround in bug fixing and installing new software • The stress test was labor intensive (since software was developing and th • Release EDG 2.0 should fix the major problems and allow for enhanced scalability,and we look forward to evaluating it and using it in our Data Challenge work

  13. EDG EO challenge: Processing / validation of 1y of GOME data Raw satellite data from the GOME instrument (~75 GB - ~5000 orbits/y) LIDAR data (7 stations, 2.5MB per month) Level 1 ESA(IT) – KNMI(NL) Processing of raw GOME data to ozone profiles. 2 alternative algorithms ~28000 profiles/day (example of 1 day total O3) IPSL(FR) Validate some of the GOME ozone profiles (~106/y) Coincident in space and time with Ground-Based measurements DataGrid environment Level 2 Visualization & Analyze

  14. EO WebMap Portal

  15. Processing Sequence 1. Search Level-1 catalogue 12. Return new Level-2 products 2. Retrieve Level-2 products 3. Level-2 Products already registered in RC? 6. Transfer Level-1 data from Archive to the Grid 7. Register Level-1 data 11. Register level-2 data 8. Submit jobs to process Level-1 data 9. Process Level-1 data 10. Transfer Level-2 data to SE EO Product Catalogue Web Portal EO Product Archive EO Grid Engine Yes? 4. Return available Level-2 products  No? 5. Perform GRID processing on-the-fly EDG User Interface EO Replica Catalogue EDG Resource Broker EDG Storage Element EDG Computing Element

  16. GOME Ozone Profile Validation ERS/GOME satellite 50 km OZONE LAYER 10 km • Goals of the DataGrid application validate satellite data with all ground based data available in an easy way: • Comparison of ozone profiles provided by satellite with lidar data in different locations and times (see the web portal) • Statistical comparison and analysis in order to improve algorithms. Lidar at the Haute Provence Observatory

  17. ComputingElement Validation Processing Sequence Satellite data validation Lidar site Level 2 Catalogue 2 Level 2 Catalogue Queries and data information retrieval from the Gome Level 2 orbit or pixel metadata catalogues Queries and data information retrieval from the Lidar metadata catalogue 3 Submission of the Job in the GRID GRID Portal Lidar data catalogue 1 GRID Storage Elements with Lidar data 4 When completed comparison between lidar and satellite ozone profiles Storage Elements with Gome L2 data

  18. Validation Output Figure 1: Estimation of the bias between Gome and Lidar using one month of data. Figure 2 : example of 2 profiles : Comparison between Gome profile and lidar profile for the 2nd October 2000.

  19. Perspectives for Biomedical Applications • Grids open new perspectives in large scale genomics analysis • Complete genome annotation • Cross-genomes analysis • Data mining on distributed databases • Pipelining of huge automatic bio-informatics analysis • Medical image processing • Large databases processing • Anatomy and physiology modeling • Epidemiological studies

  20. Applications deployed Applications tested on EDG Applications under preparation Biomedical Applications • Bio-informatics • Phylogenetics : BBE Lyon (T. Sylvestre) • Search for primers : Centrale Paris (K. Kurata) • Statistical genetics : CNG Evry (N. Margetic) • Bio-informatics web portal : IBCP (C. Blanchet) • Parasitology : LBP Clermont, Univ B. Pascal (N. Jacq) • Data-mining on DNA chips : Karolinska (R. Médina, R. Martinez) • Geometrical protein comparison : Univ. Padova (C. Ferrari) • Medical imaging • MR image simulation : CREATIS (H. Benoit-Cattin) • Medical data and metadata management : CREATIS (J. Montagnat) • Mammographies analysis ERIC/Lyon 2 (S. Miguet, T. Tweed) • Simulation platform for PET/SPECT based on Geant4 : GATE collaboration (L. Maigne)

  21. LFN image patient hospital ... Medical Imaging H 1. query Medical images Metadata 2. visualisation 5. best results visualisation 3. similarity search 4. scores

  22. Graphic layer Grid File Browsing Job Monitoring File registration and retrieval

  23. Graphical Interfaces Image registration Local files Grid files Metadata Image retrieval Query over metadata Query result

  24. LFN image patient hospital ... Image Registration Imager SE

  25. Similarity search Similarity computation Job monitoring Ranked list of images Results visualization Most similar images Low score images Source image

  26. Replica Catalog RC interface Storage Element MSS Metadata interface File metadata ACL size checksum ... Client 1 interface Client 2 interface RS interface Storage Element core Application metadata ACL encryption key sensitive metadata ... grid - server interface header blanking encryption Medical server Future: Interfacing medical data with the Grid Replication Service Grid middleware Replica Master File Medical (trusted) site Imager

  27. Parallel Processing • Magnetic Resonance Images simulation using the grid • 3 levels of parallelism: • Parallel isochromat computations • Multi-slice MRI computation • Parallel magnetization kernel Magnetisation Reconstruction computation Virtual object MRI algorithm kernel Image MRI sequence

  28. Summary • Use Cases • High Energy Physics • Earth Observation • Biomedical Applications

  29. Further Information • High Energy Physics http://datagrid-wp8.web.cern.ch/DataGrid-WP8/ • Bio-Informatics http://marianne.in2p3.fr/datagrid/wp10/index.html • Earth Observation http://styx.esrin.esa.it/grid/

More Related