1 / 7

Worldwide Data Processing with SAMGrid

SAMGrid enables the reanalysis of large datasets using the latest techniques, providing a platform for global collaboration and efficient processing of massive amounts of data.

Download Presentation

Worldwide Data Processing with SAMGrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Worldwide Data Processing with SAMGrid

  2. As experiments refine their understanding of raw data, a point is reached where it becomes desirable to reanalyze the entire dataset with the latest techniques. For the D0 experiment, the datasets Involved are large: ~250 TB Equivalent to a stack of CDs nearly as tall as the Eiffel Tower

  3. Processing such large datasets in a timely manner requires large scale compute resources. A single pass over the full dataset will involve: Reading ~250TB of input Writing ~ 70TB of output Processing ~1 Billion events To complete such a pass within 6 months requires ~3.5THz of PIII equivalent compute capacity

  4. SAMGrid provides an ideal platform for mustering the large scale resources needed to do the D0 data reprocessing with over 20 production sites located across North America, Europe, Asia and South America

  5. More than a dozen sites worldwide were able to participate in the D0 reprocessing effort by providing a peak compute capacity of over 3.5THz in PIII equivalent units: CCIN2P3 (Lyon) CMS (at FNAL) Fermilab FZU (Prague) GridKa (Karlsruhe) Imperial (London) Manchester OSCER (Oklahoma) SPRACE (Sao Paolo) D0SAR (Texas, Arlington) WestGrid (Vancouver BC) Wisconsin

  6. Essential services provided by SAMGrid: Complete meta-computing environment including Grid-level job management based on Condor and Globus Delivery of executables to sites in an encapsulated compute environment suitable for operation on diverse Linux installations Delivery of raw data over WAN to remote installations Transport of output back to FNAL and storage in MSS Bookkeeping of processing history, job success/failure, and job recovery Monitoring facilities for job status, site availability, and error logging

  7. Submission screen dump… Site screen dump… Job screen dump… Data flow (FNAL->remote->merge->FNAL) Conclusion, time spent, data processed etc.

More Related