260 likes | 355 Views
Caitriana Nicholson University of Glasgow. Dynamic Data Replication in LCG 2008. Outline. Introduction Grid Replica Optimisation The OptorSim grid simulator OptorSim architecture Experimental setup Results Conclusions. Introduction.
E N D
Caitriana NicholsonUniversity of Glasgow Dynamic Data Replication in LCG 2008
Outline • Introduction • Grid Replica Optimisation • The OptorSim grid simulator • OptorSim architecture • Experimental setup • Results • Conclusions
Introduction • Large Hadron Collider (LHC) at CERN will have raw data rate of ~15 PB/year • LHC Computing Grid (LCG) for data storage and computing infrastructure • 2008 will be first full year of LHC running • Actual analysis behaviour still unknown è use simulation to investigate behaviour èinvestigate dynamic data replication
Grid Replica Optimisation • Many variables determine overall grid performance • Impossible to reach one optimal solution! • Possible to optimise variables which are part of grid middleware • Job scheduling, data management etc • This talk considers data management only… …and dynamic replica optimisation in particular
Dynamic Replica Optimisation = optimisation of the placement of file replicas on grid sites… …in a dynamic, automated fashion
Design of a Replica Optimisation Service • Centralised, hierarchical or distributed? • Pull or push? • Choosing a replication trigger • On file request? • On file popularity? • Aim to achieve global optimisation as a result of local optimisation
OptorSim • OptorSim is a grid simulator with a focus on data management • Developed as part of European DataGrid Work Package 2 • Based on EDG architecture • Used to examine automated decisions about replica placement and deletion http://edg-wp2.web.cern.ch/edg-wp2/optimization/optorsim.html
Architecture • Sites with Computing Element (CE) and/or Storage Element (SE) • Replica Optimiser decides replications for its site • Resource Broker schedules jobs • Replica Catalogue maps logical to physical filenames • Replica Manager controls and registers replications
Algorithms • Job scheduling • Details not covered in this talk • “QueueAccessCost” scheduler used in these results • Data replication • No replication • Simple replication:“always replicate, delete existing files if necessary” • Least Recently Used (LRU) • Least Frequently Used (LFU) • Economic model: “replicate only if profitable” • Sites “buy” and “sell” files using auction mechanism • Files deleted if less valuable than new file
Experimental Setup - Jobs & Files • Job types based on computing models • “Dataset” for each experiment ~1 year’s AOD (analysis data) • 2GB files • Placed at CERN and Tier-1s at start
Experimental Setup - Storage Resources • CERN & Tier 1 site capacities from LCG Technical Design Report • “Canonical” Tier 2 capacity of 197 TB each (18.8 PB / 95 sites) • Define storage metric D = (average SE size) (total dataset size) • Memory limitations -> scale down Tier 2 SE sizes to 500 GB • Allows file deletion to start quickly • Disadvantage of small D
Experimental Setup - Computing & Network • Most (chaotic) analysis jobs run at Tier 2s • Tier 1s not given CE, except those running LHCb jobs • CERN Analysis Facility with CE of 7840 kSI2k • Tier 2s with averaged CE of 645 kSI2k each (61.3 MSI2k / 95 sites) • Network based on NREN topologies • Sites connected to closest router • Default of 155 Mbps if published value not available
Parameters • Job scheduler “QueueAccessCost” • Combines data location and queue information • Sequential access pattern • 1000 jobs per simulation • Site policies set according to LCG Memorandum of Understanding
Evaluation Metrics • Different grid users will have different criteria of evaluation • Used in these summary results are: • Mean job time • Average time taken for job to run, from scheduling to completion • Effective Network Usage (ENU) • (File requests which use network resources) (Total number of file requests)
Results: Data Replication • Performance of algorithms measured with varying D • D varied by reducing dataset size • 20-25% gain in mean job time as D approaches realistic value
Results: Data Replication • ENU shows similar gain • Allows clearer distinction between strategies
Results: Data Replication • Number of jobs increased to 4000 • Mean job time increases linearly • Relative improvement as D increases will hold for higher numbers of jobs • Realistic number of jobs is >O(10000)
Results: Site Policies • Vary site policies: • All Job Types • Sites accept jobs from any VO • One Job Type • Sites accept jobs from one VO • Mixed • default • All Job Types is ~60% faster than One Job Type
Results: Site Policies • All Job Types also give ~25% lower ENU than other policies • Egalitarian approach benefits all grid users
Results: Access Patterns • Sequential access likely for many physics applications • Zipf-like access will also occur • Some files accessed frequently, many infrequently • Replication gives performance gain of ~75% when Zipf access pattern used
Results: Access Patterns • ENU also ~75% lower with Zipf access • Any Zipf-like element makes replication highly desirable • Size of efficiency gain depends on streaming model, etc
Conclusions • OptorSim used to simulate LCG in 2008 • Dynamic data replication reduces running time of simulated grid jobs: • 20% reduction with sequential access • 75% reduction with Zipf-like access • Similar reductions in network usage • Little difference between replication strategies • Simpler LRU, LFU 20-30% faster than economic model • Site policy which allows all experiments to share resources gives most effective grid use
Replica optimiser architecture • Access Mediator (AM) - contacts replica optimisers to locate the cheapest copies of files and makes them available locally • Storage Broker (SB) - manages files stored in SE, trying to maximise profit for the finite amount of storage space available • P2P Mediator (P2PM) - establishes and maintains P2P communication between grid sites