60 likes | 142 Views
Changes in PD2P replication strategy. S. Campana ( CERN IT/ ES) on behalf of ADC. Current PD2P algorithm. For T1s: distributes proportionally with T1 MoU share This is OK, we do not discuss this today For T2s: same algorithm as job brokering
E N D
Changes in PD2P replication strategy S. Campana(CERN IT/ES) on behalf of ADC
Current PD2P algorithm • For T1s: distributes proportionally with T1 MoU share • This is OK, we do not discuss this today • For T2s: same algorithm as job brokering • Send a copy to the place which has higher chance to run a job once the dataset gets there • Weight ~ #CPUS * #Running / # Waiting • The current algorithm is optimized for re-brokering • And offloading the T1 • Not necessarily it is optimal for an balanced data distribution • The main purpose of PD2P is to replicate popular data for reuse Presentation Title - 2
New PD2P algorithm • PD2P will use two different algorithm for • Express replica (for quick reuse) • Long Term replica (aimed to balanced data distribution) • Express replica • Same algorithm as today. • Quick data delivery (use ClosedSites only) • Possibility to run promptly a job on the new replica • Long Term Replica(s) • Based on size (disk) of the site • Based on performance in Analysis Functional Tests (last month) Presentation title - 3
Pd2P and Pre-Placement • PD2P and Pre-Placement of data are (almost) orthogonal • PD2P replicates only what has been used – with a minimal delay • Pre-Placement replicates everything • The delay can be very short but also very long (congestion after reprocessing for example) • ADC would prefer to have only one mechanism for Data Replication: PD2P Presentation title - 4
Proposal • We leave the situation unchanged for T1s and CERN • We keep going with no pre-placement for T2s • We increase the number of replicas from PD2P create on first use (and subsequent uses): • One Express Replica at T2s • Two Long Term replicas at T2s • PD2P is applied to ADOs, (D)ESDs, NTUP • Both Data and MC • Including what is produced by Group Production Presentation title - 5
Monitoring and Docs • The PD2P twiki • https://twiki.cern.ch/twiki/bin/viewauth/Atlas/PandaDynamicDataPlacement • PD2P replication (tables) from Mikhail • http://panda.cern.ch/server/pandamon/query?mode=pd2p • PD2P logs from Tadashi • http://panda.cern.ch/server/pandamon/query?mode=mon&hours=240&name=panda.mon.prod&type=pd2p • Plots from Jarka (will be moved soon, check the twiki above) • http://hpv2.farm.particle.cz/~schovan/pd2p/pd2p_index.html Presentation title - 6