200 likes | 293 Views
Calcul CMS: bilan CCRC08. C. Charlot / LLR LCGFR , 3 mars 2008. CCRC08: objectifs. Test de readiness de l’infrastructure de calcul avant le data taking Exercice combin é avec les autres expériences Phase I: fév. 2008 S érie de tests fonctionnels
E N D
Calcul CMS: bilan CCRC08 C. Charlot / LLR LCGFR, 3 mars 2008
CCRC08: objectifs • Test de readiness de l’infrastructure de calcul avant le data taking • Exercice combiné avec les autres expériences • Phase I: fév. 2008 • Série de tests fonctionnels • Processing au T0 et archivage, transferts Cessy->CERN, transferts T0->T1->T2, T1 staging et processing, tests CAF • Phase II: mai 2008 • Workflow complet et simultané à tous les sites • Echelle = 100% • 1 semaine de mise en route puis 4 semaines de test • Cette présentation • Tests de transferts • Test de staging au T1 • Test de processing simultané avec ATLAS au T1 (PIC, CC-IN2P3) Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T0->T1 transfers • Objectifs: • Performances • T0 (stagged data) -> T1 (disk buffer): minimum = 25% de 2008, objectif = 40% de 2008, optimal = 50% de 2008 • T1 (disk) -> T1 (bandes): 25% de 2008 • Objectif doit être atteint pendant 3 jours de suite • Stabilité • T0 (stagged) -> T1 (disk) -> T1 (bandes) • Tranfert stable avec réception d’un volume équivalent à 3 jours au débit ci-dessus (10TB pour CC-IN2P3) Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T0->T1 transfers • T0-T1 Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T0->T1 transfers • T0-T1-CCIN2P3 - Problems with srmv2 config - Problems with dcache Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T0->T1 transfers • T0-T1s Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1->T1 transfers • Objectifs: • Performances • Débit aggrégé: exporter à 50% du débit 2008 vers au moins 3 T1s • Débit aggrégé: importer à 50% du débit 2008 depuis au moins 3 T1s • Au moins 1 T1 d’un autre continent Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1->T1 transfers • Résumé des 3 semaines Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1->T1 transfers • T1-CCIN2P3->otherT1s Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1->T1 transfers • T1<->T1 résumé Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1->T2 transfers Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1-CCIN2P3->T2s Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1s->region T2s Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: T1s->region T2s Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: reprocessing tests • Reprocessing tests for CCRC08 in February : • A) Migration from Tape to Buffer: pre-stage test. • B) Reprocessing exercise: use all available CMS slots at T1s. • Not done since already achieved at T1 CC-IN2P3 with ~1000 slots used processing of production data • C) Reprocessing exercise: test ATLAS and CMS reprocessing jobs on same WN Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: pre-staging tests • Goal: Measure latency, throughput and success rate for Tape to Buffer • staging, for files which are only kept on Tape (not on disk). • Plan: • + select one (or more) dataset(s) of 10TB size existing at T1. • + remove all the files from disk (aka, T1 Buffer). • + fire the staging from Tape to Disk of all files. • + measure some variables (detailed in the twiki). • Schedule: To be done at sites (with help of site admins) during the 1st • quarter of February. Done at all T1 sites. Réunion LCG-France, 03/03/2008 C.Charlot
CCRC08: pre-staging tests • Obtained Results: Staging time for 10 TBs: ~24h (except RAL and IN2P3,CNAF) Réunion LCG-France, 03/03/2008 C.Charlot
CCIN2P3: pre-staging tests • dCache HPSS interface: HPSS -> HPSS_Disk ->dCache_Disk (Farm access). • 1 GB file needs ~140’’ to complete process HPSS_TapeHPSS_DiskdCache_Disk. • The latest (HPSS_DiskdCache_Disk) is achieved in ~45 secs (22 MB/s), while • HPSS_TapeHPSS_Disk takes the majority of time, as expected (mounts, tape seek…). • 140’’ for file staging 7.1 MB/s for file recovery, in average, per drive. • The test launched 3 parallel processes for staging -> 3 tapes (max.) were mounted at any time to recover files from the system. 7.1 MB/s/drive was achieved 23 MB/s, averaged • A last test consisting on recalling 100 files in a same tape has been performed. • HPSS_TapedCache_Disk took 19' 12secs/file 88 MB/s. x10 better. Réunion LCG-France, 03/03/2008 C.Charlot
ATLAS+CMS processing test • Goal: run ATLAS and CMS reprocessing jobs on same WNs • Investigate performances, memory issues • Setup new CE with updated middleware and dedicated queues • Results: • ATLAS and CMS jobs were ran on dedicated CE+WNs • 10 8-core worker nodes • It allowed grid people to discover tricks in the LCG-CE glite-3.1 • Discovered that at CC tthe jobs were submitted to all queues and that GlueCEStateStatus == "Production" was not taken into account. • Max memory requirement was relaxed to allow for memory study but this was not looked at • Too limited # of WNs to see any interference effects Réunion LCG-France, 03/03/2008 C.Charlot
Conclusions • Bonne participation du CC aux tests CCRC08 • Merci à tous pour les efforts • Test de re-processing se sont avérés utiles • Staging: débit limité par l’interface dCache->HPSS • Optimisation de la gestion des requêtes ou ordonancement des fichiers par bande par l’utilisateur à prévoir • Des difficultés pour les transferts • Problème de configuration srmv2, nombreux problèmes dCache • Objectif CCRC08 (mai) de 100MB/s depuis le CERN • Il parait urgent de stabiliser dCache Réunion LCG-France, 03/03/2008 C.Charlot