240 likes | 326 Views
Data Transfer Efficiency - leave no byte unchurned. Jens Jensen Rutherford Appleton Laboratory GridPP26, U Sussex, March 2011. Background. GridPP’s data grid Distributed Storage Elements Data movers (FTS, PhEDEx et al ) Catalogues (usu. replica)
E N D
Data Transfer Efficiency- leave no byte unchurned Jens Jensen Rutherford Appleton Laboratory GridPP26, U Sussex, March 2011
Background • GridPP’sdata grid • Distributed Storage Elements • Data movers (FTS, PhEDExet al) • Catalogues (usu. replica) • e-Infrastructure (aka cyberinfrastructure) • (Presentation at ISGC)
The Data Grid • WLCG is primarily a data grid • Computation can (in principle) be redone • Jobs go to where data is • Moving a job is quicker than moving data
Postmature non-optimisation is the root of some evil • The role of infrastructure code • Scientist as a programmer • “Bad” code moves up the stack? • “Bad” code improves over time? • Doofers stay in prod’n
Efficiencaciousness Goals Service • Availability • Performance • Grows as needed • Robust (no SPoF?) People • (Effective) support • Training • Expertise • Availability of…
Approaches • Philosophy • Get it done – WLCG • Get it done right – EGI? • Do It Perfectly The First Time… • Evolutionary (control system) vs revolutionary • Proactive vs reactive
Efficiencaciousness Issues • Failures • Sites – BDII, network • Elements – storage • Components – disk servers • Timeouts • DDoS
Efficiencaciousness Issues • Overall effort • Funded, contributed, external • Availability of expertise • Single Point of Knowledge • Decoherence • 2nd Law of Thermodynamics • Learning from incidents
Efficiencaciousness Issues • Primary communication • Sites • Users: large VOs, small VOs, single users • PMB • Secondary • WLCG • NGS
Efficiencaciousness Issues • Sites • There Is Always A Bottleneck Somewhere • Site dependent • Usage dependent • Information • Freshness • Accuracy (“spped is substutefoaccurcy”)
Efficiencaciousness Issues • Usage patterns • C.f. Wahid’s talk yesterday • WAN vs LAN (WN) traffic • Technology • In the narrow sense (drives, controllers) • And the wider sense: dist’dfilesystems • Support: Upstream (EGI), Fabric
Efficiencaciousness Issues • Overheads • Complexity of use of stack (see next) • Infrastructure is complex • But Complexity Has To Go Somewhere • Time-to-production • Testing, troubleshooting, monitoring, tweaking, tuning
Particular Pain Point Principle Progress
Progressing Forward • What is progress • How to measure progress
The Good News • We’ve come a long way • Don’t think there is a skills gap • But some SPoKs
Graeme’s talk • “Get the best out of what we can afford to buy” • Proactive sites better • Standards are good
E[GM]I involvement • EMI data roadmap • Support for dCache, DPM, StoRM • Support for standards (NFS4, CDMI) • But then • StoRM=INFN, dCache=DESY, DPM=CERN
The Cloud View • Supplement resources with on-demand • Agile • CDMI is superset of SRM • But using ReST+JSON, not SOAP
(Open) Standards • Standards promote interoperation and stability • Interoperation • Multiple (independent) implementations • Both Java and (C or C++)
The Case for Non-HEP Data • Benefit from non-HEP data • Outreachy stuff • Benefit to society (eg saving lives) • NGI interop (at compute) • Others…
Efficiencaciousness Goals Service • Availability • Performance • Grows as needed • Robust (no SPoF?) People • (Effective) support • Training • Expertise • Availability of…