200 likes | 304 Views
Xrootd Proxy Service. Andrew Hanushevsky Heinz Stockinger Stanford Linear Accelerator Center SAG 2004 20-September-04 http://xrootd.slac.stanford.edu. The BaBar Experiment. Use big-bang energies to create B meson particles Look at collision decay products
E N D
Xrootd Proxy Service Andrew Hanushevsky Heinz Stockinger Stanford Linear Accelerator Center SAG 2004 20-September-04 http://xrootd.slac.stanford.edu
The BaBar Experiment • Use big-bang energies to create B meson particles • Look at collision decay products • Answer the question “where did all the anti-matter go?” • 500 physicists collaborating from >70 sites in 10 countries • USA, Canada, China, France, Germany, Italy, Norway, Russia, UK, Taiwan • The experiment produces large quantities of data • 300 TBytes/year for 10 years • Most data stored as objects using Root persistency framework • Some data stored in Objectivity/DB database • Expected to double every year as detector luminosity increases • Heavy computational load • 5,000 1-2GHZ CPU’s spread over 35 sites world-wide • Work is distributed across the collaboration 2: xrootd
BaBar is the Forerunner • LHC at CERN • The Large Hadron Collider • Due to start in 2007 • Will generate several order of magnitude more data • Will require even more compute cycles • Example: • ATLAS • Probe the Higgs boson energy range • Explore the more exotic reaches of physics 3: xrootd
The Data Access Need • Scalable high performance access to data • Must scale to 100’s if not 1000’s of data servers • Most data is read-only • Data is written only once • Versioned • Secondary access to distributed data • As a backup strategy 4: xrootd
Solution Fundamentals • Extensible base server architecture • Allows for high performance implementation • Rich but efficient server protocol • Combines file serving with P2P elements • Allows client hints for improved performance • Administrative security • Implies a structured peer-to-peer framework 5: xrootd
The Implementation • High Performance File-Based Access • Fluidly scalable • Works well in single server environments • Scales beyond 32,000 cooperative data servers • Naively extensible • Requirement for this level of scaling • Servers can be added at any time without disruption • Fully fault-tolerant • Servers can be removed at any time without disruption • Flexible Security • Allowing use of almost any protocol 6: xrootd
data Entities & Relationships xrootd Data Network (redirectors steer clients to data Data servers provide data) olbd Control Network Managers & Servers (resource info, file location) Redirectors olbd M ctl xrootd olbd S Data Clients xrootd Data Servers 7: xrootd
Example: SLAC Configuration data servers kan01 kan02 kan03 kan04 kanxx redirectors bbr-rdr-a bbr-rdr03 bbr-rdr04 client machines 8: xrootd
Data Growth & More Fault Tolerance • BaBar Data Is Replicated • Backup Strategy • Processing Strategy • Some data only available at one site • Use grid techniques to make data accessible • But, when thing go wrong would like access • The proxy solution 9: xrootd
The 10,000 Foot View SLAC us INFN it Internet FZK de RAL uk IN2P3 fr 10: xrootd
The Reality • Sites has a fear of hosting… • Distributed Denial of Service Attacks • Massive illegal file sharing • Only certain hosts allowed to get outside • Rarely batch worker machines • The ones that need remote data most • The Firewall Issue 11: xrootd
A Closer Look SLAC Firewall Firewall Firewall IN2P3 RAL IN2P3proxy RALproxy xrootd’s Firewalls require Proxy servers 12: xrootd
Proxy Service • Attempts to address competing goals • Security • Deal with firewalls • Scalability • Administrative • Configuration • Performance • Ad hoc forwarding for near-zero wait time • Intelligent caching in local domain 13: xrootd
Proxy Implementation • Uses capabilities of olbd and xrootd • Simply an extension of local load balancing • Implemented as a special file system type • Interfaces in the Logical File System layer (ofs) • Functions in the Physical File System layer (oss) • Primary developer is Heinz Stockinger 14: xrootd
Proxy Interactions data01 data02 data03 data04 RAL local olb proxy olb 4 5 local olb proxy olb SLAC 3 1 red01 data02 data03 proxy01 2 client machines 15: xrootd
Why This Arrangement? • Minimizes cross-domain knowledge • Necessary for scalability in all areas • Security • Configuration • Fault tolerance & recovery 16: xrootd
2 3 2 1 Scalable Proxy Security SLAC PROXY OLBD RAL PROXY OLBD Data Servers Data Servers Firewall 1 Authenticate & develop session key 2 Distribute session key to authenticated subscribers 3 Data servers can log into each other using session key 17: xrootd
Proxy Performance • Introduces minimal latency overhead • Virtually undetectably from US/Europe • Negligible on faster links • 2% slower on fast US/US links • 10% slower on LAN • Can be further improved • Parallel streams • Better window size calculation • Asynchronous I/O 18: xrootd
Proxy Study Conclusion • Proxy Service easily integrates into xrootd • Largely due to peer-to-peer architecture • Provides enhanced service at minimal cost • Allows access to addition data sources • Increases fault tolerance • Covers up for grid transfer mistakes • Scalable in all aspects • Security, number of servers, administration 19: xrootd
Overall Conclusion • xrootd provides high performance file access • Improves over afs, ams, nfs, etc. • Unique performance, usability, scalability, security, compatibility, and recoverability characteristics • Should scale to tens of thousand clients • Will be distributed as part of CERN’s root package • Open software, supported by • SLAC (server), • INFN-Padova (client) • CERN (security, packaging) 20: xrootd