270 likes | 407 Views
CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security. Andrew Hanushevsky Stanford Linear Accelerator Center. Legal Disclaimer. This summary is from one perspective It is not representative of any particular view Other than the presenter
E N D
CHEP 2003 SummaryGrid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center
Legal Disclaimer • This summary is from one perspective • It is not representative of any particular view • Other than the presenter • This summary is not warranted for any purpose whatsoever • Participants assume all direct and indirect (consequential or inconsequential) damages • Do you want to stay? 2: CHEP 2003
Grid Deployment • Track I talks referenced grid “deployment” • Deployment has many meanings • Minimally, if you have it working it better be usable • Is it production ready? 3: CHEP 2003
Production Grids • LCG Experience Suggests It Is Difficult • Packaging, Installation, Configuration, & Validation Issues • “These issues (and more) make the difference between the research project ending with ademo and the product to be used for production.” -- Zdenek Sekera • Assume LCG (T#184) interpretation of production • Harsh but be need a benchmark 4: CHEP 2003
It is all of the following in no particular order: availability 24 x 7 performance stability, robustness user friendliness maintainability user support From LCG T#184 What is “production quality”? 5: CHEP 2003
So Where Are We? • Let’s take a look at presented “grid” projects in alphabetic order • From Grid to Grid-Like • Disclaimer! • This is not representative of all such projects 6: CHEP 2003
AliEn (M#253) • Distributed environment with Grid interface • SASL (includes GSI) EDG compatible authentication • Distributed RDBMS-based file catalog • Condor-like job scheduling • Attempts to unify grid infrastructures • Adopted by MamoGrid (M#66) 7: CHEP 2003
Amanda (M#110) • Ostensibly production ready • Condor + Bypasses + Local Tools (Grid Navigator) • Uses central s/w and data repositories • Runs a specific application software suite • Plan to integrate Globus middleware as it matures 8: CHEP 2003
DIRAC (M#253) • Distributed environment • Essentially a roll-your-own grid-like solution • Interface to EDG now in test • EDG stability considered problematic • Successfully deployed on 17 sites 9: CHEP 2003
EDG • Workload Management WP1 (M#132 & 137) • Deployed for 18 months • Still pre-production stage • Various problems in reliability & scalability • Numerous improvements planned • DAGMan integration • Grid Accounting • Resource reservation & co-allocation • Globus GARA Approach 10: CHEP 2003
EDG (continued) • Data Management WP2 (T#249 & 490) • Basic use cases satisfied • Not proven in a “real user environment” • Pre-production • Numerous additions planned • Logical collection • Enhanced security • Authorization and delegation • OGSA direction with future compliance 11: CHEP 2003
NorduGrid (M#109) • Modified/Extended Globus + EDG RLS • Pre-production stage • Additional EDG integration as stability improves • Web Services (OGSA) plans 12: CHEP 2003
SAM (T#335) • Successful for D0 and CDF • Work under way to integrate with grid middleware • Production D0 release of SAMGrid (JIM+Condor-G) scheduled for April • One of the arguably successful grid-like projects • Largely dealing with data management issues 13: CHEP 2003
STAR (T#442) • Distributed environment • Essentially a roll-your-own grid-like solution • Interface to Condor-G • Uses LBL HRM/DRM • Successful (but limited) deployment • NERSC & BNL 14: CHEP 2003
Storage Resource Broker (T#211) • Successful deployment across multiple fields • Work underway to integrate with Globus data mangement • One of the arguably successful grid-like projects • Limited to data management 15: CHEP 2003
The Successes • Few projects have achieved “production” status • Those which have are focused and grid-like • SAM, SRB soon to follow AliEn. Dirac, & Star • It is not clear why this is so • Historical timeline? • Immediate need for results? • Funding model? • Grid protocols in flux (e.g., Globus 2 vs Globus 3)? • Open software/collaboration issues? • Sociological phenomena? • Fortunately many plan to integrate with the “standard” grid • Time will tell…. 16: CHEP 2003
The Fast Trackers • These projects have only incorporated some grid middle-ware • Amanda & NorduGrid • Many difficult issues have been avoided, but…. • Are we entering the OSI model of development? • Pick and choose from a bag of protocols & tools • This does not bode well for interoperability 17: CHEP 2003
The Simmering • “These” projects have embraced the grid • EDG (parallels and derivates) • Problems not being avoided • Adopted the long range view (2 or more years) • Will this be to the benefit of the HEP community? • Depends on your of view of next generation computing • It seems that all projects are hedging their bet • You wonder where we would be if all the hundreds of current FTE’s were focused on making this model really work 18: CHEP 2003
State of Security • Three dominate themes • Private Key Management • KCA (T#422), VSC etc. (T#81) • Virtual Organization Management • VOMs (T#317) & GUMs (T#363) • Authorization (a.k.a. Access Control) • GACL (T#190), SAZ (T#423), Akenti (T#426), CAS (T#441, 518) 19: CHEP 2003
Security Convergence • Other than x.509 there is little common ground • But, does there need to be any common ground? • Key management is a matter of trust policy • VO administration is a site or multi-lateral prerogative • Authorization is largely a local issue • It seems that if you can agree on the credentials (i.e., x.509 + endorsements) the rest is relegated to collaboration policy irrespective of implementation • This appears to be the direction • Even if it’s not obvious at the moment 20: CHEP 2003
Grid Monitoring • There is much activity • Much of it overlapping • BOSS (M#84), GMA (M#403), GridMonitor (M#321), Mona Lisa (M#103), PerfMC (M#522), & R-GMA (M#407) • Some convergence • Minimum set of events • Format (XML yet no “lingua franca” agreement) • This is an area to watch! • GGF is likely the stomping ground for agreement 21: CHEP 2003
The Ultimate Highlights • Virtual Data • XML • Distributed File Systems • Job Scheduling • Peer to Peer Computing • “The” Award 22: CHEP 2003
The Innovation Most At Risk • Virtual Data (T#106 & 114) • Great concept at technological mercy • The Optiputer is the menace. • Consider…. • Unlimited bandwidth • Ever decreasing storage costs • Constant software changes • Sociological problems of capturing the processing path • Together these may make VD untenable 23: CHEP 2003
Things to Watch For I • XML • This is rapidly becoming the common syntax • Yet little effort in developing a common language • Assumption, perhaps misguided, that WSDL repositories will address the problem • Diamonds (iKnow) architecture (Java RMI + JINI) • Distributed Grid File Systems • Minimal data movement with global access • AlienFS (R#254) • There are many others that were not presented 24: CHEP 2003
Things to Watch For II • Job to Data Scheduling • Algorithms to place a job near the data • Minimize data movement • Peer To Peer Computing • Marxist scheduling aiming for 100% utilization • Not yet addressed by current grid architectures • Ad hoc protocols • Subversive in that this may be the “real” next thing • Augernome (R#293) 25: CHEP 2003
Summarizer’s Award • The project that makes innovative yet practical use of existing grid protocols • Grid Brick (R#493) • Parallel root-based query using Globus scheduling • Uncomplicated and practical needs-based approach • It’s so obvious you wonder why you didn’t do it first • It works within a standard grid environment! • Load balancing and fault tolerance to be explored 26: CHEP 2003
Conclusions • Grid efforts are still meandering • Great for innovation • Dismal for standardization • Security is a bright spot • Rapid convergence on authentication issues • Authorization is more fuss than furry • There is a light at the end of tunnel • Monitoring situation is disappointing • The need is recognized but no agreement on how to proceed • Cross grid monitoring is in serious jeopardy 27: CHEP 2003