1 / 27

CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security

CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security. Andrew Hanushevsky Stanford Linear Accelerator Center. Legal Disclaimer. This summary is from one perspective It is not representative of any particular view Other than the presenter

idalia
Download Presentation

CHEP 2003 Summary Grid Architecture, Infrastructure, & Middleware Monitoring & Security

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHEP 2003 SummaryGrid Architecture, Infrastructure, & Middleware Monitoring & Security Andrew Hanushevsky Stanford Linear Accelerator Center

  2. Legal Disclaimer • This summary is from one perspective • It is not representative of any particular view • Other than the presenter • This summary is not warranted for any purpose whatsoever • Participants assume all direct and indirect (consequential or inconsequential) damages • Do you want to stay? 2: CHEP 2003

  3. Grid Deployment • Track I talks referenced grid “deployment” • Deployment has many meanings • Minimally, if you have it working it better be usable • Is it production ready? 3: CHEP 2003

  4. Production Grids • LCG Experience Suggests It Is Difficult • Packaging, Installation, Configuration, & Validation Issues • “These issues (and more) make the difference between the research project ending with ademo and the product to be used for production.” -- Zdenek Sekera • Assume LCG (T#184) interpretation of production • Harsh but be need a benchmark 4: CHEP 2003

  5. It is all of the following in no particular order: availability 24 x 7 performance stability, robustness user friendliness maintainability user support From LCG T#184 What is “production quality”? 5: CHEP 2003

  6. So Where Are We? • Let’s take a look at presented “grid” projects in alphabetic order • From Grid to Grid-Like • Disclaimer! • This is not representative of all such projects 6: CHEP 2003

  7. AliEn (M#253) • Distributed environment with Grid interface • SASL (includes GSI) EDG compatible authentication • Distributed RDBMS-based file catalog • Condor-like job scheduling • Attempts to unify grid infrastructures • Adopted by MamoGrid (M#66) 7: CHEP 2003

  8. Amanda (M#110) • Ostensibly production ready • Condor + Bypasses + Local Tools (Grid Navigator) • Uses central s/w and data repositories • Runs a specific application software suite • Plan to integrate Globus middleware as it matures 8: CHEP 2003

  9. DIRAC (M#253) • Distributed environment • Essentially a roll-your-own grid-like solution • Interface to EDG now in test • EDG stability considered problematic • Successfully deployed on 17 sites 9: CHEP 2003

  10. EDG • Workload Management WP1 (M#132 & 137) • Deployed for 18 months • Still pre-production stage • Various problems in reliability & scalability • Numerous improvements planned • DAGMan integration • Grid Accounting • Resource reservation & co-allocation • Globus GARA Approach 10: CHEP 2003

  11. EDG (continued) • Data Management WP2 (T#249 & 490) • Basic use cases satisfied • Not proven in a “real user environment” • Pre-production • Numerous additions planned • Logical collection • Enhanced security • Authorization and delegation • OGSA direction with future compliance 11: CHEP 2003

  12. NorduGrid (M#109) • Modified/Extended Globus + EDG RLS • Pre-production stage • Additional EDG integration as stability improves • Web Services (OGSA) plans 12: CHEP 2003

  13. SAM (T#335) • Successful for D0 and CDF • Work under way to integrate with grid middleware • Production D0 release of SAMGrid (JIM+Condor-G) scheduled for April • One of the arguably successful grid-like projects • Largely dealing with data management issues 13: CHEP 2003

  14. STAR (T#442) • Distributed environment • Essentially a roll-your-own grid-like solution • Interface to Condor-G • Uses LBL HRM/DRM • Successful (but limited) deployment • NERSC & BNL 14: CHEP 2003

  15. Storage Resource Broker (T#211) • Successful deployment across multiple fields • Work underway to integrate with Globus data mangement • One of the arguably successful grid-like projects • Limited to data management 15: CHEP 2003

  16. The Successes • Few projects have achieved “production” status • Those which have are focused and grid-like • SAM, SRB soon to follow AliEn. Dirac, & Star • It is not clear why this is so • Historical timeline? • Immediate need for results? • Funding model? • Grid protocols in flux (e.g., Globus 2 vs Globus 3)? • Open software/collaboration issues? • Sociological phenomena? • Fortunately many plan to integrate with the “standard” grid • Time will tell…. 16: CHEP 2003

  17. The Fast Trackers • These projects have only incorporated some grid middle-ware • Amanda & NorduGrid • Many difficult issues have been avoided, but…. • Are we entering the OSI model of development? • Pick and choose from a bag of protocols & tools • This does not bode well for interoperability 17: CHEP 2003

  18. The Simmering • “These” projects have embraced the grid • EDG (parallels and derivates) • Problems not being avoided • Adopted the long range view (2 or more years) • Will this be to the benefit of the HEP community? • Depends on your of view of next generation computing • It seems that all projects are hedging their bet • You wonder where we would be if all the hundreds of current FTE’s were focused on making this model really work 18: CHEP 2003

  19. State of Security • Three dominate themes • Private Key Management • KCA (T#422), VSC etc. (T#81) • Virtual Organization Management • VOMs (T#317) & GUMs (T#363) • Authorization (a.k.a. Access Control) • GACL (T#190), SAZ (T#423), Akenti (T#426), CAS (T#441, 518) 19: CHEP 2003

  20. Security Convergence • Other than x.509 there is little common ground • But, does there need to be any common ground? • Key management is a matter of trust policy • VO administration is a site or multi-lateral prerogative • Authorization is largely a local issue • It seems that if you can agree on the credentials (i.e., x.509 + endorsements) the rest is relegated to collaboration policy irrespective of implementation • This appears to be the direction • Even if it’s not obvious at the moment 20: CHEP 2003

  21. Grid Monitoring • There is much activity • Much of it overlapping • BOSS (M#84), GMA (M#403), GridMonitor (M#321), Mona Lisa (M#103), PerfMC (M#522), & R-GMA (M#407) • Some convergence • Minimum set of events • Format (XML yet no “lingua franca” agreement) • This is an area to watch! • GGF is likely the stomping ground for agreement 21: CHEP 2003

  22. The Ultimate Highlights • Virtual Data • XML • Distributed File Systems • Job Scheduling • Peer to Peer Computing • “The” Award 22: CHEP 2003

  23. The Innovation Most At Risk • Virtual Data (T#106 & 114) • Great concept at technological mercy • The Optiputer is the menace. • Consider…. • Unlimited bandwidth • Ever decreasing storage costs • Constant software changes • Sociological problems of capturing the processing path • Together these may make VD untenable 23: CHEP 2003

  24. Things to Watch For I • XML • This is rapidly becoming the common syntax • Yet little effort in developing a common language • Assumption, perhaps misguided, that WSDL repositories will address the problem • Diamonds (iKnow) architecture (Java RMI + JINI) • Distributed Grid File Systems • Minimal data movement with global access • AlienFS (R#254) • There are many others that were not presented 24: CHEP 2003

  25. Things to Watch For II • Job to Data Scheduling • Algorithms to place a job near the data • Minimize data movement • Peer To Peer Computing • Marxist scheduling aiming for 100% utilization • Not yet addressed by current grid architectures • Ad hoc protocols • Subversive in that this may be the “real” next thing • Augernome (R#293) 25: CHEP 2003

  26. Summarizer’s Award • The project that makes innovative yet practical use of existing grid protocols • Grid Brick (R#493) • Parallel root-based query using Globus scheduling • Uncomplicated and practical needs-based approach • It’s so obvious you wonder why you didn’t do it first • It works within a standard grid environment! • Load balancing and fault tolerance to be explored 26: CHEP 2003

  27. Conclusions • Grid efforts are still meandering • Great for innovation • Dismal for standardization • Security is a bright spot • Rapid convergence on authentication issues • Authorization is more fuss than furry • There is a light at the end of tunnel • Monitoring situation is disappointing • The need is recognized but no agreement on how to proceed • Cross grid monitoring is in serious jeopardy 27: CHEP 2003

More Related