160 likes | 300 Views
SAM-Grid Middleware. - Rod Walker,ICL. SAM. JIM. RunJob. Conclusions. http://d0db.fnal.gov/sam. SAM stands for “ S equential A ccess to Data via M etadata”. Sequential access within files – order of files isn’t important, e.g. HEP data. History of SAM
E N D
SAM-Grid Middleware - Rod Walker,ICL. • SAM. • JIM. • RunJob. • Conclusions. http://d0db.fnal.gov/sam
SAM stands for “Sequential Access to Data via Metadata”. Sequential access within files – order of files isn’t important, e.g. HEP data. • History of SAM • Project started in 1997 by FNAL Computing Division(not just physicists). • Meant for FNAL experiments, and recently taken up by CDF. • So far ~20 FTE years – a lot of effort. State of the art in Data Management No-one else has tried to deliver TB’s of user selected data on demand.
Global file routing • Many remote stations want files • SAM allowed free-for-all to gridftp server. • MSS access only from FNAL site, cache on private network,... • Needed control and routing • Solution: All sites can route files, eg. • Get fnal files from fnal-router • route=fnal.gov::nijmegen and nijmegen station has route=fnal.gov::fnal-router • Janet - Geant – Esnet – FNAL, 155Mbit bottleneck. • Janet - Geant – Surfnet – FNAL, Gbit(?)
SAM Status • Middleware Development • Global routing. • Diverse deployments, e.g. private network, firewall, shared vs local disk cache. • CDF deployment – GridPP • Bug fixes. • GridFTP and Authentication – GridPP • Outlook • Decreasing development. FNAL CD support for RunII
JIM history • Purpose: to build on SAM’s data handling, to create a real grid. • Job definition & management • Information & Monitoring • Novel concepts • Already have DH system. • ups/upd packaging and deployment. • rpm functionality plus multi-platform, tailoring. • little dependence on native installation, e.g.python v2.1f • hugely simplified deployment. • Use Condor as resource broker.
JIM components • User Interface • Job Definition language based on classadds • RB reduced to making MMS ranking function • Static & dynamic constraints:os,code version,freecpu,… • Plus external function to query DH system. • Collaboration with Wisconsin. • Choose gatekeeper, use external function, separate submission server from negotiator.
JIM components • Information & Monitoring. • Currently: grid sensors > ldap > MDS > PHP • Developing: grid sensors > xml > native Db > PHP, other. • Reliability, flexibility, persistency. • Same model works for grid system book-keeping and user level monitoring.
JDL ClassAd Condor-G Information And Monitoring Cin Cout GRAM ClassAd Compute Resource Compute Resource User Interface User Interface Parser Parser Information Flow Condor Schedd Condor Schedd Condor Negotiator Condor Collector Condor Collector Condor Negotiator External Code External Code Condor Grid Manager Condor Grid Manager Gatekeeper Gatekeeper Batch Syestem Batch Syestem Grid Sensors Grid Sensors Execution Site
RunJob • Vital tool for d0 MC productions on farms. • Chains, steers and parallelizes d0 executables. Creates metadata. Use SAM to store to MSS. • Now interfaced to SAM for input, and can handle real data and any d0 executables. • Will be used for skimming, re-processing datasets, and user analysis. • Fully automate monitoring, checking and storage. • Work underway by UK.
RunJob status • Maintenance & development of RunJob, and interface to SAM-Grid entirely by UK. • CMS using branch of RunJob for production. • Dave Evans and Greg Graham collaborating on merging branches. • Goal: Single package with EDG and SAM-Grid interfaces. • Runjob “server” or job-manager.
User Interface User Interface User Interface User Interface Submission Submission Global Job Queue Resource Selector Grid Client Match Making Info Gatherer Info Manager Info Collector Global DH Services SAM Naming Server Site Data Handling Local Job Handling Cluster XML DB server SAM Log Server Site Conf. Grid Gateway SAM Station (+other servs) Glob/Loc JID map Resource Optimizer ... Local Job Handler (CAF,RunJob,Vanilla, ...) SAM DB Server Web Serv SAM Stager(s) MDS JIM Advertise RC MetaData Catalog Grid Monitoring Info Providers Worker Nodes Bookkeeping Service Cache MSS User Tools Dist.FS AAA Site Site Site SAM-Grid Logistics
Conclusions • Core SAM supported by FNAL CD • Operational support via software shifts. • UK currently contributes 2 experts on shift. • JIM post-development support, • bug fixing, deployment issues (like SAM). • will need software support shifts. • RunJob is and will be UK supported. • Expanding functionality – analysis,reprocessing. • Increasing deployment – d0 sites, CMS. • On target for end-March deliverable, and production Grid in April.
JIM V1: Package dependencies samgrid jim_broker_client jim_client sam_common xml_meta_configurator sam_config server_run jim_broker jim_info_providers jim_advertise galax orbacus jim_www globus jim_jobmanagers jim_sandbox