260 likes | 393 Views
SAM Resource Management. Lee Lueking CHEP 2001 September 3-8, 2001 Beijing China. Intro to. SAM is S equential A ccess to data via M eta-data Project started in 1997 to handle D0’s needs for Run II data system. Current SAM team includes:
E N D
SAM Resource Management Lee Lueking CHEP 2001 September 3-8, 2001 Beijing China Lee Lueking, FNAL
Intro to • SAM is Sequential Access to data via Meta-data • Project started in 1997 to handle D0’s needs for Run II data system. • Current SAM team includes: • Lauri Loebel-Carpenter, Lee Lueking*, Carmenita Moore, Igor Terekhov, Julie Trumbo, Sinisa Veseli, Matthew Vranicar, Stephen P. White, Victoria White*. (*project leaders) • http://d0db.fnal.gov/sam Lee Lueking, FNAL
Overview • Goals of Resource Management • Users, Groups and Access modes • Resources and Resource Management Strategies • Implementation • System Configuration • Rules and Policies • Disk Cache Management • Fair Share scheduling • Resource Co-allocation • Plans and Conclusion Lee Lueking, FNAL
Goals of Resource Management • Implement experiment policies on prioritization and fair sharing in resource usage, by user categories (access modes, research group etc) • Maximize throughput in terms of real work done (i.e. user jobs and not system internal jobs such as data transfers) Lee Lueking, FNAL
Groups • Users whose datasets, processing styles and goals are largely shared. • Defined by: • physics topics, like Higgs, Top, W/Z, B, QCD, and New Phenomena • detector elements like calorimeter, silicon tracking, muon, and so on • particle identification like jets, electron, muon, and tau. • Users must be registered and it is possible for each individual to be included many groups. Lee Lueking, FNAL
Access Modes • Storage • Data acquisition storage • Monte Carlo data storage • General User data storage • Delivery • Frequently accessed data • Cooperative access and processing • Data file delivery on demand • Random access event selection Lee Lueking, FNAL
Resources • Tape mounts • Tape volume access • Tape drive usage • Network throughput • Disk cache • Processing CPU • Memory cache Lee Lueking, FNAL
Management Strategies • Divide the problem into 3 tier hierarchy: Local (station), Site, Global • Hardware Configuration: Mass Storage System (ATL) access, Network, Disk assignments. • Establish Rules: Group allocations, Access mode priorities, Data routing paths, Type of processing, etc. • Algorithms to combine rules Lee Lueking, FNAL
The Hierarchy of Resource Managers Sites Connected by WAN Global RM Experiment Policies, Fair Share Allocations, Cost Metrics Stations And MSS’s Connected By LANs Site RM Batch queues and disks Station – Local RM Lee Lueking, FNAL
Implementation Lee Lueking, FNAL
Overview of Sam Name Server Database Server(s) (to Central DB) Site or Global Resource Manager(s) Log server Shared Globally Station 1 Servers Station 3 Servers Local Station n Servers Station 2 Servers Mass Storage System(s) Shared Locally Arrows indicate Control and data flow Lee Lueking, FNAL
The SAM Station • Responsibilities • Cache Management • Project (Job) Management • Movement of data files to/from MSS or other Stations • Consists of a set of inter-communicating servers: • Station Master Server, • File Storage Server, • File Stager(s), • Project (Job) Manager(s) Lee Lueking, FNAL
Components of a SAM Station /Consumers Producers/ Project Managers Temp Disk Cache Disk MSS or Other Station MSS or Other Station File Storage Server Station & Cache Manager File Storage Clients File Stager(s) Data flow Control eworkers Lee Lueking, FNAL
Station Configuration • Disks assigned to the cache • Batch system used • Batch queues available • Batch queue depth • Processing capacity CPU and physical memory • Mass Storage Systems available • Inter -station transfer mechanism: BBFTP, rcp • Disk accessibility for distributed cluster • Network connection, bandwidth, subnet for each machine • Security issues, access to kerberos tickets, etc. • Waits, timeouts and retries on failure conditions Lee Lueking, FNAL
Rules and Policies • Disk cache allocated to each group • Disk cache refreshment algorithm for each group:LRU,FIFO, etc. • Minimum amount of data to deliver at a time from each tape for a project • Order files brought into the cache. • Through which station files will be routed when retrieving from a particular Mass Storage System • Which data access activities have the highest priority • Which data storing activities have the highest priority • To which MSS’s are files stored, and to which tapes • Sharing of the resources of a station among groups • Which users belong to which groups • How many projects per group are allowed • What processing activities are allowed on each station? * • To which stations should data access and processing activities be sent? * • How should the resources of a local cluster of stations be shared among groups?* * Currently done by administrators Lee Lueking, FNAL
Station Management • Caches • Allocations established for groups on each station. • Resources are allocated by group • Total Size • Lock (pin) Size • Refresh algorithm: LRU,FIFO,… • No rigid assignment to particular physical disks. • Projects • Number of concurrent projects for each group, on each station. • Administration is by authorized users only • Station admins • Group admins Lee Lueking, FNAL
Station Administration: Dump(1) lueking@d0mino:~ % sam dump station –groups *** BEGIN DUMP STATION central-analysis, id=21 running at d0mino 5 days 22 hours 24 minutes 20 seconds, admins: lueking Known batch systems: lsf Default batch system: lsf No Source location is preferred There are 1 authorized transfer groups Full delivery unit is enforced; external deliveries are unconstrained Lee Lueking, FNAL
Station Administration: Dump (2) AUTHORIZED GROUPS: group algo: admins: cope lueking melanson terekhov veseli white , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 5/50, disk: 72838247KB/100000000KB, locks:0B/30000000KB group cal: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 1/10, disk: 11856085KB/78125MB, locks:0B/78125MB group demo: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0.608163, quotas (cur/max): projects = 2/50, disk: 4867877KB/5000000KB, locks:0B/0KB group dzero: admins: lueking melanson terekhov veseli white , swap policy: LRU, fair share: 0.142857, quotas (cur/max): projects = 10/100, disk: 499860527KB/500000000KB, locks:0B/100000000KB group emid: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/10, disk: 6396015KB/10000000KB, locks:0B/10000000KB group test: admins: lueking terekhov veseli white , swap policy: LRU, fair share: 0.11512, quotas (cur/max): projects = 1/20, disk: 21381359KB/26000000KB, locks:237179KB/20000000KB group thumbnail: admins: lueking melanson schellma , swap policy: LRU, fair share: 0.13386, quotas (cur/max): projects = 0/5, disk: 20687259KB/50000000KB, locks:0B/0KB *** END OF STATION DUMP *** Lee Lueking, FNAL
Adding Data to the System • Metadata descriptions for: • Detector data • Monte Carlo data • Processing details • Mapping to storage locations (we call auto-destinations) • Station forwarding specification Lee Lueking, FNAL
User (producer) Forwarding + Caching = Global Replication Fermilab D0robot Mass Storage System Sara NIKHEF (Amsterdam) 155 Mbps Station Site Replica Lee Lueking, FNAL WAN Data flow
User (producer) Routing + Caching = Global Replication Mass Storage System Station Site Replica Lee Lueking, FNAL WAN Data flow
Resource Management Approaches • Fair Sharing (policies) • Allocation of resources and scheduling of jobs • The goal is to ensure that, in a busy environment, each abstract user gets a fixed share of “resources” or gets a fixed share of “work” done • Co-allocation and reservation (optimization) Lee Lueking, FNAL
Fair Share and Computational Economy • Jobs, when executed, incur costs (through resource utilization) and realize benefits (through getting work done) • Maintain a tuple (vector) of cumulative costs/benefits for each abstract user and compare them to his allocated fair share to set priority higher/lower • Incorporate all known resource types and benefit metrics, totally flexible Lee Lueking, FNAL
Job Control: Station Integration with the Abstract Batch System Sam submit Job Manager (Project Master) Local RM (Station Master) invoke Client jobEnd submit setJobCount/stop Sam condition satisfied Process Manager (SAM wrapper script) Batch System User Task dispatch invoke resubmit • Fair Share Job Scheduling • Resource Co-allocation Lee Lueking, FNAL
Future Plans • Tape mounts were a critical resource in the past, but the inter-station movement of data is perceived to be a future constraint as more stations are deployed with large disk caches. • In addition to moving the data to computing resources, the system will evolve to move the processing to the data. • Job control language that will specify each task at a level that will allow the system to decide when and where it can optimally be processed. • Incorporate standard grid components as availability and need dictates: GridFTP, GSI, Condor, DAGMan, etc.. Lee Lueking, FNAL
Conclusion • The SAM system used for D0 data management and access represents a large step toward a global data grid. • Resources are managed at station, site and global levels. • The system is governed by station configuration and rules/policies. • Fair share resource allocation and scheduling controls amount of work done by each group, access mode, etc. • co-allocation coordinates data and processing to most effectively utilize the overall system. Lee Lueking, FNAL