190 likes | 344 Views
FBSNG Overview. Jim Fromm Farms and Clustered Systems Group, Computing Division, Fermilab. People. FCS Group: Jim Fromm (Fermilab) Tanya Levshina (Fermilab) Igor Mandrichenko (Fermilab) Krzysztof Genser (Fermilab) Former FCS Group Members: Mark Breitung Marilyn Schweitzer
E N D
FBSNG Overview Jim Fromm Farms and Clustered Systems Group, Computing Division, Fermilab
People • FCS Group: • Jim Fromm (Fermilab) • Tanya Levshina (Fermilab) • Igor Mandrichenko (Fermilab) • Krzysztof Genser (Fermilab) • Former FCS Group Members: • Mark Breitung • Marilyn Schweitzer • FBSNG users/testers who provided significant feedback: • Antonio Wong Chan (Academia Sinica, Taiwan, CDF) • Yen-Chu Chen (Academia Sinica, Taiwan, CDF) • Miroslav Siket (Academia Sinica, Taiwan, CDF) • Heidi Schellman (Northwestern University, D0) • Steve Wolbers (Fermilab, CDF) • Ping Yeh (Academia Sinica, Taiwan, CDF) • Thomas Las (Minooka Junior High, Minooka, IL). http://www-isd.fnal.gov/fbsng
History and Status • FBS project goals • Replace CPS batch in PC farms environment • Develop a farm batch system for file-based parallel data processing style adopted for RunII • Do not preclude event-based parallelism • Milestones • Spring 1998 - initial FBS design, first working prototypes • Fall 1998 - first production users (E871) • Spring 1999 - review by CDF, D0 • Fall 1999 - FBS v2.2 • Fall 1999 - beginning of FBSNG project • July 2000 - FBSNG v1.0 released • Oct 2000 – FBSNG v1.1 released • Currently FBSNG is installed on: • Fixed target farm • CDF, D0 farms • NIKHEF (D0 collaborators) http://www-isd.fnal.gov/fbsng
FBSNG Immediate Project Goals • Stop using LSF as scheduler and job storage as was done with previous versions of FBS. • Reduce maintenance and support costs • Avoid possible scalability problems • Simplify maintenance • Allows for addition of new features (such as resource management). • Decision made to release FBSNG as soon as a core set of features were implemented. First requirement was to “Not break anything!” • Preserve as many features of FBS as reasonably possible • Add few fundamental features such as: • Abstract resources • Customizable scheduler http://www-isd.fnal.gov/fbsng
Long Term Goals • Dynamic re-configuration (implemented in V1.1) • Further development of resource management (resource pools implemented in v1.0) • Integration with FIPC, the Farms Interprocess Communications toolkit developed at Fermilab. • To make FBSNG an “open” system, accomplished through the API. http://www-isd.fnal.gov/fbsng
FBSNG Redesign • Whenever possible, features were carried forward from FBS to FBSNG. • The look and feel of FBSNG is very much like FBS, but FBS and FBSNG are not compatible. Feedback from users is that converting to FBSNG was relatively easy. http://www-isd.fnal.gov/fbsng
FBSNG Design (Big Picture) • BMGR functions: • Scheduling • Resource management • Job storage • Communication with API clients http://www-isd.fnal.gov/fbsng
FBSNG Concepts: Farm Model http://www-isd.fnal.gov/fbsng
FBSNG Concepts: Job Consists of Sections http://www-isd.fnal.gov/fbsng
FBSNG Resources • FBSNG allows for several new ideas in resource management • Global resources • Visible to the entire farm • Examples: • Disk space on NFS server • Network bandwidth • Local resources • Visible on individual nodes • Examples: • CPU • Local disk • Attributes • Attributes are local to a particular node. • Attributes are just there, they aren’t “used up”. • Examples: • Special software installed (Fortran compiler). • Version of OS • FBSNG assumes users know how a job will use resources, and assumes that the user will give this info to FBSNG. • Resources are just counters to FBSNG, it does not know anything about what they represent. http://www-isd.fnal.gov/fbsng
Resource Pools • Pools are collections of similar resources. • The actual resources in a resource pool are referred to as underlying resources. • Examples: • Multiple scratch disks on a given host. Users could specify 2GB of scratch disk, not caring which specific disk has 2GB free. • A user could have a job that needs to run on any version of Linux. A resource pool named Linux could be created with underlying resources (attributes) of Linux52 and Linux61. http://www-isd.fnal.gov/fbsng
Resource Management • Process type per task or project • Soft association between queue and process type (user can override) • User can request additional resources • Queue is more of a scheduling than resource management entity • More flexibility for users • Only per-process type resource quotas http://www-isd.fnal.gov/fbsng
User Interface • Job Submission: Users issue submit command and provide the name of a Job Description File(JDF). The JDF file contains control information needed by FBSNG such as: • Section Name • Executable Name • Queue • Number of processes to spawn. • Job Control: • Monitor status • Kill/cancel • Hold/release • History • Farm node status • Resource utilization statistics • Scheduler status http://www-isd.fnal.gov/fbsng
SECTION Stage QUEUE=IO_Q EXEC=stage.sh VSN123 /stage01 NUMPROC=1 SECTION Reconstructor QUEUE=Long_Q EXEC=reco123.sh /stage01 /stage02 NUMPROC=10 DEPEND=done(Stage) PROC_RESOURCES=disk:5 Linux Blue SECTION Clean-up QUEUE=Short_Q PROC_TYPE = Light EXEC=clean.sh /stage02 VSN123 NUMPROC=1 PRIO_INC = 5 DEPEND=failed(Reconstructor) First section: pre-stage data Queue to submit to Command to execute 1 process Second section: reconstruction 10 processes Only if pre-staging succeeds 5(GB) of local disk, Linux, Blue node Emergency clean-up section Override default process type Run at higher priority Run only if reconstructor fails FBSNG: Example of Job Description File (JDF) http://www-isd.fnal.gov/fbsng
Batch Process Environment • Environment Variables • FBS_JOB_ID • FBS_SECTION_NAME • FBS_JOB_SIZE - number of processes in the section • FBS_SCRATCH - Scratch disk area for user processes. • FBS_PROC_NO - logical process id(1…FBS_JOB_SIZE) • FBS_SECTION_NAME – The name of the section. • FBS_HOSTS - list of nodes assigned to this job. • FBS_PROC_STDOUT - path of processes stdout • FBS_PROC_STDERR - path of processes stderr • HOME - home directory • Others… • Current working directory is HOME • Stderr, stdout as specified in JDF http://www-isd.fnal.gov/fbsng
Scheduler • Algorithms are based on the idea of dynamic priorities • Controllable fair-share scheduling • Projects are assigned relative shares of farm resources • Guaranteed scheduling. Avoids starvation. • No infinite delays for big jobs • Will hold small jobs if necessary http://www-isd.fnal.gov/fbsng
API • FBSNG provides a Python API that allows: • Job submission • Job monitoring and control • Resource management and monitoring • UI, GUI are layered on top of the API http://www-isd.fnal.gov/fbsng
FBSNG Requirements • On control node: • bmgr daemon • logd daemon (optional) • On each worker node: • Launcher (root) • Rstatd (optional) • Software/hardware requirements: • Python (Most of the FBSNG sources are Python) • Tcl/Tk, Tkinter (for GUI) • FCSLIB package available from fermitools. FCSLIB contains some Python modules used by FBSNG. • Configuration files synchronized on all worker nodes (NFS works well) http://www-isd.fnal.gov/fbsng
FBSNG Project Status and Plans for Future • FBSNG V1.0 (IRIX, Linux) released July 2000. • FBSNG V1.1 (IRIX, Linux) released October 2000. • Available in Fermitools • V1.1 is in production. Feedback on V1.0 and V1.1 thus far has been positive. • See http://www-isd.fnal.gov/fbsng for project details. http://www-isd.fnal.gov/fbsng