240 likes | 354 Views
David P. Anderson Space Sciences Laboratory University of California, Berkeley. Introduction to the BOINC software. Outline. Abstractions The BOINC server software The BOINC client software and runtime system. Design goals. A few applications, lots of jobs High performance
E N D
David P. Anderson Space Sciences Laboratory University of California, Berkeley Introduction to the BOINC software
Outline • Abstractions • The BOINC server software • The BOINC client software and runtime system
Design goals • A few applications, lots of jobs • High performance • millions of jobs per day • Scalability • Fault tolerance
Abstractions • Platform • App version • a collection of files, one of which is an executable main program • associated with a platform • App • a set of app versions that all perform roughly the same computation • may have versions for different platforms • may have different versions for one platform (GPU, non-GPU)
Abstractions • Workunit (job) • a collection of input files • associated with an app (not an app version!) • attributes • resource estimates and bounds • latency bound • Result (job instance) • a collection of output files • associated with a workunit
Anatomy of a BOINC project MySQL database daemons and periodic tasks servers project root/ bin/ cgi-bin/ download/ 00/ .. 3ff/ html/ log_*/ templates/ upload/ 00/ .. 3ff/ clients
Work generator • Creates input files • Creates workunits • One per app • Flow control • disk space • DB size MySQL database work generator project root/ bin/ cgi-bin/ download/ 00/ .. 3ff/ html/ log_*/ templates/ upload/ 00/ .. 3ff/
Specifying a job • Workunit template • XML document describing • input files (logical, physical names) • job attributes • Result template • XML document describing output files • create_work() • specifies templates, app, input files
Validator • Check result validity • Compare replicas • May be app-specific MySQL database validator project root/ bin/ cgi-bin/ download/ 00/ .. 3ff/ html/ log_*/ templates/ upload/ 00/ .. 3ff/
Validation • Clients may • return bad results • exaggerated claimed credit • Strategies • app-specific consistency checking • replication • fuzzy comparison • homogeneous redundancy • adaptive replication
Assimilator • Processes completed results • App-specific MySQL database assimilator project root/ bin/ cgi-bin/ download/ 00/ .. 3ff/ html/ log_*/ templates/ upload/ 00/ .. 3ff/
Summary • Create app, app versions for different platforms • Develop work generator • Develop validator • Develop assimilator Isn’t there a simpler way?
Single-job submission • Assemble your input files and executable, then boinc_submit --input foo --output blah program • How this works: • uses “wrapper” app • executable is part of workunit • templates are created automatically • What it doesn’t do: • multi-platform • validation
Job dispatch MySQL database feeder transitioner project root/ bin/ cgi-bin/ download/ 00/ .. 3ff/ html/ log_*/ templates/ upload/ 00/ .. 3ff/ share-memory job cache scheduler (CGI or FastCGI) clients
File transfer Apache project root/ bin/ cgi-bin/ download/ 00/ .. 3ff/ html/ log_*/ templates/ upload/ 00/ .. 3ff/ clients file upload handler
Janitorial daemons MySQL database DB purger project root/ bin/ cgi-bin/ download/ 00/ .. 3ff/ html/ log_*/ templates/ upload/ 00/ .. 3ff/ file deleter
Ways to deploy a BOINC server • Linux server • Server VM for VMWare • Server VM for Amazon EC2
The BOINC runtime system application BOINC runtime Directory structure: BOINC/ projects/ lhcathome/ physical_name0 physical_name1 setiathome/ slots/ 0/ logical_name0 (link file) logical_name1 1/ fraction done CPU time share-memory message-passing suspend resume quit BOINC client
Basic API • boinc_init() • creates a thread that handles messages • boinc_finish() • creates a “finish file” • boinc_resolve_filename() • maps logical to physical file names
Checkpointing • boinc_time_to_checkpoint() • call at points where you can checkpoint • boinc_checkpoint_done() • call when you’re finished checkpointing
Compound applications • Examples: • coordinator program runs several worker programs in sequence • “switcher” program probes CPU architecture, selects which executable to run • Variants of boinc_init() let you specify which app is main program, and how messages are handled • Each message type must be handled by 1 process
Long-running applications • Trickle-up messages • Trickle-down messages • Intermediate file transfers
Legacy applications • The BOINC wrapper • takes XML “job file” • handles all messages
GPU and multithread apps • Server • you supply a function that takes an app version and a host, and returns resource usage and estimated FLOPS • the BOINC scheduler chooses the best version • Client • senses and reports coprocessors (e.g. NVIDIA GPUs) • coprocessor-aware scheduling and work fetch