350 likes | 359 Views
Learn about the unique features of desktop grids, suitable applications, challenges and solutions, and its feasibility for various tasks.
E N D
Desktop Grids Ashok Adiga Texas Advanced Computing Center {adiga@tacc.utexas.edu}
Topics • What makes Desktop Grids different? • What applications are suitable? • Three Solutions: • Condor • United Devices Grid MP • BOINC
Compute Resources on the Grid • Traditional: SMPs, MPPs, clusters, … • High speed, Reliable, Homogeneous, Dedicated, Expensive (but getting cheaper) • High speed interconnects • Upto 1000s of CPUs • Desktop PCs and Workstations • Low Speed (but improving!), Heterogeneous, Unreliable, Non-dedicated, Inexpensive • Generic connections (Ethernet connections) • 1000s-10,000s of CPUs • Grid compute power increases as desktops are upgraded
Desktop Grid Challenges • Unobtrusiveness • Harness underutilized computing resources without impacting the primary Desktop user • Added Security requirements • Desktop machines typically not in secure environment • Must protect desktop & program from each other (sandboxing) • Must ensure secure communications between grid nodes • Connectivity characteristics • Not always connected to network (e.g. laptops) • Might not have fixed identifier (e.g. dynamic IP addresses) • Limited Network Bandwidth • Ideal applications have high compute to communication ratio • Data management is critical to performance
Desktop Grid Challenges (cont’d) • Job Scheduling heterogeneous, non-dedicated resources is complex • Must match application requirements to resource characteristics • Meeting QoS is difficult since program might have to share the CPU with other desktop activity • Desktops are typically unreliable • System must detect & recover from node failure • Scalability issues • Software has to manage thousands of resources • Conventional application licensing is not set up for desktop grids
Application Feasibility • Only some applications map well to Desktop grids • Coarse-grain data parallelism • Parallel chunks relatively independent • High computation-data communication ratios • Non-Intrusive behavior on client device • Small memory footprint on the client • I/O activity is limited • Executable and data sizes are dependent on available bandwidth
Typical Applications • Desktop Grids naturally support data parallel applications • Monte Carlo methods • Large Database searches • Genetic Algorithms • Exhaustive search techniques • Parametric Design • Asynchronous Iterative algorithms
Condor • Condor manages pools of workstations and dedicated clusters to create a distributed high-throughput computing (HTC) facility. • Created at University of Wisconsin • Project established in 1985 • Initially targeted at scheduling clusters providing functions such as: • Queuing • Scheduling • Priority Scheme • Resource Classifications • And then extended to manage non-dedicated resources • Sandboxing • Job preemption
Why use Condor? • Condor has several unique mechanisms such as : • ClassAd Matchmaking • Process checkpoint/ restart / migration • Remote System Calls • Grid Awareness • Glideins • Support for multiple “Universes” • Vanilla, Java, MPI, PVM, Globus, … • Very simple to install, manage, and use • Natural environment for application developers • Free!
Execute-Only Execute-Only Submit-Only Regular Node Regular Node Central Manager = Process Spawned negotiator collector schedd schedd schedd schedd master master master master master master startd startd startd startd startd Typical Condor Pool = ClassAd Communication Pathway
Condor ClassAds • ClassAds are at the heart of Condor • ClassAds • are a set of uniquely named expressions; each expression is called an attribute • combine query and data • semi-structured : no fixed schema • extensible
Sample ClassAd MyType = "Machine" TargetType = "Job" Machine = "froth.cs.wisc.edu" Arch = "INTEL" OpSys = "SOLARIS251" Disk = 35882 Memory = 128 KeyboardIdle = 173 LoadAvg = 0.1000 Requirements = TARGET.Owner=="smith" || LoadAvg<=0.3 && KeyboardIdle>15*60
Collector Collector Negotiator Negotiator Condor Flocking • Central managers can allow schedds from other pools to submit to them. Collector Negotiator Submit Machine Central Manager (CONDOR_HOST) Pool-Foo Central Manager Pool-Bar Central Manager Schedd
Example: POVray on UT Grid Condor Was: 2h 17 min 5-8 min … 5-8 min Now: 15 min
Parallel POVray on Condor • Submitting POVray to Condor Pool – Perl Script • Automated creation of image “slices” • Automated creation of condor submit files • Automated creation of DAG file • Using DAGman for job flow control • Multiple Architecture Support • Executable = povray.$$(OpSys).$$(Arch) • Post processing with a C-executable • “Stitching” image slices back together into one image file • Using “xv” to display image back to user desktop • Alternatively transferring image file back to user’s desktop
POVray Submit Description File Universe = vanilla Executable = povray.$$(OpSys).$$(Arch) Requirements = (Arch == "INTEL" && OpSys == "LINUX") || \ (Arch == "INTEL" && OpSys == "WINNT51") || \ (Arch == "INTEL" && OpSys == "WINNT52") transfer_files = ONEXIT Input = glasschess_0.ini Error = Errfile_0.err Output = glasschess_0.ppm transfer_input_files = glasschess.pov,chesspiece1.inc arguments = glasschess_0.ini log = glasschess_0_condor.log notification = NEVER queue
DAGman Job Flow … A0 A1 A2 A3 A4 A5 An PARENT Pre-processing prior to executing Job B CHILD B
DAGman Submission Script $ condor_submit_dag povray.dag # Filename: povray.dag Job A0 ./submit/povray_submit_0.cmd Job A1 ./submit/povray_submit_1.cmd Job A2 ./submit/povray_submit_2.cmd Job A3 ./submit/povray_submit_3.cmd Job A4 ./submit/povray_submit_4.cmd Job A5 ./submit/povray_submit_5.cmd Job A6 ./submit/povray_submit_6.cmd Job A7 ./submit/povray_submit_7.cmd Job A8 ./submit/povray_submit_8.cmd Job A9 ./submit/povray_submit_9.cmd Job A10 ./submit/povray_submit_10.cmd Job A11 ./submit/povray_submit_11.cmd Job A12 ./submit/povray_submit_12.cmd Job B barrier_job_submit.cmd PARENT A0 CHILD B PARENT A1 CHILD B PARENT A2 CHILD B PARENT A3 CHILD B PARENT A4 CHILD B PARENT A5 CHILD B PARENT A6 CHILD B PARENT A7 CHILD B PARENT A8 CHILD B PARENT A9 CHILD B PARENT A10 CHILD B PARENT A11 CHILD B PARENT A12 CHILD B Script PRE B postprocessing.sh glasschess #!/bin/sh /bin/sleep 1 #!/bin/sh ./stitchppms glasschess > glasschess.ppm 2> /dev/null rm *_*.ppm *.ini Err* *.log povray.dag.* /usr/X11R6/bin/xv $1.ppm
United Devices Grid MP • Commercial product that aggregates unused cycles on desktop machines to provide a computing resource. • Originally designed for non-dedicated resources • Security, non-intrusiveness, scheduling, … • Screensaver/graphical GUI on client desktop • Support for multiple clients • Windows, Linux, Mac, AIX, & Solaris clients
Grid MP Agent • - Clusters Low latency parallel jobs Large sequential jobs Grid MP Services • Grid MP Agent • - Servers User Large data parallel jobs • Grid MP Agent • Workstation • - Desktop How Grid MP™ Works • Authenticates users and devices • Dispatches jobs based on priority… • Monitors and reschedules failed jobs • Collects job results • Advertises capability • Launches job • Secure job execution • Returns result • Caches data for reuse • Submits jobs • Monitors job progress • Processes results • Web browser interface • Command Line Interface • XML Web services API Administrator
UD Management Features • Enterprise features make it easier to convince traditional IT organizations & and individual desktop users to install software • Browser-based administration tools allow local management/policy specification to manage • Devices • Users • Workloads • Single click install of client on PCs • Easily customizable to work with software management packages
Device Group X • User Groups A = 50%, B = 25% • Usage: 8am-5pm, 2hr cut-off • Runnable app. list ….. • Device Group Y • User Group B = 100% • Usage: 24hrs, 1hr cut-off • Runnable app. list…. • Device Group X • User Groups A = 50%, B = 50% • Usage: 6pm-8am, 8hr cut-off • Runnable app. list….. Grid MP™ Provisioning Example Device Group X Device Group Administrator User Group B Device Group Y Grid MP Services Device Group Z Root Administrator User Group A Device Group Administrator(s)
Application Types Supported • Batch jobs • Use mpsub command to run single executable on single remote desktop • MPI jobs • Use ud_mpirun command to run MPI job across a set of desktop machines • Data Parallel jobs • Single job consists of several independent workunits that can be executed in parallel • Application developer must create program modules and write application scripts to create workunits
Hosted Applications • Hosted Applications are easier to manage • Provides users with managed application • Great for applications that are run frequently but rarely updated • Data Parallel applications fit best in hosted scenario • Users do not have to deal with the application maintenance only developer does. • Grid MP is optimized for running hosted applications • Applications and data are cached at client nodes • Affinity scheduling to minimize data movement by re-using cached executables and data. • Hosted application can be run across multiple platforms by registering executables for each platform
Example: Reservoir Simulation • Landmark’s VIP product benchmarked on Grid MP • Workload consisted of 240 simulations for 5 wells • Sensitivities investigated include: • 2 PVT cases, • 2 fault connectivity, • 2 aquifer cases, • 2 relative permeability cases, • 5 combinations of 5 wells • 3 combinations of vertical permeability multipliers • Each simulation packaged as a separate piece of work. • Similar Reservoir simulation application has been developed at TACC (with Dr. W. Bangerth, Institute of Geophysics)
Example: Drug Discovery • Think & LigandFit applications • Internet Project in partnership with Oxford University Model interactions between proteins and potential drug molecules • Virtual screening of drug molecules to reduce time-consuming, expensive lab testing by 90% • Drug Database of 3.5 billion candidate molecules. • Over 350K active computers participating all over the world.
Think • Code developed at Oxford University • Application Characteristics • Typical Input Data File: < 1 KB • Typical Output File: < 20 KB • Typical Execution Time: 1000-5000 minutes • Floating-point intensive • Small memory footprint • Fully resolved executable is ~3Mb in size.
BOINC • Berkeley Open Infrastructure for Network Computing (BOINC) • Open source follow-on to SETI@home • General architecture supports multiple applications • Solution targets volunteer resources, and not enterprise desktops/workstations • More information at http://boinc.berkeley.edu • Currently being used by several internet projects
Retry generation Result processing BOINC DB (MySQL) Work generation Result validation Garbage collection Scheduling server (C++) Web interfaces (PHP) data server (HTTP) data server (HTTP) data server (HTTP) Structure of a BOINC project Ongoing tasks: - monitor server correctness - monitor server performance - develop and maintain applications
BOINC • No enterprise management tools • Focus on “volunteer grid” • Provide incentives (points, teams, website) • Basic browser interface to set usage preferences on PCs • Support for user community (forums) • Simple interface for job management • Application developer creates scripts to submit jobs and retrieve results • Provides sandbox on client • No encryption: uses redundant computing to prevent spoofing
Projects using BOINC • Climateprediction.net: study climate change • Einstein@home: search for gravitational signals emitted by pulsars • LHC@home: improve the design of the CERN LHC particle accelerator • Predictor@home: investigate protein-related diseases • Rosetta@home: help researchers develop cures for human diseases • SETI@home: Look for radio evidence of extraterrestrial life • Cell Computing biomedical research (Japanese; requires nonstandard client software) • World Community Grid: advance our knowledge of human disease. (Requires 5.2.1 or greater)
SETI@home • Analysis of radio telescope data from Arecibo • SETI: search for narrowband signals • Astropulse: search for short broadband signals • 0.3 MB in, ~4 CPU hours, 10 KB out
Climateprediction.net • Climate change study (Oxford University) • Met Office model (FORTRAN, 1M lines) • Input: ~10MB executable, 1MB data • Output per workunit: • 10 MB summary (always upload) • 1 GB detail file (archive on client, may upload) • CPU time: 2-3 months (can't migrate) • trickle messages • preemptive scheduling
Why use Desktop Grids? • Desktop Grid solutions are typically complete & standalone • Easy to setup and manage • Good entry vehicle to try out grids. • Use existing (but underutilized) resources • Number of desktops/workstations on campus (or in an enterprise) is typically an order of magnitude greater than traditional compute resources. • Power of grid grows over time as new, faster desktops are added • Typical (large) numbers of resources on desktop grids enable new approaches to solving problems