Desktop Grids

Desktop Grids Ashok Adiga Texas Advanced Computing Center {adiga@tacc.utexas.edu}

Topics • What makes Desktop Grids different? • What applications are suitable? • Three Solutions: • Condor • United Devices Grid MP • BOINC

Compute Resources on the Grid • Traditional: SMPs, MPPs, clusters, … • High speed, Reliable, Homogeneous, Dedicated, Expensive (but getting cheaper) • High speed interconnects • Upto 1000s of CPUs • Desktop PCs and Workstations • Low Speed (but improving!), Heterogeneous, Unreliable, Non-dedicated, Inexpensive • Generic connections (Ethernet connections) • 1000s-10,000s of CPUs • Grid compute power increases as desktops are upgraded

Desktop Grid Challenges • Unobtrusiveness • Harness underutilized computing resources without impacting the primary Desktop user • Added Security requirements • Desktop machines typically not in secure environment • Must protect desktop & program from each other (sandboxing) • Must ensure secure communications between grid nodes • Connectivity characteristics • Not always connected to network (e.g. laptops) • Might not have fixed identifier (e.g. dynamic IP addresses) • Limited Network Bandwidth • Ideal applications have high compute to communication ratio • Data management is critical to performance

Desktop Grid Challenges (cont’d) • Job Scheduling heterogeneous, non-dedicated resources is complex • Must match application requirements to resource characteristics • Meeting QoS is difficult since program might have to share the CPU with other desktop activity • Desktops are typically unreliable • System must detect & recover from node failure • Scalability issues • Software has to manage thousands of resources • Conventional application licensing is not set up for desktop grids

Application Feasibility • Only some applications map well to Desktop grids • Coarse-grain data parallelism • Parallel chunks relatively independent • High computation-data communication ratios • Non-Intrusive behavior on client device • Small memory footprint on the client • I/O activity is limited • Executable and data sizes are dependent on available bandwidth

Typical Applications • Desktop Grids naturally support data parallel applications • Monte Carlo methods • Large Database searches • Genetic Algorithms • Exhaustive search techniques • Parametric Design • Asynchronous Iterative algorithms

Condor • Condor manages pools of workstations and dedicated clusters to create a distributed high-throughput computing (HTC) facility. • Created at University of Wisconsin • Project established in 1985 • Initially targeted at scheduling clusters providing functions such as: • Queuing • Scheduling • Priority Scheme • Resource Classifications • And then extended to manage non-dedicated resources • Sandboxing • Job preemption

Why use Condor? • Condor has several unique mechanisms such as : • ClassAd Matchmaking • Process checkpoint/ restart / migration • Remote System Calls • Grid Awareness • Glideins • Support for multiple “Universes” • Vanilla, Java, MPI, PVM, Globus, … • Very simple to install, manage, and use • Natural environment for application developers • Free!

Execute-Only Execute-Only Submit-Only Regular Node Regular Node Central Manager = Process Spawned negotiator collector schedd schedd schedd schedd master master master master master master startd startd startd startd startd Typical Condor Pool = ClassAd Communication Pathway

Condor ClassAds • ClassAds are at the heart of Condor • ClassAds • are a set of uniquely named expressions; each expression is called an attribute • combine query and data • semi-structured : no fixed schema • extensible

Sample ClassAd MyType = "Machine" TargetType = "Job" Machine = "froth.cs.wisc.edu" Arch = "INTEL" OpSys = "SOLARIS251" Disk = 35882 Memory = 128 KeyboardIdle = 173 LoadAvg = 0.1000 Requirements = TARGET.Owner=="smith" || LoadAvg<=0.3 && KeyboardIdle>15*60

Collector Collector Negotiator Negotiator Condor Flocking • Central managers can allow schedds from other pools to submit to them. Collector Negotiator Submit Machine Central Manager (CONDOR_HOST) Pool-Foo Central Manager Pool-Bar Central Manager Schedd

Example: POVray on UT Grid Condor Was: 2h 17 min 5-8 min … 5-8 min Now: 15 min

Parallel POVray on Condor • Submitting POVray to Condor Pool – Perl Script • Automated creation of image “slices” • Automated creation of condor submit files • Automated creation of DAG file • Using DAGman for job flow control • Multiple Architecture Support • Executable = povray.$$(OpSys).$$(Arch) • Post processing with a C-executable • “Stitching” image slices back together into one image file • Using “xv” to display image back to user desktop • Alternatively transferring image file back to user’s desktop

POVray Submit Description File Universe = vanilla Executable = povray.$$(OpSys).$$(Arch) Requirements = (Arch == "INTEL" && OpSys == "LINUX") || \ (Arch == "INTEL" && OpSys == "WINNT51") || \ (Arch == "INTEL" && OpSys == "WINNT52") transfer_files = ONEXIT Input = glasschess_0.ini Error = Errfile_0.err Output = glasschess_0.ppm transfer_input_files = glasschess.pov,chesspiece1.inc arguments = glasschess_0.ini log = glasschess_0_condor.log notification = NEVER queue

DAGman Job Flow … A0 A1 A2 A3 A4 A5 An PARENT Pre-processing prior to executing Job B CHILD B

DAGman Submission Script $ condor_submit_dag povray.dag # Filename: povray.dag Job A0 ./submit/povray_submit_0.cmd Job A1 ./submit/povray_submit_1.cmd Job A2 ./submit/povray_submit_2.cmd Job A3 ./submit/povray_submit_3.cmd Job A4 ./submit/povray_submit_4.cmd Job A5 ./submit/povray_submit_5.cmd Job A6 ./submit/povray_submit_6.cmd Job A7 ./submit/povray_submit_7.cmd Job A8 ./submit/povray_submit_8.cmd Job A9 ./submit/povray_submit_9.cmd Job A10 ./submit/povray_submit_10.cmd Job A11 ./submit/povray_submit_11.cmd Job A12 ./submit/povray_submit_12.cmd Job B barrier_job_submit.cmd PARENT A0 CHILD B PARENT A1 CHILD B PARENT A2 CHILD B PARENT A3 CHILD B PARENT A4 CHILD B PARENT A5 CHILD B PARENT A6 CHILD B PARENT A7 CHILD B PARENT A8 CHILD B PARENT A9 CHILD B PARENT A10 CHILD B PARENT A11 CHILD B PARENT A12 CHILD B Script PRE B postprocessing.sh glasschess #!/bin/sh /bin/sleep 1 #!/bin/sh ./stitchppms glasschess > glasschess.ppm 2> /dev/null rm *_*.ppm *.ini Err* *.log povray.dag.* /usr/X11R6/bin/xv $1.ppm

United Devices Grid MP • Commercial product that aggregates unused cycles on desktop machines to provide a computing resource. • Originally designed for non-dedicated resources • Security, non-intrusiveness, scheduling, … • Screensaver/graphical GUI on client desktop • Support for multiple clients • Windows, Linux, Mac, AIX, & Solaris clients

Grid MP Agent • - Clusters Low latency parallel jobs Large sequential jobs Grid MP Services • Grid MP Agent • - Servers User Large data parallel jobs • Grid MP Agent • Workstation • - Desktop How Grid MP™ Works • Authenticates users and devices • Dispatches jobs based on priority… • Monitors and reschedules failed jobs • Collects job results • Advertises capability • Launches job • Secure job execution • Returns result • Caches data for reuse • Submits jobs • Monitors job progress • Processes results • Web browser interface • Command Line Interface • XML Web services API Administrator

UD Management Features • Enterprise features make it easier to convince traditional IT organizations & and individual desktop users to install software • Browser-based administration tools allow local management/policy specification to manage • Devices • Users • Workloads • Single click install of client on PCs • Easily customizable to work with software management packages

Device Group X • User Groups A = 50%, B = 25% • Usage: 8am-5pm, 2hr cut-off • Runnable app. list ….. • Device Group Y • User Group B = 100% • Usage: 24hrs, 1hr cut-off • Runnable app. list…. • Device Group X • User Groups A = 50%, B = 50% • Usage: 6pm-8am, 8hr cut-off • Runnable app. list….. Grid MP™ Provisioning Example Device Group X Device Group Administrator User Group B Device Group Y Grid MP Services Device Group Z Root Administrator User Group A Device Group Administrator(s)

Application Types Supported • Batch jobs • Use mpsub command to run single executable on single remote desktop • MPI jobs • Use ud_mpirun command to run MPI job across a set of desktop machines • Data Parallel jobs • Single job consists of several independent workunits that can be executed in parallel • Application developer must create program modules and write application scripts to create workunits

Hosted Applications • Hosted Applications are easier to manage • Provides users with managed application • Great for applications that are run frequently but rarely updated • Data Parallel applications fit best in hosted scenario • Users do not have to deal with the application maintenance only developer does. • Grid MP is optimized for running hosted applications • Applications and data are cached at client nodes • Affinity scheduling to minimize data movement by re-using cached executables and data. • Hosted application can be run across multiple platforms by registering executables for each platform

Example: Reservoir Simulation • Landmark’s VIP product benchmarked on Grid MP • Workload consisted of 240 simulations for 5 wells • Sensitivities investigated include: • 2 PVT cases, • 2 fault connectivity, • 2 aquifer cases, • 2 relative permeability cases, • 5 combinations of 5 wells • 3 combinations of vertical permeability multipliers • Each simulation packaged as a separate piece of work. • Similar Reservoir simulation application has been developed at TACC (with Dr. W. Bangerth, Institute of Geophysics)

Example: Drug Discovery • Think & LigandFit applications • Internet Project in partnership with Oxford University Model interactions between proteins and potential drug molecules • Virtual screening of drug molecules to reduce time-consuming, expensive lab testing by 90% • Drug Database of 3.5 billion candidate molecules. • Over 350K active computers participating all over the world.

Think • Code developed at Oxford University • Application Characteristics • Typical Input Data File: < 1 KB • Typical Output File: < 20 KB • Typical Execution Time: 1000-5000 minutes • Floating-point intensive • Small memory footprint • Fully resolved executable is ~3Mb in size.

Grid MP: POVray Application Portal

BOINC • Berkeley Open Infrastructure for Network Computing (BOINC) • Open source follow-on to SETI@home • General architecture supports multiple applications • Solution targets volunteer resources, and not enterprise desktops/workstations • More information at http://boinc.berkeley.edu • Currently being used by several internet projects

Retry generation Result processing BOINC DB (MySQL) Work generation Result validation Garbage collection Scheduling server (C++) Web interfaces (PHP) data server (HTTP) data server (HTTP) data server (HTTP) Structure of a BOINC project Ongoing tasks: - monitor server correctness - monitor server performance - develop and maintain applications

BOINC • No enterprise management tools • Focus on “volunteer grid” • Provide incentives (points, teams, website) • Basic browser interface to set usage preferences on PCs • Support for user community (forums) • Simple interface for job management • Application developer creates scripts to submit jobs and retrieve results • Provides sandbox on client • No encryption: uses redundant computing to prevent spoofing

Projects using BOINC • Climateprediction.net: study climate change • Einstein@home: search for gravitational signals emitted by pulsars • LHC@home: improve the design of the CERN LHC particle accelerator • Predictor@home: investigate protein-related diseases • Rosetta@home: help researchers develop cures for human diseases • SETI@home: Look for radio evidence of extraterrestrial life • Cell Computing biomedical research (Japanese; requires nonstandard client software) • World Community Grid: advance our knowledge of human disease. (Requires 5.2.1 or greater)

SETI@home • Analysis of radio telescope data from Arecibo • SETI: search for narrowband signals • Astropulse: search for short broadband signals • 0.3 MB in, ~4 CPU hours, 10 KB out

Climateprediction.net • Climate change study (Oxford University) • Met Office model (FORTRAN, 1M lines) • Input: ~10MB executable, 1MB data • Output per workunit: • 10 MB summary (always upload) • 1 GB detail file (archive on client, may upload) • CPU time: 2-3 months (can't migrate) • trickle messages • preemptive scheduling

Why use Desktop Grids? • Desktop Grid solutions are typically complete & standalone • Easy to setup and manage • Good entry vehicle to try out grids. • Use existing (but underutilized) resources • Number of desktops/workstations on campus (or in an enterprise) is typically an order of magnitude greater than traditional compute resources. • Power of grid grows over time as new, faster desktops are added • Typical (large) numbers of resources on desktop grids enable new approaches to solving problems

Desktop Grids

Desktop Grids

Presentation Transcript

Desktop Grids

Exposing the myths of desktop scavenging grids

Decentralized Dynamic Scheduling across Heterogeneous Multi-core Desktop Grids

Decentralized Resource Management for Multi-core Desktop Grids

Grids

GRIDS

Compute Grids, Data Grids and Service Grids

Toward third Generation Desktop Grids ( Private Virtual Cluster )

EXPANDING SCIENTIFIC COMPUTATIONAL INFRASTRUCTURES WITH DESKTOP GRIDS

GenWrapper: A Generic Wrapper for Running Legacy Applications on Desktop Grids

Computational grids and grids projects

Grids

Specific security needs of Desktop Grids

EDGeS / EDGI : Bridging Institutional Grids, Desktop Grids and Academic Clouds

DGMonitor: a Performance Tool for Monitoring Resources on Sandbox-based Desktop Grids

EDGI Brings Desktop Grids To Distributed Computing Interoperability

Grids

Desktop Grids

Investigating Approaches to Speeding Up Systems Biology Using BOINC-Based Desktop Grids

Grids

GenWrapper: A Generic Wrapper for Running Legacy Applications on Desktop Grids