1 / 27

BOINC Workshop 10

VOLPEX. BOINC Workshop 10. Enabling interprocess communication for BOINC applications. Hien Nguyen, Eshwar Rohit University of Houston Supervisors: Dr. Jaspal Subhlok University of Houston Dr. David P. Anderson SSL – U.C, Berkeley. RESEARCH GOAL.

mikasi
Download Presentation

BOINC Workshop 10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VOLPEX BOINC Workshop 10 Enabling interprocess communication for BOINC applications Hien Nguyen, Eshwar Rohit University of Houston Supervisors: Dr. Jaspal Subhlok University of Houston Dr. David P. Anderson SSL – U.C, Berkeley

  2. RESEARCH GOAL • Enable BOINC to efficiently support apps that require interprocess communication. • Goals: • Easier programming for communicating applications • Reduce execution time (not increase throughput) 2 Hien Nguyen University of Houston

  3. Example Applications • REMD Protein Folding application • Each process runs a standard molecular simulation at different temperature P1 P2 P3 P4 P5 P6 P7 P8 270 280 290 300 310 320 330 340 STEP-1 280 270 290 300 320 310 340 330 STEP-2 280 290 270 300 320 340 310 330 STEP-3 3 Hien Nguyen University of Houston

  4. Example Applications • Or many other applications: • Differential equation solvers (grid) (synchronous) • Game playing with alpha/beta pruning (asynchronous) • Search application. • ….. • Suitable applications: moderate amount of communication. 4 Hien Nguyen University of Houston

  5. DIFFICULTIES Job execution Fast host X X Slow host Synchronization point X Slow down overall execution speed Worse as number of host increases X 5 Hien Nguyen University of Houston

  6. OUTLINE • Volpex Dataspace Overview • IPC for volunteer environment • Integration With BOINC • Process management • Host selection • Future and Related Work 6 Hien Nguyen University of Houston

  7. Volpex Dataspace • Dataspace: global shared space that processes can use for information exchange without a temporal or spatial coupling. Volpex Dataspace Server Put(ABC, 800) Get(ABC,?) (800) 7 Hien Nguyen University of Houston

  8. Volpex Dataspace – Fault Tolerance replicated X Put(ABC, 800) Get(ABC,?) (800) Volpex DSS is designed to support redundant Put/Get operations unlike Linda & variants 8 Hien Nguyen University of Houston

  9. Volpex Dataspace • Why centralized? Scale issues • But: firewalls, no incoming connections 9 Hien Nguyen University of Houston

  10. INTEGRATION WITH BOINC • Mechanics: Process management • Simultaneous process starting • Fault tolerance • Checkpoint/restart • Policy: Host selection. 10 Hien Nguyen University of Houston

  11. Job execution scheme BOINC Scheduler X Get work Get checkpoint Put data item Volpex Dataspace Server Put checkpoint Get data item 11 Hien Nguyen University of Houston

  12. PROCESSES MANAGEMENT • Simultaneous process starting: • All processes start computation together: reduce wasted resources because processes will have to wait for eachothers. • Volpex jobs have highest (infinite) priority: uninterruptible by other jobs. • While waiting for all processes of a Volpex job to be ready: host can do other finite priority volunteer jobs. • Use of boinc_temporary_exit() 12 Hien Nguyen University of Houston

  13. PROCESSES MANAGEMENT • Fault tolerance • Dead instance spotted by heartbeat mechanism: process instances regularly send heartbeat to Volpex DSS. • BOINC scheduler recruits a new volunteer host to replace the dead one. 13 Hien Nguyen University of Houston

  14. PROCESSES MANAGEMENT • Hot spare policy: BOINC Scheduler Fast replacement X Hot spare group Volpex Dataspace Server 14 Hien Nguyen University of Houston

  15. PROCESSES MANAGEMENT • Checkpointing: • Process instance commits and uploads checkpoints to Volpex DSS (only stores latest checkpoint for each process). • Restarted process instance requests checkpoint from Volpex DSS. 15 Hien Nguyen University of Houston

  16. HOST SELECTION • Volpex job: consists of processes that form a job, submitted by scientist. • Has requirements on: • Deadline • Number of processes • Has estimates of: • Total flops per process. • Flops between 2 consecutive checkpoints. • Memory usage, disk usage 16 Hien Nguyen University of Houston

  17. HOST SELECTION POLICY • Criteria for selecting volunteer hosts to assign to a Volpex job: Speed and Availability. • Availability: the interval that a host is available w/o interruption (BOINC client allowed to compute). 17 Hien Nguyen University of Houston

  18. HOST SELECTION POLICY • Job’s minimum requirements: • Minimum speed : Fast enough to meet job’s deadline • Min speed = Total flops / Deadline • Minimum expected availability : host is continuously available for x hours to commit at least 1 checkpoint. 18 Hien Nguyen University of Houston

  19. Evaluate Host's Availability • We want to predict host's length of availability interval. • Method based on : Exploiting Non-Dedicated Resources for Cloud Computing Artur Andrzejak, Derrick Kondo, David P. Anderson. (NOMS10) 19 Hien Nguyen University of Houston

  20. Evaluate Host's Availability • Last value predictor: simplistic predictor which uses the availability value in the last hourly interval before prediction as the prediction of availability for the next x hours interval. • Combined with ranking hosts by predictability: number availability changes per week. 20 Hien Nguyen University of Houston

  21. Evaluate Host's Availability In essence: select hosts which change availability very rarely. A process assigned to a host with high predictability does not necessarily need to be replicated. 21 Hien Nguyen University of Houston

  22. IMPLEMENTATION STATUS • Volpex utilities: for scientists to submit, abort or query status of a Volpex job. • Modified BOINC scheduler: includes host selection for Volpex job. • Modified Volpex DSS: handling new type of requests. 22 Hien Nguyen University of Houston

  23. IMPLEMENTATION STATUS BOINC Server BOINC Scheduler submit job specs Scheduler request dynamically create result from WU Create job & WU Scientist X Scheduler reply replacement heartbeat get procID Database checkpoint Volpex Dataspace Server Hot spare group request procID procID of failed instance 23 Hien Nguyen University of Houston

  24. FUTURE WORK • Experiment and evaluate with different degrees of freedom: • Number of processes 10-1M • Communication pattern (local/global, synch/asynch) • Size and frequency of communication • Goal: study (via live experiment or simulation) the performance of Volpex/BOINC over this space • Application: Eratosthenes, REMD Protein Folding • Enhance host selection policy 24 Hien Nguyen University of Houston

  25. OTHER WORK • Volpex MPI: • An MPI library designed for executing parallel applications in volunteer environment. • Direct communication between processes. • Key Features • Controlled redundancy • Receiver based direct communication • Distributed sender based logging • More detail: “VolpexMPI: an MPI Library for Execution of Parallel Applications on Volatile Nodes” by Troy LeBlanc, Rakhi Anand, Edgar Gabriel, and Jaspal Subhlok. 25 Hien Nguyen University of Houston

  26. If you have application? • We would be happy to cooperate with you. • Our team contacts: • Dr. Jaspal Subhlok: jaspal@uh.edu • Dr. David Anderson: davea@ssl.berkeley.edu • Dr. Edgar Gabriel: gabriel@cs.uh.edu • Hien Nguyen: hien.nguyen.nx@gmail.com • Eshwar Rohit: eshwar.rohit@gmail.com • Rakhi Anand: rakhi@cs.uh.edu • Our Website: http://www2.cs.uh.edu/~jsteach/volpex/index.htm 26 Hien Nguyen University of Houston

  27. VOLPEX Thanks for listening!

More Related