1 / 31

Lightweight Monitoring of the Progress of Remotely Executing Computations

Lightweight Monitoring of the Progress of Remotely Executing Computations. Shuo Yang, Ali R. Butt Y. Charlie Hu, Samuel P. Midkiff Purdue University. H arvesting U nused R esources. Typical workloads are Bursty Periods of little or no processing Periods of insufficient CPU resources

gavin
Download Presentation

Lightweight Monitoring of the Progress of Remotely Executing Computations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lightweight Monitoring of the Progress of Remotely Executing Computations Shuo Yang, Ali R. Butt Y. Charlie Hu, Samuel P. Midkiff Purdue University

  2. Harvesting Unused Resources • Typical workloads are Bursty • Periods of little or no processing • Periods of insufficient CPU resources • Idle cycles not usable for future • Exploit values from the wasted idle resources • Achieve more available processing capability for “free” or at low cost • “Smooth out” the workload

  3. The Need of Remote Monitoring • Centralized cycle sharing • SETI@Home, Genome@Home, IBM (with United Device), etc. • Condor, Microsoft (with GridIron), etc. • P2P based cycle-sharing (Butt et al. [VM’04]) • Individual node can utilize the system –more incentive • Nodes can be across administrative domains– more available resource • Remote execution motivates remote monitoring • Unreliable resources • Untrusted resources

  4. Review of GridCop – [Yang et al. PPoPP’05] JVM JVM (Sandboxed) Submitted Job (H-code) progress Reporting Module Processing Module (S-code) partial computation Reporting Module Host Machine Submitter

  5. Our New Contribution: Key Difference From GridCop • Uses probabilistic code instrumentation • Prevents replay attacks (like GridCop) • No recomputation needed – reduces network traffic and submitter machine overhead • Ties the progress information closely to program structure • Makes spoofing more difficult • PC values reflecting the program binary code internal nature

  6. Outline • Overview • Design of Lightweight Monitoring Mechanism • Experimental Results • Related Research and Conclusions

  7. System Overview: Code Generation Code Generation System Executed on Host: Emits progress information (“beacons”) during computation Original code Original code Host-code Submitter-code Executed on submitter: Processes “beacons”

  8. System Overview Submitted Job (H-code) Beacon Processing Module (S-code) Reporting Module Beacon Host Machine Submitter

  9. Basic Idea of the FSA Tracking • Beacons are placed at significant execution points along CFG • Beacons can be viewed as states in an FSA • Can be placed at any site satisfying the compiler instrumentation criteria, e.g. MPI call sites in this paper • Host emits beacon messages at significant execution points • An FSA emitting transition symbols • Submitter processes beacon messages • A mirror FSA recognizing legal transitions

  10. An FSA Example main(){ … mpi_irecv(…); //S1 … if(predicate){ mpi_send(…); //S2 } … mpi_wait(); //S3 … } S1 S2 S3

  11. Binary file Location Beacon (BLB) • BLB values are the virtual address of instructions in the virtual memory of a process– states in FSA Stack …. … 804a69b: call mpi_wait … 804a679: call mpi_send … 804a641: call mpi_irecv … …. heap bss Initialized data Code segment

  12. PC values – labels driving the transitions in FSA main(){ … pc = getPC(); mpi_irecv(…);// 0x804a641 deposit_beacon(pc); if(predicate){ pc = getPC(); mpi_send(…); //0x804a679 deposit_beacon(pc); } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); … } • Compiler inserts a getPC() in front of a BLB • getPC() returns the address of the next instruction @804a679 “804a679” “804a69b” @804a641 @804a69b “804a69b” “804a641”

  13. Tracking the Progress of an MPI Program main(){ … pc = getPC(); mpi_irecv(…);// 0x804a641 deposit_beacon(pc); if(predicate){ pc = getPC(); mpi_send(…); //0x804a679 deposit_beacon(pc); } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); … } @804a679 “804a679” “804a69b” @804a641 @804a69b “804a69b” “804a641”

  14. Attacks to the FSA Mechanism • Susceptible to replay attack • Remember the stream of beacons of a previous run • Replay the stream in the future (cheating to gain undeserved compensation) • Reverse engineer the binary executable • Understand the control flow graph • Expensive– NP-hard in worst case ([Wang, PhD thesis University of Virginia])

  15. Probabilistic BLB • Each MPI function call site is aBLB candidate but not necessarily a BLB site • It is used as a BLB site with probability of PB in (0,1) • Effect: an individual MPI function call site may be a BLB in the FSA in one code generation; but not a BLB in next time

  16. Probabilistic BLBs Guard against Attack • The same job can have a different FSA each time it is submitted to the host • This leads to a different legal beacon value stream • Defeats the replay attack by making it detectable • Reverse engineering by binary analysis must be repeated by cheating host on each run • Break once, spoof only once Too expensive!

  17. One FSA with Probabilistic BLB main(){ … pc = getPC(); mpi_irecv(…);// 0x804a641 deposit_beacon(pc); if(predicate){ pc = getPC(); mpi_send(…); //0x804a679 deposit_beacon(pc); } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); … } @804a679 “804a679” “804a69b” @804a641 @804a69b “804a69b” “804a641”

  18. Another FSA with Probabilistic BLB main(){ … pc = getPC(); mpi_irecv(…);// 0x804a641 deposit_beacon(pc); if(predicate){ mpi_send(…); //0x804a679 } pc = getPC(); mpi_wait(); //0x804a69b deposit_beacon(pc); … } @804a641 @804a69b “804a69b” “804a641”

  19. Outline • Overview • Design of Lightweight Monitoring Mechanism • Experimental Results • Related Research and Conclusions

  20. Experimental Setup • Submitter machine @UIUC (thanks to Josep Torrellas) • Intel 3GHz Xeon/512K cache, 1GB main memory • Running Linux 2.4.20 kernel • Host machine @Purdue • A cluster of 8 Pentium IV machines (each node has 512K cache, 512MB main memory), interconnected by a FastEthernet. • Running FreeBSD 4.7, MPICH 1.2.5 • Network access • Both machines connected to campus networks via Ethernet • UIUC—Purdue: representing a typical scenario of cycle-sharing across WAN

  21. Benchmarks & Evaluation Metrics • Used NAS Parallel Benchmark (NPB) 3.2 • A set of benchmarks to evaluate the performance of parallel computational resources • Run Time Computation Overhead • Network Traffic Overhead • Network resource is not “free” • Beacon Distribution over Time • Capability to track progress incrementally

  22. Host Side Computation Overhead at Different Number of Nodes • Overhead = (Tmonitoring – Toriginal) / Toriginal * 100% • Lower bar is better • Does not increase monotonically with the increase of process numbers

  23. Host Side Computation Overhead under Different Input Sizes • Overhead = (Tmonitoring – Toriginal) / Toriginal * 100% • Lower bar is better • Lower overhead for larger problem size

  24. Submitter Side Computation Cost • Overhead = time(submitter code) / execution time • Imperfect metric–the number depends on submitter’s hardware, submitter workload, host speed etc.

  25. Network Traffic Incurred by Monitoring • Bytes sent over network between host and submitter machine divided by the total execution time • Low bandwidth usage

  26. Beacon Distribution over Time Uniformly distribution enables incrementally tracking

  27. Outline • Overview • Design of Lightweight Monitoring Mechanism • Experimental Results • Related Research and Conclusions

  28. Related Research • L. F. Sarmenta [CCGrid’01], W. Du et al. [ICDCS’04] • A host performs same computation on different inputs • Needs a central manager • Yang et al. [PPoPP’05] • Partially duplicate compuation • Incurs more network traffic associated with the recomputation • Hofmeyr et al. [J. of Computer Security’98], Chen and Wagner [CCS’02] • Using system call sequence to detect intrusions • Approaches to achieve host security

  29. Conclusions • Lightweight monitoring over a WAN/Internet possible • No changes to host side system required • Instrumentation can be performed automatically

  30. Host Side Overhead Details(Slide 22) Overhead = (Tmonitoring – Toriginal) / Toriginal • Does not increase monotonically with an increase in the number of processes (Nprocess) • When Nprocess increases: • The denominator, Toriginal,decreases • The numerator – difference of Tmonitoring and Toriginaldecreases (the number of MPI calls decreases, decreasing the overhead of BLB message generation) • Synchronization: always one extra thread per process no matter how many processes are running

  31. Host Side Overhead Details(Slide 23) Overhead = (Tmonitoring – Toriginal) / Toriginal • Results in lower overhead for larger problem size • When the problem size increases • Denominator (Toriginal) increases • Numerator (Tmonitoring – Toriginal) similar since the number of MPI calls is similar

More Related