1 / 30

The Ethernet Approach to Grid Computing

The Ethernet Approach to Grid Computing. Douglas Thain and Miron Livny Condor Project, University of Wisconsin http://www.cs.wisc.edu/condor/ftsh. The UW US-CMS Physics Grid. Wrapper. globus-url-copy (C). Gatekeeper (C). MCRunJob (python). Impala (bash). Jobmanager (C). MOP

hayley
Download Presentation

The Ethernet Approach to Grid Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Ethernet Approachto Grid Computing Douglas Thain and Miron Livny Condor Project, University of Wisconsin http://www.cs.wisc.edu/condor/ftsh

  2. The UWUS-CMSPhysics Grid Wrapper globus-url-copy (C) Gatekeeper (C) MCRunJob (python) Impala (bash) Jobmanager (C) MOP (python) Batch Interface (bash) Submit DAG (perl) Batch System (???) DAGMan (C++) Condor-G (C++) MOP wrapper (bash) Gridmanager (C++) Impala wrapper (bash) Actual Job (Fortran) GAHP Server (C++)

  3. Outline WWW Server WWW Server Black Hole dataset dataset Client Client Client Client Client • Two problems in real systems: • Timing is uncontrollable. • Failures lack detail. • A solution: • The Ethernet Approach. • A language and a tool: • The Fault Tolerant Shell. • Time and failures are explicit. • Example Applications: • Shared Job Queue. • Shared Disk Buffer. • Shared Data Servers. Ethernet Carrier Sense Collision Detect Exponential Backoff Limited Allocation try for 30 minutes ... end

  4. 1 - Timing is Uncontrollable • Consider a distributed file system. • Suppose that the network is down. • “soft mounted” - failure after one minute • “hard mounted” – failure never exposed • Time is an unknown in nearly every operating system activity: • Process invocation. • Memory access. • Network communications.

  5. 2 - Failures Lack Detail • Consider this trivial program: • We would like to distinguish: • “success.” • “file not found.” • “nfs server down, still trying.” • “couldn’t find library libc.so.25.” % cp a b

  6. 2 - Failures Lack Detail • Consider this trivial program: • Actual results: • “success.” (exit code 0) • “file not found.” (exit code 1) • “nfs server down, still trying.” (code 1) • “couldn’t find library libc.so.25.” (code 1) % cp a b

  7. Examples Abound! • TCP connect -> ECONNREFUSED • Wrong port number. • A loaded service is rejecting connections. • The machine has just rebooted, has initialized TCP/IP, but not yet started the service. • FTP RETR -> code 550 • “550 File or directory not found.” • “550 Erlaubnis hat verweigert.” • “550 Archiveer systeem offline.” • “550 Fuori di memoria.” • “550 File staging in from tape.” (NCSA Unitree)

  8. How do we design new systems that avoid these problems? “Error Scope” HPDC 2002 Real systems have these problems. How can we learn to live with them? “Ethernet Approach” HPDC 2003 Not enough information or control.

  9. The Ethernet Approach Ethernet Rules Carrier Sense Collision Detect Exponential Backoff Limited Allocation No Carrier Sense == Aloha Protocol Network or Memory or Disk Space or OS Resources

  10. The Fault Tolerant Shell • A tool that encourages the Ethernet approach in system integration. • Similar to the Bourne or C-Shells. • Process invocation and repetition are simple. • Other elements are possible but ugly. • Not meant to be general purpose, high performance, or abstractly beautiful. • Not OOP, AOP, SOP, GP, etc... • Ethernet ideas could be used in such languages. • Elements: • Brittle property, try/catch, timed try, forany/forall.

  11. The Brittle Property wget http://host/file.tar.gz gunzip file.tar.gz tar xvf file.tar Failure of any step causes an immediate halt of the entire group.

  12. Untyped Exceptions try wget http://host/file.tar.gz gunzip file.tar.gz tar xvf file.tar catch echo “Zoiks!” end Failure of this group raises an exception. Exceptions have no type!

  13. Timed Try Statements The enclosed statement will be cancelled after 30 mins. try for 30 minutes wget http://host/file.tar.gz gunzip file.tar.gz tar xvf file.tar end An exception in the enclosed statement will retry up to 30 mins. (Exp. backoff.) Success after n is as good as success after one. (Otherwise, failure.)

  14. Timed Try Statements • If group completes within time limit. • Try block succeeds. • If group fails within time limit. • Automatically retried. • Exponentially increasing delay. • Random factor to avoid collisions. • If group runs over time limit. • Resources reclaimed, exception thrown.

  15. forany and forall forany host in xxx yyy zzz wget http://${host}/file end Attempt to make this statement succeed for any random branch. Attempt to make this statement succeed for all branches simultaneously. forall host in xxx yyy zzz wget http://${host}/file end

  16. Example Applications Ethernet Properties handled by ftsh handled by coder

  17. Shared Job Queue Multiple clients connect to a job queue to manipulate jobs. (Submit, query, remove, etc.) What’s the bottleneck? Client Match Maker Condor schedd Client CPU Client CPU Local Filesystem Job Activity Log Job Job Job Queue Job Job Job CPU Job Job

  18. Aloha Client try for 5 minutes condor_submit job.file end

  19. Ethernet Client try for 5 minutes if avail_fds() .lt. 1000 failure end condor_submit job.file end Measure free file descriptors. Throw an exception and try again.

  20. Shared Disk Buffer Multiple batch jobs share an output buffer. Jobs write output files, and a mover pushes them out. Step C: Commit Step D: Read Step B: Write Step A: Arbitrate Step E: Send Data Mover Job 8 Job 9 Job 10 Step F: Delete d4.c d5.c d6.c d7.c d8.i d9.i d10.i Local File System

  21. Aloha Client try for 30 minutes try run-job > d$n.i mv d$n.i d$n.c catch rm -f d$n.i end end Create the file, marked “incomplete.” Atomically commit the file. Remove the file if any failure.

  22. Ethernet Client try for 30 minutes if overcommitted() failure end try run-job > d$n.i mv d$n.i d$n.c catch rm -f d$n.i end end Buffer is overcommitted if estimated needs exceed available space.

  23. Shared Data Servers WWW Server WWW Server Black Hole dataset dataset Client Client Client Client Client Accepts all connections and holds them idle indefinitely. A healthy but loaded server might also have a high response time. Each client wants one instance of the data set, but doesn’t care which one. How to deal with delays and failures?

  24. Aloha Client try for 15 minutes forany host in xxx yyy zzz try for 1 minute wget http://${host}/data end end end

  25. Ethernet Client try for 15 minutes forany host in xxx yyy zzz try for 5 seconds wget http://${host}/tiny end try for 1 minute wget http://${host}/data end end end Test the server by fetching a tiny file.

  26. All Clients Blocked on Black Hole

  27. Some Thoughts • This is a necessary technique for real problems. • Timing is uncontrollable; failures lack detail. • A simple technique has significant payoff. • The Ethernet approach is not always ideal. • Carefully chosen errnos are powerful. • Designing errnos is tricky. • Requires clients of good will. • Some scenarios require external coordination. • Admission control for admission control? • Time and failure are first-class concerns. • They should be first-class elements of languages! • We get good mileage without complex constructions. • More info at: • http://www.cs.wisc.edu/condor/ftsh

  28. Computing’s central challenge, “How not to make a mess of it,” has not yet been met. -Edsger Dijkstra

More Related