1 / 23

BlueGene

2. Topics. Programming environmentCompilationExecutionDebuggingProgramming modelProcessorsMemoryFilesCommunicationsWhat happens under the covers. 3. Programming on BG/L. A single application program imageRunning on tens of thousands of compute nodesCommunicating via message passingEach

saxton
Download Presentation

BlueGene

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. BlueGene/L System Software Derek Lieber IBM T. J. Watson Research Center February 2004

    2. 2 Topics Programming environment Compilation Execution Debugging Programming model Processors Memory Files Communications What happens under the covers

    3. 3 Programming on BG/L A single application program image Running on tens of thousands of compute nodes Communicating via message passing Each image has its own copy of Memory File descriptors

    4. 4 Programming on BG/L A “job” is encapsulated in a single host-side process A merge point for compute node stdout streams A control point for Signaling (ctl-c, kill, etc) Debugging (attach, detach) Termination (exit status collection and summary)

    5. 5 Programming on BG/L Cross compile the source code Place executable onto BG/L machine’s shared filesystem Run it “blrun <job information> <program name> <args>” Stdout of all program instances appears as stdout of blrun Files go to user-specified directory on shared filesystem blrun terminates when all program instances terminate Killing blrun kills all program instances

    6. 6 Compiling and Running on BG/L

    7. 7 Programming Models “Coprocessor model” 64k instances of a single application program each has 255M address space each with two threads (main, coprocessor) non-coherent shared memory “Virtual node model” 128k instances 127M address space one thread (main)

    8. 8 Programming Model Does a job behave like A group of processes? Or a group of threads? A little bit of each

    9. 9 A process group? Yes Each program instance has its own Memory File descriptors No Can’t communicate via mmap, shmat Can’t communicate via pipes or sockets Can’t communicate via signals (kill)

    10. 10 A thread group? Yes Job terminates when All program instances terminate via exit(0) Any program instance terminates Voluntarily, via exit(!0) Involuntarily, via uncaught signal (kill, abort, segv, etc) No Each program instance has own set of file descriptors Each has own private memory space

    11. 11 Compilers and libraries GNU C, Fortran, C++ compilers can be used with BG/L, but they do not exploit 2nd FPU IBM xlf/xlc compilers have been ported to BG/L, with code generation and optimization features for dual FPU Standard glibc library MPI for communications

    12. 12 System calls Traditional ANSI + “a little” POSIX I/O Open, close, read, write, etc Time Gettimeofday, etc Signal catchers Synchronous (sigsegv, sigbus, etc) Asynchronous (timers and hardware events)

    13. 13 System calls No “unix stuff” fork, exec, pipe mount, umount, setuid, setgid No system calls needed to access most hardware Tree and torus fifos Global OR Mutexes and barriers Performance counters Mantra Keep the compute nodes simple Kernel stays out of the way and lets the application program run

    14. 14 Software Stack in BG/L Compute Node CNK controls all access to hardware, and enables bypass for application use User-space libraries and applications can directly access torus and tree through bypass As a policy, user-space code should not directly touch hardware, but there is no enforcement of that policy

    15. 15 What happens under the covers? The machine The job allocation, launch, and control system The machine monitoring and control system

    16. 16 The machine Nodes IO nodes Compute nodes Link nodes Communications networks Ethernet Tree Torus Global OR JTAG

    17. 17 The IO nodes 1024 nodes talk to outside world via ethernet talk to inside world via tree network not connected to torus embedded linux kernel purpose is to run network filesystem job control daemons

    18. 18 The compute nodes 64k nodes, each with 2 cpus and 4 fpus application programs execute here custom kernel non-preemptive application program has full control of all timing issues kernel and application share same address space kernel is memory protected kernel provides program load / start / debug / termination file access all via message passing to IO nodes

    19. 19 The link nodes Signal routing, no computation Stitch together cards and racks of io and compute nodes into “blocks” suitable for running independent jobs Isolate each block’s tree, torus, and global OR network

    20. 20 Machine configuration

    21. 21 Kernel booting and monitoring

    22. 22 Job execution

    23. 23 Blue Gene/L System Software Architecture

    24. 24 Conclusions BG/L system software stack has Custom solution (CNK) on compute nodes for high performance Linux solution on I/O nodes for flexibility and functionality MPI as default programming model BG/L system software must scale to very large machines Hierarchical organization for management Flat organization for programming Mixed conventional/special-purpose operating systems Many challenges ahead, particularly in performance, scalability and reliability

More Related