1 / 17

Programming Models for Multithreaded Architectures: The EARTH Threaded-C Experience

Programming Models for Multithreaded Architectures: The EARTH Threaded-C Experience. EARTH-MANNA Testbed. Local Memory. Local Memory. EU. EU. SU. SU. PE. PE. NETWORK. Features of Threaded Programming. Thread partition Thread length vs useful parallelism Where to “cut”?

saki
Download Presentation

Programming Models for Multithreaded Architectures: The EARTH Threaded-C Experience

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming Models forMultithreaded Architectures:The EARTH Threaded-C Experience \course\eleg652-03F\Topic5b-652.ppt

  2. EARTH-MANNA Testbed Local Memory Local Memory EU EU SU SU PE PE NETWORK \course\eleg652-03F\Topic5b-652.ppt

  3. Features of Threaded Programming • Thread partition • Thread length vs useful parallelism • Where to “cut”? • Split-phase synchronization and communication • Parallel threaded function invocation • Dynamic load balancing Latency tolerance and management \course\eleg652-03F\Topic5b-652.ppt

  4. cluster cluster cluster cluster cluster cluster cluster Crossbar- Hierarchies cluster cluster cluster cluster cluster cluster cluster cluster cluster Node CP CP i860XP i860XP 8 Network Interface I/O 8 32 Mbyte Memory The EARTH-MANNA Multiprocessor Testbed Cluster Node Node Node Node Node Crossbar 4 \course\eleg652-03F\Topic5b-652.ppt

  5. i860XP (EU) i860XP (SU) I/O Buf & Ctrl Arbiter 400 MB/s 50 MB/s Link Control MANNA Architecture Model LBE Memory (32MB) 50 MB/s \course\eleg652-03F\Topic5b-652.ppt

  6. Main Features of EARTH Multiprocessor • Fast thread context switching • Efficient parallel function invocation • Good support of fine-grain dynamic load balancing • Separate SU to hide thread sync/scheduling overhead and support of split-phase transaction Using off-the shell microprocessors \course\eleg652-03F\Topic5b-652.ppt

  7. Table 1. EARTH Instruction Set • Basic instructions: Arithmetic, Logic and Branching typical RISC instructions, e.g., those from the i860 • Thread Switching FETCH_NEXT • Synchronization SPAWN fp, ip SYNC fp, ss_off INIT_SYNC ss_off, sync_cnt, reset_cnt, ip INCR_SYNC fp, ss_off, value \course\eleg652-03F\Topic5b-652.ppt

  8. Con’d Table 1. EARTH Instruction Set • Data Transfer & Synchronization DATA_SPAWN value, dest_addr, fp, ip DATA_SYNC value, dest_addr, fp, ss_off BLOCKDATA_SPAWN src_addr, dest_addr, size, fp, ip BLOCKDATA_SYNC src_addr, dest_addr, size, fp, ss_off • Split_phase Data Requests GET_SPAWN src_addr, dest_addr, fp, ip GET_SYNC src_addr, dest_addr, fp, ss_off GET_BLOCK_SPAWN src_addr, dest_addr, size, fp, ip GET_BLOCK_SYNC src_addr, dest_addr, size, fp, ip • Function Invocation INVOKE dest_PE, f_name, no_params, params TOKEN f_name, no_params, params END_FUNCTION \course\eleg652-03F\Topic5b-652.ppt

  9. Threaded-C: A Base-Language • To serve as a target language for high-level language compilers • To serve as a machine language for the EARTH architecture \course\eleg652-03F\Topic5b-652.ppt

  10. The Role of Threaded-C C Fortran High-level Language Translation Users Threaded-C Threaded-C Compiler EARTH Platforms \course\eleg652-03F\Topic5b-652.ppt

  11. Features of Threaded Programming • Thread partition • Thread length vs useful parallelism • Where to “cut”? • Split-phase synchronization and communication (e.g. GET_SYNC) • Parallel threaded function invocation • Dynamic load balancing (e..g. TOKEN) • Other advanced features Latency tolerance and management \course\eleg652-03F\Topic5b-652.ppt

  12. EARTH-MANNA Benchmark Programs • Ray Tracing is a program for rendering 3-D photo-realistic images • Protein Folding is an application which computes all possible folding structures of a given polymer • TSP is an application to find a minimal-length Hamiltonian cycle in a graph with N cities and weighted paths. • Tomcatv is one of the SPEC benchmarks which operates upon a mesh • Paraffins is another application which enumerates the distinct isomers of paraffins • 2D-SLT is a program implementing the 2D-SLT Semi-Lagrangian Advection Model on a Gaussian Grid for numerical weather predication • N-queens is a benchmark program typical of graph searching problem. \course\eleg652-03F\Topic5b-652.ppt

  13. Parallel Function Invocation fib n-2 fib n-3 SYNC slots local vars fib n-1 fib n-2 caller’s <fp,ip> Links between frames fib n Tree of “Activation Frames” \course\eleg652-03F\Topic5b-652.ppt

  14. result n done fib 0 0 If n < 2 DATA_RSYNC (1, result, done) else { TOKEN (fib, n-1, & sum1, slot_1); TOKEN (fib, n-2, & sum2, slot_1); } END_THREAD( ) ; 2 2 THREAD-1; DATA_RSYNC (sum1 + sum2, result, done); END_THREAD ( ) ; END_FUNCTION The Fibonacci Example \course\eleg652-03F\Topic5b-652.ppt

  15. Matrix Multiplication void main ( ) { int i, j, k; float sum; for (i=0; i < N; i++) for (j=0; j < N ; j++) { sum = 0; for (k=0; k < N; k++) sum = sum + a [i] [k] * b [k] [j] c [i] [j] = sum; } } Sequential Version \course\eleg652-03F\Topic5b-652.ppt

  16. result a b done inner 0 0 2 2 THREAD-1; for (i=0; i<N; i++ ); sum = sum + (row_a[i] * column_b[i]); DATA_RSYNC (sum, result, done); END_THREAD ( ) ; BLKMOV_SYNC (a, row_a, N, slot_1); BLKMOV_SYNC (b, column_b, N, slot_1); sum = 0; END_THREAD; END_FUNCTION The Inner Product Example \course\eleg652-03F\Topic5b-652.ppt

  17. main 0 0 for (i=0; i<N; i++) for (j=0; j<N; j++) { row_a = a [i]; column_b = b [j]; TOKEN (inner, &c[I][j], row_a, column_b,slot_1); } END_THREAD; N*N N*N THREAD-1; RETURN ( ); END- THREAD The Matrix Multiplication Example \course\eleg652-03F\Topic5b-652.ppt

More Related