1 / 67

GRID superscalar: a programming model for the Grid

GRID superscalar: a programming model for the Grid. Doctoral Thesis Computer Architecture Department Technical University of Catalonia. Raül Sirvent Pardell Advisor: Rosa M. Badia Sala. Outline. Introduction Programming interface Runtime Fault tolerance at the programming model level

nikki
Download Presentation

GRID superscalar: a programming model for the Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRID superscalar: a programming model for the Grid Doctoral ThesisComputer Architecture DepartmentTechnical University of Catalonia Raül Sirvent Pardell Advisor: Rosa M. Badia Sala

  2. Outline • Introduction • Programming interface • Runtime • Fault tolerance at the programming model level • Conclusions and future work GRID superscalar: a programming model for the Grid

  3. Outline • Introduction 1.1 Motivation 1.2 Related work 1.3 Thesis objectives and contributions • Programming interface • Runtime • Fault tolerance at the programming model level • Conclusions and future work GRID superscalar: a programming model for the Grid

  4. 1.1 Motivation • The Grid architecture layers Applications Grid Middleware (Job management, Data transfer, Security, Information, QoS, ...) Distributed Resources GRID superscalar: a programming model for the Grid

  5. 1.1 Motivation • What middleware should I use? GRID superscalar: a programming model for the Grid

  6. GRID 1.1 Motivation • Programming tools: are they easy? Grid UNAWARE Grid AWARE VS. GRID superscalar: a programming model for the Grid

  7. 1.1 Motivation • Can I run my programs in parallel? Explicit parallelism Implicit parallelism VS. for(i=0; i < MSIZE; i++) for(j=0; j < MSIZE; j++) for(k=0; k < MSIZE; k++) matmul(A(i,k), B(k,j), C(i,j)) fork Draw it by hand means explicit join … GRID superscalar: a programming model for the Grid

  8. 1.1 Motivation • The Grid: a massive, dynamic and heterogeneous environment prone to failures • Study different techniques to detect and overcome failures • Checkpoint • Retries • Replication GRID superscalar: a programming model for the Grid

  9. 1.2 Related work GRID superscalar: a programming model for the Grid

  10. 1.3 Thesis objectives and contributions • Objective: create a programming model for the Grid • Grid unaware • Implicit parallelism • Sequential programming • Allows to use well-known imperative languages • Speed up applications • Include fault detection and recovery GRID superscalar: a programming model for the Grid

  11. 1.3 Thesis objectives and contributions • Contribution: GRID superscalar • Programming interface • Runtime environment • Fault tolerance features GRID superscalar: a programming model for the Grid

  12. Outline • Introduction • Programming interface 2.1 Design 2.2 User interface 2.3 Programming comparison • Runtime • Fault tolerance at the programming model level • Conclusions and future work GRID superscalar: a programming model for the Grid

  13. 2.1 Design • Interface objectives • Grid unaware • Implicit parallelism • Sequential programming • Allows to use well-known imperative languages GRID superscalar: a programming model for the Grid

  14. 2.1 Design • Target applications • Algorithms which may be easily splitted in tasks • Branch and bound computations, divide and conquer algorithms, recursive algorithms, … • Coarse grained tasks • Independent tasks • Scientific workflows, optimization algorithms, parameter sweep • Main parameters: FILES • External simulators, finite element solvers, BLAST, GAMESS GRID superscalar: a programming model for the Grid

  15. 2.1 Design • Application’s architecture: a master-worker paradigm • Master-worker parallel paradigm fits with our objectives • Main program: the master • Functions: workers • Function = Generic representation of a task • Glue to transform a sequential application into a master-worker application: stubs – skeletons (RMI, RPC, …) • Stub: call to runtime interface • Skeleton: binary which calls to the user function GRID superscalar: a programming model for the Grid

  16. app.c app-functions.c 2.1 Design void matmul(char *f1, char *f2, char *f3) { getBlocks(f1, f2, f3, A, B, C); for (i = 0; i < A->rows; i++) { for (j = 0; j < B->cols; j++) { for (k = 0; k < A->cols; k++) { C->data[i][j] += A->data[i][k] * B->data[k][j]; putBlocks(f1, f2, f3, A, B, C); } for(i=0; i < MSIZE; i++) for(j=0; j < MSIZE; j++) for(k=0; k < MSIZE; k++) matmul(A(i,k), B(k,j), C(i,j)) Local scenario GRID superscalar: a programming model for the Grid

  17. 2.1 Design app.c app-functions.c app-functions.c app-functions.c app-functions.c app-functions.c app-functions.c Middleware Master-Worker paradigm GRID superscalar: a programming model for the Grid

  18. 2.1 Design • Intermediate language concept: assembler code • In GRIDSs • The Execute generic interface • Instruction set is defined by the user • Single entry point to the runtime • Allows easy building of programming language bindings (Java, Perl, Shell Script) • Easier technology adoption C, C++, … Assembler Processor execution C, C++, … Workflow Grid execution GRID superscalar: a programming model for the Grid

  19. 2.2 User interface • Steps to program an application • Task definition • Identify those functions/programs in the application that are going to be executed in the computational Grid • All parameters must be passed in the header (remote execution) • Interface Definition Language (IDL) • For every task defined, identify which parameters are input/output files and which are input/output scalars • Programming API: master and worker • Write the main program and the tasks using GRIDSs API GRID superscalar: a programming model for the Grid

  20. 2.2 User interface • Interface Definition Language (IDL) file • CORBA-IDL like interface: • in/out/inout files • in/out/inout scalar values • The functions listed in this file will be executed in the Grid interface MATMUL { void matmul(in File f1, in File f2, inout File f3); }; GRID superscalar: a programming model for the Grid

  21. 2.2 User interface • Programming API: master and worker app.c app-functions.c • Master side GS_On GS_Off GS_FOpen/GS_FClose GS_Open/GS_Close GS_Barrier GS_Speculative_End • Worker side GS_System gs_result GS_Throw GRID superscalar: a programming model for the Grid

  22. 2.2 User interface • Task’s constraints and cost specification • Constraints: allow to specify the needs of a task (CPU, memory, architecture, software, …) • Build an expression in a constraint function (evaluated for every machine) • Cost: estimated execution time of a task (in seconds) • Useful for scheduling • Calculate it in a cost function • GS_GFlops / GS_Filesize may be used • An external estimator can be also called other.Mem == 1024 cost = operations / GS_GFlops(); GRID superscalar: a programming model for the Grid

  23. 2.3 Programming comparison • Globus vs GRIDSs Grid-aware int main() { rsl = "&(executable=/home/user/sim)(arguments=input1.txt output1.txt) (file_stage_in=(gsiftp://bscgrid01.bsc.es/path/input1.txt home/user/input1.txt))(file_stage_out=/home/user/output1.txt gsiftp://bscgrid01.bsc.es/path/output1.txt)(file_clean_up=/home/user/input1.txt /home/user/output1.txt)"; globus_gram_client_job_request(bscgrid02.bsc.es, rsl, NULL, NULL); rsl = "&(executable=/home/user/sim)(arguments=input2.txt output2.txt) (file_stage_in=(gsiftp://bscgrid01.bsc.es/path/input2.txt /home/user/input2.txt))(file_stage_out=/home/user/output2.txt gsiftp://bscgrid01.bsc.es/path/output2.txt)(file_clean_up=/home/user/input2.txt /home/user/output2.txt)"; globus_gram_client_job_request(bscgrid03.bsc.es, rsl, NULL, NULL); rsl = "&(executable=/home/user/sim)(arguments=input3.txt output3.txt) (file_stage_in=(gsiftp://bscgrid01.bsc.es/path/input3.txt /home/user/input3.txt))(file_stage_out=/home/user/output3.txt gsiftp://bscgrid01.bsc.es/path/output3.txt)(file_clean_up=/home/user/input3.txt /home/user/output3.txt)"; globus_gram_client_job_request(bscgrid04.bsc.es, rsl, NULL, NULL); } Explicit parallelism GRID superscalar: a programming model for the Grid

  24. 2.3 Programming comparison • Globus vs GRIDSs void sim(File input, File output) { command = "/home/user/sim " + input + ' ' + output; gs_result = GS_System(command); } int main() { GS_On(); sim("/path/input1.txt", "/path/output1.txt"); sim("/path/input2.txt", "/path/output2.txt"); sim("/path/input3.txt", "/path/output3.txt"); GS_Off(0); } GRID superscalar: a programming model for the Grid

  25. 2.3 Programming comparison • DAGMan vs GRIDSs A B C D Explicit parallelism int main() { GS_On(); task_A(f1, f2, f3); task_B(f2, f4); task_C(f3, f5); task_D(f4, f5, f6); GS_Off(0); } No if/while clauses JOB A A.condor JOB B B.condor JOB C C.condor JOB D D.condor PARENT A CHILD B C PARENT B C CHILD D GRID superscalar: a programming model for the Grid

  26. 2.3 Programming comparison • Ninf-G vs GRIDSs Grid-aware int main() { grpc_initialize("config_file"); grpc_object_handle_init_np("A", &A_h, "class"); grpc_object_handle_init_np("B", &B_h," class"); for(i = 0; i < 25; i++) { grpc_invoke_async_np(A_h,"foo",&sid,f_in[2*i],f_out[2*i]); grpc_invoke_async_np(B_h,"foo",&sid,f_in[2*i+1],f_out[2*i+1]); grpc_wait_all(); } grpc_object_handle_destruct_np(&A_h); grpc_object_handle_destruct_np(&B_h); grpc_finalize(); } Explicit parallelism int main() { GS_On(); for(i = 0; i < 50; i++) foo(f_in[i], f_out[i]); GS_Off(0); } GRID superscalar: a programming model for the Grid

  27. 2.3 Programming comparison • VDL vs GRIDSs No if/while clauses DV trans1( a2=@{output:tmp.0}, a1=@{input:filein.0} ); DV trans2( a2=@{output:fileout.0}, a1=@{input:tmp.0} ); DV trans1( a2=@{output:tmp.1}, a1=@{input:filein.1} ); DV trans2( a2=@{output:fileout.1}, a1=@{input:tmp.1} ); ... DV trans1( a2=@{output:tmp.999}, a1=@{input:filein.999} ); DV trans2( a2=@{output:fileout.999}, a1=@{input:tmp.999} ); int main() { GS_On(); for(i = 0; i < 1000; i++) { tmp = "tmp." + i; filein = "filein." + i; fileout = "fileout." + i; trans1(tmp, filein); trans2(fileout, tmp); } GS_Off(0); } GRID superscalar: a programming model for the Grid

  28. Outline • Introduction • Programming interface • Runtime 3.1 Scientific contributions 3.2 Developments 3.3 Evaluation tests • Fault tolerance at the programming model level • Conclusions and future work GRID superscalar: a programming model for the Grid

  29. 3.1 Scientific contributions • Runtime objectives • Extract implicit parallelism in sequential applications • Speed up execution using the Grid • Main requirement: Grid middleware • Job management • Data transfer • Security GRID superscalar: a programming model for the Grid

  30. ISU ISU FPU FPU FXU FXU IDU IDU LSU LSU IFU IFU BXU BXU L3 Directory/Control L2 L2 L2 Grid 3.1 Scientific contributions • Apply computer architecture knowledge to the Grid (superscalar processor)  ns  seconds/minutes/hours GRID superscalar: a programming model for the Grid

  31. 3.1 Scientific contributions • Data dependence analysis: allow parallelism task1(..., f1) Read after Write task2(f1, ...) task1(f1, ...) Write after Read task2(..., f1) task1(..., f1) Write after Write task2(..., f1) GRID superscalar: a programming model for the Grid

  32. 3.1 Scientific contributions for(i=0; i < MSIZE; i++) for(j=0; j < MSIZE; j++) for(k=0; k < MSIZE; k++) matmul(A(i,k), B(k,j), C(i,j)) matmul(A(0,0), B(0,0), C(0,0)) k = 0 i = 0 j = 0 k = 1 matmul(A(0,1), B(1,0), C(0,0)) matmul(A(0,2), B(2,0), C(0,0)) k = 2 k = 0 matmul(A(0,0), B(0,0), C(0,1)) ... i = 0 j = 1 k = 1 matmul(A(0,1), B(1,0), C(0,1)) k = 2 matmul(A(0,2), B(2,0), C(0,1)) GRID superscalar: a programming model for the Grid

  33. 3.1 Scientific contributions for(i=0; i < MSIZE; i++) for(j=0; j < MSIZE; j++) for(k=0; k < MSIZE; k++) matmul(A(i,k), B(k,j), C(i,j)) i = 0 j = 2 i = 1 j = 0 i = 1 j = 1 i = 1 j = 2 matmul(A(0,0), B(0,0), C(0,0)) k = 0 i = 0 j = 0 k = 1 matmul(A(0,1), B(1,0), C(0,0)) matmul(A(0,2), B(2,0), C(0,0)) k = 2 ... ... k = 0 matmul(A(0,0), B(0,0), C(0,1)) i = 0 j = 1 k = 1 matmul(A(0,1), B(1,0), C(0,1)) k = 2 matmul(A(0,2), B(2,0), C(0,1)) GRID superscalar: a programming model for the Grid

  34. 3.1 Scientific contributions • File renaming: increase parallelism task1(..., f1) Read after Write Unavoidable task2(f1, ...) task1(f1, ...) Write after Read Avoidable task2(..., f1_NEW) task2(..., f1) task1(..., f1) Avoidable Write after Write task2(..., f1) task2(..., f1_NEW) GRID superscalar: a programming model for the Grid

  35. 3.2 Developments • Basic functionality • Job submission (middleware usage) • Select sources for input files • Submit, monitor or cancel jobs • Results collection • API implementation • GS_On: read configuration file and environment • GS_Off: wait for tasks, cleanup remote data, undo renaming • GS_(F)Open: create a local task • GS_(F)Close: notify end of local task • GS_Barrier: wait for all running tasks to finish • GS_System: translate path • GS_Speculative_End: barrier until throw. If throw, discard tasks from throw to GS_Speculative_End • GS_Throw: use gs_result to notify it GRID superscalar: a programming model for the Grid

  36. 3.2 Developments ... Middleware Task scheduling: Direct Acyclic Graph GRID superscalar: a programming model for the Grid

  37. 3.2 Developments • Task scheduling: resource brokering • A resource broker is needed (but not an objective) • Grid configuration file • Information about hosts (hostname, limit of jobs, queue, working directory, quota, …) • Initial set of machines (can be changed dynamically) <?xml version="1.0" encoding="UTF-8"?> <project isSimple="yes" masterBandwidth="100000" masterBuildScript="" masterInstallDir="/home/rsirvent/matmul-master" masterName="bscgrid01.bsc.es" masterSourceDir="/datos/GRID-S/GT4/doc/examples/matmul" name="matmul" workerBuildScript="" workerSourceDir="/datos/GRID-S/GT4/doc/examples/matmul"> ... <workers> <worker Arch="x86" GFlops="5.985" LimitOfJobs="2" Mem="1024" NCPUs="2" NetKbps="100000" OpSys="Linux" Queue="none" Quota="0" deploymentStatus="deployed" installDir="/home/rsirvent/matmul-worker" name="bscgrid01.bsc.es"> GRID superscalar: a programming model for the Grid

  38. 3.2 Developments • Task scheduling: resource brokering • Scheduling policy • Estimation of total execution time of a single task • FileTransferTime: time to transfer needed files to a resource (calculated with the hosts information and the location of files) • Select fastest source for a file • ExecutionTime: estimation of the task’s run time in a resource. An interface function (can be calculated, or estimated by an external entity) • Select fastest resource for execution • Smallest estimation is selected GRID superscalar: a programming model for the Grid

  39. 3.2 Developments • Task scheduling: resource brokering • Match task constraints and machine capabilities • Implemented using the ClassAd library • Machine: offers capabilities (from Grid configuration file: memory, architecture, …) • Task: demands capabilities • Filter candidate machines for a particular task SoftwareList = BLAST, GAMESS Software = BLAST SoftwareList = GAMESS GRID superscalar: a programming model for the Grid

  40. f1 f2 3.2 Developments f3 f3 Middleware Task scheduling: File locality GRID superscalar: a programming model for the Grid

  41. 3.2 Developments • Other file locality exploitation mechanisms • Shared input disks • NFS or replicated data • Shared working directories • NFS • Erasing unused versions of files (decrease disk usage) • Disk quota control (locality increases disk usage and quota may be lower than expected) GRID superscalar: a programming model for the Grid

  42. 3.3 Evaluation GRID superscalar: a programming model for the Grid

  43. Launch Launch Launch MF MF MF MF BT BT BT SP SP SP LU LU LU MF MF MF MF MF MF BT BT BT SP SP SP LU LU LU MF MF Report Report Report MF MF MF MF BT BT BT SP SP SP LU LU LU 3.3 Evaluation • NAS Grid Benchmarks HC ED MB VP GRID superscalar: a programming model for the Grid

  44. 3.3 Evaluation • Run with classes S, W, A (2 machines x 4 CPUs) • VP benchmark must be analyzed in detail (does not scale up to 3 CPUs) GRID superscalar: a programming model for the Grid

  45. 3.3 Evaluation • Performance analysis • GRID superscalar runtime instrumented • Paraver tracefiles from the client side • The lifecycle of all tasks has been studied in detail • Overhead of GRAM Job Manager polling interval GRID superscalar: a programming model for the Grid

  46. 3.3 Evaluation • VP.S task assignment • 14.7% of the transfers when exploiting locality • VP is parallel, but its last part is sequentially executed BT MF MG MF FT BT MF MG MF FT BT MF MG MF FT Kadesh8 Khafre Remote file transfers GRID superscalar: a programming model for the Grid

  47. 3.3 Evaluation • Conclusion: workflow and granularity are important to achieve speed up GRID superscalar: a programming model for the Grid

  48. 3.3 Evaluation Two-dimensional potential energy hypersurface for acetone as a function of the 1, and 2 angles GRID superscalar: a programming model for the Grid

  49. 3.3 Evaluation • Number of executed tasks: 1120 • Each task between 45 and 65 minutes • Speed up: 26.88 (32 CPUs), 49.17 (64 CPUs) • Long running test, heterogeneous and transatlantic Grid 14 CPUs 22 CPUs 28 CPUs GRID superscalar: a programming model for the Grid

  50. 3.3 Evaluation • 15 million protein sequences have been compared using BLAST and GRID superscalar Genomes 15 million Proteins 15 million Proteins GRID superscalar: a programming model for the Grid

More Related