340 likes | 429 Views
Parallel Programming in MPI part 1. January 10, 2012. 1. 並列処理:複数の演算器で仕事を分担 Parallel Processing: Share a job among multiple processors. Multi-Core / Multi-CPU PC 1 台の計算機内 within one PC 小規模問題向け for small problems Cluster of PCs / Supercomputer 複数の計算機を相互接続 Interconnect computers
E N D
Parallel Programming in MPIpart 1 January 10, 2012 1
並列処理:複数の演算器で仕事を分担Parallel Processing: Share a job among multiple processors • Multi-Core / Multi-CPU PC • 1台の計算機内 within one PC • 小規模問題向け for small problems • Cluster of PCs / Supercomputer • 複数の計算機を相互接続Interconnect computers • 中規模~大規模問題向け for middle- to large-scale problems 計算機間で通信が必要Communication is required among computers
どうやって、プログラムに通信を記述するか?How to Describe Communications in a Program? • TCP, UDP ? • Good:- 多くのネットワークに実装されており,可搬性が高い. Portable: Available on many networks. • Bad:- 接続やデータ転送の手続きが複雑Protocols for connections and data-transfer are complicated.- 広域ネットワークを対象に設計されており,オーバーヘッドが大きい.High overhead, since they are designed for wide-area (= unreliable) networks. 記述可能だが,並列処理には適さないPossible. But not suitable for parallel processing.
MPI (Message Passing Interface) • 並列計算向けに設計された通信関数群A set of communication functions designed for parallel processing • C, C++, Fortranのプログラムから呼び出しCan be called from C/C++/Fortran programs. • "Message Passing" = Send + Receive • 実際には,Send, Receive 以外にも多数の関数を利用可能.Actually, more functions other than Send and Receive are available. • ともかく、プログラム例を見てみましょうLet's see a sample program, first.
#include <stdio.h> #include "mpi.h" int main(int argc, char *argv[]) { int myid, procs, ierr, i; double myval, val; MPI_Status status; FILE *fp; char s[64]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); if (myid == 0) { fp = fopen("test.dat", "r"); fscanf(fp, "%lf", &myval); for (i = 1; i < procs; i++){ fscanf(fp, "%lf", &val); MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); } fclose(fp); } else MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status); printf("PROCS: %d, MYID: %d, MYVAL: %e\n", procs, myid, myval); MPI_Finalize(); return 0; } Setup MPI environment Get own process ID (= rank) Get total number of processes If my ID is 0 input data for this process and keep it in myval i = 1~procs-1 input data and keep it in val useMPI_Send to send value in val to process i processes with ID other than 0use MPI_Recv to receive data from process 0 and keep it in myval print-out its own myval end of parallel computing 5 5
プログラム例の実行の流れFlow of the sample program. rank 0 rank 1 read data from a file myval rank 2 receive datafrom rank 0 receive datafrom rank 0 read datafrom a file val send val to rank 1 wait for the arrival of the data myval read data from a file val print myval wait for the arrival of the data send val to rank 2 myval print myval print myval 複数の"プロセス"が,自分の番号(ランク)に応じて実行Multiple "Processes" execute the program according to their number (= rank). 6 6 6
実行例Sample of the Result of Execution • 各プロセスがそれぞれ勝手に表示するので、表示の順番は毎回変わる可能性がある。The order of the output can be different,since each process proceeds execution independently. PROCS: 4 MYID: 1 MYVAL: 20.0000000000000000 PROCS: 4 MYID: 2 MYVAL: 30.0000000000000000 PROCS: 4 MYID: 0 MYVAL: 10.0000000000000000 PROCS: 4 MYID: 3 MYVAL: 40.0000000000000000 rank 1 rank 2 rank 0 rank 3
MPIインタフェースの特徴Characteristics of MPIInterface • MPI プログラムは,普通の C言語プログラムMPI programs are ordinal programs in C-language • Not a new language • 各プロセスが同じプログラムを実行するEvery process execute the same program • ランク(=プロセス番号)を使って,プロセス毎に違う仕事を実行Each process executes its own work according to its rank(=process number) • 他のプロセスの変数を直接見ることはできない。A process cannot read or write variables on other process directly Rank 0 Read file myval Rank 1 Read file val Rank 2 Receive Send Receive myval Read file val Print myval Send myval Print myval 8 Print myval
TCP, UDP vs MPI • MPI:並列計算に特化したシンプルな通信インタフェースSimple interface dedicated for parallel computing • SPMD(Single Program Multiple Data-stream) model • 全プロセスが同じプログラムを実行All processes execute the same program • TCP, UDP: 各種サーバ等,様々な用途を想定した汎用的な通信インタフェースGeneric interface for various communications,such as internet servers • Server/Client model • 各プロセスが自分のプログラムを実行 Each process executes its own program.
MPI TCP Client #include <stdio.h> #include "mpi.h" int main(int argc, char *argv[]) { int myid, procs, ierr, i; double myval, val; MPI_Status status; FILE *fp; char s[64]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); if (myid == 0) { fp = fopen("test.dat", "r"); fscanf(fp, "%lf", &myval); for (i = 1; i < procs; i++){ fscanf(fp, "%lf", &val); MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); } fclose(fp); } else MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status); printf("PROCS: %d, MYID: %d, MYVAL: %e\n", procs, myid, myval); MPI_Finalize(); return 0; } sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); memset(&echoServAddr, 0, sizeof(echoServAddr)); echoServAddr.sin_family = AF_INET; echoServAddr.sin_addr.s_addr = inet_addr(servIP); echoServAddr.sin_port = htons(echoServPort); connect(sock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)); echoStringLen = strlen(echoString); send(sock, echoString, echoStringLen, 0); totalBytesRcvd = 0; printf("Received: "); while (totalBytesRcvd < echoStringLen){ bytesRcvd = recv(sock, echoBuffer, RCVBUFSIZE - 1, 0); totalBytesRcvd += bytesRcvd; echoBuffer[bytesRcvd] = '\0' ; printf(echoBuffer); } printf("\n"); close(sock); initialize initialize TCP Server servSock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); memset(&echoServAddr, 0, sizeof(echoServAddr)); echoServAddr.sin_family = AF_INET; echoServAddr.sin_addr.s_addr = htonl(INADDR_ANY); echoServAddr.sin_port = htons(echoServPort); bind(servSock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)); listen(servSock, MAXPENDING); for (;;){ clntLen = sizeof(echoClntAddr); clntSock = accept(servSock,(struct sockaddr *)&echoClntAddr, &clntLen); recvMsgSize = recv(clntSock, echoBuffer, RCVBUFSIZE, 0); while (recvMsgSize > 0){ send(clntSock, echoBuffer, recvMsgSize, 0); recvMsgSize = recv(clntSock, echoBuffer, RCVBUFSIZE, 0); } close(clntSock); } initialize
MPIの位置づけLayer of MPI • ネットワークの違いを、MPIが隠ぺいHide the differences of networks Applications MPI Sockets XTI … … … TCP UDP IP High-Speed Interconnect(InfiniBand, etc.) Ethernetdriver, Ethernetcard
MPIプログラムのコンパイル,実行How to compile & execute MPI programs • Compilecommand: mpicc Example)mpicc -O3 test.c -o test.exe • Execution command: mpirun Example)mpirun -np 8 ./test.exe optimization optionO is not 0 source file to compile executable file to create number of processes executable file to execute
Ex 0)MPIプログラムの実行 Execution of an MPI program psihexaにログインして、以下を実行しなさい。Login to psihexa, and try the following commands. 時間に余裕があったら,プロセス数を変えたり,プログラムを書き換えたりしてみる.Try changing the number of processes,or modifying the source program. $ cp /tmp/test-mpi.c . $ cp /tmp/test.dat . $ cat test-mpi.c $ cat test.dat $ mpicc test-mpi.c –o test-mpi $ mpirun -np 8 ./test-mpi
MPIライブラリMPI Library • MPI関数の実体は,MPIライブラリに格納されているThe bodies of MPI functions are in "MPI Library". • mpicc が自動的に MPIライブラリをプログラムに結合するmpicc links the library to the program mpicc main() { MPI_Init(...); ... MPI_Comm_rank(...); ... MPI_Send(...); ...} link Executablefile compile MPI_Init MPI_Comm_rank ... MPI Library source program
MPIプログラムの基本構造Basic Structure of MPI Programs Crucial lines header file "mpi.h" #include <stdio.h> #include "mpi.h" int main(int argc, char *argv[]) { ... MPI_Init(&argc, &argv); ... MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); ... MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); ... MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &status); ... MPI_Finalize(); return 0; } Function for start-up You can call MPI functions in this area Functions for finish
今日の MPI関数MPI Functions Today • MPI_Init • Initialization • MPI_Finalize • Finalization • MPI_Comm_size • Get number of processes • MPI_Comm_rank • Get rank (= Process number) of this process • MPI_Send & MPI_Recv • Message Passing • MPI_Bcast & MPI_Gather • Collective Communication ( = Group Communication )
MPI_Init Usage: int MPI_Init(int *argc, char **argv); • MPIの並列処理開始Start parallel execution of in MPI • プロセスの起動やプロセス間通信路の確立等。Start processes and establish connectionsamong them. • 他のMPI関数を呼ぶ前に、必ずこの関数を呼ぶ。Most be called once before calling otherMPI functions • 引数:Parameter: • main関数の2つの引数へのポインタを渡す。Specify pointers of both of the arguments of 'main' function. • 各プロセス起動時に実行ファイル名やオプションを共有するために参照。Each process most share the name of the executable file, and the options given to the mpirun command. Example #include <stdio.h> #include "mpi.h" int main(int argc, char *argv[]) { int myid, procs, ierr; double myval, val; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); ... 17 17
MPI_Finalize Usage: int MPI_Finalize(); • 並列処理の終了Finishes paralles execution • このルーチン実行後はMPIルーチンを呼び出せないMPI functions cannot be calledafter this function. • プログラム終了前に全プロセスで必ずこのルーチンを実行させる。Every process needs to call this function before exitting the program. Example main() { ... MPI_Finalize(); } 18 18
MPI_Comm_rank Usage: int MPI_Comm_rank(MPI_Comm comm, int *rank); • そのプロセスのランクを取得するGet the rank(= process number) of the process • 2番目の引数に格納Returned in the second argument • 最初の引数 = “コミュニケータ”1st argument = "communicator" • プロセスのグループを表す識別子An identifier for the group of processes • 通常は,MPI_COMM_WORLD を指定In most cases, just specify MPI_COMM_WORLD, here. • MPI_COMM_WORLD: 実行に参加する全プロセスによるグループa group that consists all of the processes in this execution • プロセスを複数のグループに分けて、それぞれ別の仕事をさせることも可能Processes can be devided into multiple groups and attached different jobs. Example ... MPI_Comm_rank(MPI_COMM_WORLD, &myid); ... 19 19
MPI_Comm_size Usage: int MPI_Comm_size(MPI_Comm comm, int *size); • プロセス数を取得するGet the number of processes • 2番目の引数に格納される Example ... MPI_Comm_size(MPI_COMM_WORLD, &procs); ... 20 20
一対一通信Message Passing • 送信プロセスと受信プロセスの間で行われる通信Communication between "sender" and "receiver" • 送信関数と受信関数を,"適切"に呼び出す.Functions of Sending and Receiving most be called in a correct manner. • "From" rank and "To" rank are correct • Specified size of the data to be transferred is the same on both side • Same "Tag" is specified on both side Rank 1 Rank 0 Receive From: Rank 0 Size: 10 Integer data Tag: 100 Send To: Rank 1 Size: 10 Integer data Tag: 100 Wait for the message
MPI_Send Usage: int MPI_Send(void *b, int c, MPI_Datatype d, int dest,int t, MPI_Comm comm); • 送信内容Information of the message to send • start address of the data 開始アドレス,number of elements 要素数,data type データ型,rank of the destination 送信先,tag,communicator (= MPI_COMM_WORLD, in most cases) • data types: • tag: メッセージに付ける番号(整数) The number attached to each message • 不特定のプロセスから届く通信を処理するタイプのプログラムで使用Used in a kind of programs that handles anonymous messages. • 通常は、0 を指定しておいて良い. Usually, you can specify 0. Example ... MPI_Send(&val, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD); ... 22 22
Example of MPI_Send • 整数変数 d の値を送信(整数1個)Send the value of an integer variable 'd' • 実数配列 mat の最初の要素から100番目の要素までを送信Send first 100 elements of array 'mat' (with MPI_DOUBLE type) • 整数配列 data の10番目の要素から50個を送信Send elements of an integer array 'data' from 10th to 59th element MPI_Send(&d, 1, MPI_INT, 1, 0, MPI_COMM_WORLD); MPI_Send(mat, 100, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD); MPI_Send(&(data[10]), 50, MPI_INT, 1, 0, MPI_COMM_WORLD);
MPI_Recv Usage: int MPI_Recv(void *b, int c, MPI_Datatype d, int src, int t, MPI_Comm comm, MPI_Status *st); • Information of the message to receive • start address for storing data 受信データ格納用の開始アドレス,number of elements 要素数,data type データ型,rank of the source 送信元,tag (= 0, in most cases), communicator (= MPI_COMM_WORLD, in most cases),status • status: メッセージの情報を格納する整数配列An integer array for storing the information of arrived message • 送信元ランクやタグの値を参照可能(通常は、あまり使わない)Consists the information about the source rank and the tag. ( Not be used in most case ) Example ... MPI_Recv(&myval, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD &status); ... 24 24
集団通信Collective Communications • グループ内の全プロセスで行う通信Communications among all of the processes in the group • Examples) • MPI_Bcast • copy a data to otherprocesses • MPI_Gather • Gather data from other processesto an array • MPI_Reduce • Apply a 'Reduction'operation to the distributed datato produce one array Rank 0 Rank 1 Rank 2 3 1 8 2 3 1 8 2 3 1 8 2 Rank 0 Rank 1 Rank 2 7 5 9 7 5 9 Rank 0 Rank 1 Rank 2 1 2 3 4 5 6 7 8 9 12 15 18
MPI_Bcast Usage: int MPI_Bcast(void *b, int c, MPI_Datatype d, int root, MPI_Comm comm); • あるプロセスのデータを全プロセスにコピーcopy a data on a process to all of the processes • Parameters: • start address, number of elements, data type, root rank, communicator • root rank: コピー元のデータを所有するプロセスのランクrank of the process that has the original data • Example:MPI_Bcast(a, 3, MPI_DOUBLE, 0, MPI_COMM_WORLD); Rank1 Rank2 Rank3 Rank0 a a a a 26 26
MPI_Gather Usage: int MPI_Gather(void *sb, int sc MPI_Datatype st, void *rb, int rc, MPI_Datatype rt, int root, MPI_Comm comm); • 全プロセスからデータを集めて一つの配列を構成Gather data from other processes to construct an array • Parameters: • send data: start address, number of elements, data type, receive data: start address, number of elements, data type, (means only on the root rank)root rank, communicator • root rank: 結果の配列を格納するプロセスのランクrank of the process that stores the result array • Example: MPI_Gather(a, 3, MPI_DOUBLE, b, 3, MPI_DOUBLE, 0, MPI_COMM_WORLD); Rank0 Rank1 Rank2 Rank3 a a a a b 27 27
集団通信の利用に当たってUsage of Collective Communications • 同じ関数を全プロセスが実行するよう、記述する。Every process must call the same function • 例えば MPI_Bcastは,rootrankだけでなく全プロセスで実行For example, MPI_Bcast must be called not only by the root rank but also all of the other ranks • 送信データと受信データの場所を別々に指定するタイプの集団通信では、送信データの範囲と受信データの範囲が重ならないように指定する。On functions that require information of both send and receive, the specified ranges of the addresses for sending and receiving cannot be overlapped. • MPI_Gather, MPI_Allgather, MPI_Gatherv, MPI_Allgatherv, MPI_Recude, MPI_Allreduce, MPI_Alltoall, MPI_Alltoallv,etc. 28 28
まとめSummary • MPIでは、一つのプログラムを複数のプロセスが実行するOn MPI, multiple processes run the same program • 各プロセスには、そのランク(番号)に応じて仕事を割り当てるJobs are attached according to the rank(the number) of each process • 各プロセスはそれぞれ自分だけの記憶空間で動作するEach process runs on its own memory space • 他のプロセスが持っているデータを参照するには、通信するAccesses to the data on other processes can be made only by explicit communication among processes • MPIfunctions • MPI_Init, MPI_Finalize, MPI_Comm_rank • MPI_Send, MPI_Recv • MPI_Bcast, MPI_Gather
References • MPI Forumhttp://www.mpi-forum.org/ • specification of "MPI standard" • MPI仕様(日本語訳) http://phase.hpcc.jp/phase/mpi-j/ml/ • 理化学研究所の講習会資料http://accc.riken.jp/HPC/training/mpi/mpi_all_2007-02-07.pdf 30 30
Ex 1) 乱数を表示するプログラムA program that displays random numbers 「各プロセスがそれぞれ自分のランクと整数乱数を一つ表示するプログラム」を作成しなさい。Make a program in which each process displays its own rank with one integer random number Sample: #include <stdio.h> #include <stdlib.h> #include <sys/time.h> int main(int argc, char *argv[]) { int r; struct timeval tv; gettimeofday(&tv, NULL); srand(tv.tv_usec); r = rand(); printf("%d\n", r); }
Ex 1) (cont.) Example of the result of execution 1: 520391 0: 947896500 3: 1797525940 2: 565917780 4: 1618651506 5: 274032293 6: 1248787350 7: 828046128
Ex 1) Sample of the answer #include <stdio.h> #include <stdlib.h> #include <sys/time.h> #include "mpi.h" int main(int argc, char *argv[]) { int r, myid, procs; struct timeval tv; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &procs); gettimeofday(&tv, NULL); srand(tv.tv_usec); r = rand(); printf("%d: %d\n", myid, r); MPI_Finalize(); }
レポート課題: 順番をそろえて表示するReport: Display in order Ex 1) で作成したプログラムについて、以下の条件を満たすように修正しなさい。 「ランク0からランクの順に、それぞれのプロセスで生成した乱数を表示する。」Modify the program in Ex1), so that: Messages are printed out in the order of the rank of each process Example of the result of the execution 0: 1524394631 1: 999094501 2: 941763604 3: 526956378 4: 152374643 5: 1138154117 6: 1926814754 7: 156004811