380 likes | 405 Views
Group and Communicator Management Routines. Research Computing UNC - Chapel Hill Instructor: Mark Reed Email: markreed@unc.edu. Groups. A group is an ordered set of processes. Each process in a group is associated with a unique integer rank.
E N D
Group and Communicator Management Routines Research ComputingUNC - Chapel HillInstructor: Mark Reed Email: markreed@unc.edu
Groups A group is an ordered set of processes. Each process in a group is associated with a unique integer rank. Rank values start at 0 and go to N-1, where N is the number of processes in the group. In MPI, a group is represented within system memory as an object accessible to programmer only by a "handle". A group is always associated with a communicator object.
Communicators • A communicator encompasses a group of processes that may communicate with each other. • All messages must specify a communicator. • Like groups, communicators are represented within system memory as objects, accessible to the programmer only by "handles". • E.G., handle of the communicator comprising all tasks is MPI_COMM_WORLD
Groups - Communicators Communicators specify a communicationdomain, i.e. communicators provide a self-contained, communication “world” in which to exchange messages typically they bind process groups and contexts together to form a safe communication space within the group intracommunicators are used for communication within a group intercommunicators are used for communication between disjoint groups
Groups - Communicators From the programmer's perspective, a group and a communicator often appear the same. The group routines are primarily used to specify which processes should be used to construct a communicator
Group and Communicator Objects Primary purposes Allow you to organize tasks, based upon function, into task groups. Enable Collective Communications operations across a subset of related tasks. Provide basis for implementing user defined virtual topologies Provide for safe communications
Communicators Groups/communicators are dynamic - they can be created and destroyed during program execution. Processes may be in more than one group/communicator. They will have a unique rank within each group/communicator. MPI provides over 40 routines related to groups, communicators, and virtual topologies
Typical usage: Extract handle of global group from MPI_COMM_WORLD using MPI_Comm_group Form new group as a subset of global group using MPI_Group_incl or one of the many group constructors Create new communicator for new group using MPI_Comm_create Determine new rank in new communicator using MPI_Comm_rank
Typical usage cont. : Conduct communications using any MPI message passing routine When finished, free up new communicator and group (optional) using MPI_Comm_free and MPI_Group_free
MPI_Group_rank intMPI_Group_rank ( MPI_Group group,int *rank) Returns the rank of this process in the given group or MPI_UNDEFINED if the process is not a member.
MPI_Group_size int MPI_Group_size(MPI_Group group, int *size) Returns the size of a group - number of processes in the group.
MPI_Group_compare • int MPI_Group_compare (MPI_Group group1, MPI_Group group2, int *result) • result - returned result of comparison • Compares two groups and returns an integer result which is MPI_IDENT if the order and members of the two groups are the same, MPI_SIMILAR if only the members are the same, and MPI_UNEQUAL otherwise.
MPI_Comm_group int MPI_Comm_group (MPI_Comm comm, MPI_Group *group) group - returned value is the handle associated with comm Determines the group associated with the given communicator.
MPI_Group_excl int MPI_Group_excl (MPI_Group group, int n,int *ranks, MPI_Group *newgroup) Produces a group by reordering an existing group and taking only unlisted members n - the size of the ranks array ranks - array with list of ranks to exclude from new group , each should be valid and distinct newgroup - new group derived from above, preserving the order defined by group (handle) See also MPI_Group_range_excl
MPI_Group_incl int MPI_Group_incl (MPI_Group group, int n, int *ranks, MPI_Group *newgroup) Produces a group by reordering an existing group and taking only listed members. n - the size of the ranks array ranks - array with list of ranks to include in the new group, each should be valid and distinct newgroup - new group derived from above, preserving the order defined by group (handle) See also MPI_Group_range_incl
MPI_Group_intersection int MPI_Group_intersection ( MPI_Group group1, MPI_Group group2, MPI_Group *newgroup) Produces a group as the intersection of two existing groups group1 - handle of first group group2 - handle of second group newgroup - handle of intersection group
MPI_Group_union int MPI_Group_union (MPI_Group group1, MPI_Group group2, MPI_Group *newgroup) Produces a group by combining two groups. group1 - handle of first group group2 - handle of second group newgroup - handle of union group
MPI_Group_difference int MPI_Group_difference ( MPI_Group group1, MPI_Group group2, MPI_Group *newgroup) Creates a group from the difference of two groups. group1 - handle of first group group2 - handle of second group newgroup - handle of difference group
set-like operations: union All elements of the first group, followed by all elements of second group, not in first. intersect all elements of the first group that are also in the second group, ordered as in first group. difference all elements of the first group that are not in the second group, ordered as in the first group.
set-like operations cont. : Note that for these operations the order of processes in the output group is determined primarily by order in the first group (if possible) and then, if necessary, by order in the second group. Neither union nor intersection are commutative, but both are associative. The new group can be empty, that is, equal to MPI_GROUP_EMPTY.
MPI_Group_free int MPI_Group_free (MPI_Group *group) Frees a group This operation marks a group object for deallocation. The handle group is set to MPI_GROUP_NULL by the call. Any on-going operation using this group will complete normally.
Manipulating Communicators Accessors, Constructors, Destructors
MPI_Comm_compare int MPI_Comm_compare (MPI_Comm comm1, MPI_Comm comm2, int *result) Compares two communicators and returns integer result MPI_IDENT - contexts and groups are the same MPI_CONGRUENT - different contexts but identical groups MPI_SIMILAR - different contexts but similar groups MPI_UNEQUAL otherwise.
MPI_Comm_create • int MPI_Comm_create (MPI_Comm comm, MPI_Group group, MPI_Comm *newcomm) • Creates a new communicator from the oldcommunicator and the newgroup. • comm - communicator associated with old group • group - new group to create a communicator for • newcomm - returns new communicator (handle) • Note: call is executed by all processes in comm (even if they’re not in new group) • returns MPI_COMM_NULL to non-members
MPI_Comm_split Partitions the group into disjoint subgroups arguments include 2 control arguments color - nonnegative integer selects process subset key - ranks in order by integer key value, tiebreaker is original rank A new group for each distinct “color” is created Use MPI_UNDEFINED as the color argument to be excluded from all groups
MPI_Comm_dup • int MPI_Comm_dup (MPI_Comm comm, MPI_Comm *newcomm) • Duplicates an existing communicator with all its associated information. • This is useful for building safe parallel libraries.
MPI_Comm_free int MPI_Comm_free (MPI_Comm *comm) Marks the communicator object for deallocation. The handle is set to MPI_COMM_NULL Any pending operations that use this communicator will complete normally; the object is actually deallocated only if there are no other active references to it
Group and Communicator Routines Example /* NOTE: This does not work on all systems - buggy! */ /* Create two different process groups for separate collective communications exchange. Requires creating new communicators*/ #include "mpi.h" #include <stdio.h> #define NPROCS 8 int main(argc,argv) int argc; char *argv[]; { int rank, new_rank, sendbuf, recvbuf, ranks1[4]={0,1,2,3}, ranks2[4]={4,5,6,7}; MPI_Group orig_group, new_group; MPI_Comm new_comm;
Group and Communicator Routines Example MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); sendbuf = rank; /* Extract the original group handle */ MPI_Comm_group(MPI_COMM_WORLD, &orig_group); /* Divide tasks into two groups based upon rank */ /* Note new_group has a different value on each PE */ if (rank < NPROCS/2) { MPI_Group_incl(orig_group,NPROCS/2,ranks1,&new_group); } else { MPI_Group_incl(orig_group,NPROCS/2,ranks2,&new_group); }
Group and Communicator Routines Example /* Create new new communicator and then perform collective communications */ MPI_Comm_create(MPI_COMM_WORLD, new_group, &new_comm); MPI_Allreduce(&sendbuf, &recvbuf,1,MPI_INT, MPI_SUM, new_comm); MPI_Group_rank (new_group, &new_rank); printf("rank= %d newrank= %d recvbuf= %d\n", rank, new_rank, recvbuf); MPI_Finalize(); }
Sample program output: rank= 7 newrank= 3 recvbuf= 22 rank= 0 newrank= 0 recvbuf= 6 rank= 1 newrank= 1 recvbuf= 6 rank= 2 newrank= 2 recvbuf= 6 rank= 6 newrank= 2 recvbuf= 22 rank= 3 newrank= 3 recvbuf= 6 rank= 4 newrank= 0 recvbuf= 22 rank= 5 newrank= 1 recvbuf= 22
Previous Example Done with MPI_Comm_split /* this fixes the buggy Maui code by using MPI_Comm_split */ #include "mpi.h" #include <stdio.h> #define NPROCS 8 #define MASTER 0 #define MSGSIZE 7 int main(argc,argv) int argc; char *argv[]; { int rank, new_rank,sendbuf, recvbuf,color; char msg[MSGSIZE+1]=" "; MPI_Comm new_comm;
Split example, Cont. MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); sendbuf = rank; /* Divide tasks into two distinct groups. First */ /* create new group and then a new communicator.*/ /* Find new rank in new group and setup for the */ /* collective communication broadcast if MASTER.*/ /* use integer division to split group into 2 "colors“ */ /* 0 and 1 */ color = (2*rank)/NPROCS; MPI_Comm_split(MPI_COMM_WORLD,color,rank,&new_comm); MPI_Comm_rank(new_comm, &new_rank)
Split Concluded if (new_rank == MASTER) sprintf(msg,"Group %d",color+1); MPI_Bcast (&msg, MSGSIZE, MPI_CHAR, MASTER, new_comm); MPI_Allreduce (&sendbuf, &recvbuf, 1,MPI_INT,MPI_SUM, new_comm); printf("rank= %d newrank= %d msg= %s sum=%d\n", rank, new_rank, msg, recvbuf); MPI_Finalize(); }