210 likes | 357 Views
Mapping Data to HPC Architectures Dr. Jim Holten. CS 351/ IT 351 Modeling and Simulation Technologies. Overview. Data Parallel Partitioning Partitions to HW Architecture Processor Loads Communications Loads. Data Parallel Partitioning. Partitioning the primary set members
E N D
Mapping Data to HPC Architectures Dr. Jim Holten CS 351/ IT 351 Modeling and SimulationTechnologies
CS 351/ IT 351 Overview • Data Parallel Partitioning • Partitions to HW Architecture • Processor Loads • Communications Loads
CS 351/ IT 351 Data ParallelPartitioning • Partitioning the primary set members • Projecting to secondary set members • Identifying shared set members • Building communications maps
CS 351/ IT 351 PartitioningPrimary Set Members • Representing the primary sets partitioning • Sp – the primary set. • Spar – the set of partitions. • Rpar,p – the mapping of partitions to primary set members in each. • Generating the partition assignments • Spi = range(Rpar,pi, i) – partition i subset of the primary set. • Rpi,p – the subset relation for Spi
CS 351/ IT 351 PartitioningSecondary Set Members • Sets • Spar – Partitions set • Sp – Primary set • Sk – Secondary set • Subsets • Spi – Primary ith subset • Ski – Secondary ith subset
CS 351/ IT 351 PartitioningSecondary Set Members • Main Relations • Rpar,p – Partitions to primaries • Rk,p – Secondary to primary • Rpi,p – Primary’s subset • Desired Relations • Rki,k – Secondary's subset • Rki,pi – Partition secondary to primary • Intermediate Relation • Rpi,k – Primary’s subset to secondary set
CS 351/ IT 351 PartitioningSecondary Set Members • Calculate the intermediate relation • Rpi,k = Rpi,p * Rk,p-1 – Spi to Sk • Project to partition assignments • Rki,k = Range(Rpi,k) • Rki,pi = Rki,k * Rpi,k-1 • Repeat for all secondary sets.
CS 351/ IT 351 PartitioningDependent Sets
CS 351/ IT 351 Projections of Partitions
CS 351/ IT 351 Identifying Sharing ofSet Members • Getting shared members (partition subset intersections) • Spij = Spi ∩ Spj for all i ≠ j – primary set members subset to share between partitions i and j • Skij = Ski ∩ Skj for all i ≠ j – kth secondary set members subset to share between partitions i and j • Each gives a subset relation for the shared subset (Rpij,pi and Rkij,ki) enumerating the shared members.
CS 351/ IT 351 PartitionSet Member Subsets • For each process, each set has three subsets of interest • Private subset – not shared with anyone else • Shared subset – locally modified, then shared • Borrowed – used locally but set by another partition
CS 351/ IT 351 PartitionSet Member Subsets
CS 351/ IT 351 PartitionCommunications Blocks • For each pair of processes a collection of subsets may be passed as a single communications block • Shared subsets – going out to the other process • Borrowed subsets – come from the other process
CS 351/ IT 351 PartitionCommunication Blocks
CS 351/ IT 351 PartitionCommunications
CS 351/ IT 351 Which ProcessesMust Share? • When must subset field values be shared? • The subset is not empty • The field data values change • Field calculations use the changed data • Empty subsets and empty communications blocks can be ignored.
CS 351/ IT 351 Gather/Scatter • Outgoing data must be “gathered” into the communications block • Incoming data must be “scattered” back into the local subset data fields.
CS 351/ IT 351 Gather/Scatter
CS 351/ IT 351 Actual Data to Be Passed? • Fields of values over the “shared” data subset members that are locally changed must be “shared”. • Fields of data values over the “borrowed” data subset members that are needed in local calculations need “borrowed”. • Field data values order and data types must be standardized for each passed communications block.
CS 351/ IT 351 Partitions to HW architecture? • Get a graph of the HW architecture • Each CPU is a node • Shared comms between nodes are the links • Ethernet* • Direct P2P • Shared memory* • Common file system* • (* ) Comms shared among multiple links require comm nodes. • Associate the partitions to the CPU graph nodes • Identifies processor loading • Identifies comm alternatives between partitions • Identifies comm link loading
CS 351/ IT 351 Conclusions • Data parallel partitioning can be easily automated for any number of partitions if the dependencies are explicitly given as in SRF relations. • Mapping to an HW architecture can be automated also, including static load balancing. • The same techniques may be extended for dynamic load balancing.