150 likes | 297 Views
BigBen @ PSC. BigBen @ PSC. BigBen @ PSC. Compute Nodes 2068 nodes running Catamount (QK) microkernel Seastar interconnect in a 3-D torus configuration No external connectivity (no TCP) All Inter-node communication is over Portals Applications use MPI which is based on Portals.
E N D
Compute Nodes 2068 nodes running Catamount (QK) microkernel Seastar interconnect in a 3-D torus configuration No external connectivity (no TCP) All Inter-node communication is over Portals Applications use MPI which is based on Portals Service & I/O Nodes (SIO) Nodes 22 nodes running Suse Linux Also on the Seastar interconnect SIO nodes can have PCI-X hardware installed, defining unique roles for each 2 SIO nodes are externally connected to ETF with 10GigE cards (currently) BigBen Features
Portals Direct I/O (PDIO) Details • Portals-to-TCP routing • PDIO daemons aggregate hundreds of portals data streams into a configurable number of outgoing TCP streams • Heterogenous portals (both QK + Linux nodes) • Explicit Parallelism • Configurable # of Portals receivers (on SIO nodes) • Distributed across multiple 10GigE-connectedService & I/O (SIO) nodes • Corresponding # of TCP streams (to the WAN) • one per PDIO daemon • A Parallel TCP receiver in the Goodhue booth • Supports a variable/dynamic number of connections
Portals Direct I/O (PDIO) Details • Utilizing the ETF network • 10GigE end-to-end • Benchmarked >1Gbps in testing • Inherent flow-control feedback to application • Aggregation protocol allows TCP transmission or even remote file system performance to throttle the data streams coming out of the application (!) • Variable message sizes and file metadata supported • Multi-threaded ring buffer in the PDIO daemon • Allows the Portals receiver, TCP sender, and computation to proceed asynchronously
Portals Direct I/O (PDIO) Config • User-configurable/tunable parameters: • Network targets • Can be different for each job • Number of streams • Can be tuned for optimal host/network utilization • TCP network buffer size • Can be tuned for maximum throughput over the WAN • Ring buffer size/length • Controls total memory utilization of PDIO daemons • Number of portals writers • Can be any subset of the running application’s processes • Remote filename(s) • File metadata are propagated through the full chain, per write
HPC resource and renderer waiting… Compute Nodes ETF network Steering I/O Nodes PSC iGRID
Launch PPM job, PDIO daemons, and iGRID recv’ers Compute Nodes recv recv recv ETF network pdiod pdiod pdiod pdiod pdiod pdiod Steering I/O Nodes PSC iGRID
Aggregate data via Portals Compute Nodes recv recv recv ETF network pdiod pdiod pdiod pdiod pdiod pdiod Steering I/O Nodes PSC iGRID
Route traffic to ETF net Compute Nodes recv recv recv ETF network pdiod pdiod pdiod pdiod pdiod pdiod Steering I/O Nodes PSC iGRID
Recv data @ iGRID Compute Nodes recv recv recv ETF network pdiod pdiod pdiod pdiod pdiod pdiod Steering I/O Nodes PSC iGRID
render Render real-time data Compute Nodes recv recv recv ETF network pdiod pdiod pdiod pdiod pdiod pdiod Steering I/O Nodes PSC iGRID
render Send steering data back to active job Compute Nodes recv input recv recv ETF network pdiod pdiod pdiod pdiod pdiod pdiod Steering I/O Nodes PSC iGRID
render Dynamically update rendering Compute Nodes recv input recv recv ETF network pdiod pdiod pdiod pdiod pdiod pdiod Steering I/O Nodes PSC iGRID