150 likes | 302 Views
OpenFabrics 2.0 rsockets+ requirements. Sean Hefty - Intel Corporation Bob Russell, Patrick MacArthur - UNH. Data Streaming. Current: RDMA CM for connection setup Single wait object and event queue CM and CQ – use same fd In-band disconnect notification
E N D
OpenFabrics 2.0rsockets+ requirements Sean Hefty - Intel Corporation Bob Russell, Patrick MacArthur - UNH
Data Streaming • Current: RDMA CM for connection setup • Single wait object and event queue • CM and CQ – use same fd • In-band disconnect notification • Associate transport resource with an fd • fstat, dup2 • Fork support • Migrate resources between user space and kernel • chroot support www.openfabrics.org
Data Streaming • Current: RDMA write with immediate • Eliminate address and rkey exchange • Receiver selects key • Sender uses offset • Eliminate need for immediate data • Generate event based on write: location and length www.openfabrics.org
Data Streaming • Eliminate posting receives • No buffer is provided • Concern is overrunning CQ, not RQ • Replace RDMA write with send • Receiver posts single buffer that hardware packs multiple messages into • Eliminates RDMA header • Count of completed sends • Full completion data unnecessary www.openfabrics.org
Data Streaming • Split received data into two buffers • Separate header and user data • Pack tightly, but use multiple buffers • Partial completion event • Notification of partial transfer for large requests • Allow receive side to being processing www.openfabrics.org
Data Streaming • Nonblocking support • Signal when transport is ready to accept new data • Available QP and CQ resources, send credits • Keepalive support • 0-byte send that does not generate a remote event • Similar to RDMA-write, but eliminate header www.openfabrics.org
Datagram • User selectable transport address (QPN) • High QPN lookup costs • Message backlog • Multi-receive message buffer • Single buffer receives multiple messages • Split received data into two buffers • Separate header and user data • Pack tightly, but use multiple buffers www.openfabrics.org
Datagram • Fast address resolution • Compact address data • Multicast support • Fast access to multicast group www.openfabrics.org
General Requests • Increase size of immediate data • Provide easy mechanism to discover if immediate data is supported and size • Slab based allocation for receive buffers • Eliminate wasted space dealing with max message size • Eliminate posting of ‘dummy’ receive for immediate data www.openfabrics.org
General Requests • Add timeout parameters to all CM operations • E.g. connect, accept, disconnect, join multicast • Timeout parameters for reading events • Ability to cancel a pending I/O • Including CM operations www.openfabrics.org
General Requests • Error handling must be consistent • Do not leave to providers • Document which error codes every call can return • Similar to POSIX error code documentation • Use a single error return convention • Return -1 and set errno? • Return –errno? (prefered) • Return +errno? • Consistent error values in events • Do not mix transport and errno values • Easy mechanism to display error text www.openfabrics.org
General Requests • Query current status of local queues • Generating an async event (e.g. SRQ) compounds the issue of dealing with multiple fd’s • Eliminate need for these events or provide in-band notification • Support memory registration across multiple devices • Register at the system level, not per PD per HCA www.openfabrics.org
General Requests • Need simple, programmatic way to detect memory alignment restrictions • Or avoid any alignment needs • Need better way to discover supported ‘inline’ sizes • Providers should ensure that that reported values actually improve performance www.openfabrics.org
General Requests • Define reasonable minimum requirements on providers for: • Number of SGEs • Inline size • Immediate data size • CM private data length • With a supported minimum for any message www.openfabrics.org
General Requests • Asynchronous interface can be source of races • E.g. completions before call returns • Have provider update user counters before generating completion • Support multiple providers at run-time • Provide test suite to verify provider conformance to API specifications • Example programs • Error conventions • Min/max values www.openfabrics.org