190 likes | 398 Views
iSER on InfiniBand. (and SCTP). Problem Statement. Currently defined IB Storage I/O protocol SRP (SCSI RDMA Protocol) SRP does not have a discovery or management protocol SRP does not have a wide following The RDMA Consortium voted overwhelmingly to create iSER instead of porting SRP to IP
E N D
iSER on InfiniBand (and SCTP)
Problem Statement • Currently defined IB Storage I/O protocol • SRP (SCSI RDMA Protocol) • SRP does not have a discovery or management protocol • SRP does not have a wide following • The RDMA Consortium voted overwhelmingly to create iSER instead of porting SRP to IP • Missing the new function of iSCSI & iSER • Immediate Data • Unsolicited Data • Version 2 is at Level 0 & has not been updated for 1.5 years • SCTP is not defined for iSER
Reason for iSER over IB or SCTP • Would like to have the same basic Storage Protocol across all RDMA Networks • Easer to train staff • Easer to create bridging products • Motivate storage industry into an iSCSI/iSER mentality • May help the acceptance of iSCSI/iSER on IP networks • Desire for a common Discovery and Management protocol across iSCSI, iSER/iWARP, and IB • Want the same Management and discovery process and Software to handle IP networks and IB networks
Similarities iWARP IB • Local STags L_Key • Remote STags R_Key • RDMA SendSE • RDMA SendInvSE (New) • RDMA Read/Write • Shared RQs (New) • ZBTOs ZBVA (New)
Proposed New Logical Structure +-------------------------------------+ | SCSI | +-------------------------------------+ | iSCSI | DI ------> +-------------------------------------+ | iSER | +-------------------------------------+ | RDMAP | +------------------------+------------+ | DDP | | +--------------+---------+ InfiniBand | | MPA | | (RC) | +--------------+ SCTP | | | TCP | | | +--------------+---------+------------+ Example of iSCSI/iSER Layering in Full Feature Mode
Clarify the Term iWARP • Update the iSER Draft • Use term iWARP to mean either TCP or SCTP implementations • Use the term iWARP/TCP to mean iWARP over a TCP/IP base • Use the term iWARP/SCTP to mean iWARP over an SCTP base
Clarify the Term RDMAP • Update the iSER Draft: • Use the term RDMAP to mean any RDMA protocol over iWARP, InfiniBand, or any other carrier of RDMA Protocols • Use the term RDMAP/iWARP to mean an implementation using iWARP • Use the term RDMAP/IB to mean the implementation using InfiniBand • Etc.
Things to be addressed for iSER on IB or SCTP • Defining, Addressing and Discovery of IB Storage Nodes • Handling of Login (SCTP or IB) • Selection of one path to storage vrs others • Handling older IB networks • (Network equipment with Pre 1.2 Architecture)
I. Addressing • Background:IB has IP addressing • Part of IP-over-IB (IPoIB) • Proposal for Mapping Port to IB ServiceID • IETF IPS WG should validate that: • iSCSI and iSER Discovery and Management can operate with IB via IPoIB • If validated, may not even require normative changes to draft • IBTA (InfiniBand Trade Assoc.) standardize Port to ServiceID mapping
II. Login • SCTP and IB need a way to send the iSCSI Login PDUs • SCTP and IB need a way to transit to full iSER mode • IETF IPS WG discussion needed to ascertain the best way to do this • iSCSI assumes that TCP/IP streaming is used • But iSER does not care, as long as it can transit into Full RDMA mode • iSER Spec needs language to permit this • No need to define details, just language to permit • Leave details up to implementations • May have examples in Appendix, or separate informational drafts
III. Path Selection • A target could have several types of portal groups • iSCSI, iSER/TCP, iSER/SCTP, IB, … • Some Host Systems may prefer one type vrs others • Can leave this completely up to implementation • Therefore not an IETF IPS issue (except informational) • For IB let IBTA standardize connection approach • Preference for direct Endport connection • Preference for iSER Gateway vrs IPoIB And/Or • Can add TPG type information to: • SendTargets, SLP, iSNS • Would be an IETF IPS issue
IV. Handling older RDMA Networks • May be an IETF IPS Workgroup issue Or • May be out of scope as a compatibility Hack However: • Some applications have requested to have these features • VA Based TO • Explicit Invalidates only • Toleration Language and Hello Flags permit both
Reference • http://www.haifa.il.ibm.com/satran/ips/ iSER-in-an-IB-network-V9.pdf
I. Defining, Addressing and Discovery • IB nodes are addressed via a GID (Global ID) • With IP-over-IB (IPoIB) all nodes have Normal IP addresses • IP Addresses are converted to GIDs via ARP • Returned like MAC Address • Therefore, SendTargets, SLP and iSNS can continue to function in the same way • SendTargets, SLP, and iSNS can all use normal TCP/IP via IPoIB
II. Handling of Login • iSCSI Login depends on the value of MaxRecvDataSegmentLength = 8192 • iSCSI Login & Login Reply is basically a half duplex process • IB (and SCTP) can send Login PDUs to Target with <= 8K data • IB Node will work with RC connections using “RDMA Sends” • No issue of Flow Control (it is half duplex) & Expecting buffer can be queued Max 8192 + iSCSI header • Transition to iSER mode is not something special in IB • Therefore, words are being proposed for the Login to be done in IB with Sends (or normal SCTP messages) • iSCSI Login PDUs remain unchanged
III. Selection of Paths to Storage • In an IB environment it is useful to have a way to select an IB Storage Endpoint in preference to • An IB to: an iSCSI or iSER/iWARP Gateway, or • An iSCSI TCP IPoIB Gateway to IP Network • And a way to select an IB to: iSCSI or iSER/iWARP Gateway in preference to • An iSCSI TCP IPoIB Gateway to IP Network • This is done via IB defined connection process • Being address in the IBTA • Not an IETF issue
IV. Handling Older IB Networks (ZBTO vrs VABTOs) • Some IB Networks will not support ZBTO • They require a VA (VABTOs) • By using a previously reserved bit in the Hello/HelloReply message Initiators can request VABTOs • Can treat the Actual STag and VABTO as a Virtual STag (96 bits instead of 32 bits) in iSER Headers (only)
IV. Handling Older IB Networks (Missing Auto HW Invalidate) • Some IB Network Nodes can not issue SendInvSE type messages • Can just get by with SendSE type message • iSER requires Initiator side invalidates • Some IB Networks Nodes can not receive SendInvSE and then Automatically Invalidate STags • Initiator tells Target by using previously reserved bit in Hello Message