diff options
Diffstat (limited to 'doc/rfc/rfc5046.txt')
-rw-r--r-- | doc/rfc/rfc5046.txt | 4763 |
1 files changed, 4763 insertions, 0 deletions
diff --git a/doc/rfc/rfc5046.txt b/doc/rfc/rfc5046.txt new file mode 100644 index 0000000..07af160 --- /dev/null +++ b/doc/rfc/rfc5046.txt @@ -0,0 +1,4763 @@ + + + + + + +Network Working Group M. Ko +Request for Comments: 5046 IBM Corporation +Category: Standards Track M. Chadalapaka + Hewlett-Packard Company + J. Hufferd + Brocade, Inc. + U. Elzur + H. Shah + P. Thaler + Broadcom Corporation + October 2007 + + + Internet Small Computer System Interface (iSCSI) Extensions + for Remote Direct Memory Access (RDMA) + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + Internet Small Computer System Interface (iSCSI) Extensions for + Remote Direct Memory Access (RDMA) provides the RDMA data transfer + capability to iSCSI by layering iSCSI on top of an RDMA-Capable + Protocol, such as the iWARP protocol suite. An RDMA-Capable Protocol + provides RDMA Read and Write services, which enable data to be + transferred directly into SCSI I/O Buffers without intermediate data + copies. This document describes the extensions to the iSCSI protocol + to support RDMA services as provided by an RDMA-Capable Protocol, + such as the iWARP protocol suite. + + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 1] + +RFC 5046 iSER Specification October 2007 + + +Table of Contents + + 1. Introduction ....................................................5 + 1.1. Motivation .................................................5 + 1.2. Architectural Goals ........................................6 + 1.3. Protocol Overview ..........................................7 + 1.4. RDMA Services and iSER .....................................8 + 1.4.1. STag ................................................8 + 1.4.2. Send ................................................9 + 1.4.3. RDMA Write ..........................................9 + 1.4.4. RDMA Read ...........................................9 + 1.5. SCSI Read Overview ........................................10 + 1.6. SCSI Write Overview .......................................10 + 1.7. iSCSI/iSER Layering .......................................10 + 2. Definitions and Acronyms .......................................11 + 2.1. Definitions ...............................................11 + 2.2. Acronyms ..................................................17 + 2.3. Conventions ...............................................19 + 3. Upper Layer Interface Requirements .............................19 + 3.1. Operational Primitives Offered by iSER ....................20 + 3.1.1. Send_Control .......................................20 + 3.1.2. Put_Data ...........................................20 + 3.1.3. Get_Data ...........................................21 + 3.1.4. Allocate_Connection_Resources ......................21 + 3.1.5. Deallocate_Connection_Resources ....................22 + 3.1.6. Enable_Datamover ...................................22 + 3.1.7. Connection_Terminate ...............................22 + 3.1.8. Notice_Key_Values ..................................23 + 3.1.9. Deallocate_Task_Resources ..........................23 + 3.2. Operational Primitives Used by iSER .......................23 + 3.2.1. Control_Notify .....................................24 + 3.2.2. Data_Completion_Notify .............................24 + 3.2.3. Data_ACK_Notify ....................................24 + 3.2.4. Connection_Terminate_Notify ........................25 + 3.3. iSCSI Protocol Usage Requirements .........................25 + 4. Lower Layer Interface Requirements .............................26 + 4.1. Interactions with the RCaP Layer ..........................26 + 4.2. Interactions with the Transport Layer .....................27 + 5. Connection Setup and Termination ...............................27 + 5.1. iSCSI/iSER Connection Setup ...............................27 + 5.1.1. Initiator Behavior .................................29 + 5.1.2. Target Behavior ....................................30 + 5.1.3. iSER Hello Exchange ................................32 + 5.2. iSCSI/iSER Connection Termination .........................33 + 5.2.1. Normal Connection Termination at the Initiator .....33 + 5.2.2. Normal Connection Termination at the Target ........34 + 5.2.3. Termination without Logout Request/Response PDUs ...34 + + + + +Ko, et al. Standards Track [Page 2] + +RFC 5046 iSER Specification October 2007 + + + 6. Login/Text Operational Keys ....................................35 + 6.1. HeaderDigest and DataDigest ...............................35 + 6.2. MaxRecvDataSegmentLength ..................................36 + 6.3. RDMAExtensions ............................................36 + 6.4. TargetRecvDataSegmentLength ...............................37 + 6.5. InitiatorRecvDataSegmentLength ............................38 + 6.6. OFMarker and IFMarker .....................................38 + 6.7. MaxOutstandingUnexpectedPDUs ..............................38 + 7. iSCSI PDU Considerations .......................................39 + 7.1. iSCSI Data-Type PDU .......................................39 + 7.2. iSCSI Control-Type PDU ....................................40 + 7.3. iSCSI PDUs ................................................40 + 7.3.1. SCSI Command .......................................40 + 7.3.2. SCSI Response ......................................42 + 7.3.3. Task Management Function Request/Response ..........44 + 7.3.4. SCSI Data-Out ......................................45 + 7.3.5. SCSI Data-In .......................................46 + 7.3.6. Ready to Transfer (R2T) ............................48 + 7.3.7. Asynchronous Message ...............................50 + 7.3.8. Text Request and Text Response .....................50 + 7.3.9. Login Request and Login Response ...................50 + 7.3.10. Logout Request and Logout Response ................51 + 7.3.11. SNACK Request .....................................51 + 7.3.12. Reject ............................................51 + 7.3.13. NOP-Out and NOP-In ................................51 + 8. Flow Control and STag Management ...............................52 + 8.1. Flow Control for RDMA Send Message Types ..................52 + 8.1.1. Flow Control for Control-Type PDUs from the + Initiator ..........................................52 + 8.1.2. Flow Control for Control-Type PDUs from the + Target .............................................55 + 8.2. Flow Control for RDMA Read Resources ......................56 + 8.3. STag Management ...........................................56 + 8.3.1. Allocation of STags ................................57 + 8.3.2. Invalidation of STags ..............................57 + 9. iSER Control and Data Transfer .................................58 + 9.1. iSER Header Format ........................................58 + 9.2. iSER Header Format for the iSCSI Control-Type PDU .........59 + 9.3. iSER Header Format for the iSER Hello Message .............60 + 9.4. iSER Header Format for the iSER HelloReply Message ........61 + 9.5. SCSI Data Transfer Operations .............................62 + 9.5.1. SCSI Write Operation ...............................62 + 9.5.2. SCSI Read Operation ................................63 + 9.5.3. Bidirectional Operation ............................64 + 10. iSER Error Handling and Recovery ..............................64 + 10.1. Error Handling ...........................................64 + 10.1.1. Errors in the Transport Layer .....................64 + 10.1.2. Errors in the RCaP Layer ..........................65 + + + +Ko, et al. Standards Track [Page 3] + +RFC 5046 iSER Specification October 2007 + + + 10.1.3. Errors in the iSER Layer ..........................66 + 10.1.4. Errors in the iSCSI Layer .........................67 + 10.2. Error Recovery ...........................................69 + 10.2.1. PDU Recovery ......................................69 + 10.2.2. Connection Recovery ...............................70 + 11. Security Considerations .......................................71 + 12. References ....................................................71 + 12.1. Normative References .....................................71 + 12.2. Informative References ...................................72 + Appendix A. iWARP Message Format for iSER .........................73 + A.1. iWARP Message Format for iSER Hello Message ...............73 + A.2. iWARP Message Format for iSER HelloReply Message ..........74 + A.3. iWARP Message Format for SCSI Read Command PDU ............75 + A.4. iWARP Message Format for SCSI Read Data ...................76 + A.5. iWARP Message Format for SCSI Write Command PDU ...........77 + A.6. iWARP Message Format for RDMA Read Request ................78 + A.7. iWARP Message Format for Solicited SCSI Write Data ........79 + A.8. iWARP Message Format for SCSI Response PDU ................80 + Appendix B. Architectural Discussion of iSER over InfiniBand ......81 + B.1. The Host Side of the iSCSI and iSER Connections + in InfiniBand .............................................81 + B.2. The Storage Side of the iSCSI and iSER Mixed + Network Environment .......................................82 + B.3. Discovery Processes for an InfiniBand Host ................82 + B.4. IBTA Connection Specifications ............................83 + Acknowledgments ...................................................83 + +Table of Figures + + Figure 1. Example of iSCSI/iSER Layering in Full Feature Phase ....11 + Figure 2. iSER Header Format ......................................58 + Figure 3. iSER Header Format for iSCSI Control-Type PDU ...........59 + Figure 4. iSER Header Format for iSER Hello Message ...............60 + Figure 5. iSER Header Format for iSER HelloReply Message ..........61 + Figure 6. SendSE Message containing an iSER Hello Message .........72 + Figure 7. SendSE Message containing an iSER HelloReply Message ....74 + Figure 8. SendSE Message containing a SCSI Read Command PDU .......75 + Figure 9. RDMA Write Message containing SCSI Read Data ............76 + Figure 10. SendSE Message containing a SCSI Write Command PDU .....77 + Figure 11. RDMA Read Request Message ..............................78 + Figure 12. RDMA Read Response Message containing SCSI Write Data ..79 + Figure 13. SendInvSE Message containing SCSI Response PDU .........80 + Figure 14. iSCSI and iSER on IB ...................................81 + Figure 15. Storage Controller with TCP, iWARP, and IB Connections .82 + + + + + + + +Ko, et al. Standards Track [Page 4] + +RFC 5046 iSER Specification October 2007 + + +1. Introduction + +1.1. Motivation + + The iSCSI protocol [RFC3720] is a mapping of the SCSI Architecture + Model (see [SAM2]) over the TCP protocol. SCSI commands are carried + by iSCSI requests, and SCSI responses and status are carried by iSCSI + responses. Other iSCSI protocol exchanges and SCSI data are also + transported in iSCSI Protocol Data Units (PDUs). + + Out-of-order TCP segments in the Traditional iSCSI model have to be + stored and reassembled before the iSCSI protocol layer within an end + node can place the data in the iSCSI buffers. This reassembly is + required because not every TCP segment is likely to contain an iSCSI + header to enable its placement, and TCP itself does not have a + built-in mechanism for signaling Upper Level Protocol (ULP) message + boundaries to aid placement of out-of-order segments. This TCP + reassembly at high network speeds is quite counter-productive for the + following reasons: wasted memory bandwidth in data copying, the need + for reassembly memory, wasted CPU cycles in data copying, and the + general store-and-forward latency from an application perspective. + TCP reassembly was recognized as a serious issue in [RFC3720], and + the notion of a "sync and steering layer" was introduced that is + optional to implement and use. One specific sync and steering + mechanism, called "markers", was defined in [RFC3720], which provides + an application-level way of framing iSCSI Protocol Data Units (PDUs) + within the TCP data stream even when the TCP segments are not yet + reassembled to be in-order. + + With these defined techniques in [RFC3720], a Network Interface + Controller customized for iSCSI (SNIC) could offload the TCP/IP + processing and support direct data placement, but most iSCSI + implementations do not support iSCSI "markers", making SNIC marker- + based direct data placement unusable in practice. + + The iWARP protocol stack provides direct data placement functionality + that is usable in practice. In addition, there is interest in using + iSCSI with other Remote Direct Memory Access (RDMA) protocol stacks + that support direct data placement, such as the one provided by + InfiniBand. The generic term RDMA-Capable Protocol (RCaP) is used to + refer to the RDMA functionality provided by such protocol stacks. + + With the availability of RDMA-Capable Controllers within a host + system, which does not have SNICs, it is appropriate for iSCSI to be + able to exploit the direct data placement function of the RDMA- + Capable Controller like other applications. + + + + + +Ko, et al. Standards Track [Page 5] + +RFC 5046 iSER Specification October 2007 + + + iSCSI Extensions for RDMA (iSER) is designed precisely to take + advantage of generic RDMA technologies -- iSER's goal is to permit + iSCSI to employ direct data placement and RDMA capabilities using a + generic RDMA-Capable Controller. In summary, the iSCSI/iSER protocol + stack is designed to enable scaling to high speeds by relying on a + generic data placement process and RDMA technologies and products, + which enable direct data placement of both in-order and out-of-order + data. + + This document describes iSER as a protocol extension to iSCSI, both + for convenience of description and because it is true in a very + strict protocol sense. However, note that iSER is in reality + extending the connectivity of the iSCSI protocol defined in + [RFC3720], and the name iSER reflects this reality. + + When the iSCSI protocol as defined in [RFC3720] (i.e., without the + iSER enhancements) is intended in the rest of the document, the term + "Traditional iSCSI" is used to make the intention clear. + +1.2. Architectural Goals + + This section summarizes the architectural goals that guided the + design of iSER. + + 1. Provide an RDMA data transfer model for iSCSI that enables direct + in-order or out-of-order data placement of SCSI data into pre- + allocated SCSI buffers while maintaining in-order data delivery. + + 2. Not require any major changes to the SCSI Architecture Model + [SAM2] and SCSI command set standards. + + 3. Utilize existing iSCSI infrastructure (sometimes referred to as + "iSCSI ecosystem") including but not limited to MIB, + bootstrapping, negotiation, naming and discovery, and security. + + 4. Require a session to operate in the Traditional iSCSI data + transfer mode if iSER is not supported by either the initiator or + the target (i.e., not require iSCSI Full Feature Phase + interoperability between an end node operating in Traditional + iSCSI mode, and an end node operating in iSER-assisted mode). + + 5. Allow initiator and target implementations to utilize generic + RDMA-Capable Controllers such as RDMA-enabled Network Interface + Controllers (RNICs), or to implement iSCSI and iSER in software + (not require iSCSI- or iSER-specific assists in the RCaP + implementation or RDMA-Capable Controller). + + + + + +Ko, et al. Standards Track [Page 6] + +RFC 5046 iSER Specification October 2007 + + + 6. Require full and only generic RCaP functionality at both the + initiator and the target. + + 7. Implement a lightweight Datamover protocol for iSCSI with minimal + state maintenance. + +1.3. Protocol Overview + + Consistent with the architectural goals stated in Section 2.2, the + iSER protocol does not require changes in the iSCSI ecosystem or any + related SCSI specifications. The iSER protocol defines the mapping + of iSCSI PDUs to RCaP Messages in such a way that it is entirely + feasible to realize iSCSI/iSER implementations that are based on + generic RDMA-Capable Controllers. The iSER protocol layer requires + minimal state maintenance to assist an iSCSI Full Feature Phase + connection, besides being oblivious to the notion of an iSCSI + session. The crucial protocol aspects of iSER may be summarized + thus: + + 1. iSER-assisted mode is negotiated during the iSCSI login for each + session, and an entire iSCSI session can only operate in one mode + (i.e., a connection in a session cannot operate in iSER-assisted + mode if a different connection of the same session is already in + Full Feature Phase in the Traditional iSCSI mode). + + 2. Once in iSER-assisted mode, all iSCSI interactions on that + connection use RCaP Messages. + + 3. A Send Message Type is used for carrying an iSCSI control-type PDU + preceded by an iSER header. See Section 7.2 for more details on + iSCSI control-type PDUs. + + 4. RDMA Write, RDMA Read Request, and RDMA Read Response Messages are + used for carrying control and all data information associated with + the iSCSI data-type PDUs. See Section 7.1 for more details on + iSCSI data-type PDUs. + + 5. Target drives all data transfer (with the exception of iSCSI + unsolicited data) for SCSI writes and SCSI reads, by issuing RDMA + Read Requests and RDMA Writes, respectively. + + 6. RCaP is responsible for ensuring data integrity. (For example, + iWARP includes a CRC-enhanced framing layer called Marker PDU + Aligned Framing for TCP (MPA) on top of TCP; and for InfiniBand, + the CRCs are included in the Reliable Connection mode). For this + reason, iSCSI header and data digests are negotiated to "None" for + iSCSI/iSER sessions. + + + + +Ko, et al. Standards Track [Page 7] + +RFC 5046 iSER Specification October 2007 + + + 7. The iSCSI error recovery hierarchy defined in [RFC3720] is fully + supported by iSER. (However, see Section 7.3.11 on the handling + of SNACK Request PDUs.) + + 8. iSER requires no changes to iSCSI authentication, security, and + text mode negotiation mechanisms. + + Note that Traditional iSCSI implementations may have to be adapted to + employ iSER. It is expected that the adaptation when required is + likely to be centered around the upper layer interface requirements + of iSER (Section 3). + +1.4. RDMA Services and iSER + + iSER is designed to work with software and/or hardware protocol + stacks providing the protocol services defined in RCaP documents such + as [RDMAP], [IB], etc. The following subsections describe the key + protocol elements of RCaP services that iSER relies on. + +1.4.1. STag + + A Steering Tag (STag) is the identifier of an I/O Buffer unique to an + RDMA-Capable Controller that the iSER layer Advertises to the remote + iSCSI/iSER node in order to complete a SCSI I/O. + + In iSER, Advertisement is the act of informing the target by the + initiator that an I/O Buffer is available at the initiator for RDMA + Read or RDMA Write access by the target. The initiator Advertises + the I/O Buffer by including the STag in the header of an iSER Message + containing the SCSI Command PDU to the target. The base Tagged + Offset is not explicitly specified, but the target must always assume + it as zero. The buffer length is as specified in the SCSI Command + PDU. + + The iSER layer at the initiator Advertises the STag for the I/O + Buffer of each SCSI I/O to the iSER layer at the target in the iSER + header of the Send with Solicited Event (SendSE) Message containing + the SCSI Command PDU, unless the I/O can be completely satisfied by + unsolicited data alone. + + The iSER layer at the target provides the STag for the I/O Buffer + that is the Data Sink of an RDMA Read Operation (Section 2.4.4) to + the RCaP layer on the initiator node -- i.e., this is completely + transparent to the iSER layer at the initiator. + + The iSER protocol is defined so that the Advertised STag is + automatically invalidated upon a normal completion of the associated + task. This automatic invalidation is realized via the Send with + + + +Ko, et al. Standards Track [Page 8] + +RFC 5046 iSER Specification October 2007 + + + Solicited Event and Invalidate (SendInvSE) Message carrying the SCSI + Response PDU. There are two exceptions to this automatic + invalidation -- bidirectional commands, and abnormal completion of a + command. The iSER layer at the initiator is required to explicitly + invalidate the STag in these cases, in addition to sanity checking + the automatic invalidation even when that does happen. + +1.4.2. Send + + Send is the RDMA Operation that is not addressed to an Advertised + buffer by the sending side, and thus uses Untagged buffers on the + receiving side. + + The iSER layer at the initiator uses the Send Operation to transmit + any iSCSI control-type PDU to the target. As an example, the + initiator uses Send Operations to transfer iSER Messages containing + SCSI Command PDUs to the iSER layer at the target. + + An iSER layer at the target uses the Send Operation to transmit any + iSCSI control-type PDU to the initiator. As an example, the target + uses Send Operations to transfer iSER Messages containing SCSI + Response PDUs to the iSER layer at the initiator. + +1.4.3. RDMA Write + + RDMA Write is the RDMA Operation that is used to place data into an + Advertised buffer on the receiving side. The sending side addresses + the Message using an STag and a Tagged Offset that are valid on the + Data Sink. + + The iSER layer at the target uses the RDMA Write Operation to + transfer the contents of a local I/O Buffer to an Advertised I/O + Buffer at the initiator. The iSER layer at the target uses the RDMA + Write to transfer whole or part of the data required to complete a + SCSI read command. + + The iSER layer at the initiator does not employ RDMA Writes. + +1.4.4. RDMA Read + + RDMA Read is the RDMA Operation that is used to retrieve data from an + Advertised buffer on a remote node. The sending side of the RDMA + Read Request addresses the Message using an STag and a Tagged Offset + that are valid on the Data Source in addition to providing a valid + local STag and Tagged Offset that identify the Data Sink. + + The iSER layer at the target uses the RDMA Read Operation to transfer + the contents of an Advertised I/O Buffer at the initiator to a local + + + +Ko, et al. Standards Track [Page 9] + +RFC 5046 iSER Specification October 2007 + + + I/O Buffer at the target. The iSER layer at the target uses the RDMA + Read to fetch whole or part of the data required to complete a SCSI + write command. + + The iSER layer at the initiator does not employ RDMA Reads. + +1.5. SCSI Read Overview + + The iSER layer at the initiator receives the SCSI Command PDU from + the iSCSI layer. The iSER layer at the initiator generates an STag + for the I/O Buffer of the SCSI Read and Advertises the buffer by + including the STag as part of the iSER header for the PDU. The iSER + Message is transferred to the target using a SendSE Message. + + The iSER layer at the target uses one or more RDMA Writes to transfer + the data required to complete the SCSI Read. + + The iSER layer at the target uses a SendInvSE Message to transfer the + SCSI Response PDU back to the iSER layer at the initiator. The iSER + layer at the initiator notifies the iSCSI layer of the availability + of the SCSI Response PDU. + +1.6. SCSI Write Overview + + The iSER layer at the initiator receives the SCSI Command PDU from + the iSCSI layer. If solicited data transfer is involved, the iSER + layer at the initiator generates an STag for the I/O Buffer of the + SCSI Write and Advertises the buffer by including the STag as part of + the iSER header for the PDU. The iSER Message is transferred to the + target using a SendSE Message. + + The iSER layer at the initiator may optionally send one or more non- + immediate unsolicited data PDUs to the target using Send Message + Types. + + If solicited data transfer is involved, the iSER layer at the target + uses one or more RDMA Reads to transfer the data required to complete + the SCSI Write. + + The iSER layer at the target uses a SendInvSE Message to transfer the + SCSI Response PDU back to the iSER layer at the initiator. The iSER + layer at the initiator notifies the iSCSI layer of the availability + of the SCSI Response PDU. + +1.7. iSCSI/iSER Layering + + iSCSI Extensions for RDMA (iSER) is layered between the iSCSI layer + and the RCaP layer. Note that the RCaP layer may be composed of one + + + +Ko, et al. Standards Track [Page 10] + +RFC 5046 iSER Specification October 2007 + + + or more distinct protocol layers depending on the specifics of the + RCaP. Figure 1 shows an example of the relationship between SCSI, + iSCSI, iSER, and the different RCaP layers. For TCP, the RCaP is + iWARP. For InfiniBand, the RCaP is the Reliable Connected Transport + Service. Note that the iSCSI layer as described here supports the + RDMA Extensions as used in iSER. + + +-------------------------------------+ + | SCSI | + +-------------------------------------+ + | iSCSI | + DI ------> +-------------------------------------+ + | iSER | + +---------+--------------+------------+ + | RDMAP | | | + +---------+ InfiniBand | | + | DDP | Reliable | Other | + +---------+ Connected | RDMA- | + | MPA | Transport | Capable | + +---------+ Service | Protocol | + | TCP | | | + +---------+--------------+------------+ + | | InfiniBand | Other | + | IP | Network | Network | + | | Layer | Layer | + +---------+--------------+------------+ + + Figure 1. Example of iSCSI/iSER Layering in Full Feature Phase + +2. Definitions and Acronyms + +2.1. Definitions + + Advertisement (Advertised, Advertise, Advertisements, Advertises) - + The act of informing a remote iSER layer that a local node's + buffer is available to it. A Node makes a buffer available for + incoming RDMA Read Request Message or incoming RDMA Write Message + access by informing the remote iSER layer of the Tagged Buffer + identifiers (STag, TO, and buffer length). Note that this + Advertisement of Tagged Buffer information is the responsibility + of the iSER layer on either end and is not defined by the RDMA- + Capable Protocol. A typical method would be for the iSER layer to + embed the Tagged Buffer's STag, TO, and buffer length in a Send + Message destined for the remote iSER layer. + + Completion (Completed, Complete, Completes) - Completion is defined + as the process by the RDMA-Capable Protocol layer to inform the + + + + +Ko, et al. Standards Track [Page 11] + +RFC 5046 iSER Specification October 2007 + + + iSER layer, that a particular RDMA Operation has performed all + functions specified for the RDMA Operation. + + Connection - A connection is a logical circuit between the initiator + and the target, e.g., a TCP connection. Communication between the + initiator and the target occurs over one or more connections. The + connections carry control messages, SCSI commands, parameters, and + data within iSCSI Protocol Data Units (iSCSI PDUs). + + Connection Handle - An information element that identifies the + particular iSCSI connection and is unique for a given iSCSI-iSER + pair. Every invocation of an Operational Primitive is qualified + with the Connection Handle. + + Data Sink - The peer receiving a data payload. Note that the Data + Sink can be required to both send and receive RCaP Messages to + transfer a data payload. + + Data Source - The peer sending a data payload. Note that the Data + Source can be required to both send and receive RCaP Messages to + transfer a data payload. + + Datamover Interface (DI) - The interface between the iSCSI layer and + the Datamover layer as described in [DA]. + + Datamover Layer - A layer that is directly below the iSCSI layer and + above the underlying transport layers. This layer exposes and + uses a set of transport independent Operational Primitives for the + communication between the iSCSI layer and itself. The Datamover + layer, operating in conjunction with the transport layers, moves + the control and data information on the iSCSI connection. In this + specification, the iSER layer is the Datamover layer. + + Datamover Protocol - A Datamover protocol is the wire-protocol that + is defined to realize the Datamover layer functionality. In this + specification, the iSER protocol is the Datamover protocol. + + Event - An indication provided by the RDMA-Capable Protocol layer to + the iSER layer to indicate a Completion or other condition + requiring immediate attention. + + Inbound RDMA Read Queue Depth (IRD) - The maximum number of incoming + outstanding RDMA Read Requests that the RDMA-Capable Controller + can handle on a particular RCaP Stream at the Data Source. For + some RDMA-Capable Protocol layers, the term "IRD" may be known by + a different name. For example, for InfiniBand, the equivalent for + IRD is the Responder Resources. + + + + +Ko, et al. Standards Track [Page 12] + +RFC 5046 iSER Specification October 2007 + + + Invalidate STag - A mechanism used to prevent the Remote Peer from + reusing a previous explicitly Advertised STag, until the iSER + layer at the local node makes it available through a subsequent + explicit Advertisement. + + I/O Buffer - A buffer that is used in a SCSI Read or Write operation + so SCSI data may be sent from or received into that buffer. + + iSCSI - The iSCSI protocol as defined in [RFC3720] is a mapping of + the SCSI Architecture Model of SAM-2 over TCP. + + iSCSI control-type PDU - Any iSCSI PDU that is not an iSCSI data- + type PDU and also not a SCSI Data-out PDU carrying solicited data + is defined as an iSCSI control-type PDU. Specifically, it is to + be noted that SCSI Data-out PDUs for unsolicited data are defined + as iSCSI control-type PDUs. + + iSCSI data-type PDU - An iSCSI data-type PDU is defined as an iSCSI + PDU that causes data transfer, transparent to the remote iSCSI + layer, to take place between the peer iSCSI nodes on a Full + Feature Phase iSCSI connection. An iSCSI data-type PDU, when + requested for transmission by the sender iSCSI layer, results in + the associated data transfer without the participation of the + remote iSCSI layer, i.e. the PDU itself is not delivered as-is to + the remote iSCSI layer. The following iSCSI PDUs constitute the + set of iSCSI data-type PDUs - SCSI Data-In PDU and R2T PDU. + + iSCSI Layer - A layer in the protocol stack implementation within an + end node that implements the iSCSI protocol and interfaces with + the iSER layer via the Datamover Interface. + + iSCSI PDU (iSCSI Protocol Data Unit) - The iSCSI layer at the + initiator and the iSCSI layer at the target divide their + communications into messages. The term "iSCSI protocol data unit" + (iSCSI PDU) is used for these messages. + + iSCSI/iSER Connection - An iSER-assisted iSCSI connection. + + iSCSI/iSER Session - An iSER-assisted iSCSI session. + + iSCSI-iSER Pair - The iSCSI layer and the underlying iSER layer. + + iSER - iSCSI Extensions for RDMA, the protocol defined in this + document. + + iSER-assisted - A term generally used to describe the operation of + iSCSI when the iSER functionality is also enabled below the iSCSI + layer for the specific iSCSI/iSER connection in question. + + + +Ko, et al. Standards Track [Page 13] + +RFC 5046 iSER Specification October 2007 + + + iSER-IRD - This variable represents the maximum number of incoming + outstanding RDMA Read Requests that the iSER layer at the + initiator declares on a particular RCaP Stream. + + iSER-ORD - This variable represents the maximum number of outstanding + RDMA Read Requests that the iSER layer can initiate on a + particular RCaP Stream. This variable is maintained only by the + iSER layer at the target. + + iSER Layer - The layer that implements the iSCSI Extensions for RDMA + (iSER) protocol. + + iWARP - A suite of wire protocols comprising of [RDMAP], [DDP], and + [MPA] when layered above [TCP]. [RDMAP] and [DDP] may be layered + above SCTP or other transport protocols. + + Local Mapping - A task state record maintained by the iSER layer that + associates the Initiator Task Tag to the local STag(s). The + specifics of the record structure are implementation dependent. + + Local Peer - The implementation of the RDMA-Capable Protocol on the + local end of the connection. Used to refer to the local entity + when describing protocol exchanges or other interactions between + two Nodes. + + Node - A computing device attached to one or more links of a network. + A Node in this context does not refer to a specific application or + protocol instantiation running on the computer. A Node may + consist of one or more RDMA-Capable Controllers installed in a + host computer. + + Operational Primitive - An Operational Primitive is an abstract + functional interface procedure that requests that another layer + perform a specific action on the requestor's behalf or notifies + the other layer of some event. The Datamover Interface between an + iSCSI layer and a Datamover layer within an iSCSI end node uses a + set of Operational Primitives to define the functional interface + between the two layers. Note that not every invocation of an + Operational Primitive may elicit a response from the requested + layer. A full discussion of the Operational Primitive types and + request-response semantics available to iSCSI and iSER can be + found in [DA]. + + Outbound RDMA Read Queue Depth (ORD) - The maximum number of + outstanding RDMA Read Requests that the RDMA-Capable Controller + can initiate on a particular RCaP Stream at the Data Sink. For + + + + + +Ko, et al. Standards Track [Page 14] + +RFC 5046 iSER Specification October 2007 + + + some RDMA-Capable Protocol layer, the term "ORD" may be known by a + different name. For example, for InfiniBand, the equivalent for + ORD is the Initiator Depth. + + Phase-Collapse - Refers to the optimization in iSCSI where the SCSI + status is transferred along with the final SCSI Data-in PDU from a + target. See Section 3.2 in [RFC3720]. + + RCaP Message - One or more packets of the network layer comprising a + single RDMA Operation or a part of an RDMA Read Operation of the + RDMA-Capable Protocol. For iWARP, an RCaP Message is known as an + RDMAP Message. + + RCaP Stream - A single bidirectional association between the peer + RDMA-Capable Protocol layers on two Nodes over a single + transport-level stream. For iWARP, an RCaP Stream is known as an + RDMAP Stream, and the association is created when the connection + transitions to iSER-assisted mode following a successful Login + Phase during which iSER support is negotiated. + + RDMA-Capable Protocol (RCaP) - The protocol or protocol suite that + provides a reliable RDMA transport functionality, e.g., iWARP, + InfiniBand, etc. + + RDMA-Capable Controller - A network I/O adapter or embedded + controller with RDMA functionality. For example, for iWARP, this + could be an RNIC, and for InfiniBand, this could be a HCA (Host + Channel Adapter) or TCA (Target Channel Adapter). + + RDMA-enabled Network Interface Controller (RNIC) - A network I/O + adapter or embedded controller with iWARP functionality. + + RDMA Operation - A sequence of RCaP Messages, including control + Messages, to transfer data from a Data Source to a Data Sink. The + following RDMA Operations are defined - RDMA Write Operation, RDMA + Read Operation, Send Operation, Send with Invalidate Operation, + Send with Solicited Event Operation, Send with Solicited Event and + Invalidate Operation, and Terminate Operation. + + RDMA Protocol (RDMAP) - A wire protocol that supports RDMA Operations + to transfer ULP data between a Local Peer and the Remote Peer as + described in [RDMAP]. + + RDMA Read Operation - An RDMA Operation used by the Data Sink to + transfer the contents of a Data Source buffer from the Remote Peer + to a Data Sink buffer at the Local Peer. An RDMA Read operation + consists of a single RDMA Read Request Message and a single RDMA + Read Response Message. + + + +Ko, et al. Standards Track [Page 15] + +RFC 5046 iSER Specification October 2007 + + + RDMA Read Request - An RCaP Message used by the Data Sink to request + that the Data Source transfer the contents of a buffer. The RDMA + Read Request Message describes both the Data Source and the Data + Sink buffers. + + RDMA Read Response - An RCaP Message used by the Data Source to + transfer the contents of a buffer to the Data Sink, in response to + an RDMA Read Request. The RDMA Read Response Message only + describes the Data Sink buffer. + + RDMA Write Operation - An RDMA Operation used by the Data Source to + transfer the contents of a Data Source buffer from the Local Peer + to a Data Sink buffer at the Remote Peer. The RDMA Write Message + only describes the Data Sink buffer. + + Remote Direct Memory Access (RDMA) - A method of accessing memory on + a remote system in which the local system specifies the remote + location of the data to be transferred. Employing an RDMA- + Capable Controller in the remote system allows the access to take + place without interrupting the processing of the CPU(s) on the + system. + + Remote Mapping - A task state record maintained by the iSER layer + that associates the Initiator Task Tag to the Advertised STag(s). + The specifics of the record structure are implementation + dependent. + + Remote Peer - The implementation of the RDMA-Capable Protocol on the + opposite end of the connection. Used to refer to the remote + entity when describing protocol exchanges or other interactions + between two Nodes. + + SCSI Layer - This layer builds/receives SCSI CDBs (Command Descriptor + Blocks) and sends/receives them with the remaining command execute + [SAM2] parameters to/from the iSCSI layer. + + Send - An RDMA Operation that transfers the contents of a Buffer from + the Local Peer to a Buffer at the Remote Peer. + + Send Message Type - A Send Message, Send with Invalidate Message, + Send with Solicited Event Message, or Send with Solicited Event + and Invalidate Message. + + SendInvSE Message - A Send with Solicited Event and Invalidate + Message. + + SendSE Message - A Send with Solicited Event Message. + + + + +Ko, et al. Standards Track [Page 16] + +RFC 5046 iSER Specification October 2007 + + + Sequence Number (SN) - DataSN for a SCSI Data-in PDU and R2TSN for an + R2T PDU. The semantics for both types of sequence numbers are as + defined in [RFC3720]. + + Session, iSCSI Session - The group of connections that link an + initiator SCSI port with a target SCSI port form an iSCSI session + (equivalent to a SCSI I-T nexus). Connections can be added to and + removed from a session even while the I-T nexus is intact. Across + all connections within a session, an initiator sees one and the + same target. + + Solicited Event (SE) - A facility by which an RDMA Operation sender + may cause an Event to be generated at the recipient, if the + recipient is configured to generate such an Event, when a Send + with Solicited Event or Send with Solicited Event and Invalidate + Message is received. + + Steering Tag (STag) - An identifier of a Tagged Buffer on a Node + (Local or Remote) as defined in [RDMAP] and [DDP]. For other + RDMA-Capable Protocols, the Steering Tag may be known by different + names but will be herein referred to as STags. For example, for + InfiniBand, a Remote STag is known as an R-Key, and a local STag + is known as an L-Key, and both will be considered STags. + + Tagged Buffer - A buffer that is explicitly Advertised to the iSER + layer at the remote node through the exchange of an STag, Tagged + Offset, and length. + + Tagged Offset (TO) - The offset within a Tagged Buffer. + + Traditional iSCSI - Refers to the iSCSI protocol as defined in + [RFC3720] (i.e. without the iSER enhancements). + + Untagged Buffer - A buffer that is not explicitly Advertised to the + iSER layer at the remode node. + +2.2. Acronyms + + Acronym Definition + -------------------------------------------------------------- + + AHS Additional Header Segment + + BHS Basic Header Segment + + CO Connection Only + + CRC Cyclic Redundancy Check + + + +Ko, et al. Standards Track [Page 17] + +RFC 5046 iSER Specification October 2007 + + + DDP Direct Data Placement Protocol + + DI Datamover Interface + + HCA Host Channel Adapter + + IANA Internet Assigned Numbers Authority + + IB InfiniBand + + IETF Internet Engineering Task Force + + I/O Input - Output + + IO Initialize Only + + IP Internet Protocol + + IPoIB IP over InfiniBand + + IPsec Internet Protocol Security + + iSER iSCSI Extensions for RDMA + + ITT Initiator Task Tag + + LO Leading Only + + MPA Marker PDU Aligned Framing for TCP + + NOP No Operation + + NSG Next Stage (during the iSCSI Login Phase) + + OS Operating System + + PDU Protocol Data Unit + + R2T Ready To Transfer + + R2TSN Ready To Transfer Sequence Number + + RDMA Remote Direct Memory Access + + RDMAP Remote Direct Memory Access Protocol + + RFC Request For Comments + + + + +Ko, et al. Standards Track [Page 18] + +RFC 5046 iSER Specification October 2007 + + + RNIC RDMA-enabled Network Interface Controller + + SAM2 SCSI Architecture Model - 2 + + SCSI Small Computer Systems Interface + + SNACK Selective Negative Acknowledgment - also + Sequence Number Acknowledgement for data + + STag Steering Tag + + SW Session Wide + + TCA Target Channel Adapter + + TCP Transmission Control Protocol + + TMF Task Management Function + + TTT Target Transfer Tag + + TO Tagged Offset + + ULP Upper Level Protocol + +2.3. Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [RFC2119]. + +3. Upper Layer Interface Requirements + + This section discusses the upper layer interface requirements in the + form of an abstract model of the required interactions between the + iSCSI layer and the iSER layer. The abstract model used here is + derived from the architectural model described in [DA]. [DA] also + provides a functional overview of the interactions between the iSCSI + layer and the Datamover layer as intended by the Datamover + Architecture. + + The interface requirements are specified by Operational Primitives. + An Operational Primitive is an abstract functional interface + procedure between the iSCSI layer and the iSER layer that requests + one layer to perform a specific action on behalf of the other layer + or notifies the other layer of some event. Whenever an Operational + Primitive in invoked, the Connection_Handle qualifier is used to + identify a particular iSCSI connection. For some Operational + + + +Ko, et al. Standards Track [Page 19] + +RFC 5046 iSER Specification October 2007 + + + Primitives, a Data_Descriptor is used to identify the iSCSI/SCSI data + buffer associated with the requested or completed operation. + + The abstract model and the Operational Primitives defined in this + section facilitate the description of the iSER protocol. In the rest + of the iSER specification, the compliance statements related to the + use of these Operational Primitives are only for the purpose of the + required interactions between the iSCSI layer and the iSER layer. + Note that the compliance statements related to the Operational + Primitives in the rest of this specification only mandate functional + equivalence on implementations, but do not put any requirements on + the implementation specifics of the interface between the iSCSI layer + and the iSER layer. + + Each Operational Primitive is invoked with a set of qualifiers that + specify the information context for performing the specific action + being requested of the Operational Primitive. While the qualifiers + are required, the method of realizing the qualifiers (e.g., by + passing synchronously with invocation, or by retrieving from task + context, or by retrieving from shared memory, etc.) is implementation + dependent. + +3.1. Operational Primitives Offered by iSER + + The iSER protocol layer MUST support the following Operational + Primitives to be used by the iSCSI protocol layer. + +3.1.1. Send_Control + + Input qualifiers: Connection_Handle, BHS and AHS (if any) of the + iSCSI PDU, PDU-specific qualifiers + + Return results: Not specified + + This is used by the iSCSI layers at the initiator and the target to + request the outbound transfer of an iSCSI control-type PDU (see + Section 7.2). Qualifiers that only apply for a particular control- + type PDU are known as PDU-specific qualifiers, e.g., + ImmediateDataSize for a SCSI write command. For details on PDU- + specific qualifiers, see Section 7.3. The iSCSI layer can only + invoke the Send_Control Operational Primitive when the connection is + in iSER-assisted mode. + +3.1.2. Put_Data + + Input qualifiers: Connection_Handle, content of a SCSI Data-in + PDU header, Data_Descriptor, Notify_Enable + + + + +Ko, et al. Standards Track [Page 20] + +RFC 5046 iSER Specification October 2007 + + + Return results: Not specified + + This is used by the iSCSI layer at the target to request the outbound + transfer of data for a SCSI Data-in PDU from the buffer identified by + the Data_Descriptor qualifier. The iSCSI layer can only invoke the + Put_Data Operational Primitive when the connection is in iSER- + assisted mode. + + The Notify_Enable qualifier is used to indicate to the iSER layer + whether or not it should generate an eventual local completion + notification to the iSCSI layer. See Section 3.2.2 on + Data_Completion_Notify for details. + +3.1.3. Get_Data + + Input qualifiers: Connection_Handle, content of an R2T PDU, + Data_Descriptor, Notify_Enable + + Return results: Not specified + + This is used by the iSCSI layer at the target to request the inbound + transfer of solicited data requested by an R2T PDU into the buffer + identified by the Data_Descriptor qualifier. The iSCSI layer can + only invoke the Get_Data Operational Primitive when the connection is + in iSER-assisted mode. + + The Notify_Enable qualifier is used to indicate to the iSER layer + whether or not it should generate the eventual local completion + notification to the iSCSI layer. See Section 3.2.2 on + Data_Completion_Notify for details. + +3.1.4. Allocate_Connection_Resources + + Input qualifiers: Connection_Handle, Resource_Descriptor + (optional) + + Return results: Status + + This is used by the iSCSI layers at the initiator and the target to + request the allocation of all connection resources necessary to + support RCaP for an operational iSCSI/iSER connection. The iSCSI + layer may optionally specify the implementation-specific resource + requirements for the iSCSI connection using the Resource_Descriptor + qualifier. + + A return result of Status=success means that the invocation + succeeded, and a return result of Status=failure means that the + invocation failed. If the invocation is for a Connection_Handle for + + + +Ko, et al. Standards Track [Page 21] + +RFC 5046 iSER Specification October 2007 + + + which an earlier invocation succeeded, the request will be ignored by + the iSER layer and the result of Status=success will be returned. + Only one Allocate_Connection_Resources Operational Primitive + invocation can be outstanding for a given Connection_Handle at any + time. + +3.1.5. Deallocate_Connection_Resources + + Input qualifiers: Connection_Handle + + Return results: Not specified + + This is used by the iSCSI layers at the initiator and the target to + request the deallocation of all connection resources that were + allocated earlier as a result of a successful invocation of the + Allocate_Connection_Resources Operational Primitive. + +3.1.6. Enable_Datamover + + Input qualifiers: Connection_Handle, + Transport_Connection_Descriptor, Final Login_Response_PDU + (optional) + + Return results: Not specified + + This is used by the iSCSI layers at the initiator and the target to + request that a specified iSCSI connection be transitioned to iSER- + assisted mode. The Transport_Connection_Descriptor qualifier is used + to identify the specific connection associated with the + Connection_Handle. The iSCSI layer can only invoke the + Enable_Datamover Operational Primitive when there is a corresponding + prior resource allocation. + + The Final_Login_Response_PDU input qualifier is applicable only for a + target, and contains the final Login Response PDU that concludes the + iSCSI Login Phase. If the underlying transport is TCP, the final + Login Response PDU must be sent as a byte stream as expected by the + iSCSI layer at the initiator. When this qualifier is used, the iSER + layer at the target MUST transmit this final Login Response PDU + before transitioning to iSER-assisted mode. + +3.1.7. Connection_Terminate + + Input qualifiers: Connection_Handle + + Return results: Not specified + + + + + +Ko, et al. Standards Track [Page 22] + +RFC 5046 iSER Specification October 2007 + + + This is used by the iSCSI layers at the initiator and the target to + request that a specified iSCSI/iSER connection be terminated and all + associated connection and task resources be freed. When this + Operational Primitive invocation returns to the iSCSI layer, the + iSCSI layer may assume full ownership of all iSCSI-level resources, + e.g., I/O Buffers, associated with the connection. + +3.1.8. Notice_Key_Values + + Input qualifiers: Connection_Handle, number of keys, list of + Key-Value pairs + + Return results: Not specified + + This is used by the iSCSI layers at the initiator and the target to + request that the iSER layer take note of the specified Key-Value + pairs that were negotiated by the iSCSI peers for the connection. + +3.1.9. Deallocate_Task_Resources + + Input qualifiers: Connection_Handle, ITT + + Return results: Not specified + + This is used by the iSCSI layers at the initiator and the target to + request the deallocation of all RCaP-specific resources allocated by + the iSER layer for the task identified by the ITT qualifier. The + iSER layer may require a certain number of RCaP-specific resources + associated with the ITT for each new iSCSI task. In the normal + course of execution, these task-level resources in the iSER layer are + assumed to be transparently allocated on each task initiation and + deallocated on the conclusion of each task as appropriate. In + exception scenarios where the task does not conclude with a SCSI + Response PDU, the iSER layer needs to be notified of the individual + task terminations to aid its task-level resource management. This + Operational Primitive is used for this purpose, and is not needed + when a SCSI Response PDU normally concludes a task. Note that RCaP- + specific task resources are deallocated by the iSER layer when a SCSI + Response PDU normally concludes a task, even if the SCSI status was + not success. + +3.2. Operational Primitives Used by iSER + + The iSER layer MUST use the following Operational Primitives offered + by the iSCSI protocol layer when the connection is in iSER-assisted + mode. + + + + + +Ko, et al. Standards Track [Page 23] + +RFC 5046 iSER Specification October 2007 + + +3.2.1. Control_Notify + + Input qualifiers: Connection_Handle, an iSCSI control-type PDU + + Return results: Not specified + + This is used by the iSER layers at the initiator and the target to + notify the iSCSI layer of the availability of an inbound iSCSI + control-type PDU. A PDU is described as "available" to the iSCSI + layer when the iSER layer notifies the iSCSI layer of the reception + of that inbound PDU, along with an implementation-specific indication + as to where the received PDU is. + +3.2.2. Data_Completion_Notify + + Input qualifiers: Connection_Handle, ITT, SN + + Return results: Not specified + + This is used by the iSER layer to notify the iSCSI layer of the + completion of outbound data transfer that was requested by the iSCSI + layer only if the invocation of the Put_Data Operational Primitive + (see Section 3.1.2) was qualified with Notify_Enable set. SN refers + to the DataSN associated with the SCSI Data-in PDU. + + This is used by the iSER layer to notify the iSCSI layer of the + completion of inbound data transfer that was requested by the iSCSI + layer only if the invocation of the Get_Data Operational Primitive + (see Section 3.1.3) was qualified with Notify_Enable set. SN refers + to the R2TSN associated with the R2T PDU. + +3.2.3. Data_ACK_Notify + + Input qualifier: Connection_Handle, ITT, DataSN + + Return results: Not specified + + This is used by the iSER layer at the target to notify the iSCSI + layer of the arrival of the data acknowledgement (as defined in + [RFC3720]) requested earlier by the iSCSI layer for the outbound data + transfer via an invocation of the Put_Data Operational Primitive + where the A-bit in the SCSI Data-in PDU is set to 1. See Section + 7.3.5. DataSN refers to the expected DataSN of the next SCSI Data-in + PDU, which immediately follows the SCSI Data-in PDU with the A-bit + set to which this notification corresponds, with semantics as defined + in [RFC3720]. + + + + + +Ko, et al. Standards Track [Page 24] + +RFC 5046 iSER Specification October 2007 + + +3.2.4. Connection_Terminate_Notify + + Input qualifiers: Connection_Handle + + Return results: Not specified + + This is used by the iSER layers at the initiator and the target to + notify the iSCSI layer of the unsolicited termination or failure of + an iSCSI/iSER connection. The iSER layer MUST deallocate the + connection and task resources associated with the terminated + connection before the invocation of this Operational Primitive. Note + that the Connection_Terminate_Notify Operational Primitive is not + invoked when the termination of the connection is earlier requested + by the local iSCSI layer. + +3.3. iSCSI Protocol Usage Requirements + + To operate in an iSER-assisted mode, the iSCSI layers at both the + initiator and the target MUST negotiate the RDMAExtensions key (see + Section 6.3) to "Yes" on the leading connection. If the + RDMAExtensions key is not negotiated to "Yes", then iSER-assisted + mode MUST NOT be used. If the RDMAExtensions key is negotiated to + "Yes" but the invocation of the Allocate_Connection_Resources + Operational Primitive to the iSER layer fails, the iSCSI layer MUST + fail the iSCSI Login process or terminate the connection as + appropriate. See Section 10.1.3.1 for details. + + If the RDMAExtensions key is negotiated to "Yes", the iSCSI layer + MUST satisfy the following protocol usage requirements from the iSER + protocol: + + 1. The iSCSI layer at the initiator MUST set ExpDataSN to 0 in Task + Management Function Requests for Task Allegiance Reassignment for + read/bidirectional commands, so as to cause the target to send + all unacknowledged read data. + + 2. The iSCSI layer at the target MUST always return the SCSI status + in a separate SCSI Response PDU for read commands, i.e., there + MUST NOT be a "phase collapse" in concluding a SCSI read command. + + 3. The iSCSI layers at both the initiator and the target MUST + support the keys as defined in Section 6 on Login/Text + Operational Keys. If used as specified, these keys MUST NOT be + answered with NotUnderstood, and the semantics as defined MUST be + followed for each iSER-assisted connection. + + 4. The iSCSI layer at the initiator MUST NOT issue SNACKs for PDUs. + + + + +Ko, et al. Standards Track [Page 25] + +RFC 5046 iSER Specification October 2007 + + +4. Lower Layer Interface Requirements + +4.1. Interactions with the RCaP Layer + + The iSER protocol layer is layered on top of an RCaP layer (see + Figure 1) and the following are the key features that are assumed to + be supported by any RCaP layer: + + * The RCaP layer supports all basic RDMA operations, including RDMA + Write Operation, RDMA Read Operation, Send Operation, Send with + Invalidate Operation, Send with Solicited Event Operation, Send + with Solicited Event and Invalidate Operation, and Terminate + Operation. + + * The RCaP layer provides reliable, in-order message delivery and + direct data placement. + + * When the iSER layer initiates an RDMA Read Operation following an + RDMA Write Operation on one RCaP Stream, the RDMA Read Response + Message processing on the remote node will be started only after + the preceding RDMA Write Message payload is placed in the memory + of the remote node. + + * The RCaP layer encapsulates a single iSER Message into a single + RCaP Message on the Data Source side. The RCaP layer decapsulates + the iSER Message before delivering it to the iSER layer on the + Data Sink side. + + * When the iSER layer provides the STag to be remotely invalidated + to the RCaP layer for a SendInvSE Message, the RCaP layer uses + this STag as the STag to be invalidated in the SendInvSE Message. + + * The RCaP layer uses the STag and Tagged Offset provided by the + iSER layer for the RDMA Write and RDMA Read Request Messages. + + * When the RCaP layer delivers the content of an RDMA Send Message + Type to the iSER layer, the RCaP layer provides the length of the + RDMA Send message. This ensures that the iSER layer does not have + to carry a length field in the iSER header. + + * When the RCaP layer delivers the SendSE or SendInvSE Message to + the iSER layer, it notifies the iSER layer with the mechanism + provided on that interface. + + * When the RCaP layer delivers a SendInvSE Message to the iSER + layer, it passes the value of the STag that was invalidated. + + + + + +Ko, et al. Standards Track [Page 26] + +RFC 5046 iSER Specification October 2007 + + + * The RCaP layer propagates all status and error indications to the + iSER layer. + + * For a transport layer that operates in byte stream mode such as + TCP, the RCaP implementation supports the enabling of the RDMA + mode after connection establishment and the exchange of Login + parameters in byte stream mode. For a transport layer that + provides message delivery capability such as [IB], the RCaP + implementation supports the use of the messaging capability by the + iSCSI layer directly for the Login Phase after connection + establishment before enabling iSER-assisted mode. + + * Whenever the iSER layer terminates the RCaP Stream, the RCaP layer + terminates the associated connection. + +4.2. Interactions with the Transport Layer + + The iSER layer does not directly setup the transport layer connection + (e.g., TCP, or [IB]). During connection setup, the iSCSI layer is + responsible for setting up the connection. If the login is + successful, the iSCSI layer invokes the Enable_Datamover Operational + Primitive to request the iSER layer to transition to the iSER- + assisted mode for that iSCSI connection. See Section 5.1 on + iSCSI/iSER connection setup. After transitioning to iSER-assisted + mode, the RCaP layer and the underlying transport layer are + responsible for maintaining the connection and reporting to the iSER + layer any connection failures. + +5. Connection Setup and Termination + +5.1. iSCSI/iSER Connection Setup + + During connection setup, the iSCSI layer at the initiator is + responsible for establishing a connection with the target. After the + connection is established, the iSCSI layers at the initiator and the + target enter the Login Phase using the same rules as outlined in + [RFC3720]. Transition to iSER-assisted mode occurs when the + connection transitions into the iSCSI Full Feature Phase following a + successful login negotiation between the initiator and the target in + which iSER-assisted mode is negotiated and the connection resources + necessary to support RCaP have been allocated at both the initiator + and the target. The same connection MUST be used for both the iSCSI + Login Phase and the subsequent iSER-assisted Full Feature Phase. + + iSER-assisted mode MUST be enabled only if it is negotiated on the + leading connection during the LoginOperationalNegotiation stage of + the iSCSI Login Phase. iSER-assisted mode is negotiated using the + RDMAExtensions=<boolean-value> key. Both the initiator and the + + + +Ko, et al. Standards Track [Page 27] + +RFC 5046 iSER Specification October 2007 + + + target MUST exchange the RDMAExtensions key with the value set to + "Yes" to enable iSER-assisted mode. If both the initiator and the + target fail to negotiate the RDMAExtensions key set to "Yes", then + the connection MUST continue with the login semantics as defined in + [RFC3720]. If the RDMAExtensions key is not negotiated to Yes, then + for some RCaP implementation (such as [IB]), the connection may need + to be re-established in TCP capable mode. (For InfiniBand this will + require an [IPoIB] type connection.) + + iSER-assisted mode is defined for a Normal session only and the + RDMAExtensions key MUST NOT be negotiated for a Discovery session. + Discovery sessions are always conducted using the transport layer as + described in [RFC3720]. + + An iSER enabled node is not required to initiate the RDMAExtensions + key exchange if its preference is for the Traditional iSCSI mode. + The RDMAExtensions key, if offered, MUST be sent in the first + available Login Response or Login Request PDU in the + LoginOperationalNegotiation stage. This is due to the fact that the + value of some login parameters might depend on whether iSER-assisted + mode is enabled. + + iSER-assisted mode is a session-wide attribute. If both the + initiator and the target negotiate RDMAExtensions="Yes" on the + leading connection of a session, then all subsequent connections of + the same session MUST enable iSER-assisted mode without having to + exchange an RDMAExtensions key during the iSCSI Login Phase. + + Conversely, if both the initiator and the target fail to negotiate + RDMAExtensions to "Yes" on the leading connection of a session, then + the RDMAExtensions key MUST NOT be negotiated further on any + additional subsequent connection of the session. + + When the RDMAExtensions key is negotiated to "Yes", the HeaderDigest + and the DataDigest keys MUST be negotiated to "None" on all + iSCSI/iSER connections participating in that iSCSI session. This is + because, for an iSCSI/iSER connection, RCaP is responsible for + providing error detection that is at least as good as a 32-bit CRC + for all iSER Messages. Furthermore, all SCSI Read data are sent + using RDMA Write Messages instead of the SCSI Data-in PDUs, and all + solicited SCSI write data are sent using RDMA Read Response Messages + instead of the SCSI Data-out PDUs. HeaderDigest and DataDigest that + apply to iSCSI PDUs, would not be appropriate for RDMA Read and RDMA + Write operations used with iSER. + + + + + + + +Ko, et al. Standards Track [Page 28] + +RFC 5046 iSER Specification October 2007 + + +5.1.1. Initiator Behavior + + If the outcome of the iSCSI negotiation is to enable iSER-assisted + mode, then on the initiator side, prior to sending the Login Request + with the T (Transit) bit set to 1 and the NSG (Next Stage) field set + to FullFeaturePhase, the iSCSI layer MUST request that the iSER layer + allocate the connection resources necessary to support RCaP by + invoking the Allocate_Connection_Resources Operational Primitive. + The connection resources required are defined by implementation and + are outside the scope of this specification. The iSCSI layer may + invoke the Notice_Key_Values Operational Primitive before invoking + the Allocate_Connection_Resources Operational Primitive to request + that the iSER layer take note of the negotiated values of the iSCSI + keys for the connection. The specific keys to be passed as input + qualifiers are implementation dependent. These may include, but are + not limited to, MaxOutstandingR2T, ErrorRecoveryLevel, etc. + + To minimize the potential for a denial-of service attack, the iSCSI + layer MUST NOT request that the iSER layer allocate the connection + resources necessary to support RCaP until the iSCSI layer is + sufficiently far along in the iSCSI Login Phase that it is reasonably + certain that the peer side is not an attacker. In particular, if the + Login Phase includes a SecurityNegotiation stage, the iSCSI layer + MUST defer the connection resource allocation (i.e., invoking the + Allocate_Connection_Resources Operational Primitive) to the + LoginOperationalNegotiation stage [RFC3720] so that the resource + allocation occurs after the authentication phase is completed. + + Among the connection resources allocated at the initiator is the + Inbound RDMA Read Queue Depth (IRD). As described in Section 9.5.1, + R2Ts are transformed by the target into RDMA Read operations. IRD + limits the maximum number of simultaneously incoming outstanding RDMA + Read Requests per an RCaP Stream from the target to the initiator. + The required value of IRD is outside the scope of the iSER + specification. The iSER layer at the initiator MUST set IRD to 1 or + higher if R2Ts are to be used in the connection. However, the iSER + layer at the initiator MAY set IRD to 0 based on implementation + configuration, which indicates that no R2Ts will be used on that + connection. Initially, the iSER-IRD value at the initiator SHOULD be + set to the IRD value at the initiator and MUST NOT be more than the + IRD value. + + On the other hand, the Outbound RDMA Read Queue Depth (ORD) MAY be + set to 0, since the iSER layer at the initiator does not issue RDMA + Read Requests to the target. + + + + + + +Ko, et al. Standards Track [Page 29] + +RFC 5046 iSER Specification October 2007 + + + Failure to allocate the requested connection resources locally + results in a login failure and its handling is described in Section + 10.1.3.1. + + If the iSER layer at the initiator is successful in allocating the + connection resources necessary to support RCaP, the following events + MUST occur in the specified sequence: + + 1. The iSER layer MUST return a success status to the iSCSI layer in + response to the Allocate_Connection_Resources Operational + Primitive. + + 2. After the target returns the Login Response with the T bit set to + 1 and the NSG field set to FullFeaturePhase, and a status class + of 0 (Success), the iSCSI layer MUST request that the iSER layer + transition to iSER-assisted mode by invoking the Enable_Datamover + Operational Primitive with the following qualifiers. (See + Section 10.1.4.6 for the case when the status class is not + Success.): + + a. Connection_Handle that identifies the iSCSI connection. + + b. Transport_Connection_Descriptor that identifies the specific + transport connection associated with the Connection_Handle. + + 3. If necessary, the iSER layer should enable RCaP and transition + the connection to iSER-assisted mode. When the RCaP is iWARP, + then this step MUST be done. Not all RCaPs may need it depending + on the RCaP Stream start-up state. + + 4. The iSER layer MUST send the iSER Hello Message as the first iSER + Message. See Section 5.1.3 on iSER Hello Exchange. + +5.1.2. Target Behavior + + If the outcome of the iSCSI negotiation is to enable iSER-assisted + mode, then on the target side, prior to sending the Login Response + with the T (Transit) bit set to 1 and the NSG (Next Stage) field set + to FullFeaturePhase, the iSCSI layer MUST request that the iSER layer + allocate the resources necessary to support RCaP by invoking the + Allocate_Connection_Resources Operational Primitive. The connection + resources required are defined by implementation and are outside the + scope of this specification. Optionally, the iSCSI layer may invoke + the Notice_Key_Values Operational Primitive before invoking the + Allocate_Connection_Resources Operational Primitive to request that + the iSER layer take note of the negotiated values of the iSCSI keys + for the connection. The specific keys to be passed as input + + + + +Ko, et al. Standards Track [Page 30] + +RFC 5046 iSER Specification October 2007 + + + qualifiers are implementation dependent. These may include, but are + not limited to, MaxOutstandingR2T, ErrorRecoveryLevel, etc. + + To minimize the potential for a denial-of-service attack, the iSCSI + layer MUST NOT request that the iSER layer allocate the connection + resources necessary to support RCaP until the iSCSI layer is + sufficiently far along in the iSCSI Login Phase that it is reasonably + certain that the peer side is not an attacker. In particular, if the + Login Phase includes a SecurityNegotiation stage, the iSCSI layer + MUST defer the connection resource allocation (i.e., invoking the + Allocate_Connection_Resources Operational Primitive) to the + LoginOperationalNegotiation stage [RFC3720] so that the resource + allocation occurs after the authentication phase is completed. + + Among the connection resources allocated at the target is the + Outbound RDMA Read Queue Depth (ORD). As described in Section 9.5.1, + R2Ts are transformed by the target into RDMA Read operations. The + ORD limits the maximum number of simultaneously outstanding RDMA Read + Requests per RCaP Stream from the target to the initiator. + Initially, the iSER-ORD value at the target SHOULD be set to the ORD + value at the target. + + On the other hand, the IRD at the target MAY be set to 0 since the + iSER layer at the target does not expect RDMA Read Requests to be + issued by the initiator. + + Failure to allocate the requested connection resources locally + results in a login failure and its handling is described in Section + 10.1.3.1. + + If the iSER layer at the target is successful in allocating the + connection resources necessary to support RCaP, the following events + MUST occur in the specified sequence: + + 1. The iSER layer MUST return a success status to the iSCSI layer in + response to the Allocate_Connection_Resources Operational + Primitive. + + 2. The iSCSI layer MUST request that the iSER layer transition to + iSER-assisted mode by invoking the Enable_Datamover Operational + Primitive with the following qualifiers: + + a. Connection_Handle that identifies the iSCSI connection. + + b. Transport_Connection_Descriptor that identifies the specific + transport connection associated with the Connection_Handle. + + + + + +Ko, et al. Standards Track [Page 31] + +RFC 5046 iSER Specification October 2007 + + + c. The final transport layer (e.g., TCP) message containing the + Login Response with the T bit set to 1 and the NSG field set + to FullFeaturePhase. + + 3. The iSER layer MUST send the final Login Response PDU in the + native transport mode to conclude the iSCSI Login Phase. If the + underlying transport is TCP, then the iSER layer MUST send the + final Login Response PDU in byte stream mode. + + 4. After sending the final Login Response PDU, the iSER layer should + enable RCaP if necessary and transition the connection to iSER- + assisted mode. When the RCaP is iWARP, then this step MUST be + done. Not all RCaPs may need it depending on the RCaP Stream + start-up state. + + 5. After receiving the iSER Hello Message from the initiator, the + iSER layer MUST respond with the iSER HelloReply Message to be + sent as the first iSER Message. See Section 5.1.3 on iSER Hello + Exchange for more details. + + Note: In the above sequence, the operations as described in bullets 3 + and 4 MUST be performed atomically for iWARP connections. Failure to + do this may result in race conditions. + +5.1.3. iSER Hello Exchange + + After the connection transitions into iSER-assisted mode, the first + iSER Message sent by the iSER layer at the initiator to the target + MUST be the iSER Hello Message. The iSER Hello Message is used by + the iSER layer at the initiator to declare iSER parameters to the + target. See Section 9.3 on iSER Header Format for the iSER Hello + Message. + + In response to the iSER Hello Message, the iSER layer at the target + MUST return the iSER HelloReply Message as the first iSER Message + sent by the target. The iSER HelloReply Message is used by the iSER + layer at the target to declare iSER parameters to the initiator. See + Section 9.4 on iSER Header Format for the iSER HelloReply Message. + + In the iSER Hello Message, the iSER layer at the initiator declares + the iSER-IRD value to the target. + + Upon receiving the iSER Hello Message, the iSER layer at the target + MUST set the iSER-ORD value to the minimum of the iSER-ORD value at + the target and the iSER-IRD value declared by the initiator. The + iSER layer at the target MAY adjust (lower) its ORD value to match + the iSER-ORD value if the iSER-ORD value is smaller than the ORD + value at the target in order to free up the unused resources. + + + +Ko, et al. Standards Track [Page 32] + +RFC 5046 iSER Specification October 2007 + + + In the iSER HelloReply Message, the iSER layer at the target declares + the iSER-ORD value to the initiator. + + Upon receiving the iSER HelloReply Message, the iSER layer at the + initiator MAY adjust (lower) its IRD value to match the iSER-ORD + value in order to free up the unused resources, if the iSER-ORD value + declared by the target is smaller than the iSER-IRD value declared by + the initiator. + + It is an iSER level negotiation failure if the iSER parameters + declared in the iSER Hello Message by the initiator are unacceptable + to the target. This includes the following: + + * The initiator-declared iSER-IRD value is greater than 0 and the + target-declared iSER-ORD value is 0. + + * The initiator-supported and the target-supported iSER protocol + versions do not overlap. + + See Section 10.1.3.2 for the handling of the error situation. + +5.2. iSCSI/iSER Connection Termination + +5.2.1. Normal Connection Termination at the Initiator + + The iSCSI layer at the initiator terminates an iSCSI/iSER connection + normally by invoking the Send_Control Operational Primitive qualified + with the Logout Request PDU. The iSER layer at the initiator MUST + use a SendSE Message to send the Logout Request PDU to the target. + After the iSER layer at the initiator receives the SendSE Message + containing the Logout Response PDU from the target, it MUST notify + the iSCSI layer by invoking the Control_Notify Operational Primitive + qualified with the Logout Response PDU. + + After the iSCSI logout process is complete, the iSCSI layer at the + target is responsible for closing the iSCSI/iSER connection as + described in Section 5.2.2. After the RCaP layer at the initiator + reports that the connection has been closed, the iSER layer at the + initiator MUST deallocate all connection and task resources (if any) + associated with the connection, and invalidate the Local Mapping(s) + (if any) that associate the ITT(s) used on that connection to the + local STag(s) before notifying the iSCSI layer by invoking the + Connection_Terminate_Notify Operational Primitive. + + + + + + + + +Ko, et al. Standards Track [Page 33] + +RFC 5046 iSER Specification October 2007 + + +5.2.2. Normal Connection Termination at the Target + + Upon receiving the SendSE Message containing the Logout Request PDU, + the iSER layer at the target MUST notify the iSCSI layer at the + target by invoking the Control_Notify Operational Primitive qualified + with the Logout Request PDU. The iSCSI layer completes the logout + process by invoking the Send_Control Operational Primitive qualified + with the Logout Response PDU. The iSER layer at the target MUST use + a SendSE Message to send the Logout Response PDU to the initiator. + After the iSCSI logout process is complete, the iSCSI layer at the + target MUST request that the iSER layer at the target terminate the + RCaP Stream by invoking the Connection_Terminate Operational + Primitive. + + As part of the termination process, the RCaP layer MUST close the + connection. When the RCaP layer notifies the iSER layer after the + RCaP Stream and the associated connection are terminated, the iSER + layer MUST deallocate all connection and task resources (if any) + associated with the connection, and invalidate the Local and Remote + Mapping(s) (if any) that associate the ITT(s) used on that connection + to the local STag(s) and the Advertised STag(s) respectively. + +5.2.3. Termination without Logout Request/Response PDUs + +5.2.3.1. Connection Termination Initiated by the iSCSI Layer + + The Connection_Terminate Operational Primitive MAY be invoked by the + iSCSI layer to request that the iSER layer terminate the RCaP Stream + without having previously exchanged the Logout Request and Logout + Response PDUs between the two iSCSI/iSER nodes. As part of the + termination process, the RCaP layer will close the connection. When + the RCaP layer notifies the iSER layer after the RCaP Stream and the + associated connection are terminated, the iSER layer MUST perform the + following actions. + + If the Connection_Terminate Operational Primitive is invoked by the + iSCSI layer at the target, then the iSER layer at the target MUST + deallocate all connection and task resources (if any) associated with + the connection, and invalidate the Local and Remote Mappings (if any) + that associate the ITT(s) used on the connection to the local STag(s) + and the Advertised STag(s), respectively. + + If the Connection_Terminate Operational Primitive is invoked by the + iSCSI layer at the initiator, then the iSER layer at the initiator + MUST deallocate all connection and task resources (if any) associated + with the connection, and invalidate the Local Mapping(s) (if any) + that associate the ITT(s) used on the connection to the local + STag(s). + + + +Ko, et al. Standards Track [Page 34] + +RFC 5046 iSER Specification October 2007 + + +5.2.3.2. Connection Termination Notification to the iSCSI Layer + + If the iSCSI/iSER connection is terminated without the invocation of + Connection_Terminate from the iSCSI layer, the iSER layer MUST notify + the iSCSI layer that the iSCSI/iSER connection has been terminated by + invoking the Connection_Terminate_Notify Operational Primitive. + + Prior to invoking Connection_Terminate_Notify, the iSER layer at the + target MUST deallocate all connection and task resources (if any) + associated with the connection, and invalidate the Local and Remote + Mappings (if any) that associate the ITT(s) used on the connection to + the local STag(s) and the Advertised STag(s), respectively. + + Prior to invoking Connection_Terminate_Notify, the iSER layer at the + initiator MUST deallocate all connection and task resources (if any) + associated with the connection, and invalidate the Local Mappings (if + any) that associate the ITT(s) used on the connection to the local + STag(s). + + If the remote iSCSI/iSER node initiated the closing of the connection + (e.g., by sending a TCP FIN or TCP RST), the iSER layer MUST notify + the iSCSI layer after the RCaP layer reports that the connection is + closed by invoking the Connection_Terminate_Notify Operational + Primitive. + + Another example of a connection termination without a preceding + logout is when the iSCSI layer at the initiator does an implicit + logout (connection reinstatement). + +6. Login/Text Operational Keys + + Certain iSCSI login/text operational keys have restricted usage in + iSER, and additional keys are used to support the iSER protocol + functionality. All other keys defined in [RFC3720] and not discussed + in this section may be used on iSCSI/iSER connections with the same + semantics. + +6.1. HeaderDigest and DataDigest + + Irrelevant when: RDMAExtensions=Yes + + Negotiations resulting in RDMAExtensions=Yes for a session implies + HeaderDigest=None and DataDigest=None for all connections in that + session and overrides both the default and an explicit setting. + + + + + + + +Ko, et al. Standards Track [Page 35] + +RFC 5046 iSER Specification October 2007 + + +6.2. MaxRecvDataSegmentLength + + For an iSCSI connection belonging to a session in which + RDMAExtensions=Yes was negotiated on the leading connection of the + session, MaxRecvDataSegmentLength need not be declared in the Login + Phase. Instead, InitiatorRecvDataSegmentLength (as described in + Section 6.5) and TargetRecvDataSegmentLength (as described in Section + 6.4) keys are negotiated. The values of the local and remote + MaxRecvDataSegmentLength are derived from the + InitiatorRecvDataSegmentLength and TargetRecvDataSegmentLength keys + even if the MaxRecvDataSegmentLength is declared during the Login + Phase. + + In the Full Feature Phase, the initiator MUST consider the value of + its local MaxRecvDataSegmentLength (that it would have declared to + the target) as having the value of InitiatorRecvDataSegmentLength, + and the value of the remote MaxRecvDataSegmentLength (that would have + been declared by the target) as having the value of + TargetRecvDataSegmentLength. Similarly, the target MUST consider the + value of its local MaxRecvDataSegmentLength (that it would have + declared to the initiator) as having the value of + TargetRecvDataSegmentLength, and the value of the remote + MaxRecvDataSegmentLength (that would have been declared by the + initiator) as having the value of InitiatorRecvDataSegmentLength. + + The MaxRecvDataSegmentLength key is applicable only for iSCSI + control-type PDUs. + +6.3. RDMAExtensions + + Use: LO (leading only) + + Senders: Initiator and Target + + Scope: SW (session-wide) + + RDMAExtensions=<boolean-value> + + Irrelevant when: SessionType=Discovery + + Default is No + + Result function is AND + + This key is used by the initiator and the target to negotiate support + for iSER-assisted mode. To enable the use of iSER-assisted mode, + both the initiator and the target MUST exchange RDMAExtensions=Yes. + + + + +Ko, et al. Standards Track [Page 36] + +RFC 5046 iSER Specification October 2007 + + + iSER-assisted mode MUST NOT be used if either the initiator or the + target offers RDMAExtensions=No. + + An iSER-enabled node is not required to initiate the RDMAExtensions + key exchange if it prefers to operate in the Traditional iSCSI mode. + However, if the RDMAExtensions key is to be negotiated, an initiator + MUST offer the key in the first Login Request PDU in the + LoginOperationalNegotiation stage of the leading connection, and a + target MUST offer the key in the first Login Response PDU with which + it is allowed to do so (i.e., the first Login Response PDU issued + after the first Login Request PDU with the C bit set to 0) in the + LoginOperationalNegotiation stage of the leading connection. In + response to the offered key=value pair of RDMAExtensions=yes, an + initiator MUST respond in the next Login Request PDU with which it is + allowed to do so, and a target MUST respond in the next Login + Response PDU with which it is allowed to do so. + + Negotiating the RDMAExtensions key first enables a node to negotiate + the optimal value for other keys. Certain iSCSI keys such as + MaxBurstLength, MaxOutstandingR2T, ErrorRecoveryLevel, InitialR2T, + ImmediateData, etc., may be negotiated differently depending on + whether the connection is in Traditional iSCSI mode or iSER-assisted + mode. + +6.4. TargetRecvDataSegmentLength + + Use: IO (Initialize only) + + Senders: Initiator and Target + + Scope: CO (connection-only) + + Irrelevant when: RDMAExtensions=No + + TargetRecvDataSegmentLength=<numerical-value-512-to-(2**24-1)> + + Default is 8192 bytes + + Result function is minimum + + This key is relevant only for the iSCSI connection of an iSCSI + session if RDMAExtensions=Yes is negotiated on the leading connection + of the session. It is used by the initiator and target to negotiate + the maximum size of the data segment that an initiator may send to + the target in an iSCSI control-type PDU in the Full Feature Phase. + For SCSI Command PDUs and SCSI Data-out PDUs containing non-immediate + unsolicited data to be sent by the initiator, the initiator MUST send + all non-Final PDUs with a data segment size of exactly + + + +Ko, et al. Standards Track [Page 37] + +RFC 5046 iSER Specification October 2007 + + + TargetRecvDataSegmentLength whenever the PDUs constitute a data + sequence whose size is larger than TargetRecvDataSegmentLength. + +6.5. InitiatorRecvDataSegmentLength + + Use: IO (Initialize only) + + Senders: Initiator and Target + + Scope: CO (connection-only) + + Irrelevant when: RDMAExtensions=No + + InitiatorRecvDataSegmentLength=<numerical-value-512-to-(2**24-1)> + + Default is 8192 bytes + + Result function is minimum + + This key is relevant only for the iSCSI connection of an iSCSI + session if RDMAExtensions=Yes is negotiated on the leading connection + of the session. It is used by the initiator and target to negotiate + the maximum size of the data segment that a target may send to the + initiator in an iSCSI control-type PDU in the Full Feature Phase. + +6.6. OFMarker and IFMarker + + Irrelevant when: RDMAExtensions=Yes + + Negotiations resulting in RDMAExtensions=Yes for a session implies + OFMarker=No and IFMarker=No for all connections in that session and + overrides both the default and an explicit setting. + +6.7. MaxOutstandingUnexpectedPDUs + + Use: LO (leading only), Declarative + + Senders: Initiator and Target + + Scope: SW (session-wide) + + Irrelevant when: RDMAExtensions=No + + MaxOutstandingUnexpectedPDUs=<numerical-value-from-2-to-(2**32-1) | + 0> + + Default is 0 + + + + +Ko, et al. Standards Track [Page 38] + +RFC 5046 iSER Specification October 2007 + + + This key is used by the initiator and the target to declare the + maximum number of outstanding "unexpected" iSCSI control-type PDUs + that it can receive in the Full Feature Phase. It is intended to + allow the receiving side to determine the amount of buffer resources + needed beyond the normal flow control mechanism available in iSCSI. + An initiator or target should select a value such that it would not + impose an unnecessary constraint on the iSCSI layer under normal + circumstances. The value of 0 is defined to indicate that the + declarer has no limit on the maximum number of outstanding + "unexpected" iSCSI control-type PDUs that it can receive. See + Sections 8.1.1 and 8.1.2 for the usage of this key. Note that iSER + Hello and HelloReply Messages are not iSCSI control-type PDUs and are + not affected by this key. + +7. iSCSI PDU Considerations + + When a connection is in the iSER-assisted mode, two types of message + transfers are allowed between the iSCSI layer at the initiator and + the iSCSI layer at the target. These are known as the iSCSI data- + type PDUs and the iSCSI control-type PDUs, and these terms are + described in the following sections. + +7.1. iSCSI Data-Type PDU + + An iSCSI data-type PDU is defined as an iSCSI PDU that causes data + transfer, transparent to the remote iSCSI layer, to take place + between the peer iSCSI nodes in the full feature phase of an + iSCSI/iSER connection. An iSCSI data-type PDU, when requested for + transmission by the iSCSI layer in the sending node, results in the + data being transferred without the participation of the iSCSI layers + at the sending and the receiving nodes. This is due to the fact that + the PDU itself is not delivered as-is to the iSCSI layer in the + receiving node. Instead, the data transfer operations are + transformed into the appropriate RDMA operations that are handled by + the RDMA-Capable Controller. The set of iSCSI data-type PDUs + consists of SCSI Data-in PDUs and R2T PDUs. + + If the invocation of the Operational Primitive by the iSCSI layer to + request that the iSER layer process an iSCSI data-type PDU is + qualified with Notify_Enable set, then upon completing the RDMA + operation, the iSER layer at the target MUST notify the iSCSI layer + at the target by invoking the Data_Completion_Notify Operational + Primitive qualified with ITT and SN. There is no data completion + notification at the initiator since the RDMA operations are + completely handled by the RDMA-Capable Controller at the initiator + and the iSER layer at the initiator is not involved with the data + transfer associated with iSCSI data-type PDUs. + + + + +Ko, et al. Standards Track [Page 39] + +RFC 5046 iSER Specification October 2007 + + + If the invocation of the Operational Primitive by the iSCSI layer to + request that the iSER layer process an iSCSI data-type PDU is + qualified with Notify_Enable cleared, then upon completing the RDMA + operation, the iSER layer at the target MUST NOT notify the iSCSI + layer at the target and MUST NOT invoke the Data_Completion_Notify + Operational Primitive. + + If an operation associated with an iSCSI data-type PDU fails for any + reason, the contents of the Data Sink buffers associated with the + operation are considered indeterminate. + +7.2. iSCSI Control-Type PDU + + Any iSCSI PDU that is not an iSCSI data-type PDU and also not a SCSI + Data-out PDU carrying solicited data is defined as an iSCSI control- + type PDU. The iSCSI layer invokes the Send_Control Operational + Primitive to request that the iSER layer process an iSCSI control- + type PDU. iSCSI control-type PDUs are transferred using Send Message + Types of RCaP. Specifically, note that SCSI Data-out PDUs carrying + unsolicited data are defined as iSCSI control-type PDUs. See Section + 7.3.4 on the treatment of SCSI Data-out PDUs. + + When the iSER layer receives an iSCSI control-type PDU, it MUST + notify the iSCSI layer by invoking the Control_Notify Operational + Primitive qualified with the iSCSI control-type PDU. + +7.3. iSCSI PDUs + + This section describes the handling of each of the iSCSI PDU types by + the iSER layer. The iSCSI layer requests that the iSER layer process + the iSCSI PDU by invoking the appropriate Operational Primitive. A + Connection_Handle MUST qualify each of these invocations. In + addition, BHS and the optional AHS of the iSCSI PDU as defined in + [RFC3720] MUST qualify each of the invocations. The qualifying + Connection_Handle, the BHS, and the AHS are not explicitly listed in + the subsequent sections. + +7.3.1. SCSI Command + + Type: control-type PDU + + PDU-specific qualifiers (for SCSI Write or bidirectional command): + ImmediateDataSize, UnsolicitedDataSize, DataDescriptorOut + + PDU-specific qualifiers (for SCSI read or bidirectional command): + DataDescriptorIn + + + + + +Ko, et al. Standards Track [Page 40] + +RFC 5046 iSER Specification October 2007 + + + The iSER layer at the initiator MUST send the SCSI command in a + SendSE Message to the target. + + For a SCSI Write or bidirectional command, the iSCSI layer at the + initiator MUST invoke the Send_Control Operational Primitive as + follows: + + * If there is immediate data to be transferred for the SCSI Write or + bidirectional command, the qualifier ImmediateDataSize MUST be + used to define the number of bytes of immediate unsolicited data + to be sent with the Write or bidirectional command, and the + qualifier DataDescriptorOut MUST be used to define the initiator's + I/O Buffer containing the SCSI Write data. + + * If there is unsolicited data to be transferred for the SCSI Write + or bidirectional command, the qualifier UnsolicitedDataSize MUST + be used to define the number of bytes of immediate and non- + immediate unsolicited data for the command. The iSCSI layer will + issue one or more SCSI Data-out PDUs for the non-immediate + unsolicited data. See Section 7.3.4 on SCSI Data-out. + + * If there is solicited data to be transferred for the SCSI write or + bidirectional command, as indicated by the Expected Data Transfer + Length in the SCSI Command PDU exceeding the value of + UnsolicitedDataSize, the iSER layer at the initiator MUST do the + following: + + a. It MUST allocate a Write STag for the I/O Buffer defined by + the qualifier DataDescriptorOut. The DataDescriptorOut + describes the I/O buffer starting with the immediate + unsolicited data (if any), followed by the non-immediate + unsolicited data (if any) and solicited data. This means + that the BufferOffset for the SCSI Data-out for this + command is equal to the TO. This implies that a zero TO + for this STag points to the beginning of this I/O Buffer. + + b. It MUST establish a Local Mapping that associates the + Initiator Task Tag (ITT) to the Write STag. + + c. It MUST Advertise the Write STag to the target by sending + it as the Write STag in the iSER header of the iSER Message + (the payload of the SendSE Message of RCaP) containing the + SCSI write or bidirectional command PDU. See Section 9.2 + on iSER Header Format for the iSCSI Control-Type PDU. + + For a SCSI read or bidirectional command, the iSCSI layer at the + initiator MUST invoke the Send_Control Operational Primitive + qualified with DataDescriptorIn, which defines the initiator's I/O + + + +Ko, et al. Standards Track [Page 41] + +RFC 5046 iSER Specification October 2007 + + + Buffer for receiving the SCSI Read data. The iSER layer at the + initiator MUST do the following: + + a. It MUST allocate a Read STag for the I/O Buffer. + + b. It MUST establish a Local Mapping that associates the + Initiator Task Tag (ITT) to the Read STag. + + c. It MUST Advertise the Read STag to the target by sending it + as the Read STag in the iSER header of the iSER Message + (the payload of the SendSE Message of RCaP) containing the + SCSI read or bidirectional command PDU. See Section 9.2 on + iSER Header Format for the iSCSI Control-Type PDU. + + If the amount of unsolicited data to be transferred in a SCSI command + exceeds TargetRecvDataSegmentLength, then the iSCSI layer at the + initiator MUST segment the data into multiple iSCSI control-type + PDUs, with the data segment length in all PDUs generated except the + last one having exactly the size TargetRecvDataSegmentLength. The + data segment length of the last iSCSI control-type PDU carrying the + unsolicited data can be up to TargetRecvDataSegmentLength. + + When the iSER layer at the target receives the SCSI command, it MUST + establish a Remote Mapping that associates the ITT to the Advertised + Write STag and the Read STag if present in the iSER header. The + Write STag is used by the iSER layer at the target in handling the + data transfer associated with the R2T PDU(s) as described in Section + 7.3.6. The Read STag is used in handling the SCSI Data-in PDU(s) + from the iSCSI layer at the target as described in Section 7.3.5. + +7.3.2. SCSI Response + + Type: control-type PDU + + PDU-specific qualifiers: DataDescriptorStatus + + The iSCSI layer at the target MUST invoke the Send_Control + Operational Primitive qualified with DataDescriptorStatus, which + defines the buffer containing the sense and response information. + The iSCSI layer at the target MUST always return the SCSI status for + a SCSI command in a separate SCSI Response PDU. "Phase collapse" for + transferring SCSI status in a SCSI Data-in PDU MUST NOT be used. The + iSER layer at the target sends the SCSI Response PDU according to the + following rules: + + * If no STags are Advertised by the initiator in the iSER Message + containing the SCSI command PDU, then the iSER layer at the target + MUST send a SendSE Message containing the SCSI Response PDU. + + + +Ko, et al. Standards Track [Page 42] + +RFC 5046 iSER Specification October 2007 + + + * If the initiator Advertised a Read STag in the iSER Message + containing the SCSI Command PDU, then the iSER layer at the target + MUST send a SendInvSE Message containing the SCSI Response PDU. + The header of the SendInvSE Message MUST carry the Read STag to be + invalidated at the initiator. + + * If the initiator Advertised only the Write STag in the iSER + Message containing the SCSI Command PDU, then the iSER layer at + the target MUST send a SendInvSE Message containing the SCSI + Response PDU. The header of the SendInvSE Message MUST carry the + Write STag to be invalidated at the initiator. + + When the iSCSI layer at the target invokes the Send_Control + Operational Primitive to send the SCSI Response PDU, the iSER layer + at the target MUST invalidate the Remote Mapping that associates the + ITT to the Advertised STag(s) before transferring the SCSI Response + PDU to the initiator. + + Upon receiving the SendInvSE Message containing the SCSI Response PDU + from the target, the RCaP layer at the initiator will invalidate the + STag specified in the header. The iSER layer at the initiator MUST + ensure that the correct STag is invalidated. If both the Read and + the Write STags are Advertised earlier by the initiator, then the + iSER layer at the initiator MUST explicitly invalidate the Write STag + upon receiving the SendInvSE Message because the header of the + SendInvSE Message can only carry one STag (in this case, the Read + STag) to be invalidated. + + The iSER layer at the initiator MUST ensure the invalidation of the + STag(s) used in a command before notifying the iSCSI layer at the + initiator by invoking the Control_Notify Operational Primitive + qualified with the SCSI Response. This precludes the possibility of + using the STag(s) after the completion of the command, thereby + causing data corruption. + + When the iSER layer at the initiator receives the SendSE or the + SendInvSE Message containing the SCSI Response PDU, it SHOULD + invalidate the Local Mapping that associates the ITT to the local + STag(s). The iSER layer MUST ensure that all local STag(s) + associated with the ITT are invalidated before notifying the iSCSI + layer of the SCSI Response PDU by invoking the Control_Notify + Operational Primitive qualified with the SCSI Response PDU. + + + + + + + + + +Ko, et al. Standards Track [Page 43] + +RFC 5046 iSER Specification October 2007 + + +7.3.3. Task Management Function Request/Response + + Type: control-type PDU + + PDU-specific qualifiers (for TMF Request): DataDescriptorOut, + DataDescriptorIn + + The iSER layer MUST use a SendSE Message to send the Task Management + Function Request/Response PDU. + + For the Task Management Function Request with the TASK REASSIGN + function, the iSER layer at the initiator MUST do the following: + + * It MUST use the ITT as specified in the Referenced Task Tag from + the Task Management Function Request PDU to locate the existing + STag(s), if any, in the Local Mapping(s) that associates the ITT + to the local STag(s). + + * It MUST invalidate the existing STag(s), if any, and the Local + Mapping(s) that associates the ITT to the local STag(s). + + * It MUST allocate a Read STag for the I/O Buffer as defined by the + qualifier DataDescriptorIn if the Send_Control Operational + Primitive invocation is qualified with DataDescriptorIn. + + * It MUST allocate a Write STag for the I/O Buffer as defined by the + qualifier DataDescriptorOut if the Send_Control Operational + Primitive invocation is qualified with DataDescriptorOut. + + * If STags are allocated, it MUST establish a new Local Mapping(s) + that associate the ITT to the allocated STag(s). + + * It MUST Advertise the STags, if allocated, to the target in the + iSER header of the SendSE Message carrying the iSCSI PDU, as + described in Section 9.2. + + For the Task Management Function Request with the TASK REASSIGN + function for a SCSI read or bidirectional command, the iSCSI layer at + the initiator MUST set ExpDataSN to 0 since the data transfer and + acknowledgements happen transparently to the iSCSI layer at the + initiator. This provides the flexibility to the iSCSI layer at the + target to request transmission of only the unacknowledged data as + specified in [RFC3720]. + + When the iSER layer at the target receives the Task Management + Function Request with the TASK REASSIGN function, it MUST do the + following: + + + + +Ko, et al. Standards Track [Page 44] + +RFC 5046 iSER Specification October 2007 + + + * It MUST use the ITT as specified in the Referenced Task Tag from + the Task Management Function Request PDU to locate the mappings + that associate the ITT to the Advertised STag(s) and the local + STag(s), if any. + + * It MUST invalidate the local STag(s), if any, associated with the + ITT. + + * It MUST replace the Advertised STag(s) in the Remote Mapping that + associates the ITT to the Advertised STag(s) with the Write STag + and the Read STag if present in the iSER header. The Write STag + is used in the handling of the R2T PDU(s) from the iSCSI layer at + the target as described in Section 7.3.6. The Read STag is used + in the handling of the SCSI Data-in PDU(s) from the iSCSI layer at + the target as described in Section 7.3.5. + +7.3.4. SCSI Data-Out + + Type: control-type PDU + + PDU-specific qualifiers: DataDescriptorOut + + The iSCSI layer at the initiator MUST invoke the Send_Control + Operational Primitive qualified with DataDescriptorOut, which defines + the initiator's I/O Buffer containing unsolicited SCSI Write data. + + If the amount of unsolicited data to be transferred as SCSI Data-out + exceeds TargetRecvDataSegmentLength, then the iSCSI layer at the + initiator MUST segment the data into multiple iSCSI control-type + PDUs, with the DataSegmentLength having the value of + TargetRecvDataSegmentLength in all PDUs generated except the last + one. The DataSegmentLength of the last iSCSI control-type PDU + carrying the unsolicited data can be up to + TargetRecvDataSegmentLength. The iSCSI layer at the target MUST + perform the reassembly function for the unsolicited data. + + For unsolicited data, if the F bit is set to 0 in a SCSI Data-out + PDU, the iSER layer at the initiator MUST use a Send Message to send + the SCSI Data-out PDU. If the F bit is set to 1, the iSER layer at + the initiator MUST use a SendSE Message to send the SCSI Data-out + PDU. + + Note that for solicited data, the SCSI Data-out PDUs are not used + since R2T PDUs are not delivered to the iSCSI layer at the initiator; + instead, R2T PDUs are transformed by the iSER layer at the target + into RDMA Read operations. (See Section 7.3.6.) + + + + + +Ko, et al. Standards Track [Page 45] + +RFC 5046 iSER Specification October 2007 + + +7.3.5. SCSI Data-In + + Type: data-type PDU + + PDU-specific qualifiers: DataDescriptorIn + + When the iSCSI layer at the target is ready to return the SCSI Read + data to the initiator, it MUST invoke the Put_Data Operational + Primitive qualified with DataDescriptorIn, which defines the SCSI + Data-in buffer. See Section 7.1 on the general requirement on the + handling of iSCSI data-type PDUs. SCSI Data-in PDU(s) are used in + SCSI Read data transfer as described in Section 9.5.2. + + The iSER layer at the target MUST do the following for each + invocation of the Put_Data Operational Primitive: + + 1. It MUST use the ITT in the SCSI Data-in PDU to locate the remote + Read STag in the Remote Mapping that associates the ITT to + Advertised STag(s). The Remote Mapping was established earlier + by the iSER layer at the target when the SCSI read command was + received from the initiator. + + 2. It MUST generate and send an RDMA Write Message containing the + read data to the initiator. + + a. It MUST use the remote Read STag as the Data Sink STag of the + RDMA Write Message. + + b. It MUST use the Buffer Offset from the SCSI Data-in PDU as + the Data Sink Tagged Offset of the RDMA Write Message. + + c. It MUST use DataSegmentLength from the SCSI Data-in PDU to + determine the amount of data to be sent in the RDMA Write + Message. + + 3. It MUST associate DataSN and ITT from the SCSI Data-in PDU with + the RDMA Write operation. If the Put_Data Operational Primitive + invocation was qualified with Notify_Enable set, then when the + iSER layer at the target receives a completion from the RCaP + layer for the RDMA Write Message, the iSER layer at the target + MUST notify the iSCSI layer by invoking the + Data_Completion_Notify Operational Primitive qualified with + DataSN and ITT. Conversely, if the Put_Data Operational + Primitive invocation was qualified with Notify_Enable cleared, + then the iSER layer at the target MUST NOT notify the iSCSI layer + on completion and MUST NOT invoke the Data_Completion_Notify + Operational Primitive. + + + + +Ko, et al. Standards Track [Page 46] + +RFC 5046 iSER Specification October 2007 + + + When the A-bit is set to 1 in the SCSI Data-in PDU, the iSER layer at + the target MUST notify the iSCSI layer at the target when the data + transfer is complete at the initiator. To perform this additional + function, the iSER layer at the target can take advantage of the + operational ErrorRecoveryLevel if previously disclosed by the iSCSI + layer via an earlier invocation of the Notice_Key_Values Operational + Primitive. There are two approaches that can be taken: + + 1. If the iSER layer at the target knows that the operational + ErrorRecoveryLevel is 2, or if the iSER layer at the target does + not know the operational ErrorRecoveryLevel, then the iSER layer + at the target MUST issue a zero-length RDMA Read Request Message + following the RDMA Write Message. When the iSER layer at the + target receives a completion for the RDMA Read Request Message + from the RCaP layer, implying that the RDMA-Capable Controller at + the initiator has completed processing the RDMA Write Message due + to the completion ordering semantics of RCaP, the iSER layer at + the target MUST notify the iSCSI layer at the target by invoking + the Data_Ack_Notify Operational Primitive qualified with ITT and + DataSN (see Section 3.2.3). + + 2. If the iSER layer at the target knows that the operational + ErrorRecoveryLevel is 1, then the iSER layer at the target MUST + do one of the following: + + a. It MUST notify the iSCSI layer at the target by invoking the + Data_Ack_Notify Operational Primitive qualified with ITT and + DataSN (see Section 3.2.3) when it receives the local + completion from the RCaP layer for the RDMA Write Message. + This is allowed since digest errors do not occur in iSER (see + Section 10.1.4.2) and a CRC error will cause the connection + to be terminated and the task to be terminated anyway. The + local RDMA Write completion from the RCaP layer guarantees + that the RCaP layer will not access the I/O Buffer again to + transfer the data associated with that RDMA Write operation. + + b. Alternatively, it MUST use the same procedure for handling + the data transfer completion at the initiator as for + ErrorRecoveryLevel 2. + + Note that the iSCSI layer at the target cannot set the A-bit to 1 if + the ErrorRecoveryLevel=0. + + The SCSI status MUST always be returned in a separate SCSI Response + PDU. The S bit in the SCSI Data-in PDU MUST always be set to 0. + There MUST NOT be a "phase collapse" in the SCSI Data-in PDU. + + + + + +Ko, et al. Standards Track [Page 47] + +RFC 5046 iSER Specification October 2007 + + + Since the RDMA Write Message only transfers the data portion of the + SCSI Data-in PDU but not the control information in the header, such + as ExpCmdSN, if timely updates of such information are crucial, the + iSCSI layer at the initiator MAY issue NOP-Out PDUs to request that + the iSCSI layer at the target respond with the information using NOP- + In PDUs. + +7.3.6. Ready to Transfer (R2T) + + Type: data-type PDU + + PDU-specific qualifiers: DataDescriptorOut + + In order to send an R2T PDU, the iSCSI layer at the target MUST + invoke the Get_Data Operational Primitive qualified with + DataDescriptorOut, which defines the I/O Buffer for receiving the + SCSI Write data from the initiator. See Section 7.1 on the general + requirements on the handling of iSCSI data-type PDUs. + + The iSER layer at the target MUST do the following for each + invocation of the Get_Data Operational Primitive: + + 1. It MUST ensure a valid local STag for the I/O Buffer and a valid + Local Mapping that associates the Initiator Task Tag (ITT) to the + local STag. This may involve allocating a valid local STag and + establishing a Local Mapping. + + 2. It MUST use the ITT in the R2T to locate the remote Write STag in + the Remote Mapping that associates the ITT to Advertised STag(s). + The Remote Mapping is established earlier by the iSER layer at + the target when the iSER Message containing the Advertised Write + STag and the SCSI Command PDU for a SCSI write or bidirectional + command is received from the initiator. + + 3. If the iSER-ORD value at the target is set to 0, the iSER layer + at the target MUST terminate the connection and free up the + resources associated with the connection (as described in Section + 5.2.3) if it receives the R2T PDU from the iSCSI layer at the + target. Upon termination of the connection, the iSER layer at + the target MUST notify the iSCSI layer at the target by invoking + the Connection_Terminate_Notify Operational Primitive. + + 4. If the iSER-ORD value at the target is set to greater than 0, the + iSER layer at the target MUST transform the R2T PDU into an RDMA + Read Request Message. While transforming the R2T PDU, the iSER + layer at the target MUST ensure that the number of outstanding + RDMA Read Request Messages does not exceed the iSER-ORD value. + To transform the R2T PDU, the iSER layer at the target: + + + +Ko, et al. Standards Track [Page 48] + +RFC 5046 iSER Specification October 2007 + + + a. MUST derive the local STag and local Tagged Offset from the + DataDescriptorOut that qualified the Get_Data invocation. + + b. MUST use the local STag as the Data Sink STag of the RDMA + Read Request Message. + + c. MUST use the local Tagged Offset as the Data Sink Tagged + Offset of the RDMA Read Request Message. + + d. MUST use the Desired Data Transfer Length from the R2T PDU as + the RDMA Read Message Size of the RDMA Read Request Message. + + e. MUST use the remote Write STag as the Data Source STag of the + RDMA Read Request Message. + + f. MUST use the Buffer Offset from the R2T PDU as the Data + Source Tagged Offset of the RDMA Read Request Message. + + 5. It MUST associate R2TSN and ITT from the R2T PDU with the RDMA + Read operation. If the Get_Data Operational Primitive invocation + is qualified with Notify_Enable set, then when the iSER layer at + the target receives a completion from the RCaP layer for the RDMA + Read operation, the iSER layer at the target MUST notify the + iSCSI layer by invoking the Data_Completion_Notify Operational + Primitive qualified with R2TSN and ITT. Conversely, if the + Get_Data Operational Primitive invocation is qualified with + Notify_Enable cleared, then the iSER layer at the target MUST NOT + notify the iSCSI layer on completion and MUST NOT invoke the + Data_Completion_Notify Operational Primitive. + + When the RCaP layer at the initiator receives a valid RDMA Read + Request Message, it will return an RDMA Read Response Message + containing the solicited write data to the target. When the RCaP + layer at target receives the RDMA Read Response Message from the + initiator, it will place the solicited data in the I/O Buffer + referenced by the Data Sink STag in the RDMA Read Response Message. + + Since the RDMA Read Request Message from the target does not transfer + the control information in the R2T PDU, such as ExpCmdSN, if timely + updates of such information are crucial, the iSCSI layer at the + initiator MAY issue NOP-Out PDUs to request that the iSCSI layer at + the target respond with the information using NOP-In PDUs. + + Similarly, since the RDMA Read Response Message from the initiator + only transfers the data but not the control information normally + found in the SCSI Data-out PDU, such as ExpStatSN, if timely updates + of such information are crucial, the iSCSI layer at the target MAY + + + + +Ko, et al. Standards Track [Page 49] + +RFC 5046 iSER Specification October 2007 + + + issue NOP-In PDUs to request that the iSCSI layer at the initiator + respond with the information using NOP-Out PDUs. + +7.3.7. Asynchronous Message + + Type: control-type PDU + + PDU-specific qualifiers: DataDescriptorSense + + The iSCSI layer MUST invoke the Send_Control Operational Primitive + qualified with DataDescriptorSense, which defines the buffer + containing the sense and iSCSI Event information. The iSER layer + MUST use a SendSE Message to send the Asynchronous Message PDU. + +7.3.8. Text Request and Text Response + + Type: control-type PDU + + PDU-specific qualifiers: DataDescriptorTextOut (for Text + Request), DataDescriptorIn (for Text Response) + + The iSCSI layer MUST invoke the Send_Control Operational Primitive + qualified with DataDescriptorTextOut (or DataDescriptorIn), which + defines the Text Request (or Text Response) buffer. The iSER layer + MUST use SendSE Messages to send the Text Request (or Text Response + PDUs). + +7.3.9. Login Request and Login Response + + During the login negotiation, the iSCSI layer interacts with the + transport layer directly and the iSER layer is not involved. See + Section 5.1 on iSCSI/iSER connection setup. If the underlying + transport is TCP, the Login Request PDUs and the Login Response PDUs + are exchanged when the connection between the initiator and the + target is still in the byte stream mode. + + The iSCSI layer MUST not send a Login Request (or a Login Response) + PDU during the Full Feature Phase. A Login Request (or a Login + Response) PDU, if used, MUST be treated as an iSCSI protocol error. + The iSER layer MAY reject such a PDU from the iSCSI layer with an + appropriate error code. If a Login Request PDU is received by the + iSCSI layer at the target, it MUST respond with a Reject PDU with a + reason code of "protocol error". + + + + + + + + +Ko, et al. Standards Track [Page 50] + +RFC 5046 iSER Specification October 2007 + + +7.3.10. Logout Request and Logout Response + + Type: control-type PDU + + PDU-specific qualifiers: None + + The iSER layer MUST use a SendSE Message to send the Logout Request + or Logout Response PDU. Sections 5.2.1 and 5.2.2 describe the + handling of the Logout Request and the Logout Response at the + initiator and the target and the interactions between the initiator + and the target to terminate a connection. + +7.3.11. SNACK Request + + Since HeaderDigest and DataDigest must be negotiated to "None", there + are no digest errors when the connection is in iSER-assisted mode. + Also, since RCaP delivers all messages in the order they were sent, + there are no sequence errors when the connection is in iSER-assisted + mode. Therefore, the iSCSI layer MUST NOT send SNACK Request PDUs. + A SNCAK Request PDU, if used, MUST be treated as an iSCSI protocol + error. The iSER layer MAY reject such a PDU from the iSCSI layer + with an appropriate error code. If a SNACK Request PDU is received + by the iSCSI layer at the target, it MUST respond with a Reject PDU + with a reason code of "protocol error". + +7.3.12. Reject + + Type: control-type PDU + + PDU-specific qualifiers: DataDescriptorReject + + The iSCSI layer MUST invoke the Send_Control Operational Primitive + qualified with DataDescriptorReject, which defines the Reject buffer. + The iSER layer MUST use a SendSE Message to send the Reject PDU. + +7.3.13. NOP-Out and NOP-In + + Type: control-type PDU + + PDU-specific qualifiers: DataDescriptorNOPOut (for NOP-Out), + DataDescriptorNOPIn (for NOP-In) + + The iSCSI layer MUST invoke the Send_Control Operational Primitive + qualified with DataDescriptorNOPOut (or DataDescriptorNOPIn), which + defines the Ping (or Return Ping) data buffer. The iSER layer MUST + use SendSE Messages to send the NOP-Out (or NOP-In) PDU. + + + + + +Ko, et al. Standards Track [Page 51] + +RFC 5046 iSER Specification October 2007 + + +8. Flow Control and STag Management + +8.1. Flow Control for RDMA Send Message Types + + Send Message Types in RCaP are used by the iSER layer to transfer + iSCSI control-type PDUs. Each Send Message Type in RCaP consumes an + Untagged Buffer at the Data Sink. However, neither the RCaP layer + nor the iSER layer provides an explicit flow control mechanism for + the Send Message Types. Therefore, the iSER layer SHOULD provision + enough Untagged buffers for handling incoming Send Message Types to + prevent buffer exhaustion at the RCaP layer. If buffer exhaustion + occurs, it may result in the termination of the connection. + + An implementation may choose to satisfy the buffer requirement by + using a common buffer pool shared across multiple connections, with + usage limits on a per-connection basis and usage limits on the buffer + pool itself. In such an implementation, exceeding the buffer usage + limit for a connection or the buffer pool itself may trigger + interventions from the iSER layer to replenish the buffer pool and/or + to isolate the connection causing the problem. + + iSER also provides the MaxOutstandingUnexpectedPDUs key to be used by + the initiator and the target to declare the maximum number of + outstanding "unexpected" control-type PDUs that it can receive. It + is intended to allow the receiving side to determine the amount of + buffer resources needed beyond the normal flow control mechanism + available in iSCSI. + + The buffer resources required at both the initiator and the target as + a result of control-type PDUs sent by the initiator is described in + Section 8.1.1. The buffer resources required at both the initiator + and target as a result of control-type PDUs sent by the target is + described in Section 8.1.2. + +8.1.1. Flow Control for Control-Type PDUs from the Initiator + + The control-type PDUs that can be sent by an initiator to a target + can be grouped into the following categories: + + 1. Regulated: Control-type PDUs in this category are regulated by + the iSCSI CmdSN window mechanism and the immediate flag is not + set. + + 2. Unregulated but Expected: Control-type PDUs in this category are + not regulated by the iSCSI CmdSN window mechanism but are + expected by the target. + + + + + +Ko, et al. Standards Track [Page 52] + +RFC 5046 iSER Specification October 2007 + + + 3. Unregulated and Unexpected: Control-type PDUs in this category + are not regulated by the iSCSI CmdSN window mechanism and are + "unexpected" by the target. + +8.1.1.1. Control-Type PDUs from the Initiator in the Regulated Category + + Control-type PDUs that can be sent by the initiator in this category + are regulated by the iSCSI CmdSN window mechanism and the immediate + flag is not set. + + The queuing capacity required of the iSCSI layer at the target is + described in Section 3.2.2.1 of [RFC3720]. For each of the control- + type PDUs that can be sent by the initiator in this category, the + initiator MUST provision for the buffer resources required for the + corresponding control-type PDU sent as a response from the target. + The following is a list of the PDUs that can be sent by the initiator + and the PDUs that are sent by the target in response: + + a. When an initiator sends a SCSI Command PDU, it expects a SCSI + Response PDU from the target. + + b. When the initiator sends a Task Management Function Request + PDU, it expects a Task Management Function Response PDU from + the target. + + c. When the initiator sends a Text Request PDU, it expects a + Text Response PDU from the target. + + d. When the initiator sends a Logout Request PDU, it expects a + Logout Response PDU from the target. + + e. When the initiator sends a NOP-Out PDU as a ping request with + ITT != 0xffffffff and TTT = 0xffffffff, it expects a NOP-In + PDU from the target with the same ITT and TTT as in the ping + request. + + The response from the target for any of the PDUs enumerated here may + alternatively be in the form of a Reject PDU sent instead before the + task is active, as described in Section 6.3 of [RFC3720]. + +8.1.1.2. Control-Type PDUs from the Initiator in the Unregulated but + Expected Category + + For the control-type PDUs in the Unregulated but Expected category, + the amount of buffering resources required at the target can be + predetermined. The following is a list of the PDUs in this category: + + + + + +Ko, et al. Standards Track [Page 53] + +RFC 5046 iSER Specification October 2007 + + + a. SCSI Data-out PDUs are used by the initiator to send + unsolicited data. The amount of buffer resources required by + the target can be determined using FirstBurstLength. Note + that SCSI Data-out PDUs are not used for solicited data since + the R2T PDU that is used for solicitation is transformed into + RDMA Read operations by the iSER layer at the target. See + Section 7.3.4. + + b. A NOP-Out PDU with TTT != 0xffffffff is sent as a ping + response by the initiator to the NOP-In PDU sent as a ping + request by the target. + +8.1.1.3. Control-Type PDUs from the Initiator in the Unregulated and + Unexpected Category + + PDUs in the Unregulated and Unexpected category are PDUs with the + immediate flag set. The number of PDUs in this category that can be + sent by an initiator is controlled by the value of + MaxOutstandingUnexpectedPDUs declared by the target (see Section + 6.7). After a PDU in this category is sent by the initiator, it is + outstanding until it is retired. At any time, the number of + outstanding unexpected PDUs MUST not exceed the value of + MaxOutstandingUnexpectedPDUs declared by the target. + + The target uses the value of MaxOutstandingUnexpectedPDUs that it + declared to determine the amount of buffer resources required for + control-type PDUs in this category that can be sent by an initiator. + For the initiator, for each of the control-type PDUs that can be sent + in this category, the initiator MUST provision for the buffer + resources if required for the corresponding control-type PDU that can + be sent as a response from the target. + + An outstanding PDU in this category is retired as follows. If the + CmdSN of the PDU sent by the initiator in this category is x, the PDU + is outstanding until the initiator sends a non-immediate control-type + PDU on the same connection with CmdSN = y (where y is at least x) and + the target responds with a control-type PDU on any connection where + ExpCmdSN is at least y+1. + + When the number of outstanding unexpected control-type PDUs equals + MaxOutstandingUnexpectedPDUs, the iSCSI layer at the initiator MUST + NOT generate any unexpected PDUs that otherwise it would have + generated, even if it is intended for immediate delivery. + + + + + + + + +Ko, et al. Standards Track [Page 54] + +RFC 5046 iSER Specification October 2007 + + +8.1.2. Flow Control for Control-Type PDUs from the Target + + Control-type PDUs that can be sent by a target and are expected by + the initiator are listed in the Regulated category (see Section + 8.1.1.1). + + For the control-type PDUs that can be sent by a target and are + unexpected by the initiator, the number is controlled by + MaxOutstandingUnexpectedPDUs declared by the initiator (see Section + 6.7). After a PDU in this category is sent by a target, it is + outstanding until it is retired. At any time, the number of + outstanding unexpected PDUs MUST not exceed the value of + MaxOutstandingUnexpectedPDUs declared by the initiator. The + initiator uses the value of MaxOutstandingUnexpectedPDUs that it + declared to determine the amount of buffer resources required for + control-type PDUs in this category that can be sent by a target. The + following is a list of the PDUs in this category and the conditions + for retiring the outstanding PDU: + + a. For an Asynchronous Message PDU with StatSN = x, the PDU is + outstanding until the initiator sends a control-type PDU with + ExpStatSN set to at least x+1. + + b. For a Reject PDU with StatSN = x that is sent after a task is + active, the PDU is outstanding until the initiator sends a + control-type PDU with ExpStatSN set to at least x+1. + + c. For a NOP-In PDU with ITT = 0xffffffff and StatSN = x, the + PDU is outstanding until the initiator responds with a + control-type PDU on the same connection where ExpStatSN is at + least x+1. But if the NOP-In PDU is sent as a ping request + with TTT != 0xffffffff, the PDU can also be retired when the + initiator sends a NOP-Out PDU with the same ITT and TTT as in + the ping request. Note that when a target sends a NOP-In PDU + as a ping request, it must provision a buffer for the NOP-Out + PDU sent as a ping response from the initiator. + + When the number of outstanding unexpected control-type PDUs equals + MaxOutstandingUnexpectedPDUs, the iSCSI layer at the target MUST NOT + generate any unexpected PDUs that otherwise it would have generated, + even if its intent is to indicate an iSCSI error condition (e.g., + Asynchronous Message, Reject). Task timeouts, as in the initiator + waiting for a command completion or other connection and session + level exceptions, will ensure that correct operational behavior will + result in these cases despite not generating the PDU. This rule + overrides any other requirements elsewhere that require that a Reject + PDU MUST be sent. + + + + +Ko, et al. Standards Track [Page 55] + +RFC 5046 iSER Specification October 2007 + + + (Implementation note: A SCSI task timeout and recovery can be a + lengthy process and hence SHOULD be avoided by proper provisioning of + resources.) + + (Implementation note: To ensure that the initiator has a means to + inform the target that outstanding PDUs have been retired, the target + should reserve the last unexpected control-type PDU allowable by the + value of MaxOutstandingUnexpectedPDUs declared by the initiator for + sending a NOP-In ping request with TTT != 0xffffffff to allow the + initiator to return the NOP-Out ping response with the current + ExpStatSN.) + +8.2. Flow Control for RDMA Read Resources + + The total number of RDMA Read operations that can be active + simultaneously on an iSCSI/iSER connection depends on the amount of + resources allocated as declared in the iSER Hello exchange described + in Section 5.1.3. Exceeding the number of RDMA Read operations + allowed on a connection will result in the connection being + terminated by the RCaP layer. The iSER layer at the target maintains + the iSER-ORD to keep track of the maximum number of RDMA Read + Requests that can be issued by the iSER layer on a particular RCaP + Stream. + + During connection setup (see Section 5.1), iSER-IRD is known at the + initiator and iSER-ORD is known at the target after the iSER layers + at the initiator and the target have respectively allocated the + connection resources necessary to support RCaP, as directed by the + Allocate_Connection_Resources Operational Primitive from the iSCSI + layer before the end of the iSCSI Login Phase. In the Full Feature + Phase, the first message sent by the initiator is the iSER Hello + Message (see Section 9.3), which contains the value of iSER-IRD. In + response to the iSER Hello Message, the target sends the iSER + HelloReply Message (see Section 9.4), which contains the value of + iSER-ORD. The iSER layer at both the initiator and the target MAY + adjust (lower) the resources associated with iSER-IRD and iSER-ORD + respectively to match the iSER-ORD value declared in the HelloReply + Message. The iSER layer at the target MUST flow control the RDMA + Read Request Messages to not exceed the iSER-ORD value at the target. + +8.3. STag Management + + An STag, as defined in [RDMAP], is an identifier of a Tagged Buffer + used in an RDMA operation. The allocation and the subsequent + invalidation of the STags are specified in this document if the STags + are exposed on the wire by being Advertised in the iSER header or + declared in the header of an RCaP Message. + + + + +Ko, et al. Standards Track [Page 56] + +RFC 5046 iSER Specification October 2007 + + +8.3.1. Allocation of STags + + When the iSCSI layer at the initiator invokes the Send_Control + Operational Primitive to request that the iSER layer at the initiator + process a SCSI command, zero, one, or two STags may be allocated by + the iSER layer. See Section 7.3.1 for details. The number of STags + allocated depends on whether the command is unidirectional or + bidirectional and whether or not solicited write data transfer is + involved. + + When the iSCSI layer at the initiator invokes the Send_Control + Operational Primitive to request that the iSER layer at the initiator + process a Task Management Function Request with the TASK REASSIGN + function, besides allocating zero, one, or two STags, the iSER layer + MUST invalidate the existing STags, if any, associated with the ITT. + See Section 7.3.3 for details. + + The iSER layer at the target allocates a local Data Sink STag when + the iSCSI layer at the target invokes the Get_Data Operational + Primitive to request that the iSER layer process an R2T PDU. See + Section 7.3.6 for details. + +8.3.2. Invalidation of STags + + The invalidation of the STags at the initiator at the completion of a + unidirectional or bidirectional command when the associated SCSI + Response PDU is sent by the target is described in Section 7.3.2. + + When a unidirectional or bidirectional command concludes without the + associated SCSI Response PDU being sent by the target, the iSCSI + layer at the initiator MUST request that the iSER layer at the + initiator invalidate the STags by invoking the + Deallocate_Task_Resources Operational Primitive qualified with ITT. + In response, the iSER layer at the initiator MUST locate the STag(s) + (if any) in the Local Mapping that associates the ITT to the local + STag(s). The iSER layer at the initiator MUST invalidate the STag(s) + (if any) and the Local Mapping. + + For an RDMA Read operation used to realize a SCSI Write data + transfer, the iSER layer at the target SHOULD invalidate the Data + Sink STag at the conclusion of the RDMA Read operation referencing + the Data Sink STag (to permit the immediate reuse of buffer + resources). + + For an RDMA Write operation used to realize a SCSI Read data + transfer, the Data Source STag at the target is not declared to the + initiator and is not exposed on the wire. Invalidation of the STag + is thus not specified. + + + +Ko, et al. Standards Track [Page 57] + +RFC 5046 iSER Specification October 2007 + + + When a unidirectional or bidirectional command concludes without the + associated SCSI Response PDU being sent by the target, the iSCSI + layer at the target MUST request that the iSER layer at the target + invalidate the STags by invoking the Deallocate_Task_Resources + Operational Primitive qualified with ITT. In response, the iSER + layer at the target MUST locate the local STag(s) (if any) in the + Local Mapping that associates the ITT to the local STag(s). The iSER + layer at the target MUST invalidate the local STag(s) (if any) and + the mapping. + +9. iSER Control and Data Transfer + + For iSCSI data-type PDUs (see Section 7.1), the iSER layer uses RDMA + Read and RDMA Write operations to transfer the solicited data. For + iSCSI control-type PDUs (see Section 7.2), the iSER layer uses Send + Message Types of RCaP. + +9.1. iSER Header Format + + An iSER header MUST be present in every Send Message Type of RCaP. + The iSER header is located in the first 12 bytes of the message + payload of the Send Message Type of RCaP, as shown in Figure 2. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Opcode| Opcode Specific Fields | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Opcode Specific Fields | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Opcode Specific Fields | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 2. iSER Header Format + + Opcode - Operation Code: 4 bits + + The Opcode field identifies the type of iSER Messages: + + 0001b = iSCSI control-type PDU + + 0010b = iSER Hello Message + + 0011b = iSER HelloReply Message + + All other opcodes are reserved. + + + + + +Ko, et al. Standards Track [Page 58] + +RFC 5046 iSER Specification October 2007 + + +9.2. iSER Header Format for the iSCSI Control-Type PDU + + The iSER layer uses Send Message Types of RCaP to transfer iSCSI + control-type PDUs (see Section 7.2). The message payload of each of + the Send Message Types of RCaP used for transferring an iSER Message + contains an iSER Header followed by an iSCSI control-type PDU. + + The iSER header in a Send Message Type of RCaP carrying an iSCSI + control-type PDU MUST have the format as described in Figure 3. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | |W|R| | + | 0001b |S|S| Reserved | + | |V|V| | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Write STag (or N/A) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Read STag (or N/A) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 3. iSER Header Format for iSCSI Control-Type PDU + + WSV - Write STag Valid flag: 1 bit + + This flag indicates the validity of the Write STag field of the + iSER Header. If set to one, the Write STag field in this iSER + Header is valid. If set to zero, the Write STag field in this + iSER Header MUST be ignored at the receiver. The Write STag + Valid flag is set to one when there is solicited data to be + transferred for a SCSI write or bidirectional command, or when + there are non-immediate unsolicited and solicited data to be + transferred for the referenced task specified in a Task + Management Function Request with the TASK REASSIGN function. + + RSV - Read STag Valid flag: 1 bit + + This flag indicates the validity of the Read STag field of the + iSER Header. If set to one, the Read STag field in this iSER + Header is valid. If set to zero, the Read STag field in this + iSER Header MUST be ignored at the receiver. The Read STag Valid + flag is set to one for a SCSI read or bidirectional command, or + for a Task Management Function Request with the TASK REASSIGN + function. + + + + + + +Ko, et al. Standards Track [Page 59] + +RFC 5046 iSER Specification October 2007 + + + Write STag - Write Steering Tag: 32 bits + + This field contains the Write STag when the Write STag Valid flag + is set to one. For a SCSI write or bidirectional command, the + Write STag is used to Advertise the initiator's I/O Buffer + containing the solicited data. For a Task Management Function + Request with the TASK REASSIGN function, the Write STag is used + to Advertise the initiator's I/O Buffer containing the non- + immediate unsolicited data and solicited data. This Write STag + is used as the Data Source STag in the resultant RDMA Read + operation(s). When the Write STag Valid flag is set to zero, + this field MUST be set to zero. + + Read STag - Read Steering Tag: 32 bits + + This field contains the Read STag when the Read STag Valid flag + is set to one. The Read STag is used to Advertise the + initiator's Read I/O Buffer of a SCSI read or bidirectional + command, or of a Task Management Function Request with the TASK + REASSIGN function. This Read STag is used as the Data Sink STag + in the resultant RDMA Write operation(s). When the Read STag + Valid flag is zero, this field MUST be set to zero. + + Reserved: + + Reserved fields MUST be set to zero on transmit and MUST be + ignored on reception. + +9.3. iSER Header Format for the iSER Hello Message + + An iSER Hello Message MUST only contain the iSER header, which MUST + have the format as described in Figure 4. The iSER Hello Message is + the first iSER Message sent on the RCaP Stream from the iSER layer at + the initiator to the iSER layer at the target. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | | | | | + | 0010b | Rsvd | MaxVer| MinVer| iSER-IRD | + | | | | | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 4. iSER Header Format for iSER Hello Message + + + +Ko, et al. Standards Track [Page 60] + +RFC 5046 iSER Specification October 2007 + + + MaxVer - Maximum Version: 4 bits + + This field specifies the maximum version of the iSER protocol + supported. It MUST be set to one to indicate the version of the + specification described in this document. + + MinVer - Minimum Version: 4 bits + + This field specifies the minimum version of the iSER protocol + supported. It MUST be set to one to indicate the version of the + specification described in this document. + + iSER-IRD: 16 bits + + This field contains the value of the iSER-IRD at the initiator. + + Reserved (Rsvd): + + Reserved fields MUST be set to zero on transmit, and MUST be + ignored on reception. + +9.4. iSER Header Format for the iSER HelloReply Message + + An iSER HelloReply Message MUST only contain the iSER header which + MUST have the format as described in Figure 5. The iSER HelloReply + Message is the first iSER Message sent on the RCaP Stream from the + iSER layer at the target to the iSER layer at the initiator. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | |R| | | | + | 0011b |Rsvd |E| MaxVer| CurVer| iSER-ORD | + | | |J| | | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 5. iSER Header Format for iSER HelloReply Message + + REJ - Reject flag: 1 bit + + This flag indicates whether the target is rejecting this + connection. If set to one, the target is rejecting the + connection. + + + + +Ko, et al. Standards Track [Page 61] + +RFC 5046 iSER Specification October 2007 + + + MaxVer - Maximum Version: 4 bits + + This field specifies the maximum version of the iSER protocol + supported. It MUST be set to one to indicate the version of the + specification described in this document. + + CurVer - Current Version: 4 bits + + This field specifies the current version of the iSER protocol + supported. It MUST be set to one to indicate the version of the + specification described in this document. + + iSER-ORD: 16 bits + + This field contains the value of the iSER-ORD at the target. + + Reserved (Rsvd): + + Reserved fields MUST be set to zero on transmit, and MUST be + ignored on reception. + +9.5. SCSI Data Transfer Operations + + The iSER layer at the initiator and the iSER layer at the target + handle each SCSI Write, SCSI Read, and bidirectional operation as + described below. + +9.5.1. SCSI Write Operation + + The iSCSI layer at the initiator MUST invoke the Send_Control + Operational Primitive to request that the iSER layer at the initiator + send the SCSI write command. The iSER layer at the initiator MUST + request that the RCaP layer transmit a SendSE Message with the + message payload consisting of the iSER header followed by the SCSI + Command PDU and immediate data (if any). If there is solicited data, + the iSER layer MUST Advertise the Write STag in the iSER header of + the SendSE Message, as described in Section 9.2. Upon receiving the + SendSE Message, the iSER layer at the target MUST notify the iSCSI + layer at the target by invoking the Control_Notify Operational + Primitive qualified with the SCSI Command PDU. See Section 7.3.1 for + details on the handling of the SCSI write command. + + For the non-immediate unsolicited data, the iSCSI layer at the + initiator MUST invoke a Send_Control Operational Primitive qualified + with the SCSI Data-out PDU. Upon receiving each Send or SendSE + Message containing the non-immediate unsolicited data, the iSER layer + at the target MUST notify the iSCSI layer at the target by invoking + the Control_Notify Operational Primitive qualified with the SCSI + + + +Ko, et al. Standards Track [Page 62] + +RFC 5046 iSER Specification October 2007 + + + Data-out PDU. See Section 7.3.4 for details on the handling of the + SCSI Data-out PDU. + + For the solicited data, when the iSCSI layer at the target has an I/O + Buffer available, it MUST invoke the Get_Data Operational Primitive + qualified with the R2T PDU. See Section 7.3.6 for details on the + handling of the R2T PDU. + + When the data transfer associated with this SCSI Write operation is + complete, the iSCSI layer at the target MUST invoke the Send_Control + Operational Primitive when it is ready to send the SCSI Response PDU. + Upon receiving a SendSE or SendInvSE Message containing the SCSI + Response PDU, the iSER layer at the initiator MUST notify the iSCSI + layer at the initiator by invoking the Control_Notify Operational + Primitive qualified with the SCSI Response PDU. See Section 7.3.2 + for details on the handling of the SCSI Response PDU. + +9.5.2. SCSI Read Operation + + The iSCSI layer at the initiator MUST invoke the Send_Control + Operational Primitive to request that the iSER layer at the initiator + to send the SCSI read command. The iSER layer at the initiator MUST + request that the RCaP layer transmit a SendSE Message with the + message payload consisting of the iSER header followed by the SCSI + Command PDU. The iSER layer at the initiator MUST Advertise the Read + STag in the iSER header of the SendSE Message, as described in + Section 9.2. Upon receiving the SendSE Message, the iSER layer at + the target MUST notify the iSCSI layer at the target by invoking the + Control_Notify Operational Primitive qualified with the SCSI Command + PDU. See Section 7.3.1 for details on the handling of the SCSI read + command. + + When the requested SCSI data is available in the I/O Buffer, the + iSCSI layer at the target MUST invoke the Put_Data Operational + Primitive qualified with the SCSI Data-in PDU. See Section 7.3.5 for + details on the handling of the SCSI Data-in PDU. + + When the data transfer associated with this SCSI Read operation is + complete, the iSCSI layer at the target MUST invoke the Send_Control + Operational Primitive when it is ready to send the SCSI Response PDU. + Upon receiving the SendInvSE Message containing the SCSI Response + PDU, the iSER layer at the initiator MUST notify the iSCSI layer at + the initiator by invoking the Control_Notify Operational Primitive + qualified with the SCSI Response PDU. See Section 7.3.2 for details + on the handling of the SCSI Response PDU. + + + + + + +Ko, et al. Standards Track [Page 63] + +RFC 5046 iSER Specification October 2007 + + +9.5.3. Bidirectional Operation + + The initiator and the target handle the SCSI Write and the SCSI Read + portions of this bidirectional operation the same as described in + Sections 9.5.1 and 9.5.2, respectively. + +10. iSER Error Handling and Recovery + + RCaP provides the iSER layer with reliable in-order delivery. + Therefore, the error management needs of an iSER-assisted connection + are somewhat different than those of a Traditional iSCSI connection. + +10.1. Error Handling + + iSER error handling is described in the following sections, + classified loosely based on the sources of errors: + + 1. Those originating at the transport layer (e.g., TCP). + + 2. Those originating at the RCaP layer. + + 3. Those originating at the iSER layer. + + 4. Those originating at the iSCSI layer. + +10.1.1. Errors in the Transport Layer + + If the transport layer is TCP, then TCP packets with detected errors + are silently dropped by the TCP layer and result in retransmission at + the TCP layer. This has no impact on the iSER layer. However, + connection loss (e.g., link failure) and unexpected termination + (e.g., TCP graceful or abnormal close without the iSCSI Logout + exchanges) at the transport layer will cause the iSCSI/iSER + connection to be terminated as well. + +10.1.1.1. Failure in the Transport Layer before RCaP Mode Is Enabled + + If the connection is lost or terminated before the iSCSI layer + invokes the Allocate_Connection_Resources Operational Primitive, the + login process is terminated and no further action is required. + + If the connection is lost or terminated after the iSCSI layer has + invoked the Allocate_Connection_Resources Operational Primitive, then + the iSCSI layer MUST request that the iSER layer deallocate all + connection resources by invoking the Deallocate_Connection_Resources + Operational Primitive. + + + + + +Ko, et al. Standards Track [Page 64] + +RFC 5046 iSER Specification October 2007 + + +10.1.1.2. Failure in the Transport Layer after RCaP Mode Is Enabled + + If the connection is lost or terminated after the iSCSI layer has + invoked the Enable_Datamover Operational Primitive, the iSER layer + MUST notify the iSCSI layer of the connection loss by invoking the + Connection_Terminate_Notify Operational Primitive. Prior to invoking + the Connection_Terminate_Notify Operational Primitive, the iSER layer + MUST perform the actions described in Section 5.2.3.2. + +10.1.2. Errors in the RCaP Layer + + The RCaP layer does not have error recovery operations built in. If + errors are detected at the RCaP layer, the RCaP layer will terminate + the RCaP Stream and the associated connection. + +10.1.2.1. Errors Detected in the Local RCaP Layer + + If an error is encountered at the local RCaP layer, the RCaP layer + MAY send a Terminate Message to the Remote Peer to report the error + if possible. (For iWARP, see [RDMAP] for the list of errors where a + Terminate Message is sent.) The RCaP layer is responsible for + terminating the connection. After the RCaP layer notifies the iSER + layer that the connection is terminated, the iSER layer MUST notify + the iSCSI layer by invoking the Connection_Terminate_Notify + Operational Primitive. Prior to invoking the + Connection_Terminate_Notify Operational Primitive, the iSER layer + MUST perform the actions described in Section 5.2.3.2. + +10.1.2.2. Errors Detected in the RCaP Layer at the Remote Peer + + If an error is encountered at the RCaP layer at the Remote Peer, the + RCaP layer at the Remote Peer may send a Terminate Message to report + the error if possible. If it is unable to send the Terminate + Message, the connection is terminated. This is treated the same as a + failure in the transport layer after RDMA is enabled as described in + Section 10.1.1.2. + + If an error is encountered at the RCaP layer at the Remote Peer and + it is able to send a Terminate Message, the RCaP layer at the Remote + Peer is responsible for terminating the connection. After the local + RCaP layer notifies the iSER layer that the connection is terminated, + the iSER layer MUST notify the iSCSI layer by invoking the + Connection_Terminate_Notify Operational Primitive. Prior to invoking + the Connection_Terminate_Notify Operational Primitive, the iSER layer + MUST perform the actions described in Section 5.2.3.2. + + + + + + +Ko, et al. Standards Track [Page 65] + +RFC 5046 iSER Specification October 2007 + + +10.1.3. Errors in the iSER Layer + + The error handling due to errors at the iSER layer is described in + the following sections. + +10.1.3.1. Insufficient Connection Resources to Support RCaP at + Connection Setup + + After the iSCSI layer at the initiator invokes the + Allocate_Connection_Resources Operational Primitive during the iSCSI + Login Negotiation Phase, if the iSER layer at the initiator fails to + allocate the connection resources necessary to support RCaP, it MUST + return a status of failure to the iSCSI layer at the initiator. The + iSCSI layer at the initiator MUST terminate the connection as + described in Section 5.2.3.1. + + After the iSCSI layer at the target invokes the + Allocate_Connection_Resources Operational Primitive during the iSCSI + Login Negotiation Phase, if the iSER layer at the target fails to + allocate the connection resources necessary to support RCaP, it MUST + return a status of failure to the iSCSI layer at the target. The + iSCSI layer at the target MUST send a Login Response with a status + class of 3 (Target Error), and a status code of "0302" (Out of + Resources). The iSCSI layers at the initiator and the target MUST + terminate the connection as described in Section 5.2.3.1. + +10.1.3.2. iSER Negotiation Failures + + If the RCaP or iSER related parameters declared by the initiator in + the iSER Hello Message are unacceptable to the iSER layer at the + target, the iSER layer at the target MUST set the Reject (REJ) flag, + as described in Section 9.4, in the iSER HelloReply Message. The + following are the cases when the iSER layer MUST set the REJ flag to + one in the HelloReply Message: + + * The initiator-declared iSER-IRD value is greater than 0 and the + target-declared iSER-ORD value is 0. + + * The initiator-supported and the target-supported iSER protocol + versions do not overlap. + + After requesting that the RCaP layer send the iSER HelloReply + Message, the handling of the error situation is the same as that for + iSER format errors as described in Section 10.1.3.3. + + + + + + + +Ko, et al. Standards Track [Page 66] + +RFC 5046 iSER Specification October 2007 + + +10.1.3.3. iSER Format Errors + + The following types of errors in an iSER header are considered format + errors: + + * Illegal contents of any iSER header field + + * Inconsistent field contents in an iSER header + + * Length error for an iSER Hello or HelloReply Message (see Section + 9.3 and 9.4) + + When a format error is detected, the following events MUST occur in + the specified sequence: + + 1. The iSER layer MUST request that the RCaP layer terminate the + RCaP Stream. The RCaP layer MUST terminate the associated + connection. + + 2. The iSER layer MUST notify the iSCSI layer of the connection + termination by invoking the Connection_Terminate_Notify + Operational Primitive. Prior to invoking the + Connection_Terminate_Notify Operational Primitive, the iSER layer + MUST perform the actions described in Section 5.2.3.2. + +10.1.3.4. iSER Protocol Errors + + The first iSER Message sent by the iSER layer at the initiator after + transitioning into iSER-assisted mode MUST be the iSER Hello Message + (see Section 9.3). Likewise, the first iSER Message sent by the iSER + layer at the target after transitioning into iSER-assisted mode MUST + be the iSER HelloReply Message (see Section 9.4). Failure to send + the iSER Hello or HelloReply Message, as indicated by the wrong + Opcode in the iSER header, is a protocol error. The handling of this + error situation is the same as that for iSER format errors as + described in Section 10.1.3.3. + + If the sending side of an iSER-enabled connection acts in a manner + not permitted by the negotiated or declared login/text operational + key values as described in Section 6, this is a protocol error, and + the receiving side MAY handle this the same as for iSER format errors + as described in Section 10.1.3.3. + +10.1.4. Errors in the iSCSI Layer + + The error handling due to errors at the iSCSI layer is described in + the following sections. For error recovery, see Section 10.2. + + + + +Ko, et al. Standards Track [Page 67] + +RFC 5046 iSER Specification October 2007 + + +10.1.4.1. iSCSI Format Errors + + When an iSCSI format error is detected, the iSCSI layer MUST request + that the iSER layer terminate the RCaP Stream by invoking the + Connection_Terminate Operational Primitive. For more details on the + connection termination, see Section 5.2.3.1. + +10.1.4.2. iSCSI Digest Errors + + In the iSER-assisted mode, the iSCSI layer will not see any digest + error because both the HeaderDigest and the DataDigest keys are + negotiated to "None". + +10.1.4.3. iSCSI Sequence Errors + + For Traditional iSCSI, sequence errors are caused by dropped PDUs due + to header or data digest errors. Since digests are not used in + iSER-assisted mode and the RCaP layer will deliver all messages in + the order they were sent, sequence errors will not occur in iSER- + assisted mode. + +10.1.4.4. iSCSI Protocol Error + + When the iSCSI layer handles certain protocol errors by dropping the + connection, the error handling is the same as that for iSCSI format + errors as described in Section 10.1.4.1. + + When the iSCSI layer uses the iSCSI Reject PDU and response codes to + handle certain other protocol errors, no special handling at the iSER + layer is required. + +10.1.4.5. SCSI Timeouts and Session Errors + + SCSI Timeouts and Session Errors are handled at the iSCSI layer and + no special handling at the iSER layer is required. + +10.1.4.6. iSCSI Negotiation Failures + + For negotiation failures that happen during the Login Phase at the + initiator after the iSCSI layer has invoked the + Allocate_Connection_Resources Operational Primitive and before the + Enable_Datamover Operational Primitive has been invoked, the iSCSI + layer MUST request that the iSER layer deallocate all connection + resources by invoking the Deallocate_Connection_Resources Operational + Primitive. The iSCSI layer at the initiator MUST terminate the + connection. + + + + + +Ko, et al. Standards Track [Page 68] + +RFC 5046 iSER Specification October 2007 + + + For negotiation failures during the Login Phase at the target, the + iSCSI layer can use a Login Response with a status class other than 0 + (success) to terminate the Login Phase. If the iSCSI layer has + invoked the Allocate_Connection_Resources Operational Primitive + before the Enable_Datamover Operational Primitive has been invoked, + the iSCSI layer at the target MUST request that the iSER layer at the + target deallocate all connection resources by invoking the + Deallocate_Connection_Resources Operational Primitive. The iSCSI + layer at both the initiator and the target MUST terminate the + connection. + + During the iSCSI Login Phase, if the iSCSI layer at the initiator + receives a Login Response from the target with a status class other + than 0 (Success) after the iSCSI layer at the initiator has invoked + the Allocate_Connection_Resources Operational Primitive, the iSCSI + layer MUST request the iSER layer to deallocate all connection + resources by invoking the Deallocate_Connection_Resources Operational + Primitive. The iSCSI layer MUST terminate the connection in this + case. + + For negotiation failures during the Full Feature Phase, the error + handling is left to the iSCSI layer and no special handling at the + iSER layer is required. + +10.2. Error Recovery + + Error recovery requirements of iSCSI/iSER are the same as that of + Traditional iSCSI. All three ErrorRecoveryLevels as defined in + [RFC3720] are supported in iSCSI/iSER. + + * For ErrorRecoveryLevel 0, session recovery is handled by iSCSI and + no special handling by the iSER layer is required. + + * For ErrorRecoveryLevel 1, see Section 10.2.1 on PDU Recovery. + + * For ErrorRecoveryLevel 2, see Section 10.2.2 on Connection + Recovery. + + The iSCSI layer may invoke the Notice_Key_Values Operational + Primitive during connection setup to request that the iSER layer take + note of the value of the operational ErrorRecoveryLevel, as described + in Sections 5.1.1 and 5.1.2. + +10.2.1. PDU Recovery + + As described in Sections 10.1.4.2 and 10.1.4.3, digest and sequence + errors will not occur in the iSER-assisted mode. If the RCaP layer + detects an error, it will close the iSCSI/iSER connection, as + + + +Ko, et al. Standards Track [Page 69] + +RFC 5046 iSER Specification October 2007 + + + described in Section 10.1.2. Therefore, PDU recovery is not useful + in the iSER-assisted mode. + + The iSCSI layer at the initiator SHOULD disable iSCSI timeout-driven + PDU retransmissions. + +10.2.2. Connection Recovery + + The iSCSI layer at the initiator MAY reassign connection allegiance + for non-immediate commands that are still in progress and are + associated with the failed connection by using a Task Management + Function Request with the TASK REASSIGN function. See Section 7.3.3 + for more details. + + When the iSCSI layer at the initiator does a task reassignment for a + SCSI write command, it MUST qualify the Send_Control Operational + Primitive invocation with DataDescriptorOut, which defines the I/O + Buffer for both the non-immediate unsolicited data and the solicited + data. This allows the iSCSI layer at the target to use recovery R2Ts + to request data originally sent as unsolicited and solicited from the + initiator. + + When the iSCSI layer at the target accepts a reassignment request for + a SCSI read command, it MUST request that the iSER layer process SCSI + Data-in for all unacknowledged data by invoking the Put_Data + Operational Primitive. See Section 7.3.5 on the handling of SCSI + Data-in. + + When the iSCSI layer at the target accepts a reassignment request for + a SCSI write command, it MUST request that the iSER layer process a + recovery R2T for any non-immediate unsolicited data and any solicited + data sequences that have not been received by invoking the Get_Data + Operational Primitive. See Section 7.3.6 on the handling of Ready To + Transfer (R2T). + + The iSCSI layer at the target MUST NOT issue recovery R2Ts on an + iSCSI/iSER connection for a task for which the connection allegiance + was never reassigned. The iSER layer at the target MAY reject such a + recovery R2T received via the Get_Data Operational Primitive + invocation from the iSCSI layer at the target, with an appropriate + error code. + + The iSER layer at the target will process the requests invoked by the + Put_Data and Get_Data Operational Primitives for a reassigned task in + the same way as for the original commands. + + + + + + +Ko, et al. Standards Track [Page 70] + +RFC 5046 iSER Specification October 2007 + + +11. Security Considerations + + When iSER is layered on top of an RCaP layer and provides the RDMA + extensions to the iSCSI protocol, the security considerations of iSER + are the same as that of the underlying RCaP layer. For iWARP, this + is described in [RDMAP] and [RDDPSEC]. + + Since the iSER-assisted iSCSI protocol is still functionally iSCSI + from a security considerations perspective, all of the iSCSI security + requirements as described in [RFC3720] and [RFC3723] apply. If the + IPsec [IPSEC] mechanism is used, then it MUST be established before + the connection transitions to the iSER-assisted mode. If iSER is + layered on top of a non-IP based RCaP layer, all the security + protocol mechanisms applicable to that RCaP layer are also applicable + to an iSCSI/iSER connection. If iSER is layered on top of a non-IP + protocol, the IPsec mechanism as specified in [RFC3720] MUST be + implemented at any point where the iSER protocol enters the IP + network (e.g., via gateways), and the non-IP protocol SHOULD + implement (optional to use) a packet-by packet security protocol + equal in strength to the IPsec mechanism specified by [RFC3720]. + + To minimize the potential for a denial-of-service attack, the iSCSI + layer MUST NOT request that the iSER layer allocate the connection + resources necessary to support RCaP until the iSCSI layer is + sufficiently far along in the iSCSI Login Phase that it is reasonably + certain that the peer side is not an attacker, as described in + Sections 5.1.1 and 5.1.2. + + Note that the IPsec requirements for this document are based on the + version of IPsec specified in RFC 2401 [IPSEC] and related RFCs, as + profiled by RFC 3723 [RFC3723], despite the existence of a newer + version of IPsec specified in RFC 4301 [RFC4301] and related RFCs. + +12. References + +12.1. Normative References + + [RFC3720] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and + E. Zeidner, "Internet Small Computer Systems Interface + (iSCSI)", RFC 3720, April 2004. + + [RFC3723] Aboba, B., Tseng, J., Walker, J., Rangan, V., and F. + Travostino, "Securing Block Storage Protocols over IP", RFC + 3723, April 2004. + + [RDMAP] Recio, R., Culley, P., Garcia, D., Hilland, J., and B. + Metzler, "A Remote Direct Memory Access Protocol + Specification", RFC 5040, October 2007. + + + +Ko, et al. Standards Track [Page 71] + +RFC 5046 iSER Specification October 2007 + + + [DDP] Shah, H., Pinkerton, J., Recio, R., and P. Culley, "Direct + Data Placement over Reliable Transports", RFC 5041, October + 2007. + + [IPSEC] Kent, S. and R. Atkinson, "Security Architecture for the + Internet Protocol", RFC 2401, November 1998. + + [MPA] Culley, P., Elzur, U., Recio, R., Bailey, S., and J. + Carrier, "Marker PDU Aligned Framing for TCP + Specification", RFC 5044, October 2007. + + [RDDPSEC] Pinkerton, J. and E. Deleganes, "Direct Data Placement + Protocol (DDP) / Remote Direct Memory Access Protocol + (RDMAP) Security", RFC 5042, October 2007. + + [TCP] Postel, J., "Transmission Control Protocol", STD 7, RFC + 793, September 1981. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + +12.2. Informative References + + [SAM2] T10/1157D, SCSI Architecture Model - 2 (SAM-2) + + [DA] Chadalapaka, M., Hufferd, J., Satran, J., and H. Shah, "DA: + Datamover Architecture for the Internet Small Computer + System Interface (iSCSI)", RFC 5047, October 2007. + + [IB] InfiniBand Architecture Specification Volume 1 Release 1.2, + October 2004 + + [IPoIB] Chu, J. and V. Kashyap, "Transmission of IP over InfiniBand + (IPoIB)", RFC 4391, April 2006. + + [RFC4301] Kent, S. and K. Seo, "Security Architecture for the + Internet Protocol", RFC 4301, December 2005. + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 72] + +RFC 5046 iSER Specification October 2007 + + +Appendix A. iWARP Message Format for iSER + + This section is for information only and is NOT part of the standard. + It simply depicts the iWARP Message format for the various iSER + Messages when the transport layer is TCP. + +A.1. iWARP Message Format for iSER Hello Message + + The following figure depicts an iSER Hello Message encapsulated in an + iWARP SendSE Message. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Queue Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Sequence Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | 0010b | Zeros | 0001b | 0001b | iSER-IRD | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 6. SendSE Message Containing an iSER Hello Message + + + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 73] + +RFC 5046 iSER Specification October 2007 + + +A.2. iWARP Message Format for iSER HelloReply Message + + The following figure depicts an iSER HelloReply Message encapsulated + in an iWARP SendSE Message. The Reject (REJ) flag is set to 0. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Queue Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Sequence Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | 0011b |Zeros|0| 0001b | 0001b | iSER-ORD | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 7. SendSE Message Containing an iSER HelloReply Message + + + + + + + + + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 74] + +RFC 5046 iSER Specification October 2007 + + +A.3. iWARP Message Format for SCSI Read Command PDU + + The following figure depicts a SCSI Read Command PDU embedded in an + iSER Message encapsulated in an iWARP SendSE Message. For this + particular example, in the iSER header, the Write STag Valid flag is + set to zero, the Read STag Valid flag is set to one, the Write STag + field is set to all zeros, and the Read STag field contains a valid + Read STag. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Queue Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Sequence Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | 0001b |0|1| All zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Read STag | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SCSI Read Command PDU | + // // + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 8. SendSE Message Containing a SCSI Read Command PDU + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 75] + +RFC 5046 iSER Specification October 2007 + + +A.4. iWARP Message Format for SCSI Read Data + + The following figure depicts an iWARP RDMA Write Message carrying + SCSI Read data in the payload: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Data Sink STag | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Data Sink Tagged Offset | + + + + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SCSI Read data | + // // + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 9. RDMA Write Message Containing SCSI Read Data + + + + + + + + + + + + + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 76] + +RFC 5046 iSER Specification October 2007 + + +A.5. iWARP Message Format for SCSI Write Command PDU + + The following figure depicts a SCSI Write Command PDU embedded in an + iSER Message encapsulated in an iWARP SendSE Message. For this + particular example, in the iSER header, the Write STag Valid flag is + set to one, the Read STag Valid flag is set to zero, the Write STag + field contains a valid Write STag, and the Read STag field is set to + all zeros since it is not used. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Queue Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Sequence Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | 0001b |1|0| All zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Write STag | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SCSI Write Command PDU | + // // + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 10. SendSE Message Containing a SCSI Write Command PDU + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 77] + +RFC 5046 iSER Specification October 2007 + + +A.6. iWARP Message Format for RDMA Read Request + + An iSCSI R2T is transformed into an iWARP RDMA Read Request Message. + The following figure depicts an iWARP RDMA Read Request Message: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Reserved (Not Used) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | DDP (RDMA Read Request) Queue Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | DDP (RDMA Read Request) Message Sequence Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | DDP (RDMA Read Request) Message Offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Data Sink STag (SinkSTag) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + + Data Sink Tagged Offset (SinkTO) + + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | RDMA Read Message Size (RDMARDSZ) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Data Source STag (SrcSTag) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + + Data Source Tagged Offset (SrcTO) + + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 11. RDMA Read Request Message + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 78] + +RFC 5046 iSER Specification October 2007 + + +A.7. iWARP Message Format for Solicited SCSI Write Data + + The following figure depicts an iWARP RDMA Read Response Message + carrying the solicited SCSI Write data in the payload: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Data Sink STag | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Data Sink Tagged Offset | + + + + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SCSI Write Data | + // // + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 12. RDMA Read Response Message Containing SCSI Write Data + + + + + + + + + + + + + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 79] + +RFC 5046 iSER Specification October 2007 + + +A.8. iWARP Message Format for SCSI Response PDU + + The following figure depicts a SCSI Response PDU embedded in an iSER + Message encapsulated in an iWARP SendInvSE Message: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA Header | DDP Control | RDMA Control | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Invalidate STag | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Queue Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Sequence Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | (Send) Message Offset | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | 0001b |0|0| All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | All Zeros | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | SCSI Response PDU | + // // + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MPA CRC | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 13. SendInvSE Message Containing SCSI Response PDU + + + + + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 80] + +RFC 5046 iSER Specification October 2007 + + +Appendix B. Architectural Discussion of iSER over InfiniBand + + This section explains how an InfiniBand network (with Gateways) would + be structured. It is informational only and is intended to provide + insight on how iSER is used in an InfiniBand environment. + +B.1. The Host Side of the iSCSI and iSER Connections in InfiniBand + + Figure 14 defines the topologies in which iSCSI and iSER will be able + to operate on an InfiniBand Network. + + +---------+ +---------+ +---------+ +---------+ +--- -----+ + | Host | | Host | | Host | | Host | | Host | + | | | | | | | | | | + +---+-+---+ +---+-+---+ +---+-+---+ +---+-+---+ +---+-+---+ + |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| |HCA| + +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ +-v-+ + |----+------|-----+-----|-----+-----|-----+-----|-----+---> To IB + IB| IB | IB | IB | IB | SubNet2 SWTCH + +-v-----------v-----------v-----------v-----------v---------+ + | InfiniBand Switch for Subnet1 | + +---+-----+--------+-----+--------+-----+------------v------+ + | TCA | | TCA | | TCA | | + +-----+ +-----+ +-----+ | IB + / IB \ / IB \ / \ +--+--v--+--+ + | iSER | | iSER | | IPoIB | | | TCA | | + | Gateway | | Gateway | | Gateway | | +-----+ | + | to | | to | | to | | Storage | + | iSCSI | | iSER | | IP | | Controller| + | TCP | | iWARP | |Ethernet | +-----+-----+ + +---v-----| +---v-----| +----v----+ + | EN | EN | EN + +--------------+---------------+----> to IP based storage + Ethernet links that carry iSCSI or iWARP + + Figure 14. iSCSI and iSER on IB + + In Figure 14, the Host systems are connected via the InfiniBand Host + Channel Adapters (HCAs) to the InfiniBand links. With the use of IB + switch(es), the InfiniBand links connect the HCA to InfiniBand Target + Channel Adapters (TCAs) located in gateways or Storage Controllers. + An iSER-capable IB-IP Gateway converts the iSER Messages encapsulated + in IB protocols to either standard iSCSI, or iSER Messages for iWARP. + An [IPoIB] Gateway converts the InfiniBand [IPoIB] protocol to IP + protocol, and in the iSCSI case, permits iSCSI to be operated on an + IB Network between the Hosts and the [IPoIB] Gateway. + + + + + +Ko, et al. Standards Track [Page 81] + +RFC 5046 iSER Specification October 2007 + + +B.2. The Storage Side of the iSCSI and iSER Mixed Network Environment + + Figure 15 shows a storage controller that has three different portal + groups: one supporting only iSCSI (TPG-4), one supporting iSER/iWARP + or iSCSI (TPG-2), and one supporting iSER/IB (TPG-1). + + | | | + | | | + +--+--v--+----------+--v--+----------+--v--+--+ + | | IB | |iWARP| | EN | | + | | | | TCP | | NIC | | + | |(TCA)| | RNIC| | | | + | +-----| +-----+ +-----+ | + | TPG-1 TPG-2 TPG-4 | + | 9.1.3.3 9.1.2.4 9.1.2.6 | + | | + | Storage Controller | + | | + +---------------------------------------------+ + + Figure 15. Storage Controller with TCP, iWARP, and IB Connections + + The normal iSCSI portal group advertising processes (via the Service + Location Protocol (SLP), the Internet Storage Name Service (iSNS), or + SendTargets) are available to a Storage Controller. + +B.3. Discovery Processes for an InfiniBand Host + + An InfiniBand Host system can gather portal group IP addresses from + SLP, iSNS, or the SendTargets discovery processes by using TCP/IP via + [IPoIB]. After obtaining one or more remote portal IP addresses, the + Initiator uses the standard IP mechanisms to resolve the IP address + to a local outgoing interface and the destination hardware address + (Ethernet MAC or IB GID of the target or a gateway leading to the + target). If the resolved interface is an [IPoIB] network interface, + then the target portal can be reached through an InfiniBand fabric. + In this case, the Initiator can establish an iSCSI/TCP or iSCSI/iSER + session with the Target over that InfiniBand interface, using the + Hardware Address (InfiniBand GID) obtained through the standard + Address Resolution (ARP) processes. + + If more than one IP address is obtained through the discovery + process, the Initiator should select a Target IP address that is on + the same IP subnet as the Initiator, if one exists. This will avoid + a potential overhead of going through a gateway when a direct path + exists. + + + + + +Ko, et al. Standards Track [Page 82] + +RFC 5046 iSER Specification October 2007 + + + In addition, a user can configure manual static IP route entries if a + particular path to the target is preferred. + +B.4. IBTA Connection Specifications + + The InfiniBand Trade Association (IBTA) connection specifications are + outside the scope of this document, but it is expected that the IBTA + has or will define: + + * The iSER ServiceID. + + * A Means for permitting a Host to establish a connection with a + peer InfiniBand end-node, and to fall back to iSCSI/TCP over + [IPoIB] if that peer indicates iSER is not supported. + + * A Means for permitting the Host to establish connections with IB + iSER connections on storage controllers or IB iSER connected + Gateways in preference to [IPoIB] connected Gateways/Bridges or + connections to Target Storage Controllers that also accept iSCSI + via [IPoIB]. + + * A Means for combining the IB ServiceID for iSER and the IP port + number such that the IB Host can use normal IB connection + processes, yet ensure that the iSER target peer can actually + connect to the required IP port number. + +Acknowledgments + + This protocol was developed by a design team that, in addition to the + authors, included Dwight Barron (HP), John Carrier (formerly from + Adaptec), Ted Compton (EMC), Paul R. Culley (HP), Yaron Haviv + (Voltaire), Jeff Hilland (HP), Mike Krause (HP), Alex Nezhinsky + (Voltaire), Jim Pinkerton (Microsoft), Renato J. Recio (IBM), Julian + Satran (IBM), Tom Talpey (Network Appliance), and Jim Wendt (HP). + Special thanks to David Black (EMC) for his extensive review + comments. + + + + + + + + + + + + + + + +Ko, et al. Standards Track [Page 83] + +RFC 5046 iSER Specification October 2007 + + +Author's Address + + Mallikarjun Chadalapaka + Hewlett-Packard Company + 8000 Foothills Blvd. + Roseville, CA 95747-5668, USA + Phone: +1-916-785-5621 + EMail: cbm@rose.hp.com + + Uri Elzur + Broadcom Corporation + 5300 California Avenue + Irvine, CA 92617, USA + Phone: +1-949-926-6432 + EMail: Uri@Broadcom.com + + John Hufferd + Brocade Communications Systems, Inc. + 1745 Technology Drive + San Jose, CA 95110, USA + Phone: +1-408-333-5244 + EMail: jhufferd@brocade.com + + Mike Ko + IBM Corp. + 650 Harry Rd. + San Jose, CA 95120, USA + Phone: +1-408-927-2085 + EMail: mako@us.ibm.com + + Hemal Shah + Broadcom Corporation + 5300 California Avenue + Irvine, CA 92617, USA + Phone: +1-949-926-6941 + EMail: hemal@broadcom.com + + Patricia Thaler + Broadcom Corporation + 5300 California Avenue + Irvine, CA 92617, USA + Phone: +1-916-570-2707 + EMail: pthaler@broadcom.com + + + + + + + + +Ko, et al. Standards Track [Page 84] + +RFC 5046 iSER Specification October 2007 + + +Full Copyright Statement + + Copyright (C) The IETF Trust (2007). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND + THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF + THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + + + + + + + + + + + + +Ko, et al. Standards Track [Page 85] + |