1 files changed, 1963 insertions, 0 deletions
diff --git a/doc/rfc/rfc5664.txt b/doc/rfc/rfc5664.txt
new file mode 100644
index 0000000..0ee5b3e
--- /dev/null
+++ b/doc/rfc/rfc5664.txt
@@ -0,0 +1,1963 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                         B. Halevy
+Request for Comments: 5664                                      B. Welch
+Category: Standards Track                                     J. Zelenka
+ISSN: 2070-1721                                                  Panasas
+                                                            January 2010
+
+
+              Object-Based Parallel NFS (pNFS) Operations
+
+Abstract
+
+   Parallel NFS (pNFS) extends Network File System version 4 (NFSv4) to
+   allow clients to directly access file data on the storage used by the
+   NFSv4 server.  This ability to bypass the server for data access can
+   increase both performance and parallelism, but requires additional
+   client functionality for data access, some of which is dependent on
+   the class of storage used, a.k.a. the Layout Type.  The main pNFS
+   operations and data types in NFSv4 Minor version 1 specify a layout-
+   type-independent layer; layout-type-specific information is conveyed
+   using opaque data structures whose internal structure is further
+   defined by the particular layout type specification.  This document
+   specifies the NFSv4.1 Object-Based pNFS Layout Type as a companion to
+   the main NFSv4 Minor version 1 specification.
+
+Status of This Memo
+
+   This is an Internet Standards Track document.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Further information on
+   Internet Standards is available in Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc5664.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                     [Page 1]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+Copyright Notice
+
+   Copyright (c) 2010 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+Table of Contents
+
+   1. Introduction ....................................................3
+      1.1. Requirements Language ......................................4
+   2. XDR Description of the Objects-Based Layout Protocol ............4
+      2.1. Code Components Licensing Notice ...........................4
+   3. Basic Data Type Definitions .....................................6
+      3.1. pnfs_osd_objid4 ............................................6
+      3.2. pnfs_osd_version4 ..........................................6
+      3.3. pnfs_osd_object_cred4 ......................................7
+      3.4. pnfs_osd_raid_algorithm4 ...................................8
+   4. Object Storage Device Addressing and Discovery ..................8
+      4.1. pnfs_osd_targetid_type4 ...................................10
+      4.2. pnfs_osd_deviceaddr4 ......................................10
+           4.2.1. SCSI Target Identifier .............................11
+           4.2.2. Device Network Address .............................11
+   5. Object-Based Layout ............................................12
+      5.1. pnfs_osd_data_map4 ........................................13
+      5.2. pnfs_osd_layout4 ..........................................14
+      5.3. Data Mapping Schemes ......................................14
+           5.3.1. Simple Striping ....................................15
+           5.3.2. Nested Striping ....................................16
+           5.3.3. Mirroring ..........................................17
+      5.4. RAID Algorithms ...........................................18
+           5.4.1. PNFS_OSD_RAID_0 ....................................18
+           5.4.2. PNFS_OSD_RAID_4 ....................................18
+           5.4.3. PNFS_OSD_RAID_5 ....................................18
+           5.4.4. PNFS_OSD_RAID_PQ ...................................19
+           5.4.5. RAID Usage and Implementation Notes ................19
+   6. Object-Based Layout Update .....................................20
+      6.1. pnfs_osd_deltaspaceused4 ..................................20
+      6.2. pnfs_osd_layoutupdate4 ....................................21
+   7. Recovering from Client I/O Errors ..............................21
+
+
+
+Halevy, et al.              Standards Track                     [Page 2]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   8. Object-Based Layout Return .....................................22
+      8.1. pnfs_osd_errno4 ...........................................23
+      8.2. pnfs_osd_ioerr4 ...........................................24
+      8.3. pnfs_osd_layoutreturn4 ....................................24
+   9. Object-Based Creation Layout Hint ..............................25
+      9.1. pnfs_osd_layouthint4 ......................................25
+   10. Layout Segments ...............................................26
+      10.1. CB_LAYOUTRECALL and LAYOUTRETURN .........................27
+      10.2. LAYOUTCOMMIT .............................................27
+   11. Recalling Layouts .............................................27
+      11.1. CB_RECALL_ANY ............................................28
+   12. Client Fencing ................................................29
+   13. Security Considerations .......................................29
+      13.1. OSD Security Data Types ..................................30
+      13.2. The OSD Security Protocol ................................30
+      13.3. Protocol Privacy Requirements ............................32
+      13.4. Revoking Capabilities ....................................32
+   14. IANA Considerations ...........................................33
+   15. References ....................................................33
+      15.1. Normative References .....................................33
+      15.2. Informative References ...................................34
+   Appendix A.  Acknowledgments ......................................35
+
+1.  Introduction
+
+   In pNFS, the file server returns typed layout structures that
+   describe where file data is located.  There are different layouts for
+   different storage systems and methods of arranging data on storage
+   devices.  This document describes the layouts used with object-based
+   storage devices (OSDs) that are accessed according to the OSD storage
+   protocol standard (ANSI INCITS 400-2004 [1]).
+
+   An "object" is a container for data and attributes, and files are
+   stored in one or more objects.  The OSD protocol specifies several
+   operations on objects, including READ, WRITE, FLUSH, GET ATTRIBUTES,
+   SET ATTRIBUTES, CREATE, and DELETE.  However, using the object-based
+   layout the client only uses the READ, WRITE, GET ATTRIBUTES, and
+   FLUSH commands.  The other commands are only used by the pNFS server.
+
+   An object-based layout for pNFS includes object identifiers,
+   capabilities that allow clients to READ or WRITE those objects, and
+   various parameters that control how file data is striped across their
+   component objects.  The OSD protocol has a capability-based security
+   scheme that allows the pNFS server to control what operations and
+   what objects can be used by clients.  This scheme is described in
+   more detail in the "Security Considerations" section (Section 13).
+
+
+
+
+
+Halevy, et al.              Standards Track                     [Page 3]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+1.1.  Requirements Language
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119 [2].
+
+2.  XDR Description of the Objects-Based Layout Protocol
+
+   This document contains the external data representation (XDR [3])
+   description of the NFSv4.1 objects layout protocol.  The XDR
+   description is embedded in this document in a way that makes it
+   simple for the reader to extract into a ready-to-compile form.  The
+   reader can feed this document into the following shell script to
+   produce the machine readable XDR description of the NFSv4.1 objects
+   layout protocol:
+
+   #!/bin/sh
+   grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??'
+
+   That is, if the above script is stored in a file called "extract.sh",
+   and this document is in a file called "spec.txt", then the reader can
+   do:
+
+   sh extract.sh < spec.txt > pnfs_osd_prot.x
+
+   The effect of the script is to remove leading white space from each
+   line, plus a sentinel sequence of "///".
+
+   The embedded XDR file header follows.  Subsequent XDR descriptions,
+   with the sentinel sequence are embedded throughout the document.
+
+   Note that the XDR code contained in this document depends on types
+   from the NFSv4.1 nfs4_prot.x file ([4]).  This includes both nfs
+   types that end with a 4, such as offset4, length4, etc., as well as
+   more generic types such as uint32_t and uint64_t.
+
+2.1.  Code Components Licensing Notice
+
+   The XDR description, marked with lines beginning with the sequence
+   "///", as well as scripts for extracting the XDR description are Code
+   Components as described in Section 4 of "Legal Provisions Relating to
+   IETF Documents" [5].  These Code Components are licensed according to
+   the terms of Section 4 of "Legal Provisions Relating to IETF
+   Documents".
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                     [Page 4]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   /// /*
+   ///  * Copyright (c) 2010 IETF Trust and the persons identified
+   ///  * as authors of the code.  All rights reserved.
+   ///  *
+   ///  * Redistribution and use in source and binary forms, with
+   ///  * or without modification, are permitted provided that the
+   ///  * following conditions are met:
+   ///  *
+   ///  * o Redistributions of source code must retain the above
+   ///  *   copyright notice, this list of conditions and the
+   ///  *   following disclaimer.
+   ///  *
+   ///  * o Redistributions in binary form must reproduce the above
+   ///  *   copyright notice, this list of conditions and the
+   ///  *   following disclaimer in the documentation and/or other
+   ///  *   materials provided with the distribution.
+   ///  *
+   ///  * o Neither the name of Internet Society, IETF or IETF
+   ///  *   Trust, nor the names of specific contributors, may be
+   ///  *   used to endorse or promote products derived from this
+   ///  *   software without specific prior written permission.
+   ///  *
+   ///  *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS
+   ///  *   AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
+   ///  *   WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+   ///  *   IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+   ///  *   FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO
+   ///  *   EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+   ///  *   LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+   ///  *   EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+   ///  *   NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+   ///  *   SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+   ///  *   INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+   ///  *   LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+   ///  *   OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+   ///  *   IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+   ///  *   ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+   ///  *
+   ///  * This code was derived from RFC 5664.
+   ///  * Please reproduce this note if possible.
+   ///  */
+   ///
+   /// /*
+   ///  * pnfs_osd_prot.x
+   ///  */
+   ///
+   /// %#include <nfs4_prot.x>
+   ///
+
+
+
+Halevy, et al.              Standards Track                     [Page 5]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+3.  Basic Data Type Definitions
+
+   The following sections define basic data types and constants used by
+   the Object-Based Layout protocol.
+
+3.1.  pnfs_osd_objid4
+
+   An object is identified by a number, somewhat like an inode number.
+   The object storage model has a two-level scheme, where the objects
+   within an object storage device are grouped into partitions.
+
+   /// struct pnfs_osd_objid4 {
+   ///     deviceid4       oid_device_id;
+   ///     uint64_t        oid_partition_id;
+   ///     uint64_t        oid_object_id;
+   /// };
+   ///
+
+   The pnfs_osd_objid4 type is used to identify an object within a
+   partition on a specified object storage device. "oid_device_id"
+   selects the object storage device from the set of available storage
+   devices.  The device is identified with the deviceid4 type, which is
+   an index into addressing information about that device returned by
+   the GETDEVICELIST and GETDEVICEINFO operations.  The deviceid4 data
+   type is defined in NFSv4.1 [6].  Within an OSD, a partition is
+   identified with a 64-bit number, "oid_partition_id".  Within a
+   partition, an object is identified with a 64-bit number,
+   "oid_object_id".  Creation and management of partitions is outside
+   the scope of this document, and is a facility provided by the object-
+   based storage file system.
+
+3.2.  pnfs_osd_version4
+
+   /// enum pnfs_osd_version4 {
+   ///     PNFS_OSD_MISSING    = 0,
+   ///     PNFS_OSD_VERSION_1  = 1,
+   ///     PNFS_OSD_VERSION_2  = 2
+   /// };
+   ///
+
+   pnfs_osd_version4 is used to indicate the OSD protocol version or
+   whether an object is missing (i.e., unavailable).  Some of the
+   object-based layout-supported RAID algorithms encode redundant
+   information and can compensate for missing components, but the data
+   placement algorithm needs to know what parts are missing.
+
+
+
+
+
+
+Halevy, et al.              Standards Track                     [Page 6]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   At this time, the OSD standard is at version 1.0, and we anticipate a
+   version 2.0 of the standard (SNIA T10/1729-D [14]).  The second
+   generation OSD protocol has additional proposed features to support
+   more robust error recovery, snapshots, and byte-range capabilities.
+   Therefore, the OSD version is explicitly called out in the
+   information returned in the layout.  (This information can also be
+   deduced by looking inside the capability type at the format field,
+   which is the first byte.  The format value is 0x1 for an OSD v1
+   capability.  However, it seems most robust to call out the version
+   explicitly.)
+
+3.3.  pnfs_osd_object_cred4
+
+   /// enum pnfs_osd_cap_key_sec4 {
+   ///     PNFS_OSD_CAP_KEY_SEC_NONE = 0,
+   ///     PNFS_OSD_CAP_KEY_SEC_SSV  = 1
+   /// };
+   ///
+   /// struct pnfs_osd_object_cred4 {
+   ///     pnfs_osd_objid4         oc_object_id;
+   ///     pnfs_osd_version4       oc_osd_version;
+   ///     pnfs_osd_cap_key_sec4   oc_cap_key_sec;
+   ///     opaque                  oc_capability_key<>;
+   ///     opaque                  oc_capability<>;
+   /// };
+   ///
+
+   The pnfs_osd_object_cred4 structure is used to identify each
+   component comprising the file.  The "oc_object_id" identifies the
+   component object, the "oc_osd_version" represents the osd protocol
+   version, or whether that component is unavailable, and the
+   "oc_capability" and "oc_capability_key", along with the
+   "oda_systemid" from the pnfs_osd_deviceaddr4, provide the OSD
+   security credentials needed to access that object.  The
+   "oc_cap_key_sec" value denotes the method used to secure the
+   oc_capability_key (see Section 13.1 for more details).
+
+   To comply with the OSD security requirements, the capability key
+   SHOULD be transferred securely to prevent eavesdropping (see
+   Section 13).  Therefore, a client SHOULD either issue the LAYOUTGET
+   or GETDEVICEINFO operations via RPCSEC_GSS with the privacy service
+   or previously establish a secret state verifier (SSV) for the
+   sessions via the NFSv4.1 SET_SSV operation.  The
+   pnfs_osd_cap_key_sec4 type is used to identify the method used by the
+   server to secure the capability key.
+
+
+
+
+
+
+Halevy, et al.              Standards Track                     [Page 7]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   o  PNFS_OSD_CAP_KEY_SEC_NONE denotes that the oc_capability_key is
+      not encrypted, in which case the client SHOULD issue the LAYOUTGET
+      or GETDEVICEINFO operations with RPCSEC_GSS with the privacy
+      service or the NFSv4.1 transport should be secured by using
+      methods that are external to NFSv4.1 like the use of IPsec [15]
+      for transporting the NFSV4.1 protocol.
+
+   o  PNFS_OSD_CAP_KEY_SEC_SSV denotes that the oc_capability_key
+      contents are encrypted using the SSV GSS context and the
+      capability key as inputs to the GSS_Wrap() function (see GSS-API
+      [7]) with the conf_req_flag set to TRUE.  The client MUST use the
+      secret SSV key as part of the client's GSS context to decrypt the
+      capability key using the value of the oc_capability_key field as
+      the input_message to the GSS_unwrap() function.  Note that to
+      prevent eavesdropping of the SSV key, the client SHOULD issue
+      SET_SSV via RPCSEC_GSS with the privacy service.
+
+   The actual method chosen depends on whether the client established a
+   SSV key with the server and whether it issued the operation with the
+   RPCSEC_GSS privacy method.  Naturally, if the client did not
+   establish an SSV key via SET_SSV, the server MUST use the
+   PNFS_OSD_CAP_KEY_SEC_NONE method.  Otherwise, if the operation was
+   not issued with the RPCSEC_GSS privacy method, the server SHOULD
+   secure the oc_capability_key with the PNFS_OSD_CAP_KEY_SEC_SSV
+   method.  The server MAY use the PNFS_OSD_CAP_KEY_SEC_SSV method also
+   when the operation was issued with the RPCSEC_GSS privacy method.
+
+3.4.  pnfs_osd_raid_algorithm4
+
+   /// enum pnfs_osd_raid_algorithm4 {
+   ///     PNFS_OSD_RAID_0     = 1,
+   ///     PNFS_OSD_RAID_4     = 2,
+   ///     PNFS_OSD_RAID_5     = 3,
+   ///     PNFS_OSD_RAID_PQ    = 4     /* Reed-Solomon P+Q */
+   /// };
+   ///
+
+   pnfs_osd_raid_algorithm4 represents the data redundancy algorithm
+   used to protect the file's contents.  See Section 5.4 for more
+   details.
+
+4.  Object Storage Device Addressing and Discovery
+
+   Data operations to an OSD require the client to know the "address" of
+   each OSD's root object.  The root object is synonymous with the Small
+   Computer System Interface (SCSI) logical unit.  The client specifies
+   SCSI logical units to its SCSI protocol stack using a representation
+
+
+
+
+Halevy, et al.              Standards Track                     [Page 8]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   local to the client.  Because these representations are local,
+   GETDEVICEINFO must return information that can be used by the client
+   to select the correct local representation.
+
+   In the block world, a set offset (logical block number or track/
+   sector) contains a disk label.  This label identifies the disk
+   uniquely.  In contrast, an OSD has a standard set of attributes on
+   its root object.  For device identification purposes, the OSD System
+   ID (root information attribute number 3) and the OSD Name (root
+   information attribute number 9) are used as the label.  These appear
+   in the pnfs_osd_deviceaddr4 type below under the "oda_systemid" and
+   "oda_osdname" fields.
+
+   In some situations, SCSI target discovery may need to be driven based
+   on information contained in the GETDEVICEINFO response.  One example
+   of this is Internet SCSI (iSCSI) targets that are not known to the
+   client until a layout has been requested.  The information provided
+   as the "oda_targetid", "oda_targetaddr", and "oda_lun" fields in the
+   pnfs_osd_deviceaddr4 type described below (see Section 4.2) allows
+   the client to probe a specific device given its network address and
+   optionally its iSCSI Name (see iSCSI [8]), or when the device network
+   address is omitted, allows it to discover the object storage device
+   using the provided device name or SCSI Device Identifier (see SPC-3
+   [9].)
+
+   The oda_systemid is implicitly used by the client, by using the
+   object credential signing key to sign each request with the request
+   integrity check value.  This method protects the client from
+   unintentionally accessing a device if the device address mapping was
+   changed (or revoked).  The server computes the capability key using
+   its own view of the systemid associated with the respective deviceid
+   present in the credential.  If the client's view of the deviceid
+   mapping is stale, the client will use the wrong systemid (which must
+   be system-wide unique) and the I/O request to the OSD will fail to
+   pass the integrity check verification.
+
+   To recover from this condition the client should report the error and
+   return the layout using LAYOUTRETURN, and invalidate all the device
+   address mappings associated with this layout.  The client can then
+   ask for a new layout if it wishes using LAYOUTGET and resolve the
+   referenced deviceids using GETDEVICEINFO or GETDEVICELIST.
+
+   The server MUST provide the oda_systemid and SHOULD also provide the
+   oda_osdname.  When the OSD name is present, the client SHOULD get the
+   root information attributes whenever it establishes communication
+   with the OSD and verify that the OSD name it got from the OSD matches
+   the one sent by the metadata server.  To do so, the client uses the
+   root_obj_cred credentials.
+
+
+
+Halevy, et al.              Standards Track                     [Page 9]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+4.1.  pnfs_osd_targetid_type4
+
+   The following enum specifies the manner in which a SCSI target can be
+   specified.  The target can be specified as a SCSI Name, or as an SCSI
+   Device Identifier.
+
+   /// enum pnfs_osd_targetid_type4 {
+   ///     OBJ_TARGET_ANON             = 1,
+   ///     OBJ_TARGET_SCSI_NAME        = 2,
+   ///     OBJ_TARGET_SCSI_DEVICE_ID   = 3
+   /// };
+   ///
+
+4.2.  pnfs_osd_deviceaddr4
+
+   The specification for an object device address is as follows:
+
+/// union pnfs_osd_targetid4 switch (pnfs_osd_targetid_type4 oti_type) {
+///     case OBJ_TARGET_SCSI_NAME:
+///         string              oti_scsi_name<>;
+///
+///     case OBJ_TARGET_SCSI_DEVICE_ID:
+///         opaque              oti_scsi_device_id<>;
+///
+///     default:
+///         void;
+/// };
+///
+/// union pnfs_osd_targetaddr4 switch (bool ota_available) {
+///     case TRUE:
+///         netaddr4            ota_netaddr;
+///     case FALSE:
+///         void;
+/// };
+///
+/// struct pnfs_osd_deviceaddr4 {
+///     pnfs_osd_targetid4      oda_targetid;
+///     pnfs_osd_targetaddr4    oda_targetaddr;
+///     opaque                  oda_lun[8];
+///     opaque                  oda_systemid<>;
+///     pnfs_osd_object_cred4   oda_root_obj_cred;
+///     opaque                  oda_osdname<>;
+/// };
+///
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 10]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+4.2.1.  SCSI Target Identifier
+
+   When "oda_targetid" is specified as an OBJ_TARGET_SCSI_NAME, the
+   "oti_scsi_name" string MUST be formatted as an "iSCSI Name" as
+   specified in iSCSI [8] and [10].  Note that the specification of the
+   oti_scsi_name string format is outside the scope of this document.
+   Parsing the string is based on the string prefix, e.g., "iqn.",
+   "eui.", or "naa." and more formats MAY be specified in the future in
+   accordance with iSCSI Names properties.
+
+   Currently, the iSCSI Name provides for naming the target device using
+   a string formatted as an iSCSI Qualified Name (IQN) or as an Extended
+   Unique Identifier (EUI) [11] string.  Those are typically used to
+   identify iSCSI or Secure Routing Protocol (SRP) [16] devices.  The
+   Network Address Authority (NAA) string format (see [10]) provides for
+   naming the device using globally unique identifiers, as defined in
+   Fibre Channel Framing and Signaling (FC-FS) [17].  These are
+   typically used to identify Fibre Channel or SAS [18] (Serial Attached
+   SCSI) devices.  In particular, such devices that are dual-attached
+   both over Fibre Channel or SAS and over iSCSI.
+
+   When "oda_targetid" is specified as an OBJ_TARGET_SCSI_DEVICE_ID, the
+   "oti_scsi_device_id" opaque field MUST be formatted as a SCSI Device
+   Identifier as defined in SPC-3 [9] VPD Page 83h (Section 7.6.3.
+   "Device Identification VPD Page").  If the Device Identifier is
+   identical to the OSD System ID, as given by oda_systemid, the server
+   SHOULD provide a zero-length oti_scsi_device_id opaque value.  Note
+   that similarly to the "oti_scsi_name", the specification of the
+   oti_scsi_device_id opaque contents is outside the scope of this
+   document and more formats MAY be specified in the future in
+   accordance with SPC-3.
+
+   The OBJ_TARGET_ANON pnfs_osd_targetid_type4 MAY be used for providing
+   no target identification.  In this case, only the OSD System ID, and
+   optionally the provided network address, are used to locate the
+   device.
+
+4.2.2.  Device Network Address
+
+   The optional "oda_targetaddr" field MAY be provided by the server as
+   a hint to accelerate device discovery over, e.g., the iSCSI transport
+   protocol.  The network address is given with the netaddr4 type, which
+   specifies a TCP/IP based endpoint (as specified in NFSv4.1 [6]).
+   When given, the client SHOULD use it to probe for the SCSI device at
+   the given network address.  The client MAY still use other discovery
+   mechanisms such as Internet Storage Name Service (iSNS) [12] to
+   locate the device using the oda_targetid.  In particular, such an
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 11]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   external name service SHOULD be used when the devices may be attached
+   to the network using multiple connections, and/or multiple storage
+   fabrics (e.g., Fibre-Channel and iSCSI).
+
+   The "oda_lun" field identifies the OSD 64-bit Logical Unit Number,
+   formatted in accordance with SAM-3 [13].  The client uses the Logical
+   Unit Number to communicate with the specific OSD Logical Unit.  Its
+   use is defined in detail by the SCSI transport protocol, e.g., iSCSI
+   [8].
+
+5.  Object-Based Layout
+
+   The layout4 type is defined in the NFSv4.1 [6] as follows:
+
+   enum layouttype4 {
+       LAYOUT4_NFSV4_1_FILES   = 1,
+       LAYOUT4_OSD2_OBJECTS    = 2,
+       LAYOUT4_BLOCK_VOLUME    = 3
+   };
+
+   struct layout_content4 {
+       layouttype4             loc_type;
+       opaque                  loc_body<>;
+   };
+
+   struct layout4 {
+       offset4                 lo_offset;
+       length4                 lo_length;
+       layoutiomode4           lo_iomode;
+       layout_content4         lo_content;
+   };
+
+
+   This document defines structure associated with the layouttype4
+   value, LAYOUT4_OSD2_OBJECTS.  The NFSv4.1 [6] specifies the loc_body
+   structure as an XDR type "opaque".  The opaque layout is
+   uninterpreted by the generic pNFS client layers, but obviously must
+   be interpreted by the object storage layout driver.  This section
+   defines the structure of this opaque value, pnfs_osd_layout4.
+
+
+
+
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 12]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+5.1.  pnfs_osd_data_map4
+
+   /// struct pnfs_osd_data_map4 {
+   ///     uint32_t                    odm_num_comps;
+   ///     length4                     odm_stripe_unit;
+   ///     uint32_t                    odm_group_width;
+   ///     uint32_t                    odm_group_depth;
+   ///     uint32_t                    odm_mirror_cnt;
+   ///     pnfs_osd_raid_algorithm4    odm_raid_algorithm;
+   /// };
+   ///
+
+   The pnfs_osd_data_map4 structure parameterizes the algorithm that
+   maps a file's contents over the component objects.  Instead of
+   limiting the system to simple striping scheme where loss of a single
+   component object results in data loss, the map parameters support
+   mirroring and more complicated schemes that protect against loss of a
+   component object.
+
+   "odm_num_comps" is the number of component objects the file is
+   striped over.  The server MAY grow the file by adding more components
+   to the stripe while clients hold valid layouts until the file has
+   reached its final stripe width.  The file length in this case MUST be
+   limited to the number of bytes in a full stripe.
+
+   The "odm_stripe_unit" is the number of bytes placed on one component
+   before advancing to the next one in the list of components.  The
+   number of bytes in a full stripe is odm_stripe_unit times the number
+   of components.  In some RAID schemes, a stripe includes redundant
+   information (i.e., parity) that lets the system recover from loss or
+   damage to a component object.
+
+   The "odm_group_width" and "odm_group_depth" parameters allow a nested
+   striping pattern (see Section 5.3.2 for details).  If there is no
+   nesting, then odm_group_width and odm_group_depth MUST be zero.  The
+   size of the components array MUST be a multiple of odm_group_width.
+
+   The "odm_mirror_cnt" is used to replicate a file by replicating its
+   component objects.  If there is no mirroring, then odm_mirror_cnt
+   MUST be 0.  If odm_mirror_cnt is greater than zero, then the size of
+   the component array MUST be a multiple of (odm_mirror_cnt+1).
+
+   See Section 5.3 for more details.
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 13]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+5.2.  pnfs_osd_layout4
+
+   /// struct pnfs_osd_layout4 {
+   ///     pnfs_osd_data_map4      olo_map;
+   ///     uint32_t                olo_comps_index;
+   ///     pnfs_osd_object_cred4   olo_components<>;
+   /// };
+   ///
+
+   The pnfs_osd_layout4 structure specifies a layout over a set of
+   component objects.  The "olo_components" field is an array of object
+   identifiers and security credentials that grant access to each
+   object.  The organization of the data is defined by the
+   pnfs_osd_data_map4 type that specifies how the file's data is mapped
+   onto the component objects (i.e., the striping pattern).  The data
+   placement algorithm that maps file data onto component objects
+   assumes that each component object occurs exactly once in the array
+   of components.  Therefore, component objects MUST appear in the
+   olo_components array only once.  The components array may represent
+   all objects comprising the file, in which case "olo_comps_index" is
+   set to zero and the number of entries in the olo_components array is
+   equal to olo_map.odm_num_comps.  The server MAY return fewer
+   components than odm_num_comps, provided that the returned components
+   are sufficient to access any byte in the layout's data range (e.g., a
+   sub-stripe of "odm_group_width" components).  In this case,
+   olo_comps_index represents the position of the returned components
+   array within the full array of components that comprise the file.
+
+   Note that the layout depends on the file size, which the client
+   learns from the generic return parameters of LAYOUTGET, by doing
+   GETATTR commands to the metadata server.  The client uses the file
+   size to decide if it should fill holes with zeros or return a short
+   read.  Striping patterns can cause cases where component objects are
+   shorter than other components because a hole happens to correspond to
+   the last part of the component object.
+
+5.3.  Data Mapping Schemes
+
+   This section describes the different data mapping schemes in detail.
+   The object layout always uses a "dense" layout as described in
+   NFSv4.1 [6].  This means that the second stripe unit of the file
+   starts at offset 0 of the second component, rather than at offset
+   stripe_unit bytes.  After a full stripe has been written, the next
+   stripe unit is appended to the first component object in the list
+   without any holes in the component objects.
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 14]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+5.3.1.  Simple Striping
+
+   The mapping from the logical offset within a file (L) to the
+   component object C and object-specific offset O is defined by the
+   following equations:
+
+   L = logical offset into the file
+   W = total number of components
+   S = W * stripe_unit
+   N = L / S
+   C = (L-(N*S)) / stripe_unit
+   O = (N*stripe_unit)+(L%stripe_unit)
+
+   In these equations, S is the number of bytes in a full stripe, and N
+   is the stripe number.  C is an index into the array of components, so
+   it selects a particular object storage device.  Both N and C count
+   from zero.  O is the offset within the object that corresponds to the
+   file offset.  Note that this computation does not accommodate the
+   same object appearing in the olo_components array multiple times.
+
+   For example, consider an object striped over four devices, <D0 D1 D2
+   D3>.  The stripe_unit is 4096 bytes.  The stripe width S is thus 4 *
+   4096 = 16384.
+
+   Offset 0:
+     N = 0 / 16384 = 0
+     C = 0-0/4096 = 0 (D0)
+     O = 0*4096 + (0%4096) = 0
+
+   Offset 4096:
+     N = 4096 / 16384 = 0
+     C = (4096-(0*16384)) / 4096 = 1 (D1)
+     O = (0*4096)+(4096%4096) = 0
+
+   Offset 9000:
+     N = 9000 / 16384 = 0
+     C = (9000-(0*16384)) / 4096 = 2 (D2)
+     O = (0*4096)+(9000%4096) = 808
+
+   Offset 132000:
+     N = 132000 / 16384 = 8
+     C = (132000-(8*16384)) / 4096 = 0 (D0)
+     O = (8*4096) + (132000%4096) = 33696
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 15]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+5.3.2.  Nested Striping
+
+   The odm_group_width and odm_group_depth parameters allow a nested
+   striping pattern. odm_group_width defines the width of a data stripe
+   and odm_group_depth defines how many stripes are written before
+   advancing to the next group of components in the list of component
+   objects for the file.  The math used to map from a file offset to a
+   component object and offset within that object is shown below.  The
+   computations map from the logical offset L to the component index C
+   and offset relative O within that component object.
+
+   L = logical offset into the file
+   W = total number of components
+   S = stripe_unit * group_depth * W
+   T = stripe_unit * group_depth * group_width
+   U = stripe_unit * group_width
+   M = L / S
+   G = (L - (M * S)) / T
+   H = (L - (M * S)) % T
+   N = H / U
+   C = (H - (N * U)) / stripe_unit + G * group_width
+   O = L % stripe_unit + N * stripe_unit + M * group_depth * stripe_unit
+
+   In these equations, S is the number of bytes striped across all
+   component objects before the pattern repeats.  T is the number of
+   bytes striped within a group of component objects before advancing to
+   the next group.  U is the number of bytes in a stripe within a group.
+   M is the "major" (i.e., across all components) stripe number, and N
+   is the "minor" (i.e., across the group) stripe number.  G counts the
+   groups from the beginning of the major stripe, and H is the byte
+   offset within the group.
+
+   For example, consider an object striped over 100 devices with a
+   group_width of 10, a group_depth of 50, and a stripe_unit of 1 MB.
+   In this scheme, 500 MB are written to the first 10 components, and
+   5000 MB are written before the pattern wraps back around to the first
+   component in the array.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 16]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   Offset 0:
+     W = 100
+     S = 1 MB * 50 * 100 = 5000 MB
+     T = 1 MB * 50 * 10 = 500 MB
+     U = 1 MB * 10 = 10 MB
+     M = 0 / 5000 MB = 0
+     G = (0 - (0 * 5000 MB)) / 500 MB = 0
+     H = (0 - (0 * 5000 MB)) % 500 MB = 0
+     N = 0 / 10 MB = 0
+     C = (0 - (0 * 10 MB)) / 1 MB + 0 * 10 = 0
+     O = 0 % 1 MB + 0 * 1 MB + 0 * 50 * 1 MB = 0
+
+   Offset 27 MB:
+     M = 27 MB / 5000 MB = 0
+     G = (27 MB - (0 * 5000 MB)) / 500 MB = 0
+     H = (27 MB - (0 * 5000 MB)) % 500 MB = 27 MB
+     N = 27 MB / 10 MB = 2
+     C = (27 MB - (2 * 10 MB)) / 1 MB + 0 * 10 = 7
+     O = 27 MB % 1 MB + 2 * 1 MB + 0 * 50 * 1 MB = 2 MB
+
+   Offset 7232 MB:
+     M = 7232 MB / 5000 MB = 1
+     G = (7232 MB - (1 * 5000 MB)) / 500 MB = 4
+     H = (7232 MB - (1 * 5000 MB)) % 500 MB = 232 MB
+     N = 232 MB / 10 MB = 23
+     C = (232 MB - (23 * 10 MB)) / 1 MB + 4 * 10 = 42
+     O = 7232 MB % 1 MB + 23 * 1 MB + 1 * 50 * 1 MB = 73 MB
+
+5.3.3.  Mirroring
+
+   The odm_mirror_cnt is used to replicate a file by replicating its
+   component objects.  If there is no mirroring, then odm_mirror_cnt
+   MUST be 0.  If odm_mirror_cnt is greater than zero, then the size of
+   the olo_components array MUST be a multiple of (odm_mirror_cnt+1).
+   Thus, for a classic mirror on two objects, odm_mirror_cnt is one.
+   Note that mirroring can be defined over any RAID algorithm and
+   striping pattern (either simple or nested).  If odm_group_width is
+   also non-zero, then the size of the olo_components array MUST be a
+   multiple of odm_group_width * (odm_mirror_cnt+1).  Replicas are
+   adjacent in the olo_components array, and the value C produced by the
+   above equations is not a direct index into the olo_components array.
+   Instead, the following equations determine the replica component
+   index RCi, where i ranges from 0 to odm_mirror_cnt.
+
+   C = component index for striping or two-level striping
+   i ranges from 0 to odm_mirror_cnt, inclusive
+   RCi = C * (odm_mirror_cnt+1) + i
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 17]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+5.4.  RAID Algorithms
+
+   pnfs_osd_raid_algorithm4 determines the algorithm and placement of
+   redundant data.  This section defines the different redundancy
+   algorithms.  Note: The term "RAID" (Redundant Array of Independent
+   Disks) is used in this document to represent an array of component
+   objects that store data for an individual file.  The objects are
+   stored on independent object-based storage devices.  File data is
+   encoded and striped across the array of component objects using
+   algorithms developed for block-based RAID systems.
+
+5.4.1.  PNFS_OSD_RAID_0
+
+   PNFS_OSD_RAID_0 means there is no parity data, so all bytes in the
+   component objects are data bytes located by the above equations for C
+   and O.  If a component object is marked as PNFS_OSD_MISSING, the pNFS
+   client MUST either return an I/O error if this component is attempted
+   to be read or, alternatively, it can retry the READ against the pNFS
+   server.
+
+5.4.2.  PNFS_OSD_RAID_4
+
+   PNFS_OSD_RAID_4 means that the last component object, or the last in
+   each group (if odm_group_width is greater than zero), contains parity
+   information computed over the rest of the stripe with an XOR
+   operation.  If a component object is unavailable, the client can read
+   the rest of the stripe units in the damaged stripe and recompute the
+   missing stripe unit by XORing the other stripe units in the stripe.
+   Or the client can replay the READ against the pNFS server that will
+   presumably perform the reconstructed read on the client's behalf.
+
+   When parity is present in the file, then there is an additional
+   computation to map from the file offset L to the offset that accounts
+   for embedded parity, L'.  First compute L', and then use L' in the
+   above equations for C and O.
+
+   L = file offset, not accounting for parity
+   P = number of parity devices in each stripe
+   W = group_width, if not zero, else size of olo_components array
+   N = L / (W-P * stripe_unit)
+   L' = N * (W * stripe_unit) +
+        (L % (W-P * stripe_unit))
+
+5.4.3.  PNFS_OSD_RAID_5
+
+   PNFS_OSD_RAID_5 means that the position of the parity data is rotated
+   on each stripe or each group (if odm_group_width is greater than
+   zero).  In the first stripe, the last component holds the parity.  In
+
+
+
+Halevy, et al.              Standards Track                    [Page 18]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   the second stripe, the next-to-last component holds the parity, and
+   so on.  In this scheme, all stripe units are rotated so that I/O is
+   evenly spread across objects as the file is read sequentially.  The
+   rotated parity layout is illustrated here, with numbers indicating
+   the stripe unit.
+
+   0 1 2 P
+   4 5 P 3
+   8 P 6 7
+   P 9 a b
+
+   To compute the component object C, first compute the offset that
+   accounts for parity L' and use that to compute C.  Then rotate C to
+   get C'.  Finally, increase C' by one if the parity information comes
+   at or before C' within that stripe.  The following equations
+   illustrate this by computing I, which is the index of the component
+   that contains parity for a given stripe.
+
+   L = file offset, not accounting for parity
+   W = odm_group_width, if not zero, else size of olo_components array
+   N = L / (W-1 * stripe_unit)
+   (Compute L' as describe above)
+   (Compute C based on L' as described above)
+   C' = (C - (N%W)) % W
+   I = W - (N%W) - 1
+   if (C' <= I) {
+     C'++
+   }
+
+5.4.4.  PNFS_OSD_RAID_PQ
+
+   PNFS_OSD_RAID_PQ is a double-parity scheme that uses the Reed-Solomon
+   P+Q encoding scheme [19].  In this layout, the last two component
+   objects hold the P and Q data, respectively.  P is parity computed
+   with XOR, and Q is a more complex equation that is not described
+   here.  The equations given above for embedded parity can be used to
+   map a file offset to the correct component object by setting the
+   number of parity components to 2 instead of 1 for RAID4 or RAID5.
+   Clients may simply choose to read data through the metadata server if
+   two components are missing or damaged.
+
+5.4.5.  RAID Usage and Implementation Notes
+
+   RAID layouts with redundant data in their stripes require additional
+   serialization of updates to ensure correct operation.  Otherwise, if
+   two clients simultaneously write to the same logical range of an
+   object, the result could include different data in the same ranges of
+   mirrored tuples, or corrupt parity information.  It is the
+
+
+
+Halevy, et al.              Standards Track                    [Page 19]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   responsibility of the metadata server to enforce serialization
+   requirements such as this.  For example, the metadata server may do
+   so by not granting overlapping write layouts within mirrored objects.
+
+6.  Object-Based Layout Update
+
+   layoutupdate4 is used in the LAYOUTCOMMIT operation to convey updates
+   to the layout and additional information to the metadata server.  It
+   is defined in the NFSv4.1 [6] as follows:
+
+   struct layoutupdate4 {
+       layouttype4             lou_type;
+       opaque                  lou_body<>;
+   };
+
+   The layoutupdate4 type is an opaque value at the generic pNFS client
+   level.  If the lou_type layout type is LAYOUT4_OSD2_OBJECTS, then the
+   lou_body opaque value is defined by the pnfs_osd_layoutupdate4 type.
+
+   Object-Based pNFS clients are not allowed to modify the layout.
+   Therefore, the information passed in pnfs_osd_layoutupdate4 is used
+   only to update the file's attributes.  In addition to the generic
+   information the client can pass to the metadata server in
+   LAYOUTCOMMIT such as the highest offset the client wrote to and the
+   last time it modified the file, the client MAY use
+   pnfs_osd_layoutupdate4 to convey the capacity consumed (or released)
+   by writes using the layout, and to indicate that I/O errors were
+   encountered by such writes.
+
+6.1.  pnfs_osd_deltaspaceused4
+
+   /// union pnfs_osd_deltaspaceused4 switch (bool dsu_valid) {
+   ///     case TRUE:
+   ///         int64_t     dsu_delta;
+   ///     case FALSE:
+   ///         void;
+   /// };
+   ///
+
+   pnfs_osd_deltaspaceused4 is used to convey space utilization
+   information at the time of LAYOUTCOMMIT.  For the file system to
+   properly maintain capacity-used information, it needs to track how
+   much capacity was consumed by WRITE operations performed by the
+   client.  In this protocol, the OSD returns the capacity consumed by a
+   write (*), which can be different than the number of bytes written
+   because of internal overhead like block-level allocation and indirect
+   blocks, and the client reflects this back to the pNFS server so it
+   can accurately track quota.  The pNFS server can choose to trust this
+
+
+
+Halevy, et al.              Standards Track                    [Page 20]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   information coming from the clients and therefore avoid querying the
+   OSDs at the time of LAYOUTCOMMIT.  If the client is unable to obtain
+   this information from the OSD, it simply returns invalid
+   olu_delta_space_used.
+
+6.2.  pnfs_osd_layoutupdate4
+
+   /// struct pnfs_osd_layoutupdate4 {
+   ///     pnfs_osd_deltaspaceused4    olu_delta_space_used;
+   ///     bool                        olu_ioerr_flag;
+   /// };
+   ///
+
+   "olu_delta_space_used" is used to convey capacity usage information
+   back to the metadata server.
+
+   The "olu_ioerr_flag" is used when I/O errors were encountered while
+   writing the file.  The client MUST report the errors using the
+   pnfs_osd_ioerr4 structure (see Section 8.1) at LAYOUTRETURN time.
+
+   If the client updated the file successfully before hitting the I/O
+   errors, it MAY use LAYOUTCOMMIT to update the metadata server as
+   described above.  Typically, in the error-free case, the server MAY
+   turn around and update the file's attributes on the storage devices.
+   However, if I/O errors were encountered, the server better not
+   attempt to write the new attributes on the storage devices until it
+   receives the I/O error report; therefore, the client MUST set the
+   olu_ioerr_flag to true.  Note that in this case, the client SHOULD
+   send both the LAYOUTCOMMIT and LAYOUTRETURN operations in the same
+   COMPOUND RPC.
+
+7.  Recovering from Client I/O Errors
+
+   The pNFS client may encounter errors when directly accessing the
+   object storage devices.  However, it is the responsibility of the
+   metadata server to handle the I/O errors.  When the
+   LAYOUT4_OSD2_OBJECTS layout type is used, the client MUST report the
+   I/O errors to the server at LAYOUTRETURN time using the
+   pnfs_osd_ioerr4 structure (see Section 8.1).
+
+   The metadata server analyzes the error and determines the required
+   recovery operations such as repairing any parity inconsistencies,
+   recovering media failures, or reconstructing missing objects.
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 21]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   The metadata server SHOULD recall any outstanding layouts to allow it
+   exclusive write access to the stripes being recovered and to prevent
+   other clients from hitting the same error condition.  In these cases,
+   the server MUST complete recovery before handing out any new layouts
+   to the affected byte ranges.
+
+   Although it MAY be acceptable for the client to propagate a
+   corresponding error to the application that initiated the I/O
+   operation and drop any unwritten data, the client SHOULD attempt to
+   retry the original I/O operation by requesting a new layout using
+   LAYOUTGET and retry the I/O operation(s) using the new layout, or the
+   client MAY just retry the I/O operation(s) using regular NFS READ or
+   WRITE operations via the metadata server.  The client SHOULD attempt
+   to retrieve a new layout and retry the I/O operation using OSD
+   commands first and only if the error persists, retry the I/O
+   operation via the metadata server.
+
+8.  Object-Based Layout Return
+
+   layoutreturn_file4 is used in the LAYOUTRETURN operation to convey
+   layout-type specific information to the server.  It is defined in the
+   NFSv4.1 [6] as follows:
+
+   struct layoutreturn_file4 {
+           offset4         lrf_offset;
+           length4         lrf_length;
+           stateid4        lrf_stateid;
+           /* layouttype4 specific data */
+           opaque          lrf_body<>;
+   };
+
+   union layoutreturn4 switch(layoutreturn_type4 lr_returntype) {
+           case LAYOUTRETURN4_FILE:
+                   layoutreturn_file4      lr_layout;
+           default:
+                   void;
+   };
+
+   struct LAYOUTRETURN4args {
+           /* CURRENT_FH: file */
+           bool                    lora_reclaim;
+           layoutreturn_stateid    lora_recallstateid;
+           layouttype4             lora_layout_type;
+           layoutiomode4           lora_iomode;
+           layoutreturn4           lora_layoutreturn;
+   };
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 22]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   If the lora_layout_type layout type is LAYOUT4_OSD2_OBJECTS, then the
+   lrf_body opaque value is defined by the pnfs_osd_layoutreturn4 type.
+
+   The pnfs_osd_layoutreturn4 type allows the client to report I/O error
+   information back to the metadata server as defined below.
+
+8.1.  pnfs_osd_errno4
+
+   /// enum pnfs_osd_errno4 {
+   ///     PNFS_OSD_ERR_EIO            = 1,
+   ///     PNFS_OSD_ERR_NOT_FOUND      = 2,
+   ///     PNFS_OSD_ERR_NO_SPACE       = 3,
+   ///     PNFS_OSD_ERR_BAD_CRED       = 4,
+   ///     PNFS_OSD_ERR_NO_ACCESS      = 5,
+   ///     PNFS_OSD_ERR_UNREACHABLE    = 6,
+   ///     PNFS_OSD_ERR_RESOURCE       = 7
+   /// };
+   ///
+
+   pnfs_osd_errno4 is used to represent error types when read/write
+   errors are reported to the metadata server.  The error codes serve as
+   hints to the metadata server that may help it in diagnosing the exact
+   reason for the error and in repairing it.
+
+   o  PNFS_OSD_ERR_EIO indicates the operation failed because the object
+      storage device experienced a failure trying to access the object.
+      The most common source of these errors is media errors, but other
+      internal errors might cause this as well.  In this case, the
+      metadata server should go examine the broken object more closely;
+      hence, it should be used as the default error code.
+
+   o  PNFS_OSD_ERR_NOT_FOUND indicates the object ID specifies an object
+      that does not exist on the object storage device.
+
+   o  PNFS_OSD_ERR_NO_SPACE indicates the operation failed because the
+      object storage device ran out of free capacity during the
+      operation.
+
+   o  PNFS_OSD_ERR_BAD_CRED indicates the security parameters are not
+      valid.  The primary cause of this is that the capability has
+      expired, or the access policy tag (a.k.a., capability version
+      number) has been changed to revoke capabilities.  The client will
+      need to return the layout and get a new one with fresh
+      capabilities.
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 23]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   o  PNFS_OSD_ERR_NO_ACCESS indicates the capability does not allow the
+      requested operation.  This should not occur in normal operation
+      because the metadata server should give out correct capabilities,
+      or none at all.
+
+   o  PNFS_OSD_ERR_UNREACHABLE indicates the client did not complete the
+      I/O operation at the object storage device due to a communication
+      failure.  Whether or not the I/O operation was executed by the OSD
+      is undetermined.
+
+   o  PNFS_OSD_ERR_RESOURCE indicates the client did not issue the I/O
+      operation due to a local problem on the initiator (i.e., client)
+      side, e.g., when running out of memory.  The client MUST guarantee
+      that the OSD command was never dispatched to the OSD.
+
+8.2.  pnfs_osd_ioerr4
+
+   /// struct pnfs_osd_ioerr4 {
+   ///     pnfs_osd_objid4     oer_component;
+   ///     length4             oer_comp_offset;
+   ///     length4             oer_comp_length;
+   ///     bool                oer_iswrite;
+   ///     pnfs_osd_errno4     oer_errno;
+   /// };
+   ///
+
+   The pnfs_osd_ioerr4 structure is used to return error indications for
+   objects that generated errors during data transfers.  These are hints
+   to the metadata server that there are problems with that object.  For
+   each error, "oer_component", "oer_comp_offset", and "oer_comp_length"
+   represent the object and byte range within the component object in
+   which the error occurred; "oer_iswrite" is set to "true" if the
+   failed OSD operation was data modifying, and "oer_errno" represents
+   the type of error.
+
+   Component byte ranges in the optional pnfs_osd_ioerr4 structure are
+   used for recovering the object and MUST be set by the client to cover
+   all failed I/O operations to the component.
+
+8.3.  pnfs_osd_layoutreturn4
+
+   /// struct pnfs_osd_layoutreturn4 {
+   ///     pnfs_osd_ioerr4             olr_ioerr_report<>;
+   /// };
+   ///
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 24]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   When OSD I/O operations failed, "olr_ioerr_report<>" is used to
+   report these errors to the metadata server as an array of elements of
+   type pnfs_osd_ioerr4.  Each element in the array represents an error
+   that occurred on the object specified by oer_component.  If no errors
+   are to be reported, the size of the olr_ioerr_report<> array is set
+   to zero.
+
+9.  Object-Based Creation Layout Hint
+
+   The layouthint4 type is defined in the NFSv4.1 [6] as follows:
+
+   struct layouthint4 {
+       layouttype4           loh_type;
+       opaque                loh_body<>;
+   };
+
+   The layouthint4 structure is used by the client to pass a hint about
+   the type of layout it would like created for a particular file.  If
+   the loh_type layout type is LAYOUT4_OSD2_OBJECTS, then the loh_body
+   opaque value is defined by the pnfs_osd_layouthint4 type.
+
+9.1.  pnfs_osd_layouthint4
+
+   /// union pnfs_osd_max_comps_hint4 switch (bool omx_valid) {
+   ///     case TRUE:
+   ///         uint32_t            omx_max_comps;
+   ///     case FALSE:
+   ///         void;
+   /// };
+   ///
+   /// union pnfs_osd_stripe_unit_hint4 switch (bool osu_valid) {
+   ///     case TRUE:
+   ///         length4             osu_stripe_unit;
+   ///     case FALSE:
+   ///         void;
+   /// };
+   ///
+   /// union pnfs_osd_group_width_hint4 switch (bool ogw_valid) {
+   ///     case TRUE:
+   ///         uint32_t            ogw_group_width;
+   ///     case FALSE:
+   ///         void;
+   /// };
+   ///
+   /// union pnfs_osd_group_depth_hint4 switch (bool ogd_valid) {
+   ///     case TRUE:
+   ///         uint32_t            ogd_group_depth;
+   ///     case FALSE:
+
+
+
+Halevy, et al.              Standards Track                    [Page 25]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   ///         void;
+   /// };
+   ///
+   /// union pnfs_osd_mirror_cnt_hint4 switch (bool omc_valid) {
+   ///     case TRUE:
+   ///         uint32_t            omc_mirror_cnt;
+   ///     case FALSE:
+   ///         void;
+   /// };
+   ///
+   /// union pnfs_osd_raid_algorithm_hint4 switch (bool ora_valid) {
+   ///     case TRUE:
+   ///         pnfs_osd_raid_algorithm4    ora_raid_algorithm;
+   ///     case FALSE:
+   ///         void;
+   /// };
+   ///
+   /// struct pnfs_osd_layouthint4 {
+   ///     pnfs_osd_max_comps_hint4        olh_max_comps_hint;
+   ///     pnfs_osd_stripe_unit_hint4      olh_stripe_unit_hint;
+   ///     pnfs_osd_group_width_hint4      olh_group_width_hint;
+   ///     pnfs_osd_group_depth_hint4      olh_group_depth_hint;
+   ///     pnfs_osd_mirror_cnt_hint4       olh_mirror_cnt_hint;
+   ///     pnfs_osd_raid_algorithm_hint4   olh_raid_algorithm_hint;
+   /// };
+   ///
+
+   This type conveys hints for the desired data map.  All parameters are
+   optional so the client can give values for only the parameters it
+   cares about, e.g. it can provide a hint for the desired number of
+   mirrored components, regardless of the RAID algorithm selected for
+   the file.  The server should make an attempt to honor the hints, but
+   it can ignore any or all of them at its own discretion and without
+   failing the respective CREATE operation.
+
+   The "olh_max_comps_hint" can be used to limit the total number of
+   component objects comprising the file.  All other hints correspond
+   directly to the different fields of pnfs_osd_data_map4.
+
+10.  Layout Segments
+
+   The pnfs layout operations operate on logical byte ranges.  There is
+   no requirement in the protocol for any relationship between byte
+   ranges used in LAYOUTGET to acquire layouts and byte ranges used in
+   CB_LAYOUTRECALL, LAYOUTCOMMIT, or LAYOUTRETURN.  However, using OSD
+   byte-range capabilities poses limitations on these operations since
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 26]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   the capabilities associated with layout segments cannot be merged or
+   split.  The following guidelines should be followed for proper
+   operation of object-based layouts.
+
+10.1.  CB_LAYOUTRECALL and LAYOUTRETURN
+
+   In general, the object-based layout driver should keep track of each
+   layout segment it got, keeping record of the segment's iomode,
+   offset, and length.  The server should allow the client to get
+   multiple overlapping layout segments but is free to recall the layout
+   to prevent overlap.
+
+   In response to CB_LAYOUTRECALL, the client should return all layout
+   segments matching the given iomode and overlapping with the recalled
+   range.  When returning the layouts for this byte range with
+   LAYOUTRETURN, the client MUST NOT return a sub-range of a layout
+   segment it has; each LAYOUTRETURN sent MUST completely cover at least
+   one outstanding layout segment.
+
+   The server, in turn, should release any segment that exactly matches
+   the clientid, iomode, and byte range given in LAYOUTRETURN.  If no
+   exact match is found, then the server should release all layout
+   segments matching the clientid and iomode and that are fully
+   contained in the returned byte range.  If none are found and the byte
+   range is a subset of an outstanding layout segment with for the same
+   clientid and iomode, then the client can be considered malfunctioning
+   and the server SHOULD recall all layouts from this client to reset
+   its state.  If this behavior repeats, the server SHOULD deny all
+   LAYOUTGETs from this client.
+
+10.2.  LAYOUTCOMMIT
+
+   LAYOUTCOMMIT is only used by object-based pNFS to convey modified
+   attributes hints and/or to report the presence of I/O errors to the
+   metadata server (MDS).  Therefore, the offset and length in
+   LAYOUTCOMMIT4args are reserved for future use and should be set to 0.
+
+11.  Recalling Layouts
+
+   The object-based metadata server should recall outstanding layouts in
+   the following cases:
+
+   o  When the file's security policy changes, i.e., Access Control
+      Lists (ACLs) or permission mode bits are set.
+
+   o  When the file's aggregation map changes, rendering outstanding
+      layouts invalid.
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 27]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   o  When there are sharing conflicts.  For example, the server will
+      issue stripe-aligned layout segments for RAID-5 objects.  To
+      prevent corruption of the file's parity, multiple clients must not
+      hold valid write layouts for the same stripes.  An outstanding
+      READ/WRITE (RW) layout should be recalled when a conflicting
+      LAYOUTGET is received from a different client for LAYOUTIOMODE4_RW
+      and for a byte range overlapping with the outstanding layout
+      segment.
+
+11.1.  CB_RECALL_ANY
+
+   The metadata server can use the CB_RECALL_ANY callback operation to
+   notify the client to return some or all of its layouts.  The NFSv4.1
+   [6] defines the following types:
+
+   const RCA4_TYPE_MASK_OBJ_LAYOUT_MIN     = 8;
+   const RCA4_TYPE_MASK_OBJ_LAYOUT_MAX     = 9;
+
+   struct  CB_RECALL_ANY4args      {
+       uint32_t        craa_objects_to_keep;
+       bitmap4         craa_type_mask;
+   };
+
+   Typically, CB_RECALL_ANY will be used to recall client state when the
+   server needs to reclaim resources.  The craa_type_mask bitmap
+   specifies the type of resources that are recalled and the
+   craa_objects_to_keep value specifies how many of the recalled objects
+   the client is allowed to keep.  The object-based layout type mask
+   flags are defined as follows.  They represent the iomode of the
+   recalled layouts.  In response, the client SHOULD return layouts of
+   the recalled iomode that it needs the least, keeping at most
+   craa_objects_to_keep object-based layouts.
+
+   /// enum pnfs_osd_cb_recall_any_mask {
+   ///     PNFS_OSD_RCA4_TYPE_MASK_READ = 8,
+   ///     PNFS_OSD_RCA4_TYPE_MASK_RW   = 9
+   /// };
+   ///
+
+   The PNFS_OSD_RCA4_TYPE_MASK_READ flag notifies the client to return
+   layouts of iomode LAYOUTIOMODE4_READ.  Similarly, the
+   PNFS_OSD_RCA4_TYPE_MASK_RW flag notifies the client to return layouts
+   of iomode LAYOUTIOMODE4_RW.  When both mask flags are set, the client
+   is notified to return layouts of either iomode.
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 28]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+12.  Client Fencing
+
+   In cases where clients are uncommunicative and their lease has
+   expired or when clients fail to return recalled layouts within a
+   lease period at the least (see "Recalling a Layout"[6]), the server
+   MAY revoke client layouts and/or device address mappings and reassign
+   these resources to other clients.  To avoid data corruption, the
+   metadata server MUST fence off the revoked clients from the
+   respective objects as described in Section 13.4.
+
+13.  Security Considerations
+
+   The pNFS extension partitions the NFSv4 file system protocol into two
+   parts, the control path and the data path (storage protocol).  The
+   control path contains all the new operations described by this
+   extension; all existing NFSv4 security mechanisms and features apply
+   to the control path.  The combination of components in a pNFS system
+   is required to preserve the security properties of NFSv4 with respect
+   to an entity accessing data via a client, including security
+   countermeasures to defend against threats that NFSv4 provides
+   defenses for in environments where these threats are considered
+   significant.
+
+   The metadata server enforces the file access-control policy at
+   LAYOUTGET time.  The client should use suitable authorization
+   credentials for getting the layout for the requested iomode (READ or
+   RW) and the server verifies the permissions and ACL for these
+   credentials, possibly returning NFS4ERR_ACCESS if the client is not
+   allowed the requested iomode.  If the LAYOUTGET operation succeeds
+   the client receives, as part of the layout, a set of object
+   capabilities allowing it I/O access to the specified objects
+   corresponding to the requested iomode.  When the client acts on I/O
+   operations on behalf of its local users, it MUST authenticate and
+   authorize the user by issuing respective OPEN and ACCESS calls to the
+   metadata server, similar to having NFSv4 data delegations.  If access
+   is allowed, the client uses the corresponding (READ or RW)
+   capabilities to perform the I/O operations at the object storage
+   devices.  When the metadata server receives a request to change a
+   file's permissions or ACL, it SHOULD recall all layouts for that file
+   and it MUST change the capability version attribute on all objects
+   comprising the file to implicitly invalidate any outstanding
+   capabilities before committing to the new permissions and ACL.  Doing
+   this will ensure that clients re-authorize their layouts according to
+   the modified permissions and ACL by requesting new layouts.
+   Recalling the layouts in this case is courtesy of the server intended
+   to prevent clients from getting an error on I/Os done after the
+   capability version changed.
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 29]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   The object storage protocol MUST implement the security aspects
+   described in version 1 of the T10 OSD protocol definition [1].  The
+   standard defines four security methods: NOSEC, CAPKEY, CMDRSP, and
+   ALLDATA.  To provide minimum level of security allowing verification
+   and enforcement of the server access control policy using the layout
+   security credentials, the NOSEC security method MUST NOT be used for
+   any I/O operation.  The remainder of this section gives an overview
+   of the security mechanism described in that standard.  The goal is to
+   give the reader a basic understanding of the object security model.
+   Any discrepancies between this text and the actual standard are
+   obviously to be resolved in favor of the OSD standard.
+
+13.1.  OSD Security Data Types
+
+   There are three main data types associated with object security: a
+   capability, a credential, and security parameters.  The capability is
+   a set of fields that specifies an object and what operations can be
+   performed on it.  A credential is a signed capability.  Only a
+   security manager that knows the secret device keys can correctly sign
+   a capability to form a valid credential.  In pNFS, the file server
+   acts as the security manager and returns signed capabilities (i.e.,
+   credentials) to the pNFS client.  The security parameters are values
+   computed by the issuer of OSD commands (i.e., the client) that prove
+   they hold valid credentials.  The client uses the credential as a
+   signing key to sign the requests it makes to OSD, and puts the
+   resulting signatures into the security_parameters field of the OSD
+   command.  The object storage device uses the secret keys it shares
+   with the security manager to validate the signature values in the
+   security parameters.
+
+   The security types are opaque to the generic layers of the pNFS
+   client.  The credential contents are defined as opaque within the
+   pnfs_osd_object_cred4 type.  Instead of repeating the definitions
+   here, the reader is referred to Section 4.9.2.2 of the OSD standard.
+
+13.2.  The OSD Security Protocol
+
+   The object storage protocol relies on a cryptographically secure
+   capability to control accesses at the object storage devices.
+   Capabilities are generated by the metadata server, returned to the
+   client, and used by the client as described below to authenticate
+   their requests to the object-based storage device.  Capabilities
+   therefore achieve the required access and open mode checking.  They
+   allow the file server to define and check a policy (e.g., open mode)
+   and the OSD to enforce that policy without knowing the details (e.g.,
+   user IDs and ACLs).
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 30]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   Since capabilities are tied to layouts, and since they are used to
+   enforce access control, when the file ACL or mode changes the
+   outstanding capabilities MUST be revoked to enforce the new access
+   permissions.  The server SHOULD recall layouts to allow clients to
+   gracefully return their capabilities before the access permissions
+   change.
+
+   Each capability is specific to a particular object, an operation on
+   that object, a byte range within the object (in OSDv2), and has an
+   explicit expiration time.  The capabilities are signed with a secret
+   key that is shared by the object storage devices and the metadata
+   managers.  Clients do not have device keys so they are unable to
+   forge the signatures in the security parameters.  The combination of
+   a capability, the OSD System ID, and a signature is called a
+   "credential" in the OSD specification.
+
+   The details of the security and privacy model for object storage are
+   defined in the T10 OSD standard.  The following sketch of the
+   algorithm should help the reader understand the basic model.
+
+   LAYOUTGET returns a CapKey and a Cap, which, together with the OSD
+   System ID, are also called a credential.  It is a capability and a
+   signature over that capability and the SystemID.  The OSD Standard
+   refers to the CapKey as the "Credential integrity check value" and to
+   the ReqMAC as the "Request integrity check value".
+
+   CapKey = MAC<SecretKey>(Cap, SystemID)
+   Credential = {Cap, SystemID, CapKey}
+
+   The client uses CapKey to sign all the requests it issues for that
+   object using the respective Cap.  In other words, the Cap appears in
+   the request to the storage device, and that request is signed with
+   the CapKey as follows:
+
+   ReqMAC = MAC<CapKey>(Req, ReqNonce)
+   Request = {Cap, Req, ReqNonce, ReqMAC}
+
+   The following is sent to the OSD: {Cap, Req, ReqNonce, ReqMAC}.  The
+   OSD uses the SecretKey it shares with the metadata server to compare
+   the ReqMAC the client sent with a locally computed value:
+
+   LocalCapKey = MAC<SecretKey>(Cap, SystemID)
+   LocalReqMAC = MAC<LocalCapKey>(Req, ReqNonce)
+
+   and if they match the OSD assumes that the capabilities came from an
+   authentic metadata server and allows access to the object, as allowed
+   by the Cap.
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 31]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+13.3.  Protocol Privacy Requirements
+
+   Note that if the server LAYOUTGET reply, holding CapKey and Cap, is
+   snooped by another client, it can be used to generate valid OSD
+   requests (within the Cap access restrictions).
+
+   To provide the required privacy requirements for the capability key
+   returned by LAYOUTGET, the GSS-API [7] framework can be used, e.g.,
+   by using the RPCSEC_GSS privacy method to send the LAYOUTGET
+   operation or by using the SSV key to encrypt the oc_capability_key
+   using the GSS_Wrap() function.  Two general ways to provide privacy
+   in the absence of GSS-API that are independent of NFSv4 are either an
+   isolated network such as a VLAN or a secure channel provided by IPsec
+   [15].
+
+13.4.  Revoking Capabilities
+
+   At any time, the metadata server may invalidate all outstanding
+   capabilities on an object by changing its POLICY ACCESS TAG
+   attribute.  The value of the POLICY ACCESS TAG is part of a
+   capability, and it must match the state of the object attribute.  If
+   they do not match, the OSD rejects accesses to the object with the
+   sense key set to ILLEGAL REQUEST and an additional sense code set to
+   INVALID FIELD IN CDB.  When a client attempts to use a capability and
+   is rejected this way, it should issue a LAYOUTCOMMIT for the object
+   and specify PNFS_OSD_BAD_CRED in the olr_ioerr_report parameter.  The
+   client may elect to issue a compound LAYOUTRETURN/LAYOUTGET (or
+   LAYOUTCOMMIT/LAYOUTRETURN/LAYOUTGET) to attempt to fetch a refreshed
+   set of capabilities.
+
+   The metadata server may elect to change the access policy tag on an
+   object at any time, for any reason (with the understanding that there
+   is likely an associated performance penalty, especially if there are
+   outstanding layouts for this object).  The metadata server MUST
+   revoke outstanding capabilities when any one of the following occurs:
+
+   o  the permissions on the object change,
+
+   o  a conflicting mandatory byte-range lock is granted, or
+
+   o  a layout is revoked and reassigned to another client.
+
+   A pNFS client will typically hold one layout for each byte range for
+   either READ or READ/WRITE.  The client's credentials are checked by
+   the metadata server at LAYOUTGET time and it is the client's
+   responsibility to enforce access control among multiple users
+   accessing the same file.  It is neither required nor expected that
+   the pNFS client will obtain a separate layout for each user accessing
+
+
+
+Halevy, et al.              Standards Track                    [Page 32]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   a shared object.  The client SHOULD use OPEN and ACCESS calls to
+   check user permissions when performing I/O so that the server's
+   access control policies are correctly enforced.  The result of the
+   ACCESS operation may be cached while the client holds a valid layout
+   as the server is expected to recall layouts when the file's access
+   permissions or ACL change.
+
+14.  IANA Considerations
+
+   As described in NFSv4.1 [6], new layout type numbers have been
+   assigned by IANA.  This document defines the protocol associated with
+   the existing layout type number, LAYOUT4_OSD2_OBJECTS, and it
+   requires no further actions for IANA.
+
+15.  References
+
+15.1.  Normative References
+
+   [1]   Weber, R., "Information Technology - SCSI Object-Based Storage
+         Device Commands (OSD)", ANSI INCITS 400-2004, December 2004.
+
+   [2]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
+         Levels", BCP 14, RFC 2119, March 1997.
+
+   [3]   Eisler, M., "XDR: External Data Representation Standard",
+         STD 67, RFC 4506, May 2006.
+
+   [4]   Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network
+         File System (NFS) Version 4 Minor Version 1 External Data
+         Representation Standard (XDR) Description", RFC 5662,
+         January 2010.
+
+   [5]   IETF Trust, "Legal Provisions Relating to IETF Documents",
+         November 2008,
+         <http://trustee.ietf.org/docs/IETF-Trust-License-Policy.pdf>.
+
+   [6]   Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network
+         File System (NFS) Version 4 Minor Version 1 Protocol",
+         RFC 5661, January 2010.
+
+   [7]   Linn, J., "Generic Security Service Application Program
+         Interface Version 2, Update 1", RFC 2743, January 2000.
+
+   [8]   Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E.
+         Zeidner, "Internet Small Computer Systems Interface (iSCSI)",
+         RFC 3720, April 2004.
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 33]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+   [9]   Weber, R., "SCSI Primary Commands - 3 (SPC-3)", ANSI
+         INCITS 408-2005, October 2005.
+
+   [10]  Krueger, M., Chadalapaka, M., and R. Elliott, "T11 Network
+         Address Authority (NAA) Naming Format for iSCSI Node Names",
+         RFC 3980, February 2005.
+
+   [11]  IEEE, "Guidelines for 64-bit Global Identifier (EUI-64)
+         Registration Authority",
+         <http://standards.ieee.org/regauth/oui/tutorials/EUI64.html>.
+
+   [12]  Tseng, J., Gibbons, K., Travostino, F., Du Laney, C., and J.
+         Souza, "Internet Storage Name Service (iSNS)", RFC 4171,
+         September 2005.
+
+   [13]  Weber, R., "SCSI Architecture Model - 3 (SAM-3)", ANSI
+         INCITS 402-2005, February 2005.
+
+15.2.  Informative References
+
+   [14]  Weber, R., "SCSI Object-Based Storage Device Commands -2
+         (OSD-2)", January 2009,
+         <http://www.t10.org/cgi-bin/ac.pl?t=f&f=osd2r05a.pdf>.
+
+   [15]  Kent, S. and K. Seo, "Security Architecture for the Internet
+         Protocol", RFC 4301, December 2005.
+
+   [16]  T10 1415-D, "SCSI RDMA Protocol (SRP)", ANSI INCITS 365-2002,
+         December 2002.
+
+   [17]  T11 1619-D, "Fibre Channel Framing and Signaling - 2
+         (FC-FS-2)", ANSI INCITS 424-2007, February 2007.
+
+   [18]  T10 1601-D, "Serial Attached SCSI - 1.1 (SAS-1.1)", ANSI
+         INCITS 417-2006, June 2006.
+
+   [19]  MacWilliams, F. and N. Sloane, "The Theory of Error-Correcting
+         Codes, Part I", 1977.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 34]
+
+RFC 5664                      pNFS Objects                  January 2010
+
+
+Appendix A.  Acknowledgments
+
+   Todd Pisek was a co-editor of the initial versions of this document.
+   Daniel E. Messinger, Pete Wyckoff, Mike Eisler, Sean P. Turner, Brian
+   E. Carpenter, Jari Arkko, David Black, and Jason Glasgow reviewed and
+   commented on this document.
+
+Authors' Addresses
+
+   Benny Halevy
+   Panasas, Inc.
+   1501 Reedsdale St. Suite 400
+   Pittsburgh, PA  15233
+   USA
+
+   Phone: +1-412-323-3500
+   EMail: bhalevy@panasas.com
+   URI:   http://www.panasas.com/
+
+
+   Brent Welch
+   Panasas, Inc.
+   6520 Kaiser Drive
+   Fremont, CA  95444
+   USA
+
+   Phone: +1-510-608-7770
+   EMail: welch@panasas.com
+   URI:   http://www.panasas.com/
+
+
+   Jim Zelenka
+   Panasas, Inc.
+   1501 Reedsdale St. Suite 400
+   Pittsburgh, PA  15233
+   USA
+
+   Phone: +1-412-323-3500
+   EMail: jimz@panasas.com
+   URI:   http://www.panasas.com/
+
+
+
+
+
+
+
+
+
+
+
+Halevy, et al.              Standards Track                    [Page 35]
+