diff options
| author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 | 
|---|---|---|
| committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 | 
| commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
| tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc3530.txt | |
| parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) | |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc3530.txt')
| -rw-r--r-- | doc/rfc/rfc3530.txt | 15403 | 
1 files changed, 15403 insertions, 0 deletions
| diff --git a/doc/rfc/rfc3530.txt b/doc/rfc/rfc3530.txt new file mode 100644 index 0000000..93422d3 --- /dev/null +++ b/doc/rfc/rfc3530.txt @@ -0,0 +1,15403 @@ + + + + + + +Network Working Group                                         S. Shepler +Request for Comments: 3530                                  B. Callaghan +Obsoletes: 3010                                              D. Robinson +Category: Standards Track                                     R. Thurlow +                                                  Sun Microsystems, Inc. +                                                                C. Beame +                                                        Hummingbird Ltd. +                                                               M. Eisler +                                                               D. Noveck +                                                 Network Appliance, Inc. +                                                              April 2003 + + +              Network File System (NFS) version 4 Protocol + +Status of this Memo + +   This document specifies an Internet standards track protocol for the +   Internet community, and requests discussion and suggestions for +   improvements.  Please refer to the current edition of the "Internet +   Official Protocol Standards" (STD 1) for the standardization state +   and status of this protocol.  Distribution of this memo is unlimited. + +Copyright Notice + +   Copyright (C) The Internet Society (2003).  All Rights Reserved. + +Abstract + +   The Network File System (NFS) version 4 is a distributed filesystem +   protocol which owes heritage to NFS protocol version 2, RFC 1094, and +   version 3, RFC 1813.  Unlike earlier versions, the NFS version 4 +   protocol supports traditional file access while integrating support +   for file locking and the mount protocol.  In addition, support for +   strong security (and its negotiation), compound operations, client +   caching, and internationalization have been added.  Of course, +   attention has been applied to making NFS version 4 operate well in an +   Internet environment. + +   This document replaces RFC 3010 as the definition of the NFS version +   4 protocol. + +Key Words + +   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", +   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this +   document are to be interpreted as described in [RFC2119]. + + + + +Shepler, et al.             Standards Track                     [Page 1] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +Table of Contents + +   1.   Introduction . . . . . . . . . . . . . . . . . . . . . . .    8 +        1.1.  Changes since RFC 3010 . . . . . . . . . . . . . . .    8 +        1.2.  NFS version 4 Goals. . . . . . . . . . . . . . . . .    9 +        1.3.  Inconsistencies of this Document with Section 18 . .    9 +        1.4.  Overview of NFS version 4 Features . . . . . . . . .   10 +              1.4.1.  RPC and Security . . . . . . . . . . . . . .   10 +              1.4.2.  Procedure and Operation Structure. . . . . .   10 +              1.4.3.  Filesystem Mode. . . . . . . . . . . . . . .   11 +                      1.4.3.1.  Filehandle Types . . . . . . . . .   11 +                      1.4.3.2.  Attribute Types. . . . . . . . . .   12 +                      1.4.3.3.  Filesystem Replication and +                                Migration. . . . . . . . . . . . .   13 +              1.4.4.  OPEN and CLOSE . . . . . . . . . . . . . . .   13 +              1.4.5.  File locking . . . . . . . . . . . . . . . .   13 +              1.4.6.  Client Caching and Delegation. . . . . . . .   13 +        1.5.  General Definitions. . . . . . . . . . . . . . . . .   14 +   2.   Protocol Data Types. . . . . . . . . . . . . . . . . . . .   16 +        2.1.  Basic Data Types . . . . . . . . . . . . . . . . . .   16 +        2.2.  Structured Data Types. . . . . . . . . . . . . . . .   18 +   3.   RPC and Security Flavor. . . . . . . . . . . . . . . . . .   23 +        3.1.  Ports and Transports . . . . . . . . . . . . . . . .   23 +              3.1.1.  Client Retransmission Behavior . . . . . . .   24 +        3.2.  Security Flavors . . . . . . . . . . . . . . . . . .   25 +              3.2.1.  Security mechanisms for NFS version 4. . . .   25 +                      3.2.1.1.  Kerberos V5 as a security triple .   25 +                      3.2.1.2.  LIPKEY as a security triple. . . .   26 +                      3.2.1.3.  SPKM-3 as a security triple. . . .   27 +        3.3.  Security Negotiation . . . . . . . . . . . . . . . .   27 +              3.3.1.  SECINFO. . . . . . . . . . . . . . . . . . .   28 +              3.3.2.  Security Error . . . . . . . . . . . . . . .   28 +        3.4.  Callback RPC Authentication. . . . . . . . . . . . .   28 +   4.  Filehandles . . . . . . . . . . . . . . . . . . . . . . . .   30 +        4.1.  Obtaining the First Filehandle . . . . . . . . . . .   30 +              4.1.1.  Root Filehandle. . . . . . . . . . . . . . .   31 +              4.1.2.  Public Filehandle. . . . . . . . . . . . . .   31 +        4.2.  Filehandle Types . . . . . . . . . . . . . . . . . .   31 +              4.2.1.  General Properties of a Filehandle . . . . .   32 +              4.2.2.  Persistent Filehandle. . . . . . . . . . . .   32 +              4.2.3.  Volatile Filehandle. . . . . . . . . . . . .   33 +              4.2.4.  One Method of Constructing a +                      Volatile Filehandle. . . . . . . . . . . . .   34 +        4.3.  Client Recovery from Filehandle Expiration . . . . .   35 +   5.   File Attributes. . . . . . . . . . . . . . . . . . . . . .   35 +        5.1.  Mandatory Attributes . . . . . . . . . . . . . . . .   37 +        5.2.  Recommended Attributes . . . . . . . . . . . . . . .   37 +        5.3.  Named Attributes . . . . . . . . . . . . . . . . . .   37 + + + +Shepler, et al.             Standards Track                     [Page 2] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +        5.4.  Classification of Attributes . . . . . . . . . . . .   38 +        5.5.  Mandatory Attributes - Definitions . . . . . . . . .   39 +        5.6.  Recommended Attributes - Definitions . . . . . . . .   41 +        5.7.  Time Access. . . . . . . . . . . . . . . . . . . . .   46 +        5.8.  Interpreting owner and owner_group . . . . . . . . .   47 +        5.9.  Character Case Attributes. . . . . . . . . . . . . .   49 +        5.10. Quota Attributes . . . . . . . . . . . . . . . . . .   49 +        5.11. Access Control Lists . . . . . . . . . . . . . . . .   50 +               5.11.1.  ACE type . . . . . . . . . . . . . . . . .   51 +               5.11.2.  ACE Access Mask. . . . . . . . . . . . . .   52 +               5.11.3.  ACE flag . . . . . . . . . . . . . . . . .   54 +               5.11.4.  ACE who  . . . . . . . . . . . . . . . . .   55 +               5.11.5.  Mode Attribute . . . . . . . . . . . . . .   56 +               5.11.6.  Mode and ACL Attribute . . . . . . . . . .   57 +               5.11.7.  mounted_on_fileid. . . . . . . . . . . . .   57 +   6.  Filesystem Migration and Replication  . . . . . . . . . . .   58 +        6.1.  Replication. . . . . . . . . . . . . . . . . . . . .   58 +        6.2.  Migration. . . . . . . . . . . . . . . . . . . . . .   59 +        6.3.  Interpretation of the fs_locations Attribute . . . .   60 +        6.4.  Filehandle Recovery for Migration or Replication . .   61 +   7.  NFS Server Name Space . . . . . . . . . . . . . . . . . . .   61 +        7.1.  Server Exports . . . . . . . . . . . . . . . . . . .   61 +        7.2.  Browsing Exports . . . . . . . . . . . . . . . . . .   62 +        7.3.  Server Pseudo Filesystem . . . . . . . . . . . . . .   62 +        7.4.  Multiple Roots . . . . . . . . . . . . . . . . . . .   63 +        7.5.  Filehandle Volatility. . . . . . . . . . . . . . . .   63 +        7.6.  Exported Root. . . . . . . . . . . . . . . . . . . .   63 +        7.7.  Mount Point Crossing . . . . . . . . . . . . . . . .   63 +        7.8.  Security Policy and Name Space Presentation. . . . .   64 +   8.   File Locking and Share Reservations. . . . . . . . . . . .   65 +        8.1.  Locking. . . . . . . . . . . . . . . . . . . . . . .   65 +              8.1.1.    Client ID. . . . . . . . . . . . . . . . .   66 +              8.1.2.    Server Release of Clientid . . . . . . . .   69 +              8.1.3.    lock_owner and stateid Definition. . . . .   69 +              8.1.4.    Use of the stateid and Locking . . . . . .   71 +              8.1.5.    Sequencing of Lock Requests. . . . . . . .   73 +              8.1.6.    Recovery from Replayed Requests. . . . . .   74 +              8.1.7.    Releasing lock_owner State . . . . . . . .   74 +              8.1.8.    Use of Open Confirmation . . . . . . . . .   75 +        8.2.  Lock Ranges. . . . . . . . . . . . . . . . . . . . .   76 +        8.3.  Upgrading and Downgrading Locks. . . . . . . . . . .   76 +        8.4.  Blocking Locks . . . . . . . . . . . . . . . . . . .   77 +        8.5.  Lease Renewal. . . . . . . . . . . . . . . . . . . .   77 +        8.6.  Crash Recovery . . . . . . . . . . . . . . . . . . .   78 +               8.6.1.   Client Failure and Recovery. . . . . . . .   79 +               8.6.2.   Server Failure and Recovery. . . . . . . .   79 +               8.6.3.   Network Partitions and Recovery. . . . . .   81 +        8.7.   Recovery from a Lock Request Timeout or Abort . . .   85 + + + +Shepler, et al.             Standards Track                     [Page 3] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +        8.8.   Server Revocation of Locks. . . . . . . . . . . . .   85 +        8.9.   Share Reservations. . . . . . . . . . . . . . . . .   86 +        8.10.  OPEN/CLOSE Operations . . . . . . . . . . . . . . .   87 +               8.10.1.  Close and Retention of State +                        Information. . . . . . . . . . . . . . . .   88 +        8.11.  Open Upgrade and Downgrade. . . . . . . . . . . . .   88 +        8.12.  Short and Long Leases . . . . . . . . . . . . . . .   89 +        8.13.  Clocks, Propagation Delay, and Calculating Lease +               Expiration. . . . . . . . . . . . . . . . . . . . .   89 +        8.14.  Migration, Replication and State. . . . . . . . . .   90 +               8.14.1.  Migration and State. . . . . . . . . . . .   90 +               8.14.2.  Replication and State. . . . . . . . . . .   91 +               8.14.3.  Notification of Migrated Lease . . . . . .   92 +               8.14.4.  Migration and the Lease_time Attribute . .   92 +   9.  Client-Side Caching . . . . . . . . . . . . . . . . . . . .   93 +        9.1.   Performance Challenges for Client-Side Caching. . .   93 +        9.2.   Delegation and Callbacks. . . . . . . . . . . . . .   94 +               9.2.1.  Delegation Recovery . . . . . . . . . . . .   96 +        9.3.   Data Caching. . . . . . . . . . . . . . . . . . . .   98 +               9.3.1.   Data Caching and OPENs . . . . . . . . . .   98 +               9.3.2.   Data Caching and File Locking. . . . . . .   99 +               9.3.3.   Data Caching and Mandatory File Locking. .  101 +               9.3.4.   Data Caching and File Identity . . . . . .  101 +        9.4.   Open Delegation . . . . . . . . . . . . . . . . . .  102 +               9.4.1.   Open Delegation and Data Caching . . . . .  104 +               9.4.2.   Open Delegation and File Locks . . . . . .  106 +               9.4.3.   Handling of CB_GETATTR . . . . . . . . . .  106 +               9.4.4.   Recall of Open Delegation. . . . . . . . .  109 +               9.4.5.   Clients that Fail to Honor +                        Delegation Recalls . . . . . . . . . . . .  111 +               9.4.6.   Delegation Revocation. . . . . . . . . . .  112 +        9.5.   Data Caching and Revocation . . . . . . . . . . . .  112 +               9.5.1.   Revocation Recovery for Write Open +                        Delegation . . . . . . . . . . . . . . . .  113 +        9.6.   Attribute Caching . . . . . . . . . . . . . . . . .  113 +        9.7.   Data and Metadata Caching and Memory Mapped Files .  115 +        9.8.   Name Caching  . . . . . . . . . . . . . . . . . . .  118 +        9.9.   Directory Caching . . . . . . . . . . . . . . . . .  119 +   10.  Minor Versioning . . . . . . . . . . . . . . . . . . . . .  120 +   11.  Internationalization . . . . . . . . . . . . . . . . . . .  122 +        11.1.  Stringprep profile for the utf8str_cs type. . . . .  123 +               11.1.1.  Intended applicability of the +                        nfs4_cs_prep profile . . . . . . . . . . .  123 +               11.1.2.  Character repertoire of nfs4_cs_prep . . .  124 +               11.1.3.  Mapping used by nfs4_cs_prep . . . . . . .  124 +               11.1.4.  Normalization used by nfs4_cs_prep . . . .  124 +               11.1.5.  Prohibited output for nfs4_cs_prep . . . .  125 +               11.1.6.  Bidirectional output for nfs4_cs_prep. . .  125 + + + +Shepler, et al.             Standards Track                     [Page 4] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +        11.2.  Stringprep profile for the utf8str_cis type . . . .  125 +               11.2.1.  Intended applicability of the +                        nfs4_cis_prep profile. . . . . . . . . . .  125 +               11.2.2.  Character repertoire of nfs4_cis_prep  . .  125 +               11.2.3.  Mapping used by nfs4_cis_prep  . . . . . .  125 +               11.2.4.  Normalization used by nfs4_cis_prep  . . .  125 +               11.2.5.  Prohibited output for nfs4_cis_prep  . . .  126 +               11.2.6.  Bidirectional output for nfs4_cis_prep . .  126 +        11.3.  Stringprep profile for the utf8str_mixed type . . .  126 +               11.3.1.  Intended applicability of the +                        nfs4_mixed_prep profile. . . . . . . . . .  126 +               11.3.2.  Character repertoire of nfs4_mixed_prep  .  126 +               11.3.3.  Mapping used by nfs4_cis_prep  . . . . . .  126 +               11.3.4.  Normalization used by nfs4_mixed_prep  . .  127 +               11.3.5.  Prohibited output for nfs4_mixed_prep  . .  127 +               11.3.6.  Bidirectional output for nfs4_mixed_prep .  127 +        11.4.  UTF-8 Related Errors. . . . . . . . . . . . . . . .  127 +   12.  Error Definitions  . . . . . . . . . . . . . . . . . . . .  128 +   13.  NFS version 4 Requests . . . . . . . . . . . . . . . . . .  134 +        13.1.  Compound Procedure. . . . . . . . . . . . . . . . .  134 +        13.2.  Evaluation of a Compound Request. . . . . . . . . .  135 +        13.3.  Synchronous Modifying Operations. . . . . . . . . .  136 +        13.4.  Operation Values. . . . . . . . . . . . . . . . . .  136 +   14.  NFS version 4 Procedures . . . . . . . . . . . . . . . . .  136 +        14.1.  Procedure 0: NULL - No Operation. . . . . . . . . .  136 +        14.2.  Procedure 1: COMPOUND - Compound Operations . . . .  137 +               14.2.1.   Operation 3: ACCESS - Check Access +                         Rights. . . . . . . . . . . . . . . . . .  140 +               14.2.2.   Operation 4: CLOSE - Close File . . . . .  142 +               14.2.3.   Operation 5: COMMIT - Commit +                         Cached Data . . . . . . . . . . . . . . .  144 +               14.2.4.   Operation 6: CREATE - Create a +                         Non-Regular File Object . . . . . . . . .  147 +               14.2.5.   Operation 7: DELEGPURGE - +                         Purge Delegations Awaiting Recovery . . .  150 +               14.2.6.   Operation 8: DELEGRETURN - Return +                         Delegation. . . . . . . . . . . . . . . .  151 +               14.2.7.   Operation 9: GETATTR - Get Attributes . .  152 +               14.2.8.   Operation 10: GETFH - Get Current +                         Filehandle. . . . . . . . . . . . . . . .  153 +               14.2.9.   Operation 11: LINK - Create Link to a +                         File. . . . . . . . . . . . . . . . . . .  154 +               14.2.10.  Operation 12: LOCK - Create Lock  . . . .  156 +               14.2.11.  Operation 13: LOCKT - Test For Lock . . .  160 +               14.2.12.  Operation 14: LOCKU - Unlock File . . . .  162 +               14.2.13.  Operation 15: LOOKUP - Lookup Filename. .  163 +               14.2.14.  Operation 16: LOOKUPP - Lookup +                         Parent Directory. . . . . . . . . . . . .  165 + + + +Shepler, et al.             Standards Track                     [Page 5] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +               14.2.15.  Operation 17: NVERIFY - Verify +                         Difference in Attributes  . . . . . . . .  166 +               14.2.16.  Operation 18: OPEN - Open a Regular +                         File. . . . . . . . . . . . . . . . . . .  168 +               14.2.17.  Operation 19: OPENATTR - Open Named +                         Attribute Directory . . . . . . . . . . .  178 +               14.2.18.  Operation 20: OPEN_CONFIRM - +                         Confirm Open . . . . . . . . . . . . . .   180 +               14.2.19.  Operation 21: OPEN_DOWNGRADE - +                         Reduce Open File Access . . . . . . . . .  182 +               14.2.20.  Operation 22: PUTFH - Set +                         Current Filehandle. . . . . . . . . . . .  184 +               14.2.21.  Operation 23: PUTPUBFH - +                         Set Public Filehandle . . . . . . . . . .  185 +               14.2.22.  Operation 24: PUTROOTFH - +                         Set Root Filehandle . . . . . . . . . . .  186 +               14.2.23.  Operation 25: READ - Read from File . . .  187 +               14.2.24.  Operation 26: READDIR - +                         Read Directory. . . . . . . . . . . . . .  190 +               14.2.25.  Operation 27: READLINK - +                         Read Symbolic Link. . . . . . . . . . . .  193 +               14.2.26.  Operation 28: REMOVE - +                         Remove Filesystem Object. . . . . . . . .  195 +               14.2.27.  Operation 29: RENAME - +                         Rename Directory Entry. . . . . . . . . .  197 +               14.2.28.  Operation 30: RENEW - Renew a Lease . . .  200 +               14.2.29.  Operation 31: RESTOREFH - +                         Restore Saved Filehandle. . . . . . . . .  201 +               14.2.30.  Operation 32: SAVEFH - Save +                         Current Filehandle. . . . . . . . . . . .  202 +               14.2.31.  Operation 33: SECINFO - Obtain +                         Available Security. . . . . . . . . . . .  203 +               14.2.32.  Operation 34: SETATTR - Set Attributes. .  206 +               14.2.33.  Operation 35: SETCLIENTID - +                         Negotiate Clientid. . . . . . . . . . . .  209 +               14.2.34.  Operation 36: SETCLIENTID_CONFIRM - +                         Confirm Clientid. . . . . . . . . . . . .  213 +               14.2.35.  Operation 37: VERIFY - +                         Verify Same Attributes. . . . . . . . . .  217 +               14.2.36.  Operation 38: WRITE - Write to File . . .  218 +               14.2.37.  Operation 39: RELEASE_LOCKOWNER - +                         Release Lockowner State . . . . . . . . .  223 +               14.2.38.  Operation 10044: ILLEGAL - +                         Illegal operation . . . . . . . . . . . .  224 +   15.  NFS version 4 Callback Procedures  . . . . . . . . . . . .  225 +        15.1.  Procedure 0: CB_NULL - No Operation . . . . . . . .  225 +        15.2.  Procedure 1: CB_COMPOUND - Compound +               Operations. . . . . . . . . . . . . . . . . . . . .  226 + + + +Shepler, et al.             Standards Track                     [Page 6] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +               15.2.1.  Operation 3: CB_GETATTR - Get +                        Attributes . . . . . . . . . . . . . . . .  228 +               15.2.2.  Operation 4: CB_RECALL - +                        Recall an Open Delegation. . . . . . . . .  229 +               15.2.3.  Operation 10044: CB_ILLEGAL - +                        Illegal Callback Operation . . . . . . . .  230 +   16.  Security Considerations  . . . . . . . . . . . . . . . . .  231 +   17.  IANA Considerations  . . . . . . . . . . . . . . . . . . .  232 +        17.1.  Named Attribute Definition. . . . . . . . . . . . .  232 +        17.2.  ONC RPC Network Identifiers (netids). . . . . . . .  232 +   18.  RPC definition file  . . . . . . . . . . . . . . . . . . .  234 +   19.  Acknowledgements . . . . . . . . . . . . . . . . . . . . .  268 +   20.  Normative References . . . . . . . . . . . . . . . . . . .  268 +   21.  Informative References . . . . . . . . . . . . . . . . . .  270 +   22.  Authors' Information . . . . . . . . . . . . . . . . . . .  273 +        22.1.  Editor's Address. . . . . . . . . . . . . . . . . .  273 +        22.2.  Authors' Addresses. . . . . . . . . . . . . . . . .  274 +   23.  Full Copyright Statement . . . . . . . . . . . . . . . . .  275 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Shepler, et al.             Standards Track                     [Page 7] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +1.  Introduction + +1.1.  Changes since RFC 3010 + +   This definition of the NFS version 4 protocol replaces or obsoletes +   the definition present in [RFC3010].  While portions of the two +   documents have remained the same, there have been substantive changes +   in others.  The changes made between [RFC3010] and this document +   represent implementation experience and further review of the +   protocol.  While some modifications were made for ease of +   implementation or clarification, most updates represent errors or +   situations where the [RFC3010] definition were untenable. + +   The following list is not all inclusive of all changes but presents +   some of the most notable changes or additions made: + +   o  The state model has added an open_owner4 identifier.  This was +      done to accommodate Posix based clients and the model they use for +      file locking.  For Posix clients, an open_owner4 would correspond +      to a file descriptor potentially shared amongst a set of processes +      and the lock_owner4 identifier would correspond to a process that +      is locking a file. + +   o  Clarifications and error conditions were added for the handling of +      the owner and group attributes.  Since these attributes are string +      based (as opposed to the numeric uid/gid of previous versions of +      NFS), translations may not be available and hence the changes +      made. + +   o  Clarifications for the ACL and mode attributes to address +      evaluation and partial support. + +   o  For identifiers that are defined as XDR opaque, limits were set on +      their size. + +   o  Added the mounted_on_filed attribute to allow Posix clients to +      correctly construct local mounts. + +   o  Modified the SETCLIENTID/SETCLIENTID_CONFIRM operations to deal +      correctly with confirmation details along with adding the ability +      to specify new client callback information.  Also added +      clarification of the callback information itself. + +   o  Added a new operation LOCKOWNER_RELEASE to enable notifying the +      server that a lock_owner4 will no longer be used by the client. + +   o  RENEW operation changes to identify the client correctly and allow +      for additional error returns. + + + +Shepler, et al.             Standards Track                     [Page 8] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  Verify error return possibilities for all operations. + +   o  Remove use of the pathname4 data type from LOOKUP and OPEN in +      favor of having the client construct a sequence of LOOKUP +      operations to achieive the same effect. + +   o  Clarification of the internationalization issues and adoption of +      the new stringprep profile framework. + +1.2.  NFS Version 4 Goals + +   The NFS version 4 protocol is a further revision of the NFS protocol +   defined already by versions 2 [RFC1094] and 3 [RFC1813].  It retains +   the essential characteristics of previous versions: design for easy +   recovery, independent of transport protocols, operating systems and +   filesystems, simplicity, and good performance.  The NFS version 4 +   revision has the following goals: + +   o  Improved access and good performance on the Internet. + +      The protocol is designed to transit firewalls easily, perform well +      where latency is high and bandwidth is low, and scale to very +      large numbers of clients per server. + +   o  Strong security with negotiation built into the protocol. + +      The protocol builds on the work of the ONCRPC working group in +      supporting the RPCSEC_GSS protocol.  Additionally, the NFS version +      4 protocol provides a mechanism to allow clients and servers the +      ability to negotiate security and require clients and servers to +      support a minimal set of security schemes. + +   o  Good cross-platform interoperability. + +      The protocol features a filesystem model that provides a useful, +      common set of features that does not unduly favor one filesystem +      or operating system over another. + +   o  Designed for protocol extensions. + +      The protocol is designed to accept standard extensions that do not +      compromise backward compatibility. + +1.3.  Inconsistencies of this Document with Section 18 + +   Section 18, RPC Definition File, contains the definitions in XDR +   description language of the constructs used by the protocol.  Prior +   to Section 18, several of the constructs are reproduced for purposes + + + +Shepler, et al.             Standards Track                     [Page 9] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   of explanation.  The reader is warned of the possibility of errors in +   the reproduced constructs outside of Section 18.  For any part of the +   document that is inconsistent with Section 18, Section 18 is to be +   considered authoritative. + +1.4.  Overview of NFS version 4 Features + +   To provide a reasonable context for the reader, the major features of +   NFS version 4 protocol will be reviewed in brief.  This will be done +   to provide an appropriate context for both the reader who is familiar +   with the previous versions of the NFS protocol and the reader that is +   new to the NFS protocols.  For the reader new to the NFS protocols, +   there is still a fundamental knowledge that is expected.  The reader +   should be familiar with the XDR and RPC protocols as described in +   [RFC1831] and [RFC1832].  A basic knowledge of filesystems and +   distributed filesystems is expected as well. + +1.4.1.  RPC and Security + +   As with previous versions of NFS, the External Data Representation +   (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS +   version 4 protocol are those defined in [RFC1831] and [RFC1832].  To +   meet end to end security requirements, the RPCSEC_GSS framework +   [RFC2203] will be used to extend the basic RPC security.  With the +   use of RPCSEC_GSS, various mechanisms can be provided to offer +   authentication, integrity, and privacy to the NFS version 4 protocol. +   Kerberos V5 will be used as described in [RFC1964] to provide one +   security framework.  The LIPKEY GSS-API mechanism described in +   [RFC2847] will be used to provide for the use of user password and +   server public key by the NFS version 4 protocol.  With the use of +   RPCSEC_GSS, other mechanisms may also be specified and used for NFS +   version 4 security. + +   To enable in-band security negotiation, the NFS version 4 protocol +   has added a new operation which provides the client a method of +   querying the server about its policies regarding which security +   mechanisms must be used for access to the server's filesystem +   resources.  With this, the client can securely match the security +   mechanism that meets the policies specified at both the client and +   server. + +1.4.2.  Procedure and Operation Structure + +   A significant departure from the previous versions of the NFS +   protocol is the introduction of the COMPOUND procedure.  For the NFS +   version 4 protocol, there are two RPC procedures, NULL and COMPOUND. +   The COMPOUND procedure is defined in terms of operations and these +   operations correspond more closely to the traditional NFS procedures. + + + +Shepler, et al.             Standards Track                    [Page 10] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   With the use of the COMPOUND procedure, the client is able to build +   simple or complex requests.  These COMPOUND requests allow for a +   reduction in the number of RPCs needed for logical filesystem +   operations.  For example, without previous contact with a server a +   client will be able to read data from a file in one request by +   combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. +   With previous versions of the NFS protocol, this type of single +   request was not possible. + +   The model used for COMPOUND is very simple.  There is no logical OR +   or ANDing of operations.  The operations combined within a COMPOUND +   request are evaluated in order by the server.  Once an operation +   returns a failing result, the evaluation ends and the results of all +   evaluated operations are returned to the client. + +   The NFS version 4 protocol continues to have the client refer to a +   file or directory at the server by a "filehandle".  The COMPOUND +   procedure has a method of passing a filehandle from one operation to +   another within the sequence of operations.  There is a concept of a +   "current filehandle" and "saved filehandle".  Most operations use the +   "current filehandle" as the filesystem object to operate upon.  The +   "saved filehandle" is used as temporary filehandle storage within a +   COMPOUND procedure as well as an additional operand for certain +   operations. + +1.4.3.  Filesystem Model + +   The general filesystem model used for the NFS version 4 protocol is +   the same as previous versions.  The server filesystem is hierarchical +   with the regular files contained within being treated as opaque byte +   streams.  In a slight departure, file and directory names are encoded +   with UTF-8 to deal with the basics of internationalization. + +   The NFS version 4 protocol does not require a separate protocol to +   provide for the initial mapping between path name and filehandle. +   Instead of using the older MOUNT protocol for this mapping, the +   server provides a ROOT filehandle that represents the logical root or +   top of the filesystem tree provided by the server.  The server +   provides multiple filesystems by gluing them together with pseudo +   filesystems.  These pseudo filesystems provide for potential gaps in +   the path names between real filesystems. + +1.4.3.1.  Filehandle Types + +   In previous versions of the NFS protocol, the filehandle provided by +   the server was guaranteed to be valid or persistent for the lifetime +   of the filesystem object to which it referred.  For some server +   implementations, this persistence requirement has been difficult to + + + +Shepler, et al.             Standards Track                    [Page 11] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   meet.  For the NFS version 4 protocol, this requirement has been +   relaxed by introducing another type of filehandle, volatile.  With +   persistent and volatile filehandle types, the server implementation +   can match the abilities of the filesystem at the server along with +   the operating environment.  The client will have knowledge of the +   type of filehandle being provided by the server and can be prepared +   to deal with the semantics of each. + +1.4.3.2.  Attribute Types + +   The NFS version 4 protocol introduces three classes of filesystem or +   file attributes.  Like the additional filehandle type, the +   classification of file attributes has been done to ease server +   implementations along with extending the overall functionality of the +   NFS protocol.  This attribute model is structured to be extensible +   such that new attributes can be introduced in minor revisions of the +   protocol without requiring significant rework. + +   The three classifications are: mandatory, recommended and named +   attributes.  This is a significant departure from the previous +   attribute model used in the NFS protocol.  Previously, the attributes +   for the filesystem and file objects were a fixed set of mainly UNIX +   attributes.  If the server or client did not support a particular +   attribute, it would have to simulate the attribute the best it could. + +   Mandatory attributes are the minimal set of file or filesystem +   attributes that must be provided by the server and must be properly +   represented by the server.  Recommended attributes represent +   different filesystem types and operating environments.  The +   recommended attributes will allow for better interoperability and the +   inclusion of more operating environments.  The mandatory and +   recommended attribute sets are traditional file or filesystem +   attributes.  The third type of attribute is the named attribute.  A +   named attribute is an opaque byte stream that is associated with a +   directory or file and referred to by a string name.  Named attributes +   are meant to be used by client applications as a method to associate +   application specific data with a regular file or directory. + +   One significant addition to the recommended set of file attributes is +   the Access Control List (ACL) attribute.  This attribute provides for +   directory and file access control beyond the model used in previous +   versions of the NFS protocol.  The ACL definition allows for +   specification of user and group level access control. + + + + + + + + +Shepler, et al.             Standards Track                    [Page 12] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +1.4.3.3.  Filesystem Replication and Migration + +   With the use of a special file attribute, the ability to migrate or +   replicate server filesystems is enabled within the protocol.  The +   filesystem locations attribute provides a method for the client to +   probe the server about the location of a filesystem.  In the event of +   a migration of a filesystem, the client will receive an error when +   operating on the filesystem and it can then query as to the new file +   system location.  Similar steps are used for replication, the client +   is able to query the server for the multiple available locations of a +   particular filesystem.  From this information, the client can use its +   own policies to access the appropriate filesystem location. + +1.4.4.  OPEN and CLOSE + +   The NFS version 4 protocol introduces OPEN and CLOSE operations.  The +   OPEN operation provides a single point where file lookup, creation, +   and share semantics can be combined.  The CLOSE operation also +   provides for the release of state accumulated by OPEN. + +1.4.5.  File locking + +   With the NFS version 4 protocol, the support for byte range file +   locking is part of the NFS protocol.  The file locking support is +   structured so that an RPC callback mechanism is not required.  This +   is a departure from the previous versions of the NFS file locking +   protocol, Network Lock Manager (NLM).  The state associated with file +   locks is maintained at the server under a lease-based model.  The +   server defines a single lease period for all state held by a NFS +   client.  If the client does not renew its lease within the defined +   period, all state associated with the client's lease may be released +   by the server.  The client may renew its lease with use of the RENEW +   operation or implicitly by use of other operations (primarily READ). + +1.4.6.  Client Caching and Delegation + +   The file, attribute, and directory caching for the NFS version 4 +   protocol is similar to previous versions.  Attributes and directory +   information are cached for a duration determined by the client.  At +   the end of a predefined timeout, the client will query the server to +   see if the related filesystem object has been updated. + +   For file data, the client checks its cache validity when the file is +   opened.  A query is sent to the server to determine if the file has +   been changed.  Based on this information, the client determines if +   the data cache for the file should kept or released.  Also, when the +   file is closed, any modified data is written to the server. + + + + +Shepler, et al.             Standards Track                    [Page 13] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   If an application wants to serialize access to file data, file +   locking of the file data ranges in question should be used. + +   The major addition to NFS version 4 in the area of caching is the +   ability of the server to delegate certain responsibilities to the +   client.  When the server grants a delegation for a file to a client, +   the client is guaranteed certain semantics with respect to the +   sharing of that file with other clients.  At OPEN, the server may +   provide the client either a read or write delegation for the file. +   If the client is granted a read delegation, it is assured that no +   other client has the ability to write to the file for the duration of +   the delegation.  If the client is granted a write delegation, the +   client is assured that no other client has read or write access to +   the file. + +   Delegations can be recalled by the server.  If another client +   requests access to the file in such a way that the access conflicts +   with the granted delegation, the server is able to notify the initial +   client and recall the delegation.  This requires that a callback path +   exist between the server and client.  If this callback path does not +   exist, then delegations can not be granted.  The essence of a +   delegation is that it allows the client to locally service operations +   such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate +   interaction with the server. + +1.5.  General Definitions + +   The following definitions are provided for the purpose of providing +   an appropriate context for the reader. + + +   Client    The "client" is the entity that accesses the NFS server's +             resources.  The client may be an application which contains +             the logic to access the NFS server directly.  The client +             may also be the traditional operating system client remote +             filesystem services for a set of applications. + +             In the case of file locking the client is the entity that +             maintains a set of locks on behalf of one or more +             applications.  This client is responsible for crash or +             failure recovery for those locks it manages. + +             Note that multiple clients may share the same transport and +             multiple clients may exist on the same network node. + +   Clientid  A 64-bit quantity used as a unique, short-hand reference to +             a client supplied Verifier and ID.  The server is +             responsible for supplying the Clientid. + + + +Shepler, et al.             Standards Track                    [Page 14] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Lease     An interval of time defined by the server for which the +             client is irrevocably granted a lock.  At the end of a +             lease period the lock may be revoked if the lease has not +             been extended.  The lock must be revoked if a conflicting +             lock has been granted after the lease interval. + +             All leases granted by a server have the same fixed +             interval.  Note that the fixed interval was chosen to +             alleviate the expense a server would have in maintaining +             state about variable length leases across server failures. + +   Lock      The term "lock" is used to refer to both record (byte- +             range) locks as well as share reservations unless +             specifically stated otherwise. + +   Server    The "Server" is the entity responsible for coordinating +             client access to a set of filesystems. + +   Stable Storage +             NFS version 4 servers must be able to recover without data +             loss from multiple power failures (including cascading +             power failures, that is, several power failures in quick +             succession), operating system failures, and hardware +             failure of components other than the storage medium itself +             (for example, disk, nonvolatile RAM). + +             Some examples of stable storage that are allowable for an +             NFS server include: + +             1. Media commit of data, that is, the modified data has +                been successfully written to the disk media, for +                example, the disk platter. + +             2. An immediate reply disk drive with battery-backed on- +                drive intermediate storage or uninterruptible power +                system (UPS). + +             3. Server commit of data with battery-backed intermediate +                storage and recovery software. + +             4. Cache commit with uninterruptible power system (UPS) and +                recovery software. + +   Stateid   A 128-bit quantity returned by a server that uniquely +             defines the open and locking state provided by the server +             for a specific open or lock owner for a specific file. + + + + + +Shepler, et al.             Standards Track                    [Page 15] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +             Stateids composed of all bits 0 or all bits 1 have special +             meaning and are reserved values. + +   Verifier  A 64-bit quantity generated by the client that the server +             can use to determine if the client has restarted and lost +             all previous lock state. + +2.  Protocol Data Types + +   The syntax and semantics to describe the data types of the NFS +   version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] +   documents.  The next sections build upon the XDR data types to define +   types and structures specific to this protocol. + +2.1.  Basic Data Types + +   Data Type       Definition +   ____________________________________________________________________ +   int32_t         typedef int             int32_t; + +   uint32_t        typedef unsigned int    uint32_t; + +   int64_t         typedef hyper           int64_t; + +   uint64_t        typedef unsigned hyper  uint64_t; + +   attrlist4       typedef opaque        attrlist4<>; +                   Used for file/directory attributes + +   bitmap4         typedef uint32_t        bitmap4<>; +                   Used in attribute array encoding. + +   changeid4       typedef       uint64_t        changeid4; +                   Used in definition of change_info + +   clientid4       typedef uint64_t        clientid4; +                   Shorthand reference to client identification + +   component4      typedef utf8str_cs      component4; +                   Represents path name components + +   count4          typedef uint32_t        count4; +                   Various count parameters (READ, WRITE, COMMIT) + +   length4         typedef uint64_t        length4; +                   Describes LOCK lengths + + + + + +Shepler, et al.             Standards Track                    [Page 16] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   linktext4       typedef utf8str_cs      linktext4; +                   Symbolic link contents + +   mode4           typedef uint32_t        mode4; +                   Mode attribute data type + +   nfs_cookie4     typedef uint64_t        nfs_cookie4; +                   Opaque cookie value for READDIR + +   nfs_fh4         typedef opaque          nfs_fh4<NFS4_FHSIZE>; +                   Filehandle definition; NFS4_FHSIZE is defined as 128 + +   nfs_ftype4      enum nfs_ftype4; +                   Various defined file types + +   nfsstat4        enum nfsstat4; +                   Return value for operations + +   offset4         typedef uint64_t        offset4; +                   Various offset designations (READ, WRITE, +                   LOCK, COMMIT) + +   pathname4       typedef component4      pathname4<>; +                   Represents path name for LOOKUP, OPEN and others + +   qop4            typedef uint32_t        qop4; +                   Quality of protection designation in SECINFO + +   sec_oid4        typedef opaque          sec_oid4<>; +                   Security Object Identifier +                   The sec_oid4 data type is not really opaque. +                   Instead contains an ASN.1 OBJECT IDENTIFIER as used +                   by GSS-API in the mech_type argument to +                   GSS_Init_sec_context.  See [RFC2743] for details. + +   seqid4          typedef uint32_t        seqid4; +                   Sequence identifier used for file locking + +   utf8string      typedef opaque          utf8string<>; +                   UTF-8 encoding for strings + +   utf8str_cis     typedef opaque          utf8str_cis; +                   Case-insensitive UTF-8 string + +   utf8str_cs      typedef opaque          utf8str_cs; +                   Case-sensitive UTF-8 string + + + + + +Shepler, et al.             Standards Track                    [Page 17] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   utf8str_mixed   typedef opaque          utf8str_mixed; +                   UTF-8 strings with a case sensitive prefix and +                   a case insensitive suffix. + +   verifier4       typedef opaque        verifier4[NFS4_VERIFIER_SIZE]; +                   Verifier used for various operations (COMMIT, +                   CREATE, OPEN, READDIR, SETCLIENTID, +                   SETCLIENTID_CONFIRM, WRITE) NFS4_VERIFIER_SIZE is +                   defined as 8. + +2.2.  Structured Data Types + +   nfstime4 +                  struct nfstime4 { +                          int64_t seconds; +                          uint32_t nseconds; +                  } + +   The nfstime4 structure gives the number of seconds and nanoseconds +   since midnight or 0 hour January 1, 1970 Coordinated Universal Time +   (UTC).  Values greater than zero for the seconds field denote dates +   after the 0 hour January 1, 1970.  Values less than zero for the +   seconds field denote dates before the 0 hour January 1, 1970.  In +   both cases, the nseconds field is to be added to the seconds field +   for the final time representation.  For example, if the time to be +   represented is one-half second before 0 hour January 1, 1970, the +   seconds field would have a value of negative one (-1) and the +   nseconds fields would have a value of one-half second (500000000). +   Values greater than 999,999,999 for nseconds are considered invalid. + +   This data type is used to pass time and date information.  A server +   converts to and from its local representation of time when processing +   time values, preserving as much accuracy as possible.  If the +   precision of timestamps stored for a filesystem object is less than +   defined, loss of precision can occur.  An adjunct time maintenance +   protocol is recommended to reduce client and server time skew. + +   time_how4 + +                  enum time_how4 { +                          SET_TO_SERVER_TIME4 = 0, +                          SET_TO_CLIENT_TIME4 = 1 +                  }; + + + + + + + + +Shepler, et al.             Standards Track                    [Page 18] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   settime4 + +                  union settime4 switch (time_how4 set_it) { +                   case SET_TO_CLIENT_TIME4: +                           nfstime4       time; +                   default: +                           void; +                  }; + +   The above definitions are used as the attribute definitions to set +   time values.  If set_it is SET_TO_SERVER_TIME4, then the server uses +   its local representation of time for the time value. + +   specdata4 + +                  struct specdata4 { +                          uint32_t specdata1; /* major device number */ +                          uint32_t specdata2; /* minor device number */ +                  }; + +   This data type represents additional information for the device file +   types NF4CHR and NF4BLK. + +   fsid4 + +                  struct fsid4 { +                    uint64_t        major; +                    uint64_t        minor; +                  }; + +   This type is the filesystem identifier that is used as a mandatory +   attribute. + +   fs_location4 + +                  struct fs_location4 { +                          utf8str_cis    server<>; +                          pathname4     rootpath; +                  }; + + +   fs_locations4 + +                  struct fs_locations4 { +                          pathname4     fs_root; +                          fs_location4  locations<>; +                  }; + + + + +Shepler, et al.             Standards Track                    [Page 19] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The fs_location4 and fs_locations4 data types are used for the +   fs_locations recommended attribute which is used for migration and +   replication support. + +   fattr4 + +                  struct fattr4 { +                          bitmap4       attrmask; +                          attrlist4     attr_vals; +                  }; + +   The fattr4 structure is used to represent file and directory +   attributes. + +   The bitmap is a counted array of 32 bit integers used to contain bit +   values.  The position of the integer in the array that contains bit n +   can be computed from the expression (n / 32) and its bit within that +   integer is (n mod 32). + +                           0            1 +         +-----------+-----------+-----------+-- +         |  count    | 31  ..  0 | 63  .. 32 | +         +-----------+-----------+-----------+-- + +   change_info4 + +                  struct change_info4 { +                          bool          atomic; +                          changeid4     before; +                          changeid4     after; +                  }; + +   This structure is used with the CREATE, LINK, REMOVE, RENAME +   operations to let the client know the value of the change attribute +   for the directory in which the target filesystem object resides. + +   clientaddr4 + +                  struct clientaddr4 { +                          /* see struct rpcb in RFC 1833 */ +                          string r_netid<>;    /* network id */ +                          string r_addr<>;     /* universal address */ +                  }; + +   The clientaddr4 structure is used as part of the SETCLIENTID +   operation to either specify the address of the client that is using a +   clientid or as part of the callback registration.  The + + + + +Shepler, et al.             Standards Track                    [Page 20] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   r_netid and r_addr fields are specified in [RFC1833], but they are +   underspecified in [RFC1833] as far as what they should look like for +   specific protocols. + +   For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the +   US-ASCII string: + +      h1.h2.h3.h4.p1.p2 + +   The prefix, "h1.h2.h3.h4", is the standard textual form for +   representing an IPv4 address, which is always four octets long. +   Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, +   the first through fourth octets each converted to ASCII-decimal. +   Assuming big-endian ordering, p1 and p2 are, respectively, the first +   and second octets each converted to ASCII-decimal.  For example, if a +   host, in big-endian order, has an address of 0x0A010307 and there is +   a service listening on, in big endian order, port 0x020F (decimal +   527), then the complete universal address is "10.1.3.7.2.15". + +   For TCP over IPv4 the value of r_netid is the string "tcp".  For UDP +   over IPv4 the value of r_netid is the string "udp". + +   For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the +   US-ASCII string: + +         x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 + +   The suffix "p1.p2" is the service port, and is computed the same way +   as with universal addresses for TCP and UDP over IPv4.  The prefix, +   "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form for +   representing an IPv6 address as defined in Section 2.2 of [RFC2373]. +   Additionally, the two alternative forms specified in Section 2.2 of +   [RFC2373] are also acceptable. + +   For TCP over IPv6 the value of r_netid is the string "tcp6".  For UDP +   over IPv6 the value of r_netid is the string "udp6". + +   cb_client4 + +                  struct cb_client4 { +                          unsigned int  cb_program; +                          clientaddr4   cb_location; +                  }; + +   This structure is used by the client to inform the server of its call +   back address; includes the program number and client address. + + + + + +Shepler, et al.             Standards Track                    [Page 21] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   nfs_client_id4 + +                  struct nfs_client_id4 { +                          verifier4     verifier; +                          opaque        id<NFS4_OPAQUE_LIMIT>; +                  }; + +   This structure is part of the arguments to the SETCLIENTID operation. +   NFS4_OPAQUE_LIMIT is defined as 1024. + +   open_owner4 + +                  struct open_owner4 { +                          clientid4     clientid; +                          opaque        owner<NFS4_OPAQUE_LIMIT>; +                  }; + +   This structure is used to identify the owner of open state. +   NFS4_OPAQUE_LIMIT is defined as 1024. + +   lock_owner4 + +                  struct lock_owner4 { +                          clientid4     clientid; +                          opaque        owner<NFS4_OPAQUE_LIMIT>; +                  }; + +   This structure is used to identify the owner of file locking state. +   NFS4_OPAQUE_LIMIT is defined as 1024. + +   open_to_lock_owner4 + +                  struct open_to_lock_owner4 { +                          seqid4          open_seqid; +                          stateid4        open_stateid; +                          seqid4          lock_seqid; +                          lock_owner4     lock_owner; +                  }; + +   This structure is used for the first LOCK operation done for an +   open_owner4.  It provides both the open_stateid and lock_owner such +   that the transition is made from a valid open_stateid sequence to +   that of the new lock_stateid sequence.  Using this mechanism avoids +   the confirmation of the lock_owner/lock_seqid pair since it is tied +   to established state in the form of the open_stateid/open_seqid. + + + + + + +Shepler, et al.             Standards Track                    [Page 22] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   stateid4 + +                  struct stateid4 { +                    uint32_t        seqid; +                    opaque          other[12]; +                  }; + +   This structure is used for the various state sharing mechanisms +   between the client and server.  For the client, this data structure +   is read-only.  The starting value of the seqid field is undefined. +   The server is required to increment the seqid field monotonically at +   each transition of the stateid.  This is important since the client +   will inspect the seqid in OPEN stateids to determine the order of +   OPEN processing done by the server. + +3.  RPC and Security Flavor + +   The NFS version 4 protocol is a Remote Procedure Call (RPC) +   application that uses RPC version 2 and the corresponding eXternal +   Data Representation (XDR) as defined in [RFC1831] and [RFC1832].  The +   RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as +   the mechanism to deliver stronger security for the NFS version 4 +   protocol. + +3.1.  Ports and Transports + +   Historically, NFS version 2 and version 3 servers have resided on +   port 2049.  The registered port 2049 [RFC3232] for the NFS protocol +   should be the default configuration.  Using the registered port for +   NFS services means the NFS client will not need to use the RPC +   binding protocols as described in [RFC1833]; this will allow NFS to +   transit firewalls. + +   Where an NFS version 4 implementation supports operation over the IP +   network protocol, the supported transports between NFS and IP MUST be +   among the IETF-approved congestion control transport protocols, which +   include TCP and SCTP.  To enhance the possibilities for +   interoperability, an NFS version 4 implementation MUST support +   operation over the TCP transport protocol, at least until such time +   as a standards track RFC revises this requirement to use a different +   IETF-approved congestion control transport protocol. + +   If TCP is used as the transport, the client and server SHOULD use +   persistent connections.  This will prevent the weakening of TCP's +   congestion control via short lived connections and will improve +   performance for the WAN environment by eliminating the need for SYN +   handshakes. + + + + +Shepler, et al.             Standards Track                    [Page 23] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   As noted in the Security Considerations section, the authentication +   model for NFS version 4 has moved from machine-based to principal- +   based.  However, this modification of the authentication model does +   not imply a technical requirement to move the TCP connection +   management model from whole machine-based to one based on a per user +   model.  In particular, NFS over TCP client implementations have +   traditionally multiplexed traffic for multiple users over a common +   TCP connection between an NFS client and server.  This has been true, +   regardless whether the NFS client is using AUTH_SYS, AUTH_DH, +   RPCSEC_GSS or any other flavor.  Similarly, NFS over TCP server +   implementations have assumed such a model and thus scale the +   implementation of TCP connection management in proportion to the +   number of expected client machines.  It is intended that NFS version +   4 will not modify this connection management model.  NFS version 4 +   clients that violate this assumption can expect scaling issues on the +   server and hence reduced service. + +   Note that for various timers, the client and server should avoid +   inadvertent synchronization of those timers.  For further discussion +   of the general issue refer to [Floyd]. + +3.1.1.  Client Retransmission Behavior + +   When processing a request received over a reliable transport such as +   TCP, the NFS version 4 server MUST NOT silently drop the request, +   except if the transport connection has been broken.  Given such a +   contract between NFS version 4 clients and servers, clients MUST NOT +   retry a request unless one or both of the following are true: + +   o  The transport connection has been broken + +   o  The procedure being retried is the NULL procedure + +   Since reliable transports, such as TCP, do not always synchronously +   inform a peer when the other peer has broken the connection (for +   example, when an NFS server reboots), the NFS version 4 client may +   want to actively "probe" the connection to see if has been broken. +   Use of the NULL procedure is one recommended way to do so.  So, when +   a client experiences a remote procedure call timeout (of some +   arbitrary implementation specific amount), rather than retrying the +   remote procedure call, it could instead issue a NULL procedure call +   to the server.  If the server has died, the transport connection +   break will eventually be indicated to the NFS version 4 client.  The +   client can then reconnect, and then retry the original request.  If +   the NULL procedure call gets a response, the connection has not +   broken.  The client can decide to wait longer for the original +   request's response, or it can break the transport connection and +   reconnect before re-sending the original request. + + + +Shepler, et al.             Standards Track                    [Page 24] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   For callbacks from the server to the client, the same rules apply, +   but the server doing the callback becomes the client, and the client +   receiving the callback becomes the server. + +3.2.  Security Flavors + +   Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, +   AUTH_DH, and AUTH_KRB4 as security flavors.  With [RFC2203] an +   additional security flavor of RPCSEC_GSS has been introduced which +   uses the functionality of GSS-API [RFC2743].  This allows for the use +   of various security mechanisms by the RPC layer without the +   additional implementation overhead of adding RPC security flavors. +   For NFS version 4, the RPCSEC_GSS security flavor MUST be used to +   enable the mandatory security mechanism.  Other flavors, such as, +   AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. + +3.2.1.  Security mechanisms for NFS version 4 + +   The use of RPCSEC_GSS requires selection of: mechanism, quality of +   protection, and service (authentication, integrity, privacy).  The +   remainder of this document will refer to these three parameters of +   the RPCSEC_GSS security as the security triple. + +3.2.1.1.  Kerberos V5 as a security triple + +   The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be +   implemented and provide the following security triples. + +   column descriptions: + +   1 == number of pseudo flavor +   2 == name of pseudo flavor +   3 == mechanism's OID +   4 == mechanism's algorithm(s) +   5 == RPCSEC_GSS service + +   1      2     3                    4             5 +   -------------------------------------------------------------------- +   390003 krb5  1.2.840.113554.1.2.2 DES MAC MD5   rpc_gss_svc_none +   390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5   rpc_gss_svc_integrity +   390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5   rpc_gss_svc_privacy +                                     for integrity, +                                     and 56 bit DES +                                     for privacy. + +   Note that the pseudo flavor is presented here as a mapping aid to the +   implementor.  Because this NFS protocol includes a method to +   negotiate security and it understands the GSS-API mechanism, the + + + +Shepler, et al.             Standards Track                    [Page 25] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   pseudo flavor is not needed.  The pseudo flavor is needed for NFS +   version 3 since the security negotiation is done via the MOUNT +   protocol. + +   For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please +   see [RFC2623]. + +   Users and implementors are warned that 56 bit DES is no longer +   considered state of the art in terms of resistance to brute force +   attacks.  Once a revision to [RFC1964] is available that adds support +   for AES, implementors are urged to incorporate AES into their NFSv4 +   over Kerberos V5 protocol stacks, and users are similarly urged to +   migrate to the use of AES. + +3.2.1.2.  LIPKEY as a security triple + +   The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be +   implemented and provide the following security triples.  The +   definition of the columns matches the previous subsection "Kerberos +   V5 as security triple" + +   1      2        3                   4              5 +   -------------------------------------------------------------------- +   390006 lipkey   1.3.6.1.5.5.9       negotiated  rpc_gss_svc_none +   390007 lipkey-i 1.3.6.1.5.5.9       negotiated  rpc_gss_svc_integrity +   390008 lipkey-p 1.3.6.1.5.5.9       negotiated  rpc_gss_svc_privacy + +   The mechanism algorithm is listed as "negotiated".  This is because +   LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the +   confidentiality and integrity algorithms are negotiated.  Since +   SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit +   cast5CBC for confidentiality for privacy as MANDATORY, and further +   specifies that HMAC-MD5 and cast5CBC MUST be listed first before +   weaker algorithms, specifying "negotiated" in column 4 does not +   impair interoperability.  In the event an SPKM-3 peer does not +   support the mandatory algorithms, the other peer is free to accept or +   reject the GSS-API context creation. + +   Because SPKM-3 negotiates the algorithms, subsequent calls to +   LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality +   of protection value of 0 (zero).  See section 5.2 of [RFC2025] for an +   explanation. + +   LIPKEY uses SPKM-3 to create a secure channel in which to pass a user +   name and password from the client to the server.  Once the user name +   and password have been accepted by the server, calls to the LIPKEY +   context are redirected to the SPKM-3 context.  See [RFC2847] for more +   details. + + + +Shepler, et al.             Standards Track                    [Page 26] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +3.2.1.3.  SPKM-3 as a security triple + +   The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be +   implemented and provide the following security triples.  The +   definition of the columns matches the previous subsection "Kerberos +   V5 as security triple". + +   1      2        3                   4              5 +   -------------------------------------------------------------------- +   390009 spkm3    1.3.6.1.5.5.1.3     negotiated  rpc_gss_svc_none +   390010 spkm3i   1.3.6.1.5.5.1.3     negotiated  rpc_gss_svc_integrity +   390011 spkm3p   1.3.6.1.5.5.1.3     negotiated  rpc_gss_svc_privacy + +   For a discussion as to why the mechanism algorithm is listed as +   "negotiated", see the previous section "LIPKEY as a security triple." + +   Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- +   3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of +   protection value of 0 (zero).  See section 5.2 of [RFC2025] for an +   explanation. + +   Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a +   mandatory set of triples to handle the situations where the initiator +   (the client) is anonymous or where the initiator has its own +   certificate.  If the initiator is anonymous, there will not be a user +   name and password to send to the target (the server).  If the +   initiator has its own certificate, then using passwords is +   superfluous. + +3.3.  Security Negotiation + +   With the NFS version 4 server potentially offering multiple security +   mechanisms, the client needs a method to determine or negotiate which +   mechanism is to be used for its communication with the server.  The +   NFS server may have multiple points within its filesystem name space +   that are available for use by NFS clients.  In turn the NFS server +   may be configured such that each of these entry points may have +   different or multiple security mechanisms in use. + +   The security negotiation between client and server must be done with +   a secure channel to eliminate the possibility of a third party +   intercepting the negotiation sequence and forcing the client and +   server to choose a lower level of security than required or desired. +   See the section "Security Considerations" for further discussion. + + + + + + + +Shepler, et al.             Standards Track                    [Page 27] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +3.3.1.  SECINFO + +   The new SECINFO operation will allow the client to determine, on a +   per filehandle basis, what security triple is to be used for server +   access.  In general, the client will not have to use the SECINFO +   operation except during initial communication with the server or when +   the client crosses policy boundaries at the server.  It is possible +   that the server's policies change during the client's interaction +   therefore forcing the client to negotiate a new security triple. + +3.3.2.  Security Error + +   Based on the assumption that each NFS version 4 client and server +   must support a minimum set of security (i.e., LIPKEY, SPKM-3, and +   Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its +   communication with the server with one of the minimal security +   triples.  During communication with the server, the client may +   receive an NFS error of NFS4ERR_WRONGSEC.  This error allows the +   server to notify the client that the security triple currently being +   used is not appropriate for access to the server's filesystem +   resources.  The client is then responsible for determining what +   security triples are available at the server and choose one which is +   appropriate for the client.  See the section for the "SECINFO" +   operation for further discussion of how the client will respond to +   the NFS4ERR_WRONGSEC error and use SECINFO. + +3.4.  Callback RPC Authentication + +   Except as noted elsewhere in this section, the callback RPC +   (described later) MUST mutually authenticate the NFS server to the +   principal that acquired the clientid (also described later), using +   the security flavor the original SETCLIENTID operation used. + +   For AUTH_NONE, there are no principals, so this is a non-issue. + +   AUTH_SYS has no notions of mutual authentication or a server +   principal, so the callback from the server simply uses the AUTH_SYS +   credential that the user used when he set up the delegation. + +   For AUTH_DH, one commonly used convention is that the server uses the +   credential corresponding to this AUTH_DH principal: + +         unix.host@domain + +   where host and domain are variables corresponding to the name of +   server host and directory services domain in which it lives such as a +   Network Information System domain or a DNS domain. + + + + +Shepler, et al.             Standards Track                    [Page 28] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Because LIPKEY is layered over SPKM-3, it is permissible for the +   server to use SPKM-3 and not LIPKEY for the callback even if the +   client used LIPKEY for SETCLIENTID. + +   Regardless of what security mechanism under RPCSEC_GSS is being used, +   the NFS server, MUST identify itself in GSS-API via a +   GSS_C_NT_HOSTBASED_SERVICE name type.  GSS_C_NT_HOSTBASED_SERVICE +   names are of the form: + +         service@hostname + +   For NFS, the "service" element is + +         nfs + +   Implementations of security mechanisms will convert nfs@hostname to +   various different forms.  For Kerberos V5 and LIPKEY, the following +   form is RECOMMENDED: + +         nfs/hostname + +   For Kerberos V5, nfs/hostname would be a server principal in the +   Kerberos Key Distribution Center database.  This is the same +   principal the client acquired a GSS-API context for when it issued +   the SETCLIENTID operation, therefore, the realm name for the server +   principal must be the same for the callback as it was for the +   SETCLIENTID. + +   For LIPKEY, this would be the username passed to the target (the NFS +   version 4 client that receives the callback). + +   It should be noted that LIPKEY may not work for callbacks, since the +   LIPKEY client uses a user id/password.  If the NFS client receiving +   the callback can authenticate the NFS server's user name/password +   pair, and if the user that the NFS server is authenticating to has a +   public key certificate, then it works. + +   In situations where the NFS client uses LIPKEY and uses a per-host +   principal for the SETCLIENTID operation, instead of using LIPKEY for +   SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication +   be used.  This effectively means that the client will use a +   certificate to authenticate and identify the initiator to the target +   on the NFS server.  Using SPKM-3 and not LIPKEY has the following +   advantages: + +   o  When the server does a callback, it must authenticate to the +      principal used in the SETCLIENTID.  Even if LIPKEY is used, +      because LIPKEY is layered over SPKM-3, the NFS client will need to + + + +Shepler, et al.             Standards Track                    [Page 29] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      have a certificate that corresponds to the principal used in the +      SETCLIENTID operation.  From an administrative perspective, having +      a user name, password, and certificate for both the client and +      server is redundant. + +   o  LIPKEY was intended to minimize additional infrastructure +      requirements beyond a certificate for the target, and the +      expectation is that existing password infrastructure can be +      leveraged for the initiator.  In some environments, a per-host +      password does not exist yet.  If certificates are used for any +      per-host principals, then additional password infrastructure is +      not needed. + +   o  In cases when a host is both an NFS client and server, it can +      share the same per-host certificate. + +4.  Filehandles + +   The filehandle in the NFS protocol is a per server unique identifier +   for a filesystem object.  The contents of the filehandle are opaque +   to the client.  Therefore, the server is responsible for translating +   the filehandle to an internal representation of the filesystem +   object. + +4.1.  Obtaining the First Filehandle + +   The operations of the NFS protocol are defined in terms of one or +   more filehandles.  Therefore, the client needs a filehandle to +   initiate communication with the server.  With the NFS version 2 +   protocol [RFC1094] and the NFS version 3 protocol [RFC1813], there +   exists an ancillary protocol to obtain this first filehandle.  The +   MOUNT protocol, RPC program number 100005, provides the mechanism of +   translating a string based filesystem path name to a filehandle which +   can then be used by the NFS protocols. + +   The MOUNT protocol has deficiencies in the area of security and use +   via firewalls.  This is one reason that the use of the public +   filehandle was introduced in [RFC2054] and [RFC2055].  With the use +   of the public filehandle in combination with the LOOKUP operation in +   the NFS version 2 and 3 protocols, it has been demonstrated that the +   MOUNT protocol is unnecessary for viable interaction between NFS +   client and server. + +   Therefore, the NFS version 4 protocol will not use an ancillary +   protocol for translation from string based path names to a +   filehandle.  Two special filehandles will be used as starting points +   for the NFS client. + + + + +Shepler, et al.             Standards Track                    [Page 30] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +4.1.1.  Root Filehandle + +   The first of the special filehandles is the ROOT filehandle.  The +   ROOT filehandle is the "conceptual" root of the filesystem name space +   at the NFS server.  The client uses or starts with the ROOT +   filehandle by employing the PUTROOTFH operation.  The PUTROOTFH +   operation instructs the server to set the "current" filehandle to the +   ROOT of the server's file tree.  Once this PUTROOTFH operation is +   used, the client can then traverse the entirety of the server's file +   tree with the LOOKUP operation.  A complete discussion of the server +   name space is in the section "NFS Server Name Space". + +4.1.2.  Public Filehandle + +   The second special filehandle is the PUBLIC filehandle.  Unlike the +   ROOT filehandle, the PUBLIC filehandle may be bound or represent an +   arbitrary filesystem object at the server.  The server is responsible +   for this binding.  It may be that the PUBLIC filehandle and the ROOT +   filehandle refer to the same filesystem object.  However, it is up to +   the administrative software at the server and the policies of the +   server administrator to define the binding of the PUBLIC filehandle +   and server filesystem object.  The client may not make any +   assumptions about this binding.  The client uses the PUBLIC +   filehandle via the PUTPUBFH operation. + +4.2.  Filehandle Types + +   In the NFS version 2 and 3 protocols, there was one type of +   filehandle with a single set of semantics.  This type of filehandle +   is termed "persistent" in NFS Version 4.  The semantics of a +   persistent filehandle remain the same as before.  A new type of +   filehandle introduced in NFS Version 4 is the "volatile" filehandle, +   which attempts to accommodate certain server environments. + +   The volatile filehandle type was introduced to address server +   functionality or implementation issues which make correct +   implementation of a persistent filehandle infeasible.  Some server +   environments do not provide a filesystem level invariant that can be +   used to construct a persistent filehandle.  The underlying server +   filesystem may not provide the invariant or the server's filesystem +   programming interfaces may not provide access to the needed +   invariant.  Volatile filehandles may ease the implementation of +   server functionality such as hierarchical storage management or +   filesystem reorganization or migration.  However, the volatile +   filehandle increases the implementation burden for the client. + + + + + + +Shepler, et al.             Standards Track                    [Page 31] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Since the client will need to handle persistent and volatile +   filehandles differently, a file attribute is defined which may be +   used by the client to determine the filehandle types being returned +   by the server. + +4.2.1.  General Properties of a Filehandle + +   The filehandle contains all the information the server needs to +   distinguish an individual file.  To the client, the filehandle is +   opaque.  The client stores filehandles for use in a later request and +   can compare two filehandles from the same server for equality by +   doing a byte-by-byte comparison.  However, the client MUST NOT +   otherwise interpret the contents of filehandles.  If two filehandles +   from the same server are equal, they MUST refer to the same file. +   Servers SHOULD try to maintain a one-to-one correspondence between +   filehandles and files but this is not required.  Clients MUST use +   filehandle comparisons only to improve performance, not for correct +   behavior.  All clients need to be prepared for situations in which it +   cannot be determined whether two filehandles denote the same object +   and in such cases, avoid making invalid assumptions which might cause +   incorrect behavior.  Further discussion of filehandle and attribute +   comparison in the context of data caching is presented in the section +   "Data Caching and File Identity". + +   As an example, in the case that two different path names when +   traversed at the server terminate at the same filesystem object, the +   server SHOULD return the same filehandle for each path.  This can +   occur if a hard link is used to create two file names which refer to +   the same underlying file object and associated data.  For example, if +   paths /a/b/c and /a/d/c refer to the same file, the server SHOULD +   return the same filehandle for both path names traversals. + +4.2.2.  Persistent Filehandle + +   A persistent filehandle is defined as having a fixed value for the +   lifetime of the filesystem object to which it refers.  Once the +   server creates the filehandle for a filesystem object, the server +   MUST accept the same filehandle for the object for the lifetime of +   the object.  If the server restarts or reboots the NFS server must +   honor the same filehandle value as it did in the server's previous +   instantiation.  Similarly, if the filesystem is migrated, the new NFS +   server must honor the same filehandle as the old NFS server. + +   The persistent filehandle will be become stale or invalid when the +   filesystem object is removed.  When the server is presented with a +   persistent filehandle that refers to a deleted object, it MUST return +   an error of NFS4ERR_STALE.  A filehandle may become stale when the +   filesystem containing the object is no longer available.  The file + + + +Shepler, et al.             Standards Track                    [Page 32] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   system may become unavailable if it exists on removable media and the +   media is no longer available at the server or the filesystem in whole +   has been destroyed or the filesystem has simply been removed from the +   server's name space (i.e., unmounted in a UNIX environment). + +4.2.3.  Volatile Filehandle + +   A volatile filehandle does not share the same longevity +   characteristics of a persistent filehandle.  The server may determine +   that a volatile filehandle is no longer valid at many different +   points in time.  If the server can definitively determine that a +   volatile filehandle refers to an object that has been removed, the +   server should return NFS4ERR_STALE to the client (as is the case for +   persistent filehandles).  In all other cases where the server +   determines that a volatile filehandle can no longer be used, it +   should return an error of NFS4ERR_FHEXPIRED. + +   The mandatory attribute "fh_expire_type" is used by the client to +   determine what type of filehandle the server is providing for a +   particular filesystem.  This attribute is a bitmask with the +   following values: + +   FH4_PERSISTENT +             The value of FH4_PERSISTENT is used to indicate a +             persistent filehandle, which is valid until the object is +             removed from the filesystem.  The server will not return +             NFS4ERR_FHEXPIRED for this filehandle.  FH4_PERSISTENT is +             defined as a value in which none of the bits specified +             below are set. + +   FH4_VOLATILE_ANY +             The filehandle may expire at any time, except as +             specifically excluded (i.e., FH4_NO_EXPIRE_WITH_OPEN). + +   FH4_NOEXPIRE_WITH_OPEN +             May only be set when FH4_VOLATILE_ANY is set.  If this bit +             is set, then the meaning of FH4_VOLATILE_ANY is qualified +             to exclude any expiration of the filehandle when it is +             open. + +   FH4_VOL_MIGRATION +             The filehandle will expire as a result of migration.  If +             FH4_VOL_ANY is set, FH4_VOL_MIGRATION is redundant. + + + + + + + + +Shepler, et al.             Standards Track                    [Page 33] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   FH4_VOL_RENAME +             The filehandle will expire during rename.  This includes a +             rename by the requesting client or a rename by any other +             client.  If FH4_VOL_ANY is set, FH4_VOL_RENAME is +             redundant. + +   Servers which provide volatile filehandles that may expire while open +   (i.e., if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if +   FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should +   deny a RENAME or REMOVE that would affect an OPEN file of any of the +   components leading to the OPEN file.  In addition, the server should +   deny all RENAME or REMOVE requests during the grace period upon +   server restart. + +   Note that the bits FH4_VOL_MIGRATION and FH4_VOL_RENAME allow the +   client to determine that expiration has occurred whenever a specific +   event occurs, without an explicit filehandle expiration error from +   the server.  FH4_VOL_ANY does not provide this form of information. +   In situations where the server will expire many, but not all +   filehandles upon migration (e.g., all but those that are open), +   FH4_VOLATILE_ANY (in this case with FH4_NOEXPIRE_WITH_OPEN) is a +   better choice since the client may not assume that all filehandles +   will expire when migration occurs, and it is likely that additional +   expirations will occur (as a result of file CLOSE) that are separated +   in time from the migration event itself. + +4.2.4.  One Method of Constructing a Volatile Filehandle + +   A volatile filehandle, while opaque to the client could contain: + +   [volatile bit = 1 | server boot time | slot | generation number] + +   o  slot is an index in the server volatile filehandle table + +   o  generation number is the generation number for the table +      entry/slot + +   When the client presents a volatile filehandle, the server makes the +   following checks, which assume that the check for the volatile bit +   has passed.  If the server boot time is less than the current server +   boot time, return NFS4ERR_FHEXPIRED.  If slot is out of range, return +   NFS4ERR_BADHANDLE.  If the generation number does not match, return +   NFS4ERR_FHEXPIRED. + +   When the server reboots, the table is gone (it is volatile). + +   If volatile bit is 0, then it is a persistent filehandle with a +   different structure following it. + + + +Shepler, et al.             Standards Track                    [Page 34] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +4.3.  Client Recovery from Filehandle Expiration + +   If possible, the client SHOULD recover from the receipt of an +   NFS4ERR_FHEXPIRED error.  The client must take on additional +   responsibility so that it may prepare itself to recover from the +   expiration of a volatile filehandle.  If the server returns +   persistent filehandles, the client does not need these additional +   steps. + +   For volatile filehandles, most commonly the client will need to store +   the component names leading up to and including the filesystem object +   in question.  With these names, the client should be able to recover +   by finding a filehandle in the name space that is still available or +   by starting at the root of the server's filesystem name space. + +   If the expired filehandle refers to an object that has been removed +   from the filesystem, obviously the client will not be able to recover +   from the expired filehandle. + +   It is also possible that the expired filehandle refers to a file that +   has been renamed.  If the file was renamed by another client, again +   it is possible that the original client will not be able to recover. +   However, in the case that the client itself is renaming the file and +   the file is open, it is possible that the client may be able to +   recover.  The client can determine the new path name based on the +   processing of the rename request.  The client can then regenerate the +   new filehandle based on the new path name.  The client could also use +   the compound operation mechanism to construct a set of operations +   like: +           RENAME A B +           LOOKUP B +           GETFH + +   Note that the COMPOUND procedure does not provide atomicity.  This +   example only reduces the overhead of recovering from an expired +   filehandle. + +5.  File Attributes + +   To meet the requirements of extensibility and increased +   interoperability with non-UNIX platforms, attributes must be handled +   in a flexible manner.  The NFS version 3 fattr3 structure contains a +   fixed list of attributes that not all clients and servers are able to +   support or care about.  The fattr3 structure can not be extended as +   new needs arise and it provides no way to indicate non-support.  With +   the NFS version 4 protocol, the client is able query what attributes +   the server supports and construct requests with only those supported +   attributes (or a subset thereof). + + + +Shepler, et al.             Standards Track                    [Page 35] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   To this end, attributes are divided into three groups: mandatory, +   recommended, and named.  Both mandatory and recommended attributes +   are supported in the NFS version 4 protocol by a specific and well- +   defined encoding and are identified by number.  They are requested by +   setting a bit in the bit vector sent in the GETATTR request; the +   server response includes a bit vector to list what attributes were +   returned in the response.  New mandatory or recommended attributes +   may be added to the NFS protocol between major revisions by +   publishing a standards-track RFC which allocates a new attribute +   number value and defines the encoding for the attribute.  See the +   section "Minor Versioning" for further discussion. + +   Named attributes are accessed by the new OPENATTR operation, which +   accesses a hidden directory of attributes associated with a file +   system object.  OPENATTR takes a filehandle for the object and +   returns the filehandle for the attribute hierarchy.  The filehandle +   for the named attributes is a directory object accessible by LOOKUP +   or READDIR and contains files whose names represent the named +   attributes and whose data bytes are the value of the attribute.  For +   example: + +      LOOKUP     "foo"       ; look up file +      GETATTR    attrbits +      OPENATTR               ; access foo's named attributes +      LOOKUP     "x11icon"   ; look up specific attribute +      READ       0,4096      ; read stream of bytes + +   Named attributes are intended for data needed by applications rather +   than by an NFS client implementation.  NFS implementors are strongly +   encouraged to define their new attributes as recommended attributes +   by bringing them to the IETF standards-track process. + +   The set of attributes which are classified as mandatory is +   deliberately small since servers must do whatever it takes to support +   them.  A server should support as many of the recommended attributes +   as possible but by their definition, the server is not required to +   support all of them.  Attributes are deemed mandatory if the data is +   both needed by a large number of clients and is not otherwise +   reasonably computable by the client when support is not provided on +   the server. + +   Note that the hidden directory returned by OPENATTR is a convenience +   for protocol processing.  The client should not make any assumptions +   about the server's implementation of named attributes and whether the +   underlying filesystem at the server has a named attribute directory +   or not.  Therefore, operations such as SETATTR and GETATTR on the +   named attribute directory are undefined. + + + + +Shepler, et al.             Standards Track                    [Page 36] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +5.1.  Mandatory Attributes + +   These MUST be supported by every NFS version 4 client and server in +   order to ensure a minimum level of interoperability.  The server must +   store and return these attributes and the client must be able to +   function with an attribute set limited to these attributes.  With +   just the mandatory attributes some client functionality may be +   impaired or limited in some ways.  A client may ask for any of these +   attributes to be returned by setting a bit in the GETATTR request and +   the server must return their value. + +5.2.  Recommended Attributes + +   These attributes are understood well enough to warrant support in the +   NFS version 4 protocol.  However, they may not be supported on all +   clients and servers.  A client may ask for any of these attributes to +   be returned by setting a bit in the GETATTR request but must handle +   the case where the server does not return them.  A client may ask for +   the set of attributes the server supports and should not request +   attributes the server does not support.  A server should be tolerant +   of requests for unsupported attributes and simply not return them +   rather than considering the request an error.  It is expected that +   servers will support all attributes they comfortably can and only +   fail to support attributes which are difficult to support in their +   operating environments.  A server should provide attributes whenever +   they don't have to "tell lies" to the client.  For example, a file +   modification time should be either an accurate time or should not be +   supported by the server.  This will not always be comfortable to +   clients but the client is better positioned decide whether and how to +   fabricate or construct an attribute or whether to do without the +   attribute. + +5.3.  Named Attributes + +   These attributes are not supported by direct encoding in the NFS +   Version 4 protocol but are accessed by string names rather than +   numbers and correspond to an uninterpreted stream of bytes which are +   stored with the filesystem object.  The name space for these +   attributes may be accessed by using the OPENATTR operation.  The +   OPENATTR operation returns a filehandle for a virtual "attribute +   directory" and further perusal of the name space may be done using +   READDIR and LOOKUP operations on this filehandle.  Named attributes +   may then be examined or changed by normal READ and WRITE and CREATE +   operations on the filehandles returned from READDIR and LOOKUP. +   Named attributes may have attributes. + + + + + + +Shepler, et al.             Standards Track                    [Page 37] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   It is recommended that servers support arbitrary named attributes.  A +   client should not depend on the ability to store any named attributes +   in the server's filesystem.  If a server does support named +   attributes, a client which is also able to handle them should be able +   to copy a file's data and meta-data with complete transparency from +   one location to another; this would imply that names allowed for +   regular directory entries are valid for named attribute names as +   well. + +   Names of attributes will not be controlled by this document or other +   IETF standards track documents.  See the section "IANA +   Considerations" for further discussion. + +5.4.  Classification of Attributes + +   Each of the Mandatory and Recommended attributes can be classified in +   one of three categories: per server, per filesystem, or per +   filesystem object.  Note that it is possible that some per filesystem +   attributes may vary within the filesystem.  See the "homogeneous" +   attribute for its definition.  Note that the attributes +   time_access_set and time_modify_set are not listed in this section +   because they are write-only attributes corresponding to time_access +   and time_modify, and are used in a special instance of SETATTR. + +   o  The per server attribute is: + +         lease_time + +   o  The per filesystem attributes are: + +      supp_attr, fh_expire_type, link_support, symlink_support, +      unique_handles, aclsupport, cansettime, case_insensitive, +      case_preserving, chown_restricted, files_avail, files_free, +      files_total, fs_locations, homogeneous, maxfilesize, maxname, +      maxread, maxwrite, no_trunc, space_avail, space_free, space_total, +      time_delta + +   o  The per filesystem object attributes are: + +      type, change, size, named_attr, fsid, rdattr_error, filehandle, +      ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, +      owner, owner_group, rawdev, space_used, system, time_access, +      time_backup, time_create, time_metadata, time_modify, +      mounted_on_fileid + +   For quota_avail_hard, quota_avail_soft, and quota_used see their +   definitions below for the appropriate classification. + + + + +Shepler, et al.             Standards Track                    [Page 38] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +5.5.  Mandatory Attributes - Definitions + +   Name              #    DataType     Access   Description +   ___________________________________________________________________ +   supp_attr         0    bitmap       READ     The bit vector which +                                                would retrieve all +                                                mandatory and +                                                recommended attributes +                                                that are supported for +                                                this object.  The +                                                scope of this +                                                attribute applies to +                                                all objects with a +                                                matching fsid. + +   type              1    nfs4_ftype   READ     The type of the object +                                                (file, directory, +                                                symlink, etc.) + +   fh_expire_type    2    uint32       READ     Server uses this to +                                                specify filehandle +                                                expiration behavior to +                                                the client.  See the +                                                section "Filehandles" +                                                for additional +                                                description. + +   change            3    uint64       READ     A value created by the +                                                server that the client +                                                can use to determine +                                                if file data, +                                                directory contents or +                                                attributes of the +                                                object have been +                                                modified.  The server +                                                may return the +                                                object's time_metadata +                                                attribute for this +                                                attribute's value but +                                                only if the filesystem +                                                object can not be +                                                updated more +                                                frequently than the +                                                resolution of +                                                time_metadata. + +   size              4    uint64       R/W      The size of the object +                                                in bytes. + + + +Shepler, et al.             Standards Track                    [Page 39] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   link_support      5    bool         READ     True, if the object's +                                                filesystem supports +                                                hard links. + +   symlink_support   6    bool         READ     True, if the object's +                                                filesystem supports +                                                symbolic links. + +   named_attr        7    bool         READ     True, if this object +                                                has named attributes. +                                                In other words, object +                                                has a non-empty named +                                                attribute directory. + +   fsid              8    fsid4        READ     Unique filesystem +                                                identifier for the +                                                filesystem holding +                                                this object.  fsid +                                                contains major and +                                                minor components each +                                                of which are uint64. + +   unique_handles    9    bool         READ     True, if two distinct +                                                filehandles guaranteed +                                                to refer to two +                                                different filesystem +                                                objects. + +   lease_time        10   nfs_lease4   READ     Duration of leases at +                                                server in seconds. + +   rdattr_error      11   enum         READ     Error returned from +                                                getattr during +                                                readdir. + +   filehandle        19   nfs_fh4      READ     The filehandle of this +                                                object (primarily for +                                                readdir requests). + + + + + + + + + + + + + +Shepler, et al.             Standards Track                    [Page 40] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +5.6.  Recommended Attributes - Definitions + +   Name                #    Data Type      Access   Description +   _____________________________________________________________________ +   ACL                 12   nfsace4<>      R/W      The access control +                                                    list for the object. + +   aclsupport          13   uint32         READ     Indicates what types +                                                    of ACLs are +                                                    supported on the +                                                    current filesystem. + +   archive             14   bool           R/W      True, if this file +                                                    has been archived +                                                    since the time of +                                                    last modification +                                                    (deprecated in favor +                                                    of time_backup). + +   cansettime          15   bool           READ     True, if the server +                                                    is able to change +                                                    the times for a +                                                    filesystem object as +                                                    specified in a +                                                    SETATTR operation. + +   case_insensitive    16   bool           READ     True, if filename +                                                    comparisons on this +                                                    filesystem are case +                                                    insensitive. + +   case_preserving     17   bool           READ     True, if filename +                                                    case on this +                                                    filesystem are +                                                    preserved. + +   chown_restricted    18   bool           READ     If TRUE, the server +                                                    will reject any +                                                    request to change +                                                    either the owner or +                                                    the group associated +                                                    with a file if the +                                                    caller is not a +                                                    privileged user (for +                                                    example, "root" in +                                                    UNIX operating +                                                    environments or in +                                                    Windows 2000 the + + + +Shepler, et al.             Standards Track                    [Page 41] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +                                                    "Take Ownership" +                                                    privilege). + +   fileid              20   uint64         READ     A number uniquely +                                                    identifying the file +                                                    within the +                                                    filesystem. + +   files_avail         21   uint64         READ     File slots available +                                                    to this user on the +                                                    filesystem +                                                    containing this +                                                    object - this should +                                                    be the smallest +                                                    relevant limit. + +   files_free          22   uint64         READ     Free file slots on +                                                    the filesystem +                                                    containing this +                                                    object - this should +                                                    be the smallest +                                                    relevant limit. + +   files_total         23   uint64         READ     Total file slots on +                                                    the filesystem +                                                    containing this +                                                    object. + +   fs_locations        24   fs_locations   READ     Locations where this +                                                    filesystem may be +                                                    found.  If the +                                                    server returns +                                                    NFS4ERR_MOVED +                                                    as an error, this +                                                    attribute MUST be +                                                    supported. + +   hidden              25   bool           R/W      True, if the file is +                                                    considered hidden +                                                    with respect to the +                                                    Windows API. + +   homogeneous         26   bool           READ     True, if this +                                                    object's filesystem +                                                    is homogeneous, +                                                    i.e., are per +                                                    filesystem +                                                    attributes the same + + + +Shepler, et al.             Standards Track                    [Page 42] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +                                                    for all filesystem's +                                                    objects? + +   maxfilesize         27   uint64         READ     Maximum supported +                                                    file size for the +                                                    filesystem of this +                                                    object. + +   maxlink             28   uint32         READ     Maximum number of +                                                    links for this +                                                    object. + +   maxname             29   uint32         READ     Maximum filename +                                                    size supported for +                                                    this object. + +   maxread             30   uint64         READ     Maximum read size +                                                    supported for this +                                                    object. + +   maxwrite            31   uint64         READ     Maximum write size +                                                    supported for this +                                                    object.  This +                                                    attribute SHOULD be +                                                    supported if the +                                                    file is writable. +                                                    Lack of this +                                                    attribute can +                                                    lead to the client +                                                    either wasting +                                                    bandwidth or not +                                                    receiving the best +                                                    performance. + +   mimetype            32   utf8<>         R/W      MIME body +                                                    type/subtype of this +                                                    object. + +   mode                33   mode4          R/W      UNIX-style mode and +                                                    permission bits for +                                                    this object. + +   no_trunc            34   bool           READ     True, if a name +                                                    longer than name_max +                                                    is used, an error be +                                                    returned and name is +                                                    not truncated. + + + + +Shepler, et al.             Standards Track                    [Page 43] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   numlinks            35   uint32         READ     Number of hard links +                                                    to this object. + +   owner               36   utf8<>         R/W      The string name of +                                                    the owner of this +                                                    object. + +   owner_group         37   utf8<>         R/W      The string name of +                                                    the group ownership +                                                    of this object. + +   quota_avail_hard    38   uint64         READ     For definition see +                                                    "Quota Attributes" +                                                    section below. + +   quota_avail_soft    39   uint64         READ     For definition see +                                                    "Quota Attributes" +                                                    section below. + +   quota_used          40   uint64         READ     For definition see +                                                    "Quota Attributes" +                                                    section below. + +   rawdev              41   specdata4      READ     Raw device +                                                    identifier.  UNIX +                                                    device major/minor +                                                    node information. +                                                    If the value of +                                                    type is not +                                                    NF4BLK or NF4CHR, +                                                    the value return +                                                    SHOULD NOT be +                                                    considered useful. + +   space_avail         42   uint64         READ     Disk space in bytes +                                                    available to this +                                                    user on the +                                                    filesystem +                                                    containing this +                                                    object - this should +                                                    be the smallest +                                                    relevant limit. + +   space_free          43   uint64         READ     Free disk space in +                                                    bytes on the +                                                    filesystem +                                                    containing this +                                                    object - this should + + + +Shepler, et al.             Standards Track                    [Page 44] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +                                                    be the smallest +                                                    relevant limit. + +   space_total         44   uint64         READ     Total disk space in +                                                    bytes on the +                                                    filesystem +                                                    containing this +                                                    object. + +   space_used          45   uint64         READ     Number of filesystem +                                                    bytes allocated to +                                                    this object. + +   system              46   bool           R/W      True, if this file +                                                    is a "system" file +                                                    with respect to the +                                                    Windows API. + +   time_access         47   nfstime4       READ     The time of last +                                                    access to the object +                                                    by a read that was +                                                    satisfied by the +                                                    server. + +   time_access_set     48   settime4       WRITE    Set the time of last +                                                    access to the +                                                    object.  SETATTR +                                                    use only. + +   time_backup         49   nfstime4       R/W      The time of last +                                                    backup of the +                                                    object. + +   time_create         50   nfstime4       R/W      The time of creation +                                                    of the object.  This +                                                    attribute does not +                                                    have any relation to +                                                    the traditional UNIX +                                                    file attribute +                                                    "ctime" or "change +                                                    time". + +   time_delta          51   nfstime4       READ     Smallest useful +                                                    server time +                                                    granularity. + + + + + + +Shepler, et al.             Standards Track                    [Page 45] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   time_metadata       52   nfstime4       READ     The time of last +                                                    meta-data +                                                    modification of the +                                                    object. + +   time_modify         53   nfstime4       READ     The time of last +                                                    modification to the +                                                    object. + +   time_modify_set     54   settime4       WRITE    Set the time of last +                                                    modification to the +                                                    object.  SETATTR use +                                                    only. + +   mounted_on_fileid   55   uint64         READ     Like fileid, but if +                                                    the target +                                                    filehandle is the +                                                    root of a filesystem +                                                    return the fileid of +                                                    the underlying +                                                    directory. + +5.7.  Time Access + +   As defined above, the time_access attribute represents the time of +   last access to the object by a read that was satisfied by the server. +   The notion of what is an "access" depends on server's operating +   environment and/or the server's filesystem semantics.  For example, +   for servers obeying POSIX semantics, time_access would be updated +   only by the READLINK, READ, and READDIR operations and not any of the +   operations that modify the content of the object.  Of course, setting +   the corresponding time_access_set attribute is another way to modify +   the time_access attribute. + +   Whenever the file object resides on a writable filesystem, the server +   should make best efforts to record time_access into stable storage. +   However, to mitigate the performance effects of doing so, and most +   especially whenever the server is satisfying the read of the object's +   content from its cache, the server MAY cache access time updates and +   lazily write them to stable storage.  It is also acceptable to give +   administrators of the server the option to disable time_access +   updates. + + + + + + + + + +Shepler, et al.             Standards Track                    [Page 46] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +5.8.  Interpreting owner and owner_group + +   The recommended attributes "owner" and "owner_group" (and also users +   and groups within the "acl" attribute) are represented in terms of a +   UTF-8 string.  To avoid a representation that is tied to a particular +   underlying implementation at the client or server, the use of the +   UTF-8 string has been chosen.  Note that section 6.1 of [RFC2624] +   provides additional rationale.  It is expected that the client and +   server will have their own local representation of owner and +   owner_group that is used for local storage or presentation to the end +   user.  Therefore, it is expected that when these attributes are +   transferred between the client and server that the local +   representation is translated to a syntax of the form +   "user@dns_domain".  This will allow for a client and server that do +   not use the same local representation the ability to translate to a +   common syntax that can be interpreted by both. + +   Similarly, security principals may be represented in different ways +   by different security mechanisms.  Servers normally translate these +   representations into a common format, generally that used by local +   storage, to serve as a means of identifying the users corresponding +   to these security principals.  When these local identifiers are +   translated to the form of the owner attribute, associated with files +   created by such principals they identify, in a common format, the +   users associated with each corresponding set of security principals. + +   The translation used to interpret owner and group strings is not +   specified as part of the protocol.  This allows various solutions to +   be employed.  For example, a local translation table may be consulted +   that maps between a numeric id to the user@dns_domain syntax.  A name +   service may also be used to accomplish the translation.  A server may +   provide a more general service, not limited by any particular +   translation (which would only translate a limited set of possible +   strings) by storing the owner and owner_group attributes in local +   storage without any translation or it may augment a translation +   method by storing the entire string for attributes for which no +   translation is available while using the local representation for +   those cases in which a translation is available. + +   Servers that do not provide support for all possible values of the +   owner and owner_group attributes, should return an error +   (NFS4ERR_BADOWNER) when a string is presented that has no +   translation, as the value to be set for a SETATTR of the owner, +   owner_group, or acl attributes.  When a server does accept an owner +   or owner_group value as valid on a SETATTR (and similarly for the +   owner and group strings in an acl), it is promising to return that +   same string when a corresponding GETATTR is done.  Configuration +   changes and ill-constructed name translations (those that contain + + + +Shepler, et al.             Standards Track                    [Page 47] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   aliasing) may make that promise impossible to honor.  Servers should +   make appropriate efforts to avoid a situation in which these +   attributes have their values changed when no real change to ownership +   has occurred. + +   The "dns_domain" portion of the owner string is meant to be a DNS +   domain name.  For example, user@ietf.org.  Servers should accept as +   valid a set of users for at least one domain.  A server may treat +   other domains as having no valid translations.  A more general +   service is provided when a server is capable of accepting users for +   multiple domains, or for all domains, subject to security +   constraints. + +   In the case where there is no translation available to the client or +   server, the attribute value must be constructed without the "@". +   Therefore, the absence of the @ from the owner or owner_group +   attribute signifies that no translation was available at the sender +   and that the receiver of the attribute should not use that string as +   a basis for translation into its own internal format.  Even though +   the attribute value can not be translated, it may still be useful. +   In the case of a client, the attribute string may be used for local +   display of ownership. + +   To provide a greater degree of compatibility with previous versions +   of NFS (i.e., v2 and v3), which identified users and groups by 32-bit +   unsigned uid's and gid's, owner and group strings that consist of +   decimal numeric values with no leading zeros can be given a special +   interpretation by clients and servers which choose to provide such +   support.  The receiver may treat such a user or group string as +   representing the same user as would be represented by a v2/v3 uid or +   gid having the corresponding numeric value.  A server is not +   obligated to accept such a string, but may return an NFS4ERR_BADOWNER +   instead.  To avoid this mechanism being used to subvert user and +   group translation, so that a client might pass all of the owners and +   groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER +   error when there is a valid translation for the user or owner +   designated in this way.  In that case, the client must use the +   appropriate name@domain string and not the special form for +   compatibility. + +   The owner string "nobody" may be used to designate an anonymous user, +   which will be associated with a file created by a security principal +   that cannot be mapped through normal means to the owner attribute. + + + + + + + + +Shepler, et al.             Standards Track                    [Page 48] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +5.9.  Character Case Attributes + +   With respect to the case_insensitive and case_preserving attributes, +   each UCS-4 character (which UTF-8 encodes) has a "long descriptive +   name" [RFC1345] which may or may not included the word "CAPITAL" or +   "SMALL".  The presence of SMALL or CAPITAL allows an NFS server to +   implement unambiguous and efficient table driven mappings for case +   insensitive comparisons, and non-case-preserving storage.  For +   general character handling and internationalization issues, see the +   section "Internationalization". + +5.10.  Quota Attributes + +   For the attributes related to filesystem quotas, the following +   definitions apply: + +   quota_avail_soft +         The value in bytes which represents the amount of additional +         disk space that can be allocated to this file or directory +         before the user may reasonably be warned.  It is understood +         that this space may be consumed by allocations to other files +         or directories though there is a rule as to which other files +         or directories. + +   quota_avail_hard +         The value in bytes which represent the amount of additional +         disk space beyond the current allocation that can be allocated +         to this file or directory before further allocations will be +         refused.  It is understood that this space may be consumed by +         allocations to other files or directories. + +   quota_used +         The value in bytes which represent the amount of disc space +         used by this file or directory and possibly a number of other +         similar files or directories, where the set of "similar" meets +         at least the criterion that allocating space to any file or +         directory in the set will reduce the "quota_avail_hard" of +         every other file or directory in the set. + +         Note that there may be a number of distinct but overlapping +         sets of files or directories for which a quota_used value is +         maintained (e.g., "all files with a given owner", "all files +         with a given group owner", etc.). + +         The server is at liberty to choose any of those sets but should +         do so in a repeatable way.  The rule may be configured per- +         filesystem or may be "choose the set with the smallest quota". + + + + +Shepler, et al.             Standards Track                    [Page 49] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +5.11.  Access Control Lists + +   The NFS version 4 ACL attribute is an array of access control entries +   (ACE).  Although, the client can read and write the ACL attribute, +   the NFSv4 model is the server does all access control based on the +   server's interpretation of the ACL.  If at any point the client wants +   to check access without issuing an operation that modifies or reads +   data or metadata, the client can use the OPEN and ACCESS operations +   to do so.  There are various access control entry types, as defined +   in the Section "ACE type".  The server is able to communicate which +   ACE types are supported by returning the appropriate value within the +   aclsupport attribute.  Each ACE covers one or more operations on a +   file or directory as described in the Section "ACE Access Mask".  It +   may also contain one or more flags that modify the semantics of the +   ACE as defined in the Section "ACE flag". + +   The NFS ACE attribute is defined as follows: + +         typedef uint32_t        acetype4; +         typedef uint32_t        aceflag4; +         typedef uint32_t        acemask4; + +         struct nfsace4 { +                 acetype4        type; +                 aceflag4        flag; +                 acemask4        access_mask; +                 utf8str_mixed   who; +         }; + +   To determine if a request succeeds, each nfsace4 entry is processed +   in order by the server.  Only ACEs which have a "who" that matches +   the requester are considered.  Each ACE is processed until all of the +   bits of the requester's access have been ALLOWED.  Once a bit (see +   below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer +   considered in the processing of later ACEs.  If an ACCESS_DENIED_ACE +   is encountered where the requester's access still has unALLOWED bits +   in common with the "access_mask" of the ACE, the request is denied. +   However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT +   ACE types do not affect a requester's access, and instead are for +   triggering events as a result of a requester's access attempt. + +   Therefore, all AUDIT and ALARM ACEs are processed until end of the +   ACL.  When the ACL is fully processed, if there are bits in +   requester's mask that have not been considered whether the server +   allows or denies the access is undefined.  If there is a mode +   attribute on the file, then this cannot happen, since the mode's +   MODE4_*OTH bits will map to EVERYONE@ ACEs that unambiguously specify +   the requester's access. + + + +Shepler, et al.             Standards Track                    [Page 50] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The NFS version 4 ACL model is quite rich.  Some server platforms may +   provide access control functionality that goes beyond the UNIX-style +   mode attribute, but which is not as rich as the NFS ACL model.  So +   that users can take advantage of this more limited functionality, the +   server may indicate that it supports ACLs as long as it follows the +   guidelines for mapping between its ACL model and the NFS version 4 +   ACL model. + +   The situation is complicated by the fact that a server may have +   multiple modules that enforce ACLs.  For example, the enforcement for +   NFS version 4 access may be different from the enforcement for local +   access, and both may be different from the enforcement for access +   through other protocols such as SMB.  So it may be useful for a +   server to accept an ACL even if not all of its modules are able to +   support it. + +   The guiding principle in all cases is that the server must not accept +   ACLs that appear to make the file more secure than it really is. + +5.11.1.  ACE type + +   Type         Description +   _____________________________________________________ +   ALLOW        Explicitly grants the access defined in +                acemask4 to the file or directory. + +   DENY         Explicitly denies the access defined in +                acemask4 to the file or directory. + +   AUDIT        LOG (system dependent) any access +                attempt to a file or directory which +                uses any of the access methods specified +                in acemask4. + +   ALARM        Generate a system ALARM (system +                dependent) when any access attempt is +                made to a file or directory for the +                access methods specified in acemask4. + +   A server need not support all of the above ACE types.  The bitmask +   constants used to represent the above definitions within the + +   aclsupport attribute are as follows: + +      const ACL4_SUPPORT_ALLOW_ACL    = 0x00000001; +      const ACL4_SUPPORT_DENY_ACL     = 0x00000002; +      const ACL4_SUPPORT_AUDIT_ACL    = 0x00000004; +      const ACL4_SUPPORT_ALARM_ACL    = 0x00000008; + + + +Shepler, et al.             Standards Track                    [Page 51] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The semantics of the "type" field follow the descriptions provided +   above. + +   The constants used for the type field (acetype4) are as follows: + +      const ACE4_ACCESS_ALLOWED_ACE_TYPE      = 0x00000000; +      const ACE4_ACCESS_DENIED_ACE_TYPE       = 0x00000001; +      const ACE4_SYSTEM_AUDIT_ACE_TYPE        = 0x00000002; +      const ACE4_SYSTEM_ALARM_ACE_TYPE        = 0x00000003; + +   Clients should not attempt to set an ACE unless the server claims +   support for that ACE type.  If the server receives a request to set +   an ACE that it cannot store, it MUST reject the request with +   NFS4ERR_ATTRNOTSUPP.  If the server receives a request to set an ACE +   that it can store but cannot enforce, the server SHOULD reject the +   request with NFS4ERR_ATTRNOTSUPP. + +   Example: suppose a server can enforce NFS ACLs for NFS access but +   cannot enforce ACLs for local access.  If arbitrary processes can run +   on the server, then the server SHOULD NOT indicate ACL support.  On +   the other hand, if only trusted administrative programs run locally, +   then the server may indicate ACL support. + +5.11.2.  ACE Access Mask + +   The access_mask field contains values based on the following: + +   Access                 Description +   _______________________________________________________________ +   READ_DATA              Permission to read the data of the file +   LIST_DIRECTORY         Permission to list the contents of a +                          directory +   WRITE_DATA             Permission to modify the file's data +   ADD_FILE               Permission to add a new file to a +                          directory +   APPEND_DATA            Permission to append data to a file +   ADD_SUBDIRECTORY       Permission to create a subdirectory to a +                          directory +   READ_NAMED_ATTRS       Permission to read the named attributes +                          of a file +   WRITE_NAMED_ATTRS      Permission to write the named attributes +                          of a file +   EXECUTE                Permission to execute a file +   DELETE_CHILD           Permission to delete a file or directory +                          within a directory +   READ_ATTRIBUTES        The ability to read basic attributes +                          (non-acls) of a file +   WRITE_ATTRIBUTES       Permission to change basic attributes + + + +Shepler, et al.             Standards Track                    [Page 52] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +                          (non-acls) of a file +   DELETE                 Permission to Delete the file +   READ_ACL               Permission to Read the ACL +   WRITE_ACL              Permission to Write the ACL +   WRITE_OWNER            Permission to change the owner +   SYNCHRONIZE            Permission to access file locally at the +                          server with synchronous reads and writes + +   The bitmask constants used for the access mask field are as follows: + +   const ACE4_READ_DATA            = 0x00000001; +   const ACE4_LIST_DIRECTORY       = 0x00000001; +   const ACE4_WRITE_DATA           = 0x00000002; +   const ACE4_ADD_FILE             = 0x00000002; +   const ACE4_APPEND_DATA          = 0x00000004; +   const ACE4_ADD_SUBDIRECTORY     = 0x00000004; +   const ACE4_READ_NAMED_ATTRS     = 0x00000008; +   const ACE4_WRITE_NAMED_ATTRS    = 0x00000010; +   const ACE4_EXECUTE              = 0x00000020; +   const ACE4_DELETE_CHILD         = 0x00000040; +   const ACE4_READ_ATTRIBUTES      = 0x00000080; +   const ACE4_WRITE_ATTRIBUTES     = 0x00000100; +   const ACE4_DELETE               = 0x00010000; +   const ACE4_READ_ACL             = 0x00020000; +   const ACE4_WRITE_ACL            = 0x00040000; +   const ACE4_WRITE_OWNER          = 0x00080000; +   const ACE4_SYNCHRONIZE          = 0x00100000; + +   Server implementations need not provide the granularity of control +   that is implied by this list of masks.  For example, POSIX-based +   systems might not distinguish APPEND_DATA (the ability to append to a +   file) from WRITE_DATA (the ability to modify existing contents); both +   masks would be tied to a single "write" permission.  When such a +   server returns attributes to the client, it would show both +   APPEND_DATA and WRITE_DATA if and only if the write permission is +   enabled. + +   If a server receives a SETATTR request that it cannot accurately +   implement, it should error in the direction of more restricted +   access.  For example, suppose a server cannot distinguish overwriting +   data from appending new data, as described in the previous paragraph. +   If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is +   not (or vice versa), the server should reject the request with +   NFS4ERR_ATTRNOTSUPP.  Nonetheless, if the ACE has type DENY, the +   server may silently turn on the other bit, so that both APPEND_DATA +   and WRITE_DATA are denied. + + + + + +Shepler, et al.             Standards Track                    [Page 53] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +5.11.3.  ACE flag + +   The "flag" field contains values based on the following descriptions. + +   ACE4_FILE_INHERIT_ACE +      Can be placed on a directory and indicates that this ACE should be +      added to each new non-directory file created. + +   ACE4_DIRECTORY_INHERIT_ACE +      Can be placed on a directory and indicates that this ACE should be +      added to each new directory created. + +   ACE4_INHERIT_ONLY_ACE +      Can be placed on a directory but does not apply to the directory, +      only to newly created files/directories as specified by the above +      two flags. + +   ACE4_NO_PROPAGATE_INHERIT_ACE +      Can be placed on a directory.  Normally when a new directory is +      created and an ACE exists on the parent directory which is marked +      ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new +      directory.  One for the directory itself and one which is an +      inheritable ACE for newly created directories.  This flag tells +      the server to not place an ACE on the newly created directory +      which is inheritable by subdirectories of the created directory. + +   ACE4_SUCCESSFUL_ACCESS_ACE_FLAG + +   ACL4_FAILED_ACCESS_ACE_FLAG +      The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and +      ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to +      ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE +      (ALARM) ACE types.  If during the processing of the file's ACL, +      the server encounters an AUDIT or ALARM ACE that matches the +      principal attempting the OPEN, the server notes that fact, and the +      presence, if any, of the SUCCESS and FAILED flags encountered in +      the AUDIT or ALARM ACE.  Once the server completes the ACL +      processing, and the share reservation processing, and the OPEN +      call, it then notes if the OPEN succeeded or failed.  If the OPEN +      succeeded, and if the SUCCESS flag was set for a matching AUDIT or +      ALARM, then the appropriate AUDIT or ALARM event occurs.  If the +      OPEN failed, and if the FAILED flag was set for the matching AUDIT +      or ALARM, then the appropriate AUDIT or ALARM event occurs. +      Clearly either or both of the SUCCESS or FAILED can be set, but if +      neither is set, the AUDIT or ALARM ACE is not useful. + + + + + + +Shepler, et al.             Standards Track                    [Page 54] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      The previously described processing applies to that of the ACCESS +      operation as well.  The difference being that "success" or +      "failure" does not mean whether ACCESS returns NFS4_OK or not. +      Success means whether ACCESS returns all requested and supported +      bits.  Failure means whether ACCESS failed to return a bit that +      was requested and supported. + +   ACE4_IDENTIFIER_GROUP +      Indicates that the "who" refers to a GROUP as defined under UNIX. + +   The bitmask constants used for the flag field are as follows: + +   const ACE4_FILE_INHERIT_ACE             = 0x00000001; +   const ACE4_DIRECTORY_INHERIT_ACE        = 0x00000002; +   const ACE4_NO_PROPAGATE_INHERIT_ACE     = 0x00000004; +   const ACE4_INHERIT_ONLY_ACE             = 0x00000008; +   const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG   = 0x00000010; +   const ACE4_FAILED_ACCESS_ACE_FLAG       = 0x00000020; +   const ACE4_IDENTIFIER_GROUP             = 0x00000040; + +   A server need not support any of these flags.  If the server supports +   flags that are similar to, but not exactly the same as, these flags, +   the implementation may define a mapping between the protocol-defined +   flags and the implementation-defined flags.  Again, the guiding +   principle is that the file not appear to be more secure than it +   really is. + +   For example, suppose a client tries to set an ACE with +   ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE.  If the +   server does not support any form of ACL inheritance, the server +   should reject the request with NFS4ERR_ATTRNOTSUPP.  If the server +   supports a single "inherit ACE" flag that applies to both files and +   directories, the server may reject the request (i.e., requiring the +   client to set both the file and directory inheritance flags).  The +   server may also accept the request and silently turn on the +   ACE4_DIRECTORY_INHERIT_ACE flag. + +5.11.4.  ACE who + +   There are several special identifiers ("who") which need to be +   understood universally, rather than in the context of a particular +   DNS domain.  Some of these identifiers cannot be understood when an +   NFS client accesses the server, but have meaning when a local process +   accesses the file.  The ability to display and modify these +   permissions is permitted over NFS, even if none of the access methods +   on the server understands the identifiers. + + + + + +Shepler, et al.             Standards Track                    [Page 55] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Who                    Description +   _______________________________________________________________ +   "OWNER"                The owner of the file. +   "GROUP"                The group associated with the file. +   "EVERYONE"             The world. +   "INTERACTIVE"          Accessed from an interactive terminal. +   "NETWORK"              Accessed via the network. +   "DIALUP"               Accessed as a dialup user to the server. +   "BATCH"                Accessed from a batch job. +   "ANONYMOUS"            Accessed without any authentication. +   "AUTHENTICATED"        Any authenticated user (opposite of +                          ANONYMOUS) +   "SERVICE"              Access from a system service. + +   To avoid conflict, these special identifiers are distinguish by an +   appended "@" and should appear in the form "xxxx@" (note: no domain +   name after the "@").  For example: ANONYMOUS@. + +5.11.5.  Mode Attribute + +   The NFS version 4 mode attribute is based on the UNIX mode bits.  The +   following bits are defined: + +      const MODE4_SUID = 0x800;  /* set user id on execution */ +      const MODE4_SGID = 0x400;  /* set group id on execution */ +      const MODE4_SVTX = 0x200;  /* save text even after use */ +      const MODE4_RUSR = 0x100;  /* read permission: owner */ +      const MODE4_WUSR = 0x080;  /* write permission: owner */ +      const MODE4_XUSR = 0x040;  /* execute permission: owner */ +      const MODE4_RGRP = 0x020;  /* read permission: group */ +      const MODE4_WGRP = 0x010;  /* write permission: group */ +      const MODE4_XGRP = 0x008;  /* execute permission: group */ +      const MODE4_ROTH = 0x004;  /* read permission: other */ +      const MODE4_WOTH = 0x002;  /* write permission: other */ +      const MODE4_XOTH = 0x001;  /* execute permission: other */ + +   Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal +   identified in the owner attribute.  Bits MODE4_RGRP, MODE4_WGRP, and + +   MODE4_XGRP apply to the principals identified in the owner_group +   attribute.  Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any +   principal that does not match that in the owner group, and does not +   have a group matching that of the owner_group attribute. + +   The remaining bits are not defined by this protocol and MUST NOT be +   used.  The minor version mechanism must be used to define further bit +   usage. + + + + +Shepler, et al.             Standards Track                    [Page 56] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Note that in UNIX, if a file has the MODE4_SGID bit set and no +   MODE4_XGRP bit set, then READ and WRITE must use mandatory file +   locking. + +5.11.6.  Mode and ACL Attribute + +   The server that supports both mode and ACL must take care to +   synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the +   ACEs which have respective who fields of "OWNER@", "GROUP@", and +   "EVERYONE@" so that the client can see semantically equivalent access +   permissions exist whether the client asks for owner, owner_group and +   mode attributes, or for just the ACL. + +   Because the mode attribute includes bits (e.g., MODE4_SVTX) that have +   nothing to do with ACL semantics, it is permitted for clients to +   specify both the ACL attribute and mode in the same SETATTR +   operation.  However, because there is no prescribed order for +   processing the attributes in a SETATTR, the client must ensure that +   ACL attribute, if specified without mode, would produce the desired +   mode bits, and conversely, the mode attribute if specified without +   ACL, would produce the desired "OWNER@", "GROUP@", and "EVERYONE@" +   ACEs. + +5.11.7.  mounted_on_fileid + +   UNIX-based operating environments connect a filesystem into the +   namespace by connecting (mounting) the filesystem onto the existing +   file object (the mount point, usually a directory) of an existing +   filesystem.  When the mount point's parent directory is read via an +   API like readdir(), the return results are directory entries, each +   with a component name and a fileid.  The fileid of the mount point's +   directory entry will be different from the fileid that the stat() +   system call returns.  The stat() system call is returning the fileid +   of the root of the mounted filesystem, whereas readdir() is returning +   the fileid stat() would have returned before any filesystems were +   mounted on the mount point. + +   Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request +   to cross other filesystems.  The client detects the filesystem +   crossing whenever the filehandle argument of LOOKUP has an fsid +   attribute different from that of the filehandle returned by LOOKUP. +   A UNIX-based client will consider this a "mount point crossing". +   UNIX has a legacy scheme for allowing a process to determine its +   current working directory.  This relies on readdir() of a mount +   point's parent and stat() of the mount point returning fileids as +   previously described.  The mounted_on_fileid attribute corresponds to +   the fileid that readdir() would have returned as described +   previously. + + + +Shepler, et al.             Standards Track                    [Page 57] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   While the NFS version 4 client could simply fabricate a fileid +   corresponding to what mounted_on_fileid provides (and if the server +   does not support mounted_on_fileid, the client has no choice), there +   is a risk that the client will generate a fileid that conflicts with +   one that is already assigned to another object in the filesystem. +   Instead, if the server can provide the mounted_on_fileid, the +   potential for client operational problems in this area is eliminated. + +   If the server detects that there is no mounted point at the target +   file object, then the value for mounted_on_fileid that it returns is +   the same as that of the fileid attribute. + +   The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD +   provide it if possible, and for a UNIX-based server, this is +   straightforward.  Usually, mounted_on_fileid will be requested during +   a READDIR operation, in which case it is trivial (at least for UNIX- +   based servers) to return mounted_on_fileid since it is equal to the +   fileid of a directory entry returned by readdir().  If +   mounted_on_fileid is requested in a GETATTR operation, the server +   should obey an invariant that has it returning a value that is equal +   to the file object's entry in the object's parent directory, i.e., +   what readdir() would have returned.  Some operating environments +   allow a series of two or more filesystems to be mounted onto a single +   mount point.  In this case, for the server to obey the aforementioned +   invariant, it will need to find the base mount point, and not the +   intermediate mount points. + +6.  Filesystem Migration and Replication + +   With the use of the recommended attribute "fs_locations", the NFS +   version 4 server has a method of providing filesystem migration or +   replication services.  For the purposes of migration and replication, +   a filesystem will be defined as all files that share a given fsid +   (both major and minor values are the same). + +   The fs_locations attribute provides a list of filesystem locations. +   These locations are specified by providing the server name (either +   DNS domain or IP address) and the path name representing the root of +   the filesystem.  Depending on the type of service being provided, the +   list will provide a new location or a set of alternate locations for +   the filesystem.  The client will use this information to redirect its +   requests to the new server. + +6.1.  Replication + +   It is expected that filesystem replication will be used in the case +   of read-only data.  Typically, the filesystem will be replicated on +   two or more servers.  The fs_locations attribute will provide the + + + +Shepler, et al.             Standards Track                    [Page 58] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   list of these locations to the client.  On first access of the +   filesystem, the client should obtain the value of the fs_locations +   attribute.  If, in the future, the client finds the server +   unresponsive, the client may attempt to use another server specified +   by fs_locations. + +   If applicable, the client must take the appropriate steps to recover +   valid filehandles from the new server.  This is described in more +   detail in the following sections. + +6.2.  Migration + +   Filesystem migration is used to move a filesystem from one server to +   another.  Migration is typically used for a filesystem that is +   writable and has a single copy.  The expected use of migration is for +   load balancing or general resource reallocation.  The protocol does +   not specify how the filesystem will be moved between servers.  This +   server-to-server transfer mechanism is left to the server +   implementor.  However, the method used to communicate the migration +   event between client and server is specified here. + +   Once the servers participating in the migration have completed the +   move of the filesystem, the error NFS4ERR_MOVED will be returned for +   subsequent requests received by the original server.  The +   NFS4ERR_MOVED error is returned for all operations except PUTFH and +   GETATTR.  Upon receiving the NFS4ERR_MOVED error, the client will +   obtain the value of the fs_locations attribute.  The client will then +   use the contents of the attribute to redirect its requests to the +   specified server.  To facilitate the use of GETATTR, operations such +   as PUTFH must also be accepted by the server for the migrated file +   system's filehandles.  Note that if the server returns NFS4ERR_MOVED, +   the server MUST support the fs_locations attribute. + +   If the client requests more attributes than just fs_locations, the +   server may return fs_locations only.  This is to be expected since +   the server has migrated the filesystem and may not have a method of +   obtaining additional attribute data. + +   The server implementor needs to be careful in developing a migration +   solution.  The server must consider all of the state information +   clients may have outstanding at the server.  This includes but is not +   limited to locking/share state, delegation state, and asynchronous +   file writes which are represented by WRITE and COMMIT verifiers.  The +   server should strive to minimize the impact on its clients during and +   after the migration process. + + + + + + +Shepler, et al.             Standards Track                    [Page 59] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +6.3.  Interpretation of the fs_locations Attribute + +   The fs_location attribute is structured in the following way: + +   struct fs_location { +           utf8str_cis     server<>; +           pathname4       rootpath; +   }; + +   struct fs_locations { +           pathname4       fs_root; +           fs_location     locations<>; +   }; + +   The fs_location struct is used to represent the location of a +   filesystem by providing a server name and the path to the root of the +   filesystem.  For a multi-homed server or a set of servers that use +   the same rootpath, an array of server names may be provided.  An +   entry in the server array is an UTF8 string and represents one of a +   traditional DNS host name, IPv4 address, or IPv6 address.  It is not +   a requirement that all servers that share the same rootpath be listed +   in one fs_location struct.  The array of server names is provided for +   convenience.  Servers that share the same rootpath may also be listed +   in separate fs_location entries in the fs_locations attribute. + +   The fs_locations struct and attribute then contains an array of +   locations.  Since the name space of each server may be constructed +   differently, the "fs_root" field is provided.  The path represented +   by fs_root represents the location of the filesystem in the server's +   name space.  Therefore, the fs_root path is only associated with the +   server from which the fs_locations attribute was obtained.  The +   fs_root path is meant to aid the client in locating the filesystem at +   the various servers listed. + +   As an example, there is a replicated filesystem located at two +   servers (servA and servB).  At servA the filesystem is located at +   path "/a/b/c".  At servB the filesystem is located at path "/x/y/z". +   In this example the client accesses the filesystem first at servA +   with a multi-component lookup path of "/a/b/c/d".  Since the client +   used a multi-component lookup to obtain the filehandle at "/a/b/c/d", +   it is unaware that the filesystem's root is located in servA's name +   space at "/a/b/c".  When the client switches to servB, it will need +   to determine that the directory it first referenced at servA is now +   represented by the path "/x/y/z/d" on servB.  To facilitate this, the +   fs_locations attribute provided by servA would have a fs_root value +   of "/a/b/c" and two entries in fs_location.  One entry in fs_location +   will be for itself (servA) and the other will be for servB with a + + + + +Shepler, et al.             Standards Track                    [Page 60] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   path of "/x/y/z".  With this information, the client is able to +   substitute "/x/y/z" for the "/a/b/c" at the beginning of its access +   path and construct "/x/y/z/d" to use for the new server. + +   See the section "Security Considerations" for a discussion on the +   recommendations for the security flavor to be used by any GETATTR +   operation that requests the "fs_locations" attribute. + +6.4.  Filehandle Recovery for Migration or Replication + +   Filehandles for filesystems that are replicated or migrated generally +   have the same semantics as for filesystems that are not replicated or +   migrated.  For example, if a filesystem has persistent filehandles +   and it is migrated to another server, the filehandle values for the +   filesystem will be valid at the new server. + +   For volatile filehandles, the servers involved likely do not have a +   mechanism to transfer filehandle format and content between +   themselves.  Therefore, a server may have difficulty in determining +   if a volatile filehandle from an old server should return an error of +   NFS4ERR_FHEXPIRED.  Therefore, the client is informed, with the use +   of the fh_expire_type attribute, whether volatile filehandles will +   expire at the migration or replication event.  If the bit +   FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client +   must treat the volatile filehandle as if the server had returned the +   NFS4ERR_FHEXPIRED error.  At the migration or replication event in +   the presence of the FH4_VOL_MIGRATION bit, the client will not +   present the original or old volatile filehandle to the new server. +   The client will start its communication with the new server by +   recovering its filehandles using the saved file names. + +7.  NFS Server Name Space + +7.1.  Server Exports + +   On a UNIX server the name space describes all the files reachable by +   pathnames under the root directory or "/".  On a Windows NT server +   the name space constitutes all the files on disks named by mapped +   disk letters.  NFS server administrators rarely make the entire +   server's filesystem name space available to NFS clients.  More often +   portions of the name space are made available via an "export" +   feature.  In previous versions of the NFS protocol, the root +   filehandle for each export is obtained through the MOUNT protocol; +   the client sends a string that identifies the export of name space +   and the server returns the root filehandle for it.  The MOUNT +   protocol supports an EXPORTS procedure that will enumerate the +   server's exports. + + + + +Shepler, et al.             Standards Track                    [Page 61] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +7.2.  Browsing Exports + +   The NFS version 4 protocol provides a root filehandle that clients +   can use to obtain filehandles for these exports via a multi-component +   LOOKUP.  A common user experience is to use a graphical user +   interface (perhaps a file "Open" dialog window) to find a file via +   progressive browsing through a directory tree.  The client must be +   able to move from one export to another export via single-component, +   progressive LOOKUP operations. + +   This style of browsing is not well supported by the NFS version 2 and +   3 protocols.  The client expects all LOOKUP operations to remain +   within a single server filesystem.  For example, the device attribute +   will not change.  This prevents a client from taking name space paths +   that span exports. + +   An automounter on the client can obtain a snapshot of the server's +   name space using the EXPORTS procedure of the MOUNT protocol.  If it +   understands the server's pathname syntax, it can create an image of +   the server's name space on the client.  The parts of the name space +   that are not exported by the server are filled in with a "pseudo +   filesystem" that allows the user to browse from one mounted +   filesystem to another.  There is a drawback to this representation of +   the server's name space on the client: it is static.  If the server +   administrator adds a new export the client will be unaware of it. + +7.3.  Server Pseudo Filesystem + +   NFS version 4 servers avoid this name space inconsistency by +   presenting all the exports within the framework of a single server +   name space.  An NFS version 4 client uses LOOKUP and READDIR +   operations to browse seamlessly from one export to another.  Portions +   of the server name space that are not exported are bridged via a +   "pseudo filesystem" that provides a view of exported directories +   only.  A pseudo filesystem has a unique fsid and behaves like a +   normal, read only filesystem. + +   Based on the construction of the server's name space, it is possible +   that multiple pseudo filesystems may exist.  For example, + +   /a         pseudo filesystem +   /a/b       real filesystem +   /a/b/c     pseudo filesystem +   /a/b/c/d   real filesystem + +   Each of the pseudo filesystems are considered separate entities and +   therefore will have a unique fsid. + + + + +Shepler, et al.             Standards Track                    [Page 62] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +7.4.  Multiple Roots + +   The DOS and Windows operating environments are sometimes described as +   having "multiple roots".  Filesystems are commonly represented as +   disk letters.  MacOS represents filesystems as top level names.  NFS +   version 4 servers for these platforms can construct a pseudo file +   system above these root names so that disk letters or volume names +   are simply directory names in the pseudo root. + +7.5.  Filehandle Volatility + +   The nature of the server's pseudo filesystem is that it is a logical +   representation of filesystem(s) available from the server. +   Therefore, the pseudo filesystem is most likely constructed +   dynamically when the server is first instantiated.  It is expected +   that the pseudo filesystem may not have an on disk counterpart from +   which persistent filehandles could be constructed.  Even though it is +   preferable that the server provide persistent filehandles for the +   pseudo filesystem, the NFS client should expect that pseudo file +   system filehandles are volatile.  This can be confirmed by checking +   the associated "fh_expire_type" attribute for those filehandles in +   question.  If the filehandles are volatile, the NFS client must be +   prepared to recover a filehandle value (e.g., with a multi-component +   LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. + +7.6.  Exported Root + +   If the server's root filesystem is exported, one might conclude that +   a pseudo-filesystem is not needed.  This would be wrong.  Assume the +   following filesystems on a server: + +         /       disk1  (exported) +         /a      disk2  (not exported) +         /a/b    disk3  (exported) + +   Because disk2 is not exported, disk3 cannot be reached with simple +   LOOKUPs.  The server must bridge the gap with a pseudo-filesystem. + +7.7.  Mount Point Crossing + +   The server filesystem environment may be constructed in such a way +   that one filesystem contains a directory which is 'covered' or +   mounted upon by a second filesystem.  For example: + +         /a/b            (filesystem 1) +         /a/b/c/d        (filesystem 2) + + + + + +Shepler, et al.             Standards Track                    [Page 63] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The pseudo filesystem for this server may be constructed to look +   like: + +         /               (place holder/not exported) +         /a/b            (filesystem 1) +         /a/b/c/d        (filesystem 2) + +   It is the server's responsibility to present the pseudo filesystem +   that is complete to the client.  If the client sends a lookup request +   for the path "/a/b/c/d", the server's response is the filehandle of +   the filesystem "/a/b/c/d".  In previous versions of the NFS protocol, +   the server would respond with the filehandle of directory "/a/b/c/d" +   within the filesystem "/a/b". + +   The NFS client will be able to determine if it crosses a server mount +   point by a change in the value of the "fsid" attribute. + +7.8.  Security Policy and Name Space Presentation + +   The application of the server's security policy needs to be carefully +   considered by the implementor.  One may choose to limit the +   viewability of portions of the pseudo filesystem based on the +   server's perception of the client's ability to authenticate itself +   properly.  However, with the support of multiple security mechanisms +   and the ability to negotiate the appropriate use of these mechanisms, +   the server is unable to properly determine if a client will be able +   to authenticate itself.  If, based on its policies, the server +   chooses to limit the contents of the pseudo filesystem, the server +   may effectively hide filesystems from a client that may otherwise +   have legitimate access. + +   As suggested practice, the server should apply the security policy of +   a shared resource in the server's namespace to the components of the +   resource's ancestors.  For example: + +         / +         /a/b +         /a/b/c + +   The /a/b/c directory is a real filesystem and is the shared resource. +   The security policy for /a/b/c is Kerberos with integrity.  The +   server should apply the same security policy to /, /a, and /a/b. +   This allows for the extension of the protection of the server's +   namespace to the ancestors of the real shared resource. + + + + + + + +Shepler, et al.             Standards Track                    [Page 64] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   For the case of the use of multiple, disjoint security mechanisms in +   the server's resources, the security for a particular object in the +   server's namespace should be the union of all security mechanisms of +   all direct descendants. + +8.  File Locking and Share Reservations + +   Integrating locking into the NFS protocol necessarily causes it to be +   stateful.  With the inclusion of share reservations the protocol +   becomes substantially more dependent on state than the traditional +   combination of NFS and NLM [XNFS].  There are three components to +   making this state manageable: + +   o  Clear division between client and server + +   o  Ability to reliably detect inconsistency in state between client +      and server + +   o  Simple and robust recovery mechanisms + +   In this model, the server owns the state information.  The client +   communicates its view of this state to the server as needed.  The +   client is also able to detect inconsistent state before modifying a +   file. + +   To support Win32 share reservations it is necessary to atomically +   OPEN or CREATE files.  Having a separate share/unshare operation +   would not allow correct implementation of the Win32 OpenFile API.  In +   order to correctly implement share semantics, the previous NFS +   protocol mechanisms used when a file is opened or created (LOOKUP, +   CREATE, ACCESS) need to be replaced.  The NFS version 4 protocol has +   an OPEN operation that subsumes the NFS version 3 methodology of +   LOOKUP, CREATE, and ACCESS.  However, because many operations require +   a filehandle, the traditional LOOKUP is preserved to map a file name +   to filehandle without establishing state on the server.  The policy +   of granting access or modifying files is managed by the server based +   on the client's state.  These mechanisms can implement policy ranging +   from advisory only locking to full mandatory locking. + +8.1.  Locking + +   It is assumed that manipulating a lock is rare when compared to READ +   and WRITE operations.  It is also assumed that crashes and network +   partitions are relatively rare.  Therefore it is important that the +   READ and WRITE operations have a lightweight mechanism to indicate if +   they possess a held lock.  A lock request contains the heavyweight +   information required to establish a lock and uniquely define the lock +   owner. + + + +Shepler, et al.             Standards Track                    [Page 65] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The following sections describe the transition from the heavy weight +   information to the eventual stateid used for most client and server +   locking and lease interactions. + +8.1.1.  Client ID + +   For each LOCK request, the client must identify itself to the server. + +   This is done in such a way as to allow for correct lock +   identification and crash recovery.  A sequence of a SETCLIENTID +   operation followed by a SETCLIENTID_CONFIRM operation is required to +   establish the identification onto the server.  Establishment of +   identification by a new incarnation of the client also has the effect +   of immediately breaking any leased state that a previous incarnation +   of the client might have had on the server, as opposed to forcing the +   new client incarnation to wait for the leases to expire.  Breaking +   the lease state amounts to the server removing all lock, share +   reservation, and, where the server is not supporting the +   CLAIM_DELEGATE_PREV claim type, all delegation state associated with +   same client with the same identity.  For discussion of delegation +   state recovery, see the section "Delegation Recovery". + +   Client identification is encapsulated in the following structure: + +         struct nfs_client_id4 { +                 verifier4     verifier; +                 opaque        id<NFS4_OPAQUE_LIMIT>; +         }; + +   The first field, verifier is a client incarnation verifier that is +   used to detect client reboots.  Only if the verifier is different +   from that which the server has previously recorded the client (as +   identified by the second field of the structure, id) does the server +   start the process of canceling the client's leased state. + +   The second field, id is a variable length string that uniquely +   defines the client. + +   There are several considerations for how the client generates the id +   string: + +   o  The string should be unique so that multiple clients do not +      present the same string.  The consequences of two clients +      presenting the same string range from one client getting an error +      to one client having its leased state abruptly and unexpectedly +      canceled. + + + + + +Shepler, et al.             Standards Track                    [Page 66] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  The string should be selected so the subsequent incarnations +      (e.g., reboots) of the same client cause the client to present the +      same string.  The implementor is cautioned against an approach +      that requires the string to be recorded in a local file because +      this precludes the use of the implementation in an environment +      where there is no local disk and all file access is from an NFS +      version 4 server. + +   o  The string should be different for each server network address +      that the client accesses, rather than common to all server network +      addresses.  The reason is that it may not be possible for the +      client to tell if the same server is listening on multiple network +      addresses.  If the client issues SETCLIENTID with the same id +      string to each network address of such a server, the server will +      think it is the same client, and each successive SETCLIENTID will +      cause the server to begin the process of removing the client's +      previous leased state. + +   o  The algorithm for generating the string should not assume that the +      client's network address won't change.  This includes changes +      between client incarnations and even changes while the client is +      stilling running in its current incarnation.  This means that if +      the client includes just the client's and server's network address +      in the id string, there is a real risk, after the client gives up +      the network address, that another client, using a similar +      algorithm for generating the id string, will generate a +      conflicting id string. + +   Given the above considerations, an example of a well generated id +   string is one that includes: + +   o  The server's network address. + +   o  The client's network address. + +   o  For a user level NFS version 4 client, it should contain +      additional information to distinguish the client from other user +      level clients running on the same host, such as a process id or +      other unique sequence. + +   o  Additional information that tends to be unique, such as one or +      more of: + +      -  The client machine's serial number (for privacy reasons, it is +         best to perform some one way function on the serial number). + +      -  A MAC address. + + + + +Shepler, et al.             Standards Track                    [Page 67] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      -  The timestamp of when the NFS version 4 software was first +         installed on the client (though this is subject to the +         previously mentioned caution about using information that is +         stored in a file, because the file might only be accessible +         over NFS version 4). + +      -  A true random number.  However since this number ought to be +         the same between client incarnations, this shares the same +         problem as that of the using the timestamp of the software +         installation. + +   As a security measure, the server MUST NOT cancel a client's leased +   state if the principal established the state for a given id string is +   not the same as the principal issuing the SETCLIENTID. + +   Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose +   of establishing the information the server needs to make callbacks to +   the client for purpose of supporting delegations.  It is permitted to +   change this information via SETCLIENTID and SETCLIENTID_CONFIRM +   within the same incarnation of the client without removing the +   client's leased state. + +   Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully +   completed, the client uses the shorthand client identifier, of type +   clientid4, instead of the longer and less compact nfs_client_id4 +   structure.  This shorthand client identifier (a clientid) is assigned +   by the server and should be chosen so that it will not conflict with +   a clientid previously assigned by the server.  This applies across +   server restarts or reboots.  When a clientid is presented to a server +   and that clientid is not recognized, as would happen after a server +   reboot, the server will reject the request with the error +   NFS4ERR_STALE_CLIENTID.  When this happens, the client must obtain a +   new clientid by use of the SETCLIENTID operation and then proceed to +   any other necessary recovery for the server reboot case (See the +   section "Server Failure and Recovery"). + +   The client must also employ the SETCLIENTID operation when it +   receives a NFS4ERR_STALE_STATEID error using a stateid derived from +   its current clientid, since this also indicates a server reboot which +   has invalidated the existing clientid (see the next section +   "lock_owner and stateid Definition" for details). + +   See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM +   for a complete specification of the operations. + + + + + + + +Shepler, et al.             Standards Track                    [Page 68] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +8.1.2.  Server Release of Clientid + +   If the server determines that the client holds no associated state +   for its clientid, the server may choose to release the clientid.  The +   server may make this choice for an inactive client so that resources +   are not consumed by those intermittently active clients.  If the +   client contacts the server after this release, the server must ensure +   the client receives the appropriate error so that it will use the +   SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new identity. +   It should be clear that the server must be very hesitant to release a +   clientid since the resulting work on the client to recover from such +   an event will be the same burden as if the server had failed and +   restarted.  Typically a server would not release a clientid unless +   there had been no activity from that client for many minutes. + +   Note that if the id string in a SETCLIENTID request is properly +   constructed, and if the client takes care to use the same principal +   for each successive use of SETCLIENTID, then, barring an active +   denial of service attack, NFS4ERR_CLID_INUSE should never be +   returned. + +   However, client bugs, server bugs, or perhaps a deliberate change of +   the principal owner of the id string (such as the case of a client +   that changes security flavors, and under the new flavor, there is no +   mapping to the previous owner) will in rare cases result in +   NFS4ERR_CLID_INUSE. + +   In that event, when the server gets a SETCLIENTID for a client id +   that currently has no state, or it has state, but the lease has +   expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST +   allow the SETCLIENTID, and confirm the new clientid if followed by +   the appropriate SETCLIENTID_CONFIRM. + +8.1.3.  lock_owner and stateid Definition + +   When requesting a lock, the client must present to the server the +   clientid and an identifier for the owner of the requested lock. +   These two fields are referred to as the lock_owner and the definition +   of those fields are: + +   o  A clientid returned by the server as part of the client's use of +      the SETCLIENTID operation. + +   o  A variable length opaque array used to uniquely define the owner +      of a lock managed by the client. + +      This may be a thread id, process id, or other unique value. + + + + +Shepler, et al.             Standards Track                    [Page 69] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   When the server grants the lock, it responds with a unique stateid. +   The stateid is used as a shorthand reference to the lock_owner, since +   the server will be maintaining the correspondence between them. + +   The server is free to form the stateid in any manner that it chooses +   as long as it is able to recognize invalid and out-of-date stateids. +   This requirement includes those stateids generated by earlier +   instances of the server.  From this, the client can be properly +   notified of a server restart.  This notification will occur when the +   client presents a stateid to the server from a previous +   instantiation. + +   The server must be able to distinguish the following situations and +   return the error as specified: + +   o  The stateid was generated by an earlier server instance (i.e., +      before a server reboot).  The error NFS4ERR_STALE_STATEID should +      be returned. + +   o  The stateid was generated by the current server instance but the +      stateid no longer designates the current locking state for the +      lockowner-file pair in question (i.e., one or more locking +      operations has occurred).  The error NFS4ERR_OLD_STATEID should be +      returned. + +      This error condition will only occur when the client issues a +      locking request which changes a stateid while an I/O request that +      uses that stateid is outstanding. + +   o  The stateid was generated by the current server instance but the +      stateid does not designate a locking state for any active +      lockowner-file pair.  The error NFS4ERR_BAD_STATEID should be +      returned. + +      This error condition will occur when there has been a logic error +      on the part of the client or server.  This should not happen. + +   One mechanism that may be used to satisfy these requirements is for +   the server to, + +   o  divide the "other" field of each stateid into two fields: + +      -  A server verifier which uniquely designates a particular server +         instantiation. + +      -  An index into a table of locking-state structures. + + + + + +Shepler, et al.             Standards Track                    [Page 70] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  utilize the "seqid" field of each stateid, such that seqid is +      monotonically incremented for each stateid that is associated with +      the same index into the locking-state table. + +   By matching the incoming stateid and its field values with the state +   held at the server, the server is able to easily determine if a +   stateid is valid for its current instantiation and state.  If the +   stateid is not valid, the appropriate error can be supplied to the +   client. + +8.1.4.  Use of the stateid and Locking + +   All READ, WRITE and SETATTR operations contain a stateid.  For the +   purposes of this section, SETATTR operations which change the size +   attribute of a file are treated as if they are writing the area +   between the old and new size (i.e., the range truncated or added to +   the file by means of the SETATTR), even where SETATTR is not +   explicitly mentioned in the text. + +   If the lock_owner performs a READ or WRITE in a situation in which it +   has established a lock or share reservation on the server (any OPEN +   constitutes a share reservation) the stateid (previously returned by +   the server) must be used to indicate what locks, including both +   record locks and share reservations, are held by the lockowner.  If +   no state is established by the client, either record lock or share +   reservation, a stateid of all bits 0 is used.  Regardless whether a +   stateid of all bits 0, or a stateid returned by the server is used, +   if there is a conflicting share reservation or mandatory record lock +   held on the file, the server MUST refuse to service the READ or WRITE +   operation. + +   Share reservations are established by OPEN operations and by their +   nature are mandatory in that when the OPEN denies READ or WRITE +   operations, that denial results in such operations being rejected +   with error NFS4ERR_LOCKED.  Record locks may be implemented by the +   server as either mandatory or advisory, or the choice of mandatory or +   advisory behavior may be determined by the server on the basis of the +   file being accessed (for example, some UNIX-based servers support a +   "mandatory lock bit" on the mode attribute such that if set, record +   locks are required on the file before I/O is possible).  When record +   locks are advisory, they only prevent the granting of conflicting +   lock requests and have no effect on READs or WRITEs.  Mandatory +   record locks, however, prevent conflicting I/O operations.  When they +   are attempted, they are rejected with NFS4ERR_LOCKED.  When the +   client gets NFS4ERR_LOCKED on a file it knows it has the proper share +   reservation for, it will need to issue a LOCK request on the region + + + + + +Shepler, et al.             Standards Track                    [Page 71] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   of the file that includes the region the I/O was to be performed on, +   with an appropriate locktype (i.e., READ*_LT for a READ operation, +   WRITE*_LT for a WRITE operation). + +   With NFS version 3, there was no notion of a stateid so there was no +   way to tell if the application process of the client sending the READ +   or WRITE operation had also acquired the appropriate record lock on +   the file.  Thus there was no way to implement mandatory locking. +   With the stateid construct, this barrier has been removed. + +   Note that for UNIX environments that support mandatory file locking, +   the distinction between advisory and mandatory locking is subtle.  In +   fact, advisory and mandatory record locks are exactly the same in so +   far as the APIs and requirements on implementation.  If the mandatory +   lock attribute is set on the file, the server checks to see if the +   lockowner has an appropriate shared (read) or exclusive (write) +   record lock on the region it wishes to read or write to.  If there is +   no appropriate lock, the server checks if there is a conflicting lock +   (which can be done by attempting to acquire the conflicting lock on +   the behalf of the lockowner, and if successful, release the lock +   after the READ or WRITE is done), and if there is, the server returns +   NFS4ERR_LOCKED. + +   For Windows environments, there are no advisory record locks, so the +   server always checks for record locks during I/O requests. + +   Thus, the NFS version 4 LOCK operation does not need to distinguish +   between advisory and mandatory record locks.  It is the NFS version 4 +   server's processing of the READ and WRITE operations that introduces +   the distinction. + +   Every stateid other than the special stateid values noted in this +   section, whether returned by an OPEN-type operation (i.e., OPEN, +   OPEN_DOWNGRADE), or by a LOCK-type operation (i.e., LOCK or LOCKU), +   defines an access mode for the file (i.e., READ, WRITE, or READ- +   WRITE) as established by the original OPEN which began the stateid +   sequence, and as modified by subsequent OPENs and OPEN_DOWNGRADEs +   within that stateid sequence.  When a READ, WRITE, or SETATTR which +   specifies the size attribute, is done, the operation is subject to +   checking against the access mode to verify that the operation is +   appropriate given the OPEN with which the operation is associated. + +   In the case of WRITE-type operations (i.e., WRITEs and SETATTRs which +   set size), the server must verify that the access mode allows writing +   and return an NFS4ERR_OPENMODE error if it does not.  In the case, of +   READ, the server may perform the corresponding check on the access +   mode, or it may choose to allow READ on opens for WRITE only, to +   accommodate clients whose write implementation may unavoidably do + + + +Shepler, et al.             Standards Track                    [Page 72] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   reads (e.g., due to buffer cache constraints).  However, even if +   READs are allowed in these circumstances, the server MUST still check +   for locks that conflict with the READ (e.g., another open specify +   denial of READs).  Note that a server which does enforce the access +   mode check on READs need not explicitly check for conflicting share +   reservations since the existence of OPEN for read access guarantees +   that no conflicting share reservation can exist. + +   A stateid of all bits 1 (one) MAY allow READ operations to bypass +   locking checks at the server.  However, WRITE operations with a +   stateid with bits all 1 (one) MUST NOT bypass locking checks and are +   treated exactly the same as if a stateid of all bits 0 were used. + +   A lock may not be granted while a READ or WRITE operation using one +   of the special stateids is being performed and the range of the lock +   request conflicts with the range of the READ or WRITE operation.  For +   the purposes of this paragraph, a conflict occurs when a shared lock +   is requested and a WRITE operation is being performed, or an +   exclusive lock is requested and either a READ or a WRITE operation is +   being performed.  A SETATTR that sets size is treated similarly to a +   WRITE as discussed above. + +8.1.5.  Sequencing of Lock Requests + +   Locking is different than most NFS operations as it requires "at- +   most-one" semantics that are not provided by ONCRPC.  ONCRPC over a +   reliable transport is not sufficient because a sequence of locking +   requests may span multiple TCP connections.  In the face of +   retransmission or reordering, lock or unlock requests must have a +   well defined and consistent behavior.  To accomplish this, each lock +   request contains a sequence number that is a consecutively increasing +   integer.  Different lock_owners have different sequences.  The server +   maintains the last sequence number (L) received and the response that +   was returned.  The first request issued for any given lock_owner is +   issued with a sequence number of zero. + +   Note that for requests that contain a sequence number, for each +   lock_owner, there should be no more than one outstanding request. + +   If a request (r) with a previous sequence number (r < L) is received, +   it is rejected with the return of error NFS4ERR_BAD_SEQID.  Given a +   properly-functioning client, the response to (r) must have been +   received before the last request (L) was sent.  If a duplicate of +   last request (r == L) is received, the stored response is returned. +   If a request beyond the next sequence (r == L + 2) is received, it is +   rejected with the return of error NFS4ERR_BAD_SEQID.  Sequence +   history is reinitialized whenever the SETCLIENTID/SETCLIENTID_CONFIRM +   sequence changes the client verifier. + + + +Shepler, et al.             Standards Track                    [Page 73] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Since the sequence number is represented with an unsigned 32-bit +   integer, the arithmetic involved with the sequence number is mod +   2^32.  For an example of modulo arithmetic involving sequence numbers +   see [RFC793]. + +   It is critical the server maintain the last response sent to the +   client to provide a more reliable cache of duplicate non-idempotent +   requests than that of the traditional cache described in [Juszczak]. +   The traditional duplicate request cache uses a least recently used +   algorithm for removing unneeded requests.  However, the last lock +   request and response on a given lock_owner must be cached as long as +   the lock state exists on the server. + +   The client MUST monotonically increment the sequence number for the +   CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE +   operations.  This is true even in the event that the previous +   operation that used the sequence number received an error.  The only +   exception to this rule is if the previous operation received one of +   the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID, +   NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR, +   NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE. + +8.1.6.  Recovery from Replayed Requests + +   As described above, the sequence number is per lock_owner.  As long +   as the server maintains the last sequence number received and follows +   the methods described above, there are no risks of a Byzantine router +   re-sending old requests.  The server need only maintain the +   (lock_owner, sequence number) state as long as there are open files +   or closed files with locks outstanding. + +   LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence +   number and therefore the risk of the replay of these operations +   resulting in undesired effects is non-existent while the server +   maintains the lock_owner state. + +8.1.7.  Releasing lock_owner State + +   When a particular lock_owner no longer holds open or file locking +   state at the server, the server may choose to release the sequence +   number state associated with the lock_owner.  The server may make +   this choice based on lease expiration, for the reclamation of server +   memory, or other implementation specific details.  In any event, the +   server is able to do this safely only when the lock_owner no longer +   is being utilized by the client.  The server may choose to hold the +   lock_owner state in the event that retransmitted requests are +   received.  However, the period to hold this state is implementation +   specific. + + + +Shepler, et al.             Standards Track                    [Page 74] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is +   retransmitted after the server has previously released the lock_owner +   state, the server will find that the lock_owner has no files open and +   an error will be returned to the client.  If the lock_owner does have +   a file open, the stateid will not match and again an error is +   returned to the client. + +8.1.8.  Use of Open Confirmation + +   In the case that an OPEN is retransmitted and the lock_owner is being +   used for the first time or the lock_owner state has been previously +   released by the server, the use of the OPEN_CONFIRM operation will +   prevent incorrect behavior.  When the server observes the use of the +   lock_owner for the first time, it will direct the client to perform +   the OPEN_CONFIRM for the corresponding OPEN.  This sequence +   establishes the use of an lock_owner and associated sequence number. +   Since the OPEN_CONFIRM sequence connects a new open_owner on the +   server with an existing open_owner on a client, the sequence number +   may have any value.  The OPEN_CONFIRM step assures the server that +   the value received is the correct one.  See the section "OPEN_CONFIRM +   - Confirm Open" for further details. + +   There are a number of situations in which the requirement to confirm +   an OPEN would pose difficulties for the client and server, in that +   they would be prevented from acting in a timely fashion on +   information received, because that information would be provisional, +   subject to deletion upon non-confirmation.  Fortunately, these are +   situations in which the server can avoid the need for confirmation +   when responding to open requests.  The two constraints are: + +   o  The server must not bestow a delegation for any open which would +      require confirmation. + +   o  The server MUST NOT require confirmation on a reclaim-type open +      (i.e., one specifying claim type CLAIM_PREVIOUS or +      CLAIM_DELEGATE_PREV). + +   These constraints are related in that reclaim-type opens are the only +   ones in which the server may be required to send a delegation.  For +   CLAIM_NULL, sending the delegation is optional while for +   CLAIM_DELEGATE_CUR, no delegation is sent. + +   Delegations being sent with an open requiring confirmation are +   troublesome because recovering from non-confirmation adds undue +   complexity to the protocol while requiring confirmation on reclaim- +   type opens poses difficulties in that the inability to resolve + + + + + +Shepler, et al.             Standards Track                    [Page 75] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   the status of the reclaim until lease expiration may make it +   difficult to have timely determination of the set of locks being +   reclaimed (since the grace period may expire). + +   Requiring open confirmation on reclaim-type opens is avoidable +   because of the nature of the environments in which such opens are +   done.  For CLAIM_PREVIOUS opens, this is immediately after server +   reboot, so there should be no time for lockowners to be created, +   found to be unused, and recycled.  For CLAIM_DELEGATE_PREV opens, we +   are dealing with a client reboot situation.  A server which supports +   delegation can be sure that no lockowners for that client have been +   recycled since client initialization and thus can ensure that +   confirmation will not be required. + +8.2.  Lock Ranges + +   The protocol allows a lock owner to request a lock with a byte range +   and then either upgrade or unlock a sub-range of the initial lock. +   It is expected that this will be an uncommon type of request.  In any +   case, servers or server filesystems may not be able to support sub- +   range lock semantics.  In the event that a server receives a locking +   request that represents a sub-range of current locking state for the +   lock owner, the server is allowed to return the error +   NFS4ERR_LOCK_RANGE to signify that it does not support sub-range lock +   operations.  Therefore, the client should be prepared to receive this +   error and, if appropriate, report the error to the requesting +   application. + +   The client is discouraged from combining multiple independent locking +   ranges that happen to be adjacent into a single request since the +   server may not support sub-range requests and for reasons related to +   the recovery of file locking state in the event of server failure. +   As discussed in the section "Server Failure and Recovery" below, the +   server may employ certain optimizations during recovery that work +   effectively only when the client's behavior during lock recovery is +   similar to the client's locking behavior prior to server failure. + +8.3.  Upgrading and Downgrading Locks + +   If a client has a write lock on a record, it can request an atomic +   downgrade of the lock to a read lock via the LOCK request, by setting +   the type to READ_LT.  If the server supports atomic downgrade, the +   request will succeed.  If not, it will return NFS4ERR_LOCK_NOTSUPP. +   The client should be prepared to receive this error, and if +   appropriate, report the error to the requesting application. + + + + + + +Shepler, et al.             Standards Track                    [Page 76] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   If a client has a read lock on a record, it can request an atomic +   upgrade of the lock to a write lock via the LOCK request by setting +   the type to WRITE_LT or WRITEW_LT.  If the server does not support +   atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP.  If the upgrade +   can be achieved without an existing conflict, the request will +   succeed.  Otherwise, the server will return either NFS4ERR_DENIED or +   NFS4ERR_DEADLOCK.  The error NFS4ERR_DEADLOCK is returned if the +   client issued the LOCK request with the type set to WRITEW_LT and the +   server has detected a deadlock.  The client should be prepared to +   receive such errors and if appropriate, report the error to the +   requesting application. + +8.4.  Blocking Locks + +   Some clients require the support of blocking locks.  The NFS version +   4 protocol must not rely on a callback mechanism and therefore is +   unable to notify a client when a previously denied lock has been +   granted.  Clients have no choice but to continually poll for the +   lock.  This presents a fairness problem.  Two new lock types are +   added, READW and WRITEW, and are used to indicate to the server that +   the client is requesting a blocking lock.  The server should maintain +   an ordered list of pending blocking locks.  When the conflicting lock +   is released, the server may wait the lease period for the first +   waiting client to re-request the lock.  After the lease period +   expires the next waiting client request is allowed the lock.  Clients +   are required to poll at an interval sufficiently small that it is +   likely to acquire the lock in a timely manner.  The server is not +   required to maintain a list of pending blocked locks as it is used to +   increase fairness and not correct operation.  Because of the +   unordered nature of crash recovery, storing of lock state to stable +   storage would be required to guarantee ordered granting of blocking +   locks. + +   Servers may also note the lock types and delay returning denial of +   the request to allow extra time for a conflicting lock to be +   released, allowing a successful return.  In this way, clients can +   avoid the burden of needlessly frequent polling for blocking locks. +   The server should take care in the length of delay in the event the +   client retransmits the request. + +8.5.  Lease Renewal + +   The purpose of a lease is to allow a server to remove stale locks +   that are held by a client that has crashed or is otherwise +   unreachable.  It is not a mechanism for cache consistency and lease +   renewals may not be denied if the lease interval has not expired. + + + + + +Shepler, et al.             Standards Track                    [Page 77] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The following events cause implicit renewal of all of the leases for +   a given client (i.e., all those sharing a given clientid).  Each of +   these is a positive indication that the client is still active and +   that the associated state held at the server, for the client, is +   still valid. + +   o  An OPEN with a valid clientid. + +   o  Any operation made with a valid stateid (CLOSE, DELEGPURGE, +      DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, +      READ, RENEW, SETATTR, WRITE).  This does not include the special +      stateids of all bits 0 or all bits 1. + +      Note that if the client had restarted or rebooted, the client +      would not be making these requests without issuing the +      SETCLIENTID/SETCLIENTID_CONFIRM sequence.  The use of the +      SETCLIENTID/SETCLIENTID_CONFIRM sequence (one that changes the +      client verifier) notifies the server to drop the locking state +      associated with the client.  SETCLIENTID/SETCLIENTID_CONFIRM never +      renews a lease. + +      If the server has rebooted, the stateids (NFS4ERR_STALE_STATEID +      error) or the clientid (NFS4ERR_STALE_CLIENTID error) will not be +      valid hence preventing spurious renewals. + +   This approach allows for low overhead lease renewal which scales +   well.  In the typical case no extra RPC calls are required for lease +   renewal and in the worst case one RPC is required every lease period +   (i.e., a RENEW operation).  The number of locks held by the client is +   not a factor since all state for the client is involved with the +   lease renewal action. + +   Since all operations that create a new lease also renew existing +   leases, the server must maintain a common lease expiration time for +   all valid leases for a given client.  This lease time can then be +   easily updated upon implicit lease renewal actions. + +8.6.  Crash Recovery + +   The important requirement in crash recovery is that both the client +   and the server know when the other has failed.  Additionally, it is +   required that a client sees a consistent view of data across server +   restarts or reboots.  All READ and WRITE operations that may have +   been queued within the client or network buffers must wait until the +   client has successfully recovered the locks protecting the READ and +   WRITE operations. + + + + + +Shepler, et al.             Standards Track                    [Page 78] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +8.6.1.  Client Failure and Recovery + +   In the event that a client fails, the server may recover the client's +   locks when the associated leases have expired.  Conflicting locks +   from another client may only be granted after this lease expiration. +   If the client is able to restart or reinitialize within the lease +   period the client may be forced to wait the remainder of the lease +   period before obtaining new locks. + +   To minimize client delay upon restart, lock requests are associated +   with an instance of the client by a client supplied verifier.  This +   verifier is part of the initial SETCLIENTID call made by the client. +   The server returns a clientid as a result of the SETCLIENTID +   operation.  The client then confirms the use of the clientid with +   SETCLIENTID_CONFIRM.  The clientid in combination with an opaque +   owner field is then used by the client to identify the lock owner for +   OPEN.  This chain of associations is then used to identify all locks +   for a particular client. + +   Since the verifier will be changed by the client upon each +   initialization, the server can compare a new verifier to the verifier +   associated with currently held locks and determine that they do not +   match.  This signifies the client's new instantiation and subsequent +   loss of locking state.  As a result, the server is free to release +   all locks held which are associated with the old clientid which was +   derived from the old verifier. + +   Note that the verifier must have the same uniqueness properties of +   the verifier for the COMMIT operation. + +8.6.2.  Server Failure and Recovery + +   If the server loses locking state (usually as a result of a restart +   or reboot), it must allow clients time to discover this fact and re- +   establish the lost locking state.  The client must be able to re- +   establish the locking state without having the server deny valid +   requests because the server has granted conflicting access to another +   client.  Likewise, if there is the possibility that clients have not +   yet re-established their locking state for a file, the server must +   disallow READ and WRITE operations for that file.  The duration of +   this recovery period is equal to the duration of the lease period. + +   A client can determine that server failure (and thus loss of locking +   state) has occurred, when it receives one of two errors.  The +   NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a +   reboot or restart.  The NFS4ERR_STALE_CLIENTID error indicates a + + + + + +Shepler, et al.             Standards Track                    [Page 79] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   clientid invalidated by reboot or restart.  When either of these are +   received, the client must establish a new clientid (See the section +   "Client ID") and re-establish the locking state as discussed below. + +   The period of special handling of locking and READs and WRITEs, equal +   in duration to the lease period, is referred to as the "grace +   period".  During the grace period, clients recover locks and the +   associated state by reclaim-type locking requests (i.e., LOCK +   requests with reclaim set to true and OPEN operations with a claim +   type of CLAIM_PREVIOUS).  During the grace period, the server must +   reject READ and WRITE operations and non-reclaim locking requests +   (i.e., other LOCK and OPEN operations) with an error of +   NFS4ERR_GRACE. + +   If the server can reliably determine that granting a non-reclaim +   request will not conflict with reclamation of locks by other clients, +   the NFS4ERR_GRACE error does not have to be returned and the non- +   reclaim client request can be serviced.  For the server to be able to +   service READ and WRITE operations during the grace period, it must +   again be able to guarantee that no possible conflict could arise +   between an impending reclaim locking request and the READ or WRITE +   operation.  If the server is unable to offer that guarantee, the +   NFS4ERR_GRACE error must be returned to the client. + +   For a server to provide simple, valid handling during the grace +   period, the easiest method is to simply reject all non-reclaim +   locking requests and READ and WRITE operations by returning the +   NFS4ERR_GRACE error.  However, a server may keep information about +   granted locks in stable storage.  With this information, the server +   could determine if a regular lock or READ or WRITE operation can be +   safely processed. + +   For example, if a count of locks on a given file is available in +   stable storage, the server can track reclaimed locks for the file and +   when all reclaims have been processed, non-reclaim locking requests +   may be processed.  This way the server can ensure that non-reclaim +   locking requests will not conflict with potential reclaim requests. +   With respect to I/O requests, if the server is able to determine that +   there are no outstanding reclaim requests for a file by information +   from stable storage or another similar mechanism, the processing of +   I/O requests could proceed normally for the file. + +   To reiterate, for a server that allows non-reclaim lock and I/O +   requests to be processed during the grace period, it MUST determine +   that no lock subsequently reclaimed will be rejected and that no lock +   subsequently reclaimed would have prevented any I/O operation +   processed during the grace period. + + + + +Shepler, et al.             Standards Track                    [Page 80] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Clients should be prepared for the return of NFS4ERR_GRACE errors for +   non-reclaim lock and I/O requests.  In this case the client should +   employ a retry mechanism for the request.  A delay (on the order of +   several seconds) between retries should be used to avoid overwhelming +   the server.  Further discussion of the general issue is included in +   [Floyd].  The client must account for the server that is able to +   perform I/O and non-reclaim locking requests within the grace period +   as well as those that can not do so. + +   A reclaim-type locking request outside the server's grace period can +   only succeed if the server can guarantee that no conflicting lock or +   I/O request has been granted since reboot or restart. + +   A server may, upon restart, establish a new value for the lease +   period.  Therefore, clients should, once a new clientid is +   established, refetch the lease_time attribute and use it as the basis +   for lease renewal for the lease associated with that server. +   However, the server must establish, for this restart event, a grace +   period at least as long as the lease period for the previous server +   instantiation.  This allows the client state obtained during the +   previous server instance to be reliably re-established. + +8.6.3.  Network Partitions and Recovery + +   If the duration of a network partition is greater than the lease +   period provided by the server, the server will have not received a +   lease renewal from the client.  If this occurs, the server may free +   all locks held for the client.  As a result, all stateids held by the +   client will become invalid or stale.  Once the client is able to +   reach the server after such a network partition, all I/O submitted by +   the client with the now invalid stateids will fail with the server +   returning the error NFS4ERR_EXPIRED.  Once this error is received, +   the client will suitably notify the application that held the lock. + +   As a courtesy to the client or as an optimization, the server may +   continue to hold locks on behalf of a client for which recent +   communication has extended beyond the lease period.  If the server +   receives a lock or I/O request that conflicts with one of these +   courtesy locks, the server must free the courtesy lock and grant the +   new request. + +   When a network partition is combined with a server reboot, there are +   edge conditions that place requirements on the server in order to +   avoid silent data corruption following the server reboot.  Two of +   these edge conditions are known, and are discussed below. + + + + + + +Shepler, et al.             Standards Track                    [Page 81] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The first edge condition has the following scenario: + +      1. Client A acquires a lock. + +      2. Client A and server experience mutual network partition, such +         that client A is unable to renew its lease. + +      3. Client A's lease expires, so server releases lock. + +      4. Client B acquires a lock that would have conflicted with that +         of Client A. + +      5. Client B releases the lock + +      6. Server reboots + +      7. Network partition between client A and server heals. + +      8. Client A issues a RENEW operation, and gets back a +         NFS4ERR_STALE_CLIENTID. + +      9. Client A reclaims its lock within the server's grace period. + +   Thus, at the final step, the server has erroneously granted client +   A's lock reclaim.  If client B modified the object the lock was +   protecting, client A will experience object corruption. + +   The second known edge condition follows: + +      1. Client A acquires a lock. + +      2. Server reboots. + +      3. Client A and server experience mutual network partition, such +         that client A is unable to reclaim its lock within the grace +         period. + +      4. Server's reclaim grace period ends.  Client A has no locks +         recorded on server. + +      5. Client B acquires a lock that would have conflicted with that +         of Client A. + +      6. Client B releases the lock. + +      7. Server reboots a second time. + +      8. Network partition between client A and server heals. + + + +Shepler, et al.             Standards Track                    [Page 82] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      9. Client A issues a RENEW operation, and gets back a +         NFS4ERR_STALE_CLIENTID. + +     10. Client A reclaims its lock within the server's grace period. + +   As with the first edge condition, the final step of the scenario of +   the second edge condition has the server erroneously granting client +   A's lock reclaim. + +   Solving the first and second edge conditions requires that the server +   either assume after it reboots that edge condition occurs, and thus +   return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server +   record some information stable storage.  The amount of information +   the server records in stable storage is in inverse proportion to how +   harsh the server wants to be whenever the edge conditions occur.  The +   server that is completely tolerant of all edge conditions will record +   in stable storage every lock that is acquired, removing the lock +   record from stable storage only when the lock is unlocked by the +   client and the lock's lockowner advances the sequence number such +   that the lock release is not the last stateful event for the +   lockowner's sequence.  For the two aforementioned edge conditions, +   the harshest a server can be, and still support a grace period for +   reclaims, requires that the server record in stable storage +   information some minimal information.  For example, a server +   implementation could, for each client, save in stable storage a +   record containing: + +   o  the client's id string + +   o  a boolean that indicates if the client's lease expired or if there +      was administrative intervention (see the section, Server +      Revocation of Locks) to revoke a record lock, share reservation, +      or delegation + +   o  a timestamp that is updated the first time after a server boot or +      reboot the client acquires record locking, share reservation, or +      delegation state on the server.  The timestamp need not be updated +      on subsequent lock requests until the server reboots. + +   The server implementation would also record in the stable storage the +   timestamps from the two most recent server reboots. + +   Assuming the above record keeping, for the first edge condition, +   after the server reboots, the record that client A's lease expired +   means that another client could have acquired a conflicting record +   lock, share reservation, or delegation.  Hence the server must reject +   a reclaim from client A with the error NFS4ERR_NO_GRACE. + + + + +Shepler, et al.             Standards Track                    [Page 83] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   For the second edge condition, after the server reboots for a second +   time, the record that the client had an unexpired record lock, share +   reservation, or delegation established before the server's previous +   incarnation means that the server must reject a reclaim from client A +   with the error NFS4ERR_NO_GRACE. + +   Regardless of the level and approach to record keeping, the server +   MUST implement one of the following strategies (which apply to +   reclaims of share reservations, record locks, and delegations): + +      1. Reject all reclaims with NFS4ERR_NO_GRACE.  This is superharsh, +         but necessary if the server does not want to record lock state +         in stable storage. + +      2. Record sufficient state in stable storage such that all known +         edge conditions involving server reboot, including the two +         noted in this section, are detected.  False positives are +         acceptable.  Note that at this time, it is not known if there +         are other edge conditions. + +         In the event, after a server reboot, the server determines that +         there is unrecoverable damage or corruption to the the stable +         storage, then for all clients and/or locks affected, the server +         MUST return NFS4ERR_NO_GRACE. + +   A mandate for the client's handling of the NFS4ERR_NO_GRACE error is +   outside the scope of this specification, since the strategies for +   such handling are very dependent on the client's operating +   environment.  However, one potential approach is described below. + +   When the client receives NFS4ERR_NO_GRACE, it could examine the +   change attribute of the objects the client is trying to reclaim state +   for, and use that to determine whether to re-establish the state via +   normal OPEN or LOCK requests.  This is acceptable provided the +   client's operating environment allows it.  In otherwords, the client +   implementor is advised to document for his users the behavior.  The +   client could also inform the application that its record lock or +   share reservations (whether they were delegated or not) have been +   lost, such as via a UNIX signal, a GUI pop-up window, etc.  See the +   section, "Data Caching and Revocation" for a discussion of what the +   client should do for dealing with unreclaimed delegations on client +   state. + +   For further discussion of revocation of locks see the section "Server +   Revocation of Locks". + + + + + + +Shepler, et al.             Standards Track                    [Page 84] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +8.7.  Recovery from a Lock Request Timeout or Abort + +   In the event a lock request times out, a client may decide to not +   retry the request.  The client may also abort the request when the +   process for which it was issued is terminated (e.g., in UNIX due to a +   signal).  It is possible though that the server received the request +   and acted upon it.  This would change the state on the server without +   the client being aware of the change.  It is paramount that the +   client re-synchronize state with server before it attempts any other +   operation that takes a seqid and/or a stateid with the same +   lock_owner.  This is straightforward to do without a special re- +   synchronize operation. + +   Since the server maintains the last lock request and response +   received on the lock_owner, for each lock_owner, the client should +   cache the last lock request it sent such that the lock request did +   not receive a response.  From this, the next time the client does a +   lock operation for the lock_owner, it can send the cached request, if +   there is one, and if the request was one that established state +   (e.g., a LOCK or OPEN operation), the server will return the cached +   result or if never saw the request, perform it.  The client can +   follow up with a request to remove the state (e.g., a LOCKU or CLOSE +   operation).  With this approach, the sequencing and stateid +   information on the client and server for the given lock_owner will +   re-synchronize and in turn the lock state will re-synchronize. + +8.8.  Server Revocation of Locks + +   At any point, the server can revoke locks held by a client and the +   client must be prepared for this event.  When the client detects that +   its locks have been or may have been revoked, the client is +   responsible for validating the state information between itself and +   the server.  Validating locking state for the client means that it +   must verify or reclaim state for each lock currently held. + +   The first instance of lock revocation is upon server reboot or re- +   initialization.  In this instance the client will receive an error +   (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will +   proceed with normal crash recovery as described in the previous +   section. + +   The second lock revocation event is the inability to renew the lease +   before expiration.  While this is considered a rare or unusual event, +   the client must be prepared to recover.  Both the server and client +   will be able to detect the failure to renew the lease and are capable +   of recovering without data corruption.  For the server, it tracks the +   last renewal event serviced for the client and knows when the lease +   will expire.  Similarly, the client must track operations which will + + + +Shepler, et al.             Standards Track                    [Page 85] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   renew the lease period.  Using the time that each such request was +   sent and the time that the corresponding reply was received, the +   client should bound the time that the corresponding renewal could +   have occurred on the server and thus determine if it is possible that +   a lease period expiration could have occurred. + +   The third lock revocation event can occur as a result of +   administrative intervention within the lease period.  While this is +   considered a rare event, it is possible that the server's +   administrator has decided to release or revoke a particular lock held +   by the client.  As a result of revocation, the client will receive an +   error of NFS4ERR_ADMIN_REVOKED.  In this instance the client may +   assume that only the lock_owner's locks have been lost.  The client +   notifies the lock holder appropriately.  The client may not assume +   the lease period has been renewed as a result of failed operation. + +   When the client determines the lease period may have expired, the +   client must mark all locks held for the associated lease as +   "unvalidated".  This means the client has been unable to re-establish +   or confirm the appropriate lock state with the server.  As described +   in the previous section on crash recovery, there are scenarios in +   which the server may grant conflicting locks after the lease period +   has expired for a client.  When it is possible that the lease period +   has expired, the client must validate each lock currently held to +   ensure that a conflicting lock has not been granted.  The client may +   accomplish this task by issuing an I/O request, either a pending I/O +   or a zero-length read, specifying the stateid associated with the +   lock in question.  If the response to the request is success, the +   client has validated all of the locks governed by that stateid and +   re-established the appropriate state between itself and the server. + +   If the I/O request is not successful, then one or more of the locks +   associated with the stateid was revoked by the server and the client +   must notify the owner. + +8.9.  Share Reservations + +   A share reservation is a mechanism to control access to a file.  It +   is a separate and independent mechanism from record locking.  When a +   client opens a file, it issues an OPEN operation to the server +   specifying the type of access required (READ, WRITE, or BOTH) and the +   type of access to deny others (deny NONE, READ, WRITE, or BOTH).  If +   the OPEN fails the client will fail the application's open request. + +   Pseudo-code definition of the semantics: + +   if (request.access == 0) +      return (NFS4ERR_INVAL) + + + +Shepler, et al.             Standards Track                    [Page 86] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   else +      if ((request.access & file_state.deny)) || +            (request.deny & file_state.access)) +                    return (NFS4ERR_DENIED) + +   This checking of share reservations on OPEN is done with no exception +   for an existing OPEN for the same open_owner. + +   The constants used for the OPEN and OPEN_DOWNGRADE operations for the +   access and deny fields are as follows: + +   const OPEN4_SHARE_ACCESS_READ   = 0x00000001; +   const OPEN4_SHARE_ACCESS_WRITE  = 0x00000002; +   const OPEN4_SHARE_ACCESS_BOTH   = 0x00000003; + +   const OPEN4_SHARE_DENY_NONE     = 0x00000000; +   const OPEN4_SHARE_DENY_READ     = 0x00000001; +   const OPEN4_SHARE_DENY_WRITE    = 0x00000002; +   const OPEN4_SHARE_DENY_BOTH     = 0x00000003; + +8.10.  OPEN/CLOSE Operations + +   To provide correct share semantics, a client MUST use the OPEN +   operation to obtain the initial filehandle and indicate the desired +   access and what if any access to deny.  Even if the client intends to +   use a stateid of all 0's or all 1's, it must still obtain the +   filehandle for the regular file with the OPEN operation so the +   appropriate share semantics can be applied.  For clients that do not +   have a deny mode built into their open programming interfaces, deny +   equal to NONE should be used. + +   The OPEN operation with the CREATE flag, also subsumes the CREATE +   operation for regular files as used in previous versions of the NFS +   protocol.  This allows a create with a share to be done atomically. + +   The CLOSE operation removes all share reservations held by the +   lock_owner on that file.  If record locks are held, the client SHOULD +   release all locks before issuing a CLOSE.  The server MAY free all +   outstanding locks on CLOSE but some servers may not support the CLOSE +   of a file that still has record locks held.  The server MUST return +   failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the +   CLOSE. + +   The LOOKUP operation will return a filehandle without establishing +   any lock state on the server.  Without a valid stateid, the server +   will assume the client has the least access.  For example, a file + + + + + +Shepler, et al.             Standards Track                    [Page 87] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   opened with deny READ/WRITE cannot be accessed using a filehandle +   obtained through LOOKUP because it would not have a valid stateid +   (i.e., using a stateid of all bits 0 or all bits 1). + +8.10.1.  Close and Retention of State Information + +   Since a CLOSE operation requests deallocation of a stateid, dealing +   with retransmission of the CLOSE, may pose special difficulties, +   since the state information, which normally would be used to +   determine the state of the open file being designated, might be +   deallocated, resulting in an NFS4ERR_BAD_STATEID error. + +   Servers may deal with this problem in a number of ways.  To provide +   the greatest degree assurance that the protocol is being used +   properly, a server should, rather than deallocate the stateid, mark +   it as close-pending, and retain the stateid with this status, until +   later deallocation.  In this way, a retransmitted CLOSE can be +   recognized since the stateid points to state information with this +   distinctive status, so that it can be handled without error. + +   When adopting this strategy, a server should retain the state +   information until the earliest of: + +   o  Another validly sequenced request for the same lockowner, that is +      not a retransmission. + +   o  The time that a lockowner is freed by the server due to period +      with no activity. + +   o  All locks for the client are freed as a result of a SETCLIENTID. + +   Servers may avoid this complexity, at the cost of less complete +   protocol error checking, by simply responding NFS4_OK in the event of +   a CLOSE for a deallocated stateid, on the assumption that this case +   must be caused by a retransmitted close.  When adopting this +   approach, it is desirable to at least log an error when returning a +   no-error indication in this situation.  If the server maintains a +   reply-cache mechanism, it can verify the CLOSE is indeed a +   retransmission and avoid error logging in most cases. + +8.11.  Open Upgrade and Downgrade + +   When an OPEN is done for a file and the lockowner for which the open +   is being done already has the file open, the result is to upgrade the +   open file status maintained on the server to include the access and +   deny bits specified by the new OPEN as well as those for the existing +   OPEN.  The result is that there is one open file, as far as the +   protocol is concerned, and it includes the union of the access and + + + +Shepler, et al.             Standards Track                    [Page 88] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   deny bits for all of the OPEN requests completed.  Only a single +   CLOSE will be done to reset the effects of both OPENs.  Note that the +   client, when issuing the OPEN, may not know that the same file is in +   fact being opened.  The above only applies if both OPENs result in +   the OPENed object being designated by the same filehandle. + +   When the server chooses to export multiple filehandles corresponding +   to the same file object and returns different filehandles on two +   different OPENs of the same file object, the server MUST NOT "OR" +   together the access and deny bits and coalesce the two open files. +   Instead the server must maintain separate OPENs with separate +   stateids and will require separate CLOSEs to free them. + +   When multiple open files on the client are merged into a single open +   file object on the server, the close of one of the open files (on the +   client) may necessitate change of the access and deny status of the +   open file on the server.  This is because the union of the access and +   deny bits for the remaining opens may be smaller (i.e., a proper +   subset) than previously.  The OPEN_DOWNGRADE operation is used to +   make the necessary change and the client should use it to update the +   server so that share reservation requests by other clients are +   handled properly. + +8.12.  Short and Long Leases + +   When determining the time period for the server lease, the usual +   lease tradeoffs apply.  Short leases are good for fast server +   recovery at a cost of increased RENEW or READ (with zero length) +   requests.  Longer leases are certainly kinder and gentler to servers +   trying to handle very large numbers of clients.  The number of RENEW +   requests drop in proportion to the lease time.  The disadvantages of +   long leases are slower recovery after server failure (the server must +   wait for the leases to expire and the grace period to elapse before +   granting new lock requests) and increased file contention (if client +   fails to transmit an unlock request then server must wait for lease +   expiration before granting new locks). + +   Long leases are usable if the server is able to store lease state in +   non-volatile memory.  Upon recovery, the server can reconstruct the +   lease state from its non-volatile memory and continue operation with +   its clients and therefore long leases would not be an issue. + +8.13.  Clocks, Propagation Delay, and Calculating Lease Expiration + +   To avoid the need for synchronized clocks, lease times are granted by +   the server as a time delta.  However, there is a requirement that the +   client and server clocks do not drift excessively over the duration +   of the lock.  There is also the issue of propagation delay across the + + + +Shepler, et al.             Standards Track                    [Page 89] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   network which could easily be several hundred milliseconds as well as +   the possibility that requests will be lost and need to be +   retransmitted. + +   To take propagation delay into account, the client should subtract it +   from lease times (e.g., if the client estimates the one-way +   propagation delay as 200 msec, then it can assume that the lease is +   already 200 msec old when it gets it).  In addition, it will take +   another 200 msec to get a response back to the server.  So the client +   must send a lock renewal or write data back to the server 400 msec +   before the lease would expire. + +   The server's lease period configuration should take into account the +   network distance of the clients that will be accessing the server's +   resources.  It is expected that the lease period will take into +   account the network propagation delays and other network delay +   factors for the client population.  Since the protocol does not allow +   for an automatic method to determine an appropriate lease period, the +   server's administrator may have to tune the lease period. + +8.14.  Migration, Replication and State + +   When responsibility for handling a given file system is transferred +   to a new server (migration) or the client chooses to use an alternate +   server (e.g., in response to server unresponsiveness) in the context +   of file system replication, the appropriate handling of state shared +   between the client and server (i.e., locks, leases, stateids, and +   clientids) is as described below.  The handling differs between +   migration and replication.  For related discussion of file server +   state and recover of such see the sections under "File Locking and +   Share Reservations". + +   If server replica or a server immigrating a filesystem agrees to, or +   is expected to, accept opaque values from the client that originated +   from another server, then it is a wise implementation practice for +   the servers to encode the "opaque" values in network byte order. +   This way, servers acting as replicas or immigrating filesystems will +   be able to parse values like stateids, directory cookies, +   filehandles, etc. even if their native byte order is different from +   other servers cooperating in the replication and migration of the +   filesystem. + +8.14.1.  Migration and State + +   In the case of migration, the servers involved in the migration of a +   filesystem SHOULD transfer all server state from the original to the +   new server.  This must be done in a way that is transparent to the +   client.  This state transfer will ease the client's transition when a + + + +Shepler, et al.             Standards Track                    [Page 90] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   filesystem migration occurs.  If the servers are successful in +   transferring all state, the client will continue to use stateids +   assigned by the original server.  Therefore the new server must +   recognize these stateids as valid.  This holds true for the clientid +   as well.  Since responsibility for an entire filesystem is +   transferred with a migration event, there is no possibility that +   conflicts will arise on the new server as a result of the transfer of +   locks. + +   As part of the transfer of information between servers, leases would +   be transferred as well.  The leases being transferred to the new +   server will typically have a different expiration time from those for +   the same client, previously on the old server.  To maintain the +   property that all leases on a given server for a given client expire +   at the same time, the server should advance the expiration time to +   the later of the leases being transferred or the leases already +   present.  This allows the client to maintain lease renewal of both +   classes without special effort. + +   The servers may choose not to transfer the state information upon +   migration.  However, this choice is discouraged.  In this case, when +   the client presents state information from the original server, the +   client must be prepared to receive either NFS4ERR_STALE_CLIENTID or +   NFS4ERR_STALE_STATEID from the new server.  The client should then +   recover its state information as it normally would in response to a +   server failure.  The new server must take care to allow for the +   recovery of state information as it would in the event of server +   restart. + +8.14.2.  Replication and State + +   Since client switch-over in the case of replication is not under +   server control, the handling of state is different.  In this case, +   leases, stateids and clientids do not have validity across a +   transition from one server to another.  The client must re-establish +   its locks on the new server.  This can be compared to the re- +   establishment of locks by means of reclaim-type requests after a +   server reboot.  The difference is that the server has no provision to +   distinguish requests reclaiming locks from those obtaining new locks +   or to defer the latter.  Thus, a client re-establishing a lock on the +   new server (by means of a LOCK or OPEN request), may have the +   requests denied due to a conflicting lock.  Since replication is +   intended for read-only use of filesystems, such denial of locks +   should not pose large difficulties in practice.  When an attempt to +   re-establish a lock on a new server is denied, the client should +   treat the situation as if his original lock had been revoked. + + + + + +Shepler, et al.             Standards Track                    [Page 91] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +8.14.3.  Notification of Migrated Lease + +   In the case of lease renewal, the client may not be submitting +   requests for a filesystem that has been migrated to another server. +   This can occur because of the implicit lease renewal mechanism.  The +   client renews leases for all filesystems when submitting a request to +   any one filesystem at the server. + +   In order for the client to schedule renewal of leases that may have +   been relocated to the new server, the client must find out about +   lease relocation before those leases expire.  To accomplish this, all +   operations which implicitly renew leases for a client (i.e., OPEN, +   CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error +   NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be +   renewed has been transferred to a new server.  This condition will +   continue until the client receives an NFS4ERR_MOVED error and the +   server receives the subsequent GETATTR(fs_locations) for an access to +   each filesystem for which a lease has been moved to a new server. + +   When a client receives an NFS4ERR_LEASE_MOVED error, it should +   perform an operation on each filesystem associated with the server in +   question.  When the client receives an NFS4ERR_MOVED error, the +   client can follow the normal process to obtain the new server +   information (through the fs_locations attribute) and perform renewal +   of those leases on the new server.  If the server has not had state +   transferred to it transparently, the client will receive either +   NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server, +   as described above, and the client can then recover state information +   as it does in the event of server failure. + +8.14.4.  Migration and the Lease_time Attribute + +   In order that the client may appropriately manage its leases in the +   case of migration, the destination server must establish proper +   values for the lease_time attribute. + +   When state is transferred transparently, that state should include +   the correct value of the lease_time attribute.  The lease_time +   attribute on the destination server must never be less than that on +   the source since this would result in premature expiration of leases +   granted by the source server.  Upon migration in which state is +   transferred transparently, the client is under no obligation to re- +   fetch the lease_time attribute and may continue to use the value +   previously fetched (on the source server). + +   If state has not been transferred transparently (i.e., the client +   sees a real or simulated server reboot), the client should fetch the +   value of lease_time on the new (i.e., destination) server, and use it + + + +Shepler, et al.             Standards Track                    [Page 92] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   for subsequent locking requests.  However the server must respect a +   grace period at least as long as the lease_time on the source server, +   in order to ensure that clients have ample time to reclaim their +   locks before potentially conflicting non-reclaimed locks are granted. +   The means by which the new server obtains the value of lease_time on +   the old server is left to the server implementations.  It is not +   specified by the NFS version 4 protocol. + +9.  Client-Side Caching + +   Client-side caching of data, of file attributes, and of file names is +   essential to providing good performance with the NFS protocol. +   Providing distributed cache coherence is a difficult problem and +   previous versions of the NFS protocol have not attempted it. +   Instead, several NFS client implementation techniques have been used +   to reduce the problems that a lack of coherence poses for users. +   These techniques have not been clearly defined by earlier protocol +   specifications and it is often unclear what is valid or invalid +   client behavior. + +   The NFS version 4 protocol uses many techniques similar to those that +   have been used in previous protocol versions.  The NFS version 4 +   protocol does not provide distributed cache coherence.  However, it +   defines a more limited set of caching guarantees to allow locks and +   share reservations to be used without destructive interference from +   client side caching. + +   In addition, the NFS version 4 protocol introduces a delegation +   mechanism which allows many decisions normally made by the server to +   be made locally by clients.  This mechanism provides efficient +   support of the common cases where sharing is infrequent or where +   sharing is read-only. + +9.1.  Performance Challenges for Client-Side Caching + +   Caching techniques used in previous versions of the NFS protocol have +   been successful in providing good performance.  However, several +   scalability challenges can arise when those techniques are used with +   very large numbers of clients.  This is particularly true when +   clients are geographically distributed which classically increases +   the latency for cache revalidation requests. + +   The previous versions of the NFS protocol repeat their file data +   cache validation requests at the time the file is opened.  This +   behavior can have serious performance drawbacks.  A common case is +   one in which a file is only accessed by a single client.  Therefore, +   sharing is infrequent. + + + + +Shepler, et al.             Standards Track                    [Page 93] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   In this case, repeated reference to the server to find that no +   conflicts exist is expensive.  A better option with regards to +   performance is to allow a client that repeatedly opens a file to do +   so without reference to the server.  This is done until potentially +   conflicting operations from another client actually occur. + +   A similar situation arises in connection with file locking.  Sending +   file lock and unlock requests to the server as well as the read and +   write requests necessary to make data caching consistent with the +   locking semantics (see the section "Data Caching and File Locking") +   can severely limit performance.  When locking is used to provide +   protection against infrequent conflicts, a large penalty is incurred. +   This penalty may discourage the use of file locking by applications. + +   The NFS version 4 protocol provides more aggressive caching +   strategies with the following design goals: + +   o  Compatibility with a large range of server semantics. + +   o  Provide the same caching benefits as previous versions of the NFS +      protocol when unable to provide the more aggressive model. + +   o  Requirements for aggressive caching are organized so that a large +      portion of the benefit can be obtained even when not all of the +      requirements can be met. + +   The appropriate requirements for the server are discussed in later +   sections in which specific forms of caching are covered.  (see the +   section "Open Delegation"). + +9.2.  Delegation and Callbacks + +   Recallable delegation of server responsibilities for a file to a +   client improves performance by avoiding repeated requests to the +   server in the absence of inter-client conflict.  With the use of a +   "callback" RPC from server to client, a server recalls delegated +   responsibilities when another client engages in sharing of a +   delegated file. + +   A delegation is passed from the server to the client, specifying the +   object of the delegation and the type of delegation.  There are +   different types of delegations but each type contains a stateid to be +   used to represent the delegation when performing operations that +   depend on the delegation.  This stateid is similar to those +   associated with locks and share reservations but differs in that the +   stateid for a delegation is associated with a clientid and may be + + + + + +Shepler, et al.             Standards Track                    [Page 94] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   used on behalf of all the open_owners for the given client.  A +   delegation is made to the client as a whole and not to any specific +   process or thread of control within it. + +   Because callback RPCs may not work in all environments (due to +   firewalls, for example), correct protocol operation does not depend +   on them.  Preliminary testing of callback functionality by means of a +   CB_NULL procedure determines whether callbacks can be supported.  The +   CB_NULL procedure checks the continuity of the callback path.  A +   server makes a preliminary assessment of callback availability to a +   given client and avoids delegating responsibilities until it has +   determined that callbacks are supported.  Because the granting of a +   delegation is always conditional upon the absence of conflicting +   access, clients must not assume that a delegation will be granted and +   they must always be prepared for OPENs to be processed without any +   delegations being granted. + +   Once granted, a delegation behaves in most ways like a lock.  There +   is an associated lease that is subject to renewal together with all +   of the other leases held by that client. + +   Unlike locks, an operation by a second client to a delegated file +   will cause the server to recall a delegation through a callback. + +   On recall, the client holding the delegation must flush modified +   state (such as modified data) to the server and return the +   delegation.  The conflicting request will not receive a response +   until the recall is complete.  The recall is considered complete when +   the client returns the delegation or the server times out on the +   recall and revokes the delegation as a result of the timeout. +   Following the resolution of the recall, the server has the +   information necessary to grant or deny the second client's request. + +   At the time the client receives a delegation recall, it may have +   substantial state that needs to be flushed to the server.  Therefore, +   the server should allow sufficient time for the delegation to be +   returned since it may involve numerous RPCs to the server.  If the +   server is able to determine that the client is diligently flushing +   state to the server as a result of the recall, the server may extend +   the usual time allowed for a recall.  However, the time allowed for +   recall completion should not be unbounded. + +   An example of this is when responsibility to mediate opens on a given +   file is delegated to a client (see the section "Open Delegation"). +   The server will not know what opens are in effect on the client. +   Without this knowledge the server will be unable to determine if the +   access and deny state for the file allows any particular open until +   the delegation for the file has been returned. + + + +Shepler, et al.             Standards Track                    [Page 95] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   A client failure or a network partition can result in failure to +   respond to a recall callback.  In this case, the server will revoke +   the delegation which in turn will render useless any modified state +   still on the client. + +9.2.1.  Delegation Recovery + +   There are three situations that delegation recovery must deal with: + +   o   Client reboot or restart + +   o   Server reboot or restart + +   o   Network partition (full or callback-only) + +   In the event the client reboots or restarts, the failure to renew +   leases will result in the revocation of record locks and share +   reservations.  Delegations, however, may be treated a bit +   differently. + +   There will be situations in which delegations will need to be +   reestablished after a client reboots or restarts.  The reason for +   this is the client may have file data stored locally and this data +   was associated with the previously held delegations.  The client will +   need to reestablish the appropriate file state on the server. + +   To allow for this type of client recovery, the server MAY extend the +   period for delegation recovery beyond the typical lease expiration +   period.  This implies that requests from other clients that conflict +   with these delegations will need to wait.  Because the normal recall +   process may require significant time for the client to flush changed +   state to the server, other clients need be prepared for delays that +   occur because of a conflicting delegation.  This longer interval +   would increase the window for clients to reboot and consult stable +   storage so that the delegations can be reclaimed.  For open +   delegations, such delegations are reclaimed using OPEN with a claim +   type of CLAIM_DELEGATE_PREV.  (See the sections on "Data Caching and +   Revocation" and "Operation 18: OPEN" for discussion of open +   delegation and the details of OPEN respectively). + +   A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it +   does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and +   instead MUST, for a period of time no less than that of the value of +   the lease_time attribute, maintain the client's delegations to allow +   time for the client to issue CLAIM_DELEGATE_PREV requests.  The +   server that supports CLAIM_DELEGATE_PREV MUST support the DELEGPURGE +   operation. + + + + +Shepler, et al.             Standards Track                    [Page 96] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   When the server reboots or restarts, delegations are reclaimed (using +   the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to +   record locks and share reservations.  However, there is a slight +   semantic difference.  In the normal case if the server decides that a +   delegation should not be granted, it performs the requested action +   (e.g., OPEN) without granting any delegation.  For reclaim, the +   server grants the delegation but a special designation is applied so +   that the client treats the delegation as having been granted but +   recalled by the server.  Because of this, the client has the duty to +   write all modified state to the server and then return the +   delegation.  This process of handling delegation reclaim reconciles +   three principles of the NFS version 4 protocol: + +   o  Upon reclaim, a client reporting resources assigned to it by an +      earlier server instance must be granted those resources. + +   o  The server has unquestionable authority to determine whether +      delegations are to be granted and, once granted, whether they are +      to be continued. + +   o  The use of callbacks is not to be depended upon until the client +      has proven its ability to receive them. + +   When a network partition occurs, delegations are subject to freeing +   by the server when the lease renewal period expires.  This is similar +   to the behavior for locks and share reservations.  For delegations, +   however, the server may extend the period in which conflicting +   requests are held off.  Eventually the occurrence of a conflicting +   request from another client will cause revocation of the delegation. +   A loss of the callback path (e.g., by later network configuration +   change) will have the same effect.  A recall request will fail and +   revocation of the delegation will result. + +   A client normally finds out about revocation of a delegation when it +   uses a stateid associated with a delegation and receives the error +   NFS4ERR_EXPIRED.  It also may find out about delegation revocation +   after a client reboot when it attempts to reclaim a delegation and +   receives that same error.  Note that in the case of a revoked write +   open delegation, there are issues because data may have been modified +   by the client whose delegation is revoked and separately by other +   clients.  See the section "Revocation Recovery for Write Open +   Delegation" for a discussion of such issues.  Note also that when +   delegations are revoked, information about the revoked delegation +   will be written by the server to stable storage (as described in the +   section "Crash Recovery").  This is done to deal with the case in +   which a server reboots after revoking a delegation but before the +   client holding the revoked delegation is notified about the +   revocation. + + + +Shepler, et al.             Standards Track                    [Page 97] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +9.3.  Data Caching + +   When applications share access to a set of files, they need to be +   implemented so as to take account of the possibility of conflicting +   access by another application.  This is true whether the applications +   in question execute on different clients or reside on the same +   client. + +   Share reservations and record locks are the facilities the NFS +   version 4 protocol provides to allow applications to coordinate +   access by providing mutual exclusion facilities.  The NFS version 4 +   protocol's data caching must be implemented such that it does not +   invalidate the assumptions that those using these facilities depend +   upon. + +9.3.1.  Data Caching and OPENs + +   In order to avoid invalidating the sharing assumptions that +   applications rely on, NFS version 4 clients should not provide cached +   data to applications or modify it on behalf of an application when it +   would not be valid to obtain or modify that same data via a READ or +   WRITE operation. + +   Furthermore, in the absence of open delegation (see the section "Open +   Delegation") two additional rules apply.  Note that these rules are +   obeyed in practice by many NFS version 2 and version 3 clients. + +   o  First, cached data present on a client must be revalidated after +      doing an OPEN.  Revalidating means that the client fetches the +      change attribute from the server, compares it with the cached +      change attribute, and if different, declares the cached data (as +      well as the cached attributes) as invalid.  This is to ensure that +      the data for the OPENed file is still correctly reflected in the +      client's cache.  This validation must be done at least when the +      client's OPEN operation includes DENY=WRITE or BOTH thus +      terminating a period in which other clients may have had the +      opportunity to open the file with WRITE access.  Clients may +      choose to do the revalidation more often (i.e., at OPENs +      specifying DENY=NONE) to parallel the NFS version 3 protocol's +      practice for the benefit of users assuming this degree of cache +      revalidation. + +      Since the change attribute is updated for data and metadata +      modifications, some client implementors may be tempted to use the +      time_modify attribute and not change to validate cached data, so +      that metadata changes do not spuriously invalidate clean data. +      The implementor is cautioned in this approach.  The change +      attribute is guaranteed to change for each update to the file, + + + +Shepler, et al.             Standards Track                    [Page 98] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      whereas time_modify is guaranteed to change only at the +      granularity of the time_delta attribute.  Use by the client's data +      cache validation logic of time_modify and not change runs the risk +      of the client incorrectly marking stale data as valid. + +   o  Second, modified data must be flushed to the server before closing +      a file OPENed for write.  This is complementary to the first rule. +      If the data is not flushed at CLOSE, the revalidation done after +      client OPENs as file is unable to achieve its purpose.  The other +      aspect to flushing the data before close is that the data must be +      committed to stable storage, at the server, before the CLOSE +      operation is requested by the client.  In the case of a server +      reboot or restart and a CLOSEd file, it may not be possible to +      retransmit the data to be written to the file.  Hence, this +      requirement. + +9.3.2.  Data Caching and File Locking + +   For those applications that choose to use file locking instead of +   share reservations to exclude inconsistent file access, there is an +   analogous set of constraints that apply to client side data caching. +   These rules are effective only if the file locking is used in a way +   that matches in an equivalent way the actual READ and WRITE +   operations executed.  This is as opposed to file locking that is +   based on pure convention.  For example, it is possible to manipulate +   a two-megabyte file by dividing the file into two one-megabyte +   regions and protecting access to the two regions by file locks on +   bytes zero and one.  A lock for write on byte zero of the file would +   represent the right to do READ and WRITE operations on the first +   region.  A lock for write on byte one of the file would represent the +   right to do READ and WRITE operations on the second region.  As long +   as all applications manipulating the file obey this convention, they +   will work on a local filesystem.  However, they may not work with the +   NFS version 4 protocol unless clients refrain from data caching. + +   The rules for data caching in the file locking environment are: + +   o  First, when a client obtains a file lock for a particular region, +      the data cache corresponding to that region (if any cached data +      exists) must be revalidated.  If the change attribute indicates +      that the file may have been updated since the cached data was +      obtained, the client must flush or invalidate the cached data for +      the newly locked region.  A client might choose to invalidate all +      of non-modified cached data that it has for the file but the only +      requirement for correct operation is to invalidate all of the data +      in the newly locked region. + + + + + +Shepler, et al.             Standards Track                    [Page 99] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  Second, before releasing a write lock for a region, all modified +      data for that region must be flushed to the server.  The modified +      data must also be written to stable storage. + +   Note that flushing data to the server and the invalidation of cached +   data must reflect the actual byte ranges locked or unlocked. +   Rounding these up or down to reflect client cache block boundaries +   will cause problems if not carefully done.  For example, writing a +   modified block when only half of that block is within an area being +   unlocked may cause invalid modification to the region outside the +   unlocked area.  This, in turn, may be part of a region locked by +   another client.  Clients can avoid this situation by synchronously +   performing portions of write operations that overlap that portion +   (initial or final) that is not a full block.  Similarly, invalidating +   a locked area which is not an integral number of full buffer blocks +   would require the client to read one or two partial blocks from the +   server if the revalidation procedure shows that the data which the +   client possesses may not be valid. + +   The data that is written to the server as a prerequisite to the +   unlocking of a region must be written, at the server, to stable +   storage.  The client may accomplish this either with synchronous +   writes or by following asynchronous writes with a COMMIT operation. +   This is required because retransmission of the modified data after a +   server reboot might conflict with a lock held by another client. + +   A client implementation may choose to accommodate applications which +   use record locking in non-standard ways (e.g., using a record lock as +   a global semaphore) by flushing to the server more data upon an LOCKU +   than is covered by the locked range.  This may include modified data +   within files other than the one for which the unlocks are being done. +   In such cases, the client must not interfere with applications whose +   READs and WRITEs are being done only within the bounds of record +   locks which the application holds.  For example, an application locks +   a single byte of a file and proceeds to write that single byte.  A +   client that chose to handle a LOCKU by flushing all modified data to +   the server could validly write that single byte in response to an +   unrelated unlock.  However, it would not be valid to write the entire +   block in which that single written byte was located since it includes +   an area that is not locked and might be locked by another client. +   Client implementations can avoid this problem by dividing files with +   modified data into those for which all modifications are done to +   areas covered by an appropriate record lock and those for which there +   are modifications not covered by a record lock.  Any writes done for +   the former class of files must not include areas not locked and thus +   not modified on the client. + + + + + +Shepler, et al.             Standards Track                   [Page 100] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +9.3.3.  Data Caching and Mandatory File Locking + +   Client side data caching needs to respect mandatory file locking when +   it is in effect.  The presence of mandatory file locking for a given +   file is indicated when the client gets back NFS4ERR_LOCKED from a +   READ or WRITE on a file it has an appropriate share reservation for. +   When mandatory locking is in effect for a file, the client must check +   for an appropriate file lock for data being read or written.  If a +   lock exists for the range being read or written, the client may +   satisfy the request using the client's validated cache.  If an +   appropriate file lock is not held for the range of the read or write, +   the read or write request must not be satisfied by the client's cache +   and the request must be sent to the server for processing.  When a +   read or write request partially overlaps a locked region, the request +   should be subdivided into multiple pieces with each region (locked or +   not) treated appropriately. + +9.3.4.  Data Caching and File Identity + +   When clients cache data, the file data needs to be organized +   according to the filesystem object to which the data belongs.  For +   NFS version 3 clients, the typical practice has been to assume for +   the purpose of caching that distinct filehandles represent distinct +   filesystem objects.  The client then has the choice to organize and +   maintain the data cache on this basis. + +   In the NFS version 4 protocol, there is now the possibility to have +   significant deviations from a "one filehandle per object" model +   because a filehandle may be constructed on the basis of the object's +   pathname.  Therefore, clients need a reliable method to determine if +   two filehandles designate the same filesystem object.  If clients +   were simply to assume that all distinct filehandles denote distinct +   objects and proceed to do data caching on this basis, caching +   inconsistencies would arise between the distinct client side objects +   which mapped to the same server side object. + +   By providing a method to differentiate filehandles, the NFS version 4 +   protocol alleviates a potential functional regression in comparison +   with the NFS version 3 protocol.  Without this method, caching +   inconsistencies within the same client could occur and this has not +   been present in previous versions of the NFS protocol.  Note that it +   is possible to have such inconsistencies with applications executing +   on multiple clients but that is not the issue being addressed here. + +   For the purposes of data caching, the following steps allow an NFS +   version 4 client to determine whether two distinct filehandles denote +   the same server side object: + + + + +Shepler, et al.             Standards Track                   [Page 101] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  If GETATTR directed to two filehandles returns different values of +      the fsid attribute, then the filehandles represent distinct +      objects. + +   o  If GETATTR for any file with an fsid that matches the fsid of the +      two filehandles in question returns a unique_handles attribute +      with a value of TRUE, then the two objects are distinct. + +   o  If GETATTR directed to the two filehandles does not return the +      fileid attribute for both of the handles, then it cannot be +      determined whether the two objects are the same.  Therefore, +      operations which depend on that knowledge (e.g., client side data +      caching) cannot be done reliably. + +   o  If GETATTR directed to the two filehandles returns different +      values for the fileid attribute, then they are distinct objects. + +   o  Otherwise they are the same object. + +9.4.  Open Delegation + +   When a file is being OPENed, the server may delegate further handling +   of opens and closes for that file to the opening client.  Any such +   delegation is recallable, since the circumstances that allowed for +   the delegation are subject to change.  In particular, the server may +   receive a conflicting OPEN from another client, the server must +   recall the delegation before deciding whether the OPEN from the other +   client may be granted.  Making a delegation is up to the server and +   clients should not assume that any particular OPEN either will or +   will not result in an open delegation.  The following is a typical +   set of conditions that servers might use in deciding whether OPEN +   should be delegated: + +   o  The client must be able to respond to the server's callback +      requests.  The server will use the CB_NULL procedure for a test of +      callback ability. + +   o  The client must have responded properly to previous recalls. + +   o  There must be no current open conflicting with the requested +      delegation. + +   o  There should be no current delegation that conflicts with the +      delegation being requested. + +   o  The probability of future conflicting open requests should be low +      based on the recent history of the file. + + + + +Shepler, et al.             Standards Track                   [Page 102] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  The existence of any server-specific semantics of OPEN/CLOSE that +      would make the required handling incompatible with the prescribed +      handling that the delegated client would apply (see below). + +   There are two types of open delegations, read and write.  A read open +   delegation allows a client to handle, on its own, requests to open a +   file for reading that do not deny read access to others.  Multiple +   read open delegations may be outstanding simultaneously and do not +   conflict.  A write open delegation allows the client to handle, on +   its own, all opens.  Only one write open delegation may exist for a +   given file at a given time and it is inconsistent with any read open +   delegations. + +   When a client has a read open delegation, it may not make any changes +   to the contents or attributes of the file but it is assured that no +   other client may do so.  When a client has a write open delegation, +   it may modify the file data since no other client will be accessing +   the file's data.  The client holding a write delegation may only +   affect file attributes which are intimately connected with the file +   data:  size, time_modify, change. + +   When a client has an open delegation, it does not send OPENs or +   CLOSEs to the server but updates the appropriate status internally. +   For a read open delegation, opens that cannot be handled locally +   (opens for write or that deny read access) must be sent to the +   server. + +   When an open delegation is made, the response to the OPEN contains an +   open delegation structure which specifies the following: + +   o  the type of delegation (read or write) + +   o  space limitation information to control flushing of data on close +      (write open delegation only, see the section "Open Delegation and +      Data Caching") + +   o  an nfsace4 specifying read and write permissions + +   o  a stateid to represent the delegation for READ and WRITE + +   The delegation stateid is separate and distinct from the stateid for +   the OPEN proper.  The standard stateid, unlike the delegation +   stateid, is associated with a particular lock_owner and will continue +   to be valid after the delegation is recalled and the file remains +   open. + + + + + + +Shepler, et al.             Standards Track                   [Page 103] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   When a request internal to the client is made to open a file and open +   delegation is in effect, it will be accepted or rejected solely on +   the basis of the following conditions.  Any requirement for other +   checks to be made by the delegate should result in open delegation +   being denied so that the checks can be made by the server itself. + +   o  The access and deny bits for the request and the file as described +      in the section "Share Reservations". + +   o  The read and write permissions as determined below. + +   The nfsace4 passed with delegation can be used to avoid frequent +   ACCESS calls.  The permission check should be as follows: + +   o  If the nfsace4 indicates that the open may be done, then it should +      be granted without reference to the server. + +   o  If the nfsace4 indicates that the open may not be done, then an +      ACCESS request must be sent to the server to obtain the definitive +      answer. + +   The server may return an nfsace4 that is more restrictive than the +   actual ACL of the file.  This includes an nfsace4 that specifies +   denial of all access.  Note that some common practices such as +   mapping the traditional user "root" to the user "nobody" may make it +   incorrect to return the actual ACL of the file in the delegation +   response. + +   The use of delegation together with various other forms of caching +   creates the possibility that no server authentication will ever be +   performed for a given user since all of the user's requests might be +   satisfied locally.  Where the client is depending on the server for +   authentication, the client should be sure authentication occurs for +   each user by use of the ACCESS operation.  This should be the case +   even if an ACCESS operation would not be required otherwise.  As +   mentioned before, the server may enforce frequent authentication by +   returning an nfsace4 denying all access with every open delegation. + +9.4.1.  Open Delegation and Data Caching + +   OPEN delegation allows much of the message overhead associated with +   the opening and closing files to be eliminated.  An open when an open +   delegation is in effect does not require that a validation message be +   sent to the server.  The continued endurance of the "read open +   delegation" provides a guarantee that no OPEN for write and thus no +   write has occurred.  Similarly, when closing a file opened for write +   and if write open delegation is in effect, the data written does not +   have to be flushed to the server until the open delegation is + + + +Shepler, et al.             Standards Track                   [Page 104] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   recalled.  The continued endurance of the open delegation provides a +   guarantee that no open and thus no read or write has been done by +   another client. + +   For the purposes of open delegation, READs and WRITEs done without an +   OPEN are treated as the functional equivalents of a corresponding +   type of OPEN.  This refers to the READs and WRITEs that use the +   special stateids consisting of all zero bits or all one bits. +   Therefore, READs or WRITEs with a special stateid done by another +   client will force the server to recall a write open delegation.  A +   WRITE with a special stateid done by another client will force a +   recall of read open delegations. + +   With delegations, a client is able to avoid writing data to the +   server when the CLOSE of a file is serviced.  The file close system +   call is the usual point at which the client is notified of a lack of +   stable storage for the modified file data generated by the +   application.  At the close, file data is written to the server and +   through normal accounting the server is able to determine if the +   available filesystem space for the data has been exceeded (i.e., +   server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT).  This accounting +   includes quotas.  The introduction of delegations requires that a +   alternative method be in place for the same type of communication to +   occur between client and server. + +   In the delegation response, the server provides either the limit of +   the size of the file or the number of modified blocks and associated +   block size.  The server must ensure that the client will be able to +   flush data to the server of a size equal to that provided in the +   original delegation.  The server must make this assurance for all +   outstanding delegations.  Therefore, the server must be careful in +   its management of available space for new or modified data taking +   into account available filesystem space and any applicable quotas. +   The server can recall delegations as a result of managing the +   available filesystem space.  The client should abide by the server's +   state space limits for delegations.  If the client exceeds the stated +   limits for the delegation, the server's behavior is undefined. + +   Based on server conditions, quotas or available filesystem space, the +   server may grant write open delegations with very restrictive space +   limitations.  The limitations may be defined in a way that will +   always force modified data to be flushed to the server on close. + +   With respect to authentication, flushing modified data to the server +   after a CLOSE has occurred may be problematic.  For example, the user +   of the application may have logged off the client and unexpired +   authentication credentials may not be present.  In this case, the +   client may need to take special care to ensure that local unexpired + + + +Shepler, et al.             Standards Track                   [Page 105] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   credentials will in fact be available.  This may be accomplished by +   tracking the expiration time of credentials and flushing data well in +   advance of their expiration or by making private copies of +   credentials to assure their availability when needed. + +9.4.2.  Open Delegation and File Locks + +   When a client holds a write open delegation, lock operations may be +   performed locally.  This includes those required for mandatory file +   locking.  This can be done since the delegation implies that there +   can be no conflicting locks.  Similarly, all of the revalidations +   that would normally be associated with obtaining locks and the +   flushing of data associated with the releasing of locks need not be +   done. + +   When a client holds a read open delegation, lock operations are not +   performed locally.  All lock operations, including those requesting +   non-exclusive locks, are sent to the server for resolution. + +9.4.3.  Handling of CB_GETATTR + +   The server needs to employ special handling for a GETATTR where the +   target is a file that has a write open delegation in effect.  The +   reason for this is that the client holding the write delegation may +   have modified the data and the server needs to reflect this change to +   the second client that submitted the GETATTR.  Therefore, the client +   holding the write delegation needs to be interrogated.  The server +   will use the CB_GETATTR operation.  The only attributes that the +   server can reliably query via CB_GETATTR are size and change. + +   Since CB_GETATTR is being used to satisfy another client's GETATTR +   request, the server only needs to know if the client holding the +   delegation has a modified version of the file.  If the client's copy +   of the delegated file is not modified (data or size), the server can +   satisfy the second client's GETATTR request from the attributes +   stored locally at the server.  If the file is modified, the server +   only needs to know about this modified state.  If the server +   determines that the file is currently modified, it will respond to +   the second client's GETATTR as if the file had been modified locally +   at the server. + +   Since the form of the change attribute is determined by the server +   and is opaque to the client, the client and server need to agree on a +   method of communicating the modified state of the file.  For the size +   attribute, the client will report its current view of the file size. + +   For the change attribute, the handling is more involved. + + + + +Shepler, et al.             Standards Track                   [Page 106] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   For the client, the following steps will be taken when receiving a +   write delegation: + +   o  The value of the change attribute will be obtained from the server +      and cached.  Let this value be represented by c. + +   o  The client will create a value greater than c that will be used +      for communicating modified data is held at the client.  Let this +      value be represented by d. + +   o  When the client is queried via CB_GETATTR for the change +      attribute, it checks to see if it holds modified data.  If the +      file is modified, the value d is returned for the change attribute +      value.  If this file is not currently modified, the client returns +      the value c for the change attribute. + +   For simplicity of implementation, the client MAY for each CB_GETATTR +   return the same value d.  This is true even if, between successive +   CB_GETATTR operations, the client again modifies in the file's data +   or metadata in its cache.  The client can return the same value +   because the only requirement is that the client be able to indicate +   to the server that the client holds modified data.  Therefore, the +   value of d may always be c + 1. + +   While the change attribute is opaque to the client in the sense that +   it has no idea what units of time, if any, the server is counting +   change with, it is not opaque in that the client has to treat it as +   an unsigned integer, and the server has to be able to see the results +   of the client's changes to that integer.  Therefore, the server MUST +   encode the change attribute in network order when sending it to the +   client.  The client MUST decode it from network order to its native +   order when receiving it and the client MUST encode it network order +   when sending it to the server.  For this reason, change is defined as +   an unsigned integer rather than an opaque array of octets. + +   For the server, the following steps will be taken when providing a +   write delegation: + +   o  Upon providing a write delegation, the server will cache a copy of +      the change attribute in the data structure it uses to record the +      delegation.  Let this value be represented by sc. + +   o  When a second client sends a GETATTR operation on the same file to +      the server, the server obtains the change attribute from the first +      client.  Let this value be cc. + + + + + + +Shepler, et al.             Standards Track                   [Page 107] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  If the value cc is equal to sc, the file is not modified and the +      server returns the current values for change, time_metadata, and +      time_modify (for example) to the second client. + +   o  If the value cc is NOT equal to sc, the file is currently modified +      at the first client and most likely will be modified at the server +      at a future time.  The server then uses its current time to +      construct attribute values for time_metadata and time_modify.  A +      new value of sc, which we will call nsc, is computed by the +      server, such that nsc >= sc + 1.  The server then returns the +      constructed time_metadata, time_modify, and nsc values to the +      requester.  The server replaces sc in the delegation record with +      nsc.  To prevent the possibility of time_modify, time_metadata, +      and change from appearing to go backward (which would happen if +      the client holding the delegation fails to write its modified data +      to the server before the delegation is revoked or returned), the +      server SHOULD update the file's metadata record with the +      constructed attribute values.  For reasons of reasonable +      performance, committing the constructed attribute values to stable +      storage is OPTIONAL. + +      As discussed earlier in this section, the client MAY return the +      same cc value on subsequent CB_GETATTR calls, even if the file was +      modified in the client's cache yet again between successive +      CB_GETATTR calls.  Therefore, the server must assume that the file +      has been modified yet again, and MUST take care to ensure that the +      new nsc it constructs and returns is greater than the previous nsc +      it returned.  An example implementation's delegation record would +      satisfy this mandate by including a boolean field (let us call it +      "modified") that is set to false when the delegation is granted, +      and an sc value set at the time of grant to the change attribute +      value.  The modified field would be set to true the first time cc +      != sc, and would stay true until the delegation is returned or +      revoked.  The processing for constructing nsc, time_modify, and +      time_metadata would use this pseudo code: + +      if (!modified) { +          do CB_GETATTR for change and size; + +             if (cc != sc) +                 modified = TRUE; +         } else { +                 do CB_GETATTR for size; +         } + +         if (modified) { +             sc = sc + 1; +          time_modify = time_metadata = current_time; + + + +Shepler, et al.             Standards Track                   [Page 108] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +          update sc, time_modify, time_metadata into file's metadata; +      } + +      return to client (that sent GETATTR) the attributes +         it requested, but make sure size comes from what +         CB_GETATTR returned.  Do not update the file's metadata +         with the client's modified size. + +   o  In the case that the file attribute size is different than the +      server's current value, the server treats this as a modification +      regardless of the value of the change attribute retrieved via +      CB_GETATTR and responds to the second client as in the last step. + +   This methodology resolves issues of clock differences between client +   and server and other scenarios where the use of CB_GETATTR break +   down. + +   It should be noted that the server is under no obligation to use +   CB_GETATTR and therefore the server MAY simply recall the delegation +   to avoid its use. + +9.4.4.  Recall of Open Delegation + +   The following events necessitate recall of an open delegation: + +   o  Potentially conflicting OPEN request (or READ/WRITE done with +      "special" stateid) + +   o  SETATTR issued by another client + +   o  REMOVE request for the file + +   o  RENAME request for the file as either source or target of the +      RENAME + +   Whether a RENAME of a directory in the path leading to the file +   results in recall of an open delegation depends on the semantics of +   the server filesystem.  If that filesystem denies such RENAMEs when a +   file is open, the recall must be performed to determine whether the +   file in question is, in fact, open. + +   In addition to the situations above, the server may choose to recall +   open delegations at any time if resource constraints make it +   advisable to do so.  Clients should always be prepared for the +   possibility of recall. + + + + + + +Shepler, et al.             Standards Track                   [Page 109] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   When a client receives a recall for an open delegation, it needs to +   update state on the server before returning the delegation.  These +   same updates must be done whenever a client chooses to return a +   delegation voluntarily.  The following items of state need to be +   dealt with: + +   o  If the file associated with the delegation is no longer open and +      no previous CLOSE operation has been sent to the server, a CLOSE +      operation must be sent to the server. + +   o  If a file has other open references at the client, then OPEN +      operations must be sent to the server.  The appropriate stateids +      will be provided by the server for subsequent use by the client +      since the delegation stateid will not longer be valid.  These OPEN +      requests are done with the claim type of CLAIM_DELEGATE_CUR.  This +      will allow the presentation of the delegation stateid so that the +      client can establish the appropriate rights to perform the OPEN. +      (see the section "Operation 18: OPEN" for details.) + +   o  If there are granted file locks, the corresponding LOCK operations +      need to be performed.  This applies to the write open delegation +      case only. + +   o  For a write open delegation, if at the time of recall the file is +      not open for write, all modified data for the file must be flushed +      to the server.  If the delegation had not existed, the client +      would have done this data flush before the CLOSE operation. + +   o  For a write open delegation when a file is still open at the time +      of recall, any modified data for the file needs to be flushed to +      the server. + +   o  With the write open delegation in place, it is possible that the +      file was truncated during the duration of the delegation.  For +      example, the truncation could have occurred as a result of an OPEN +      UNCHECKED with a size attribute value of zero.  Therefore, if a +      truncation of the file has occurred and this operation has not +      been propagated to the server, the truncation must occur before +      any modified data is written to the server. + +   In the case of write open delegation, file locking imposes some +   additional requirements.  To precisely maintain the associated +   invariant, it is required to flush any modified data in any region +   for which a write lock was released while the write delegation was in +   effect.  However, because the write open delegation implies no other +   locking by other clients, a simpler implementation is to flush all +   modified data for the file (as described just above) if any write +   lock has been released while the write open delegation was in effect. + + + +Shepler, et al.             Standards Track                   [Page 110] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   An implementation need not wait until delegation recall (or deciding +   to voluntarily return a delegation) to perform any of the above +   actions, if implementation considerations (e.g., resource +   availability constraints) make that desirable.  Generally, however, +   the fact that the actual open state of the file may continue to +   change makes it not worthwhile to send information about opens and +   closes to the server, except as part of delegation return.  Only in +   the case of closing the open that resulted in obtaining the +   delegation would clients be likely to do this early, since, in that +   case, the close once done will not be undone.  Regardless of the +   client's choices on scheduling these actions, all must be performed +   before the delegation is returned, including (when applicable) the +   close that corresponds to the open that resulted in the delegation. +   These actions can be performed either in previous requests or in +   previous operations in the same COMPOUND request. + +9.4.5.  Clients that Fail to Honor Delegation Recalls + +   A client may fail to respond to a recall for various reasons, such as +   a failure of the callback path from server to the client.  The client +   may be unaware of a failure in the callback path.  This lack of +   awareness could result in the client finding out long after the +   failure that its delegation has been revoked, and another client has +   modified the data for which the client had a delegation.  This is +   especially a problem for the client that held a write delegation. + +   The server also has a dilemma in that the client that fails to +   respond to the recall might also be sending other NFS requests, +   including those that renew the lease before the lease expires. +   Without returning an error for those lease renewing operations, the +   server leads the client to believe that the delegation it has is in +   force. + +   This difficulty is solved by the following rules: + +   o  When the callback path is down, the server MUST NOT revoke the +      delegation if one of the following occurs: + +      -  The client has issued a RENEW operation and the server has +         returned an NFS4ERR_CB_PATH_DOWN error.  The server MUST renew +         the lease for any record locks and share reservations the +         client has that the server has known about (as opposed to those +         locks and share reservations the client has established but not +         yet sent to the server, due to the delegation).  The server +         SHOULD give the client a reasonable time to return its +         delegations to the server before revoking the client's +         delegations. + + + + +Shepler, et al.             Standards Track                   [Page 111] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      -  The client has not issued a RENEW operation for some period of +         time after the server attempted to recall the delegation.  This +         period of time MUST NOT be less than the value of the +         lease_time attribute. + +   o  When the client holds a delegation, it can not rely on operations, +      except for RENEW, that take a stateid, to renew delegation leases +      across callback path failures.  The client that wants to keep +      delegations in force across callback path failures must use RENEW +      to do so. + +9.4.6.  Delegation Revocation + +   At the point a delegation is revoked, if there are associated opens +   on the client, the applications holding these opens need to be +   notified.  This notification usually occurs by returning errors for +   READ/WRITE operations or when a close is attempted for the open file. + +   If no opens exist for the file at the point the delegation is +   revoked, then notification of the revocation is unnecessary. +   However, if there is modified data present at the client for the +   file, the user of the application should be notified.  Unfortunately, +   it may not be possible to notify the user since active applications +   may not be present at the client.  See the section "Revocation +   Recovery for Write Open Delegation" for additional details. + +9.5.  Data Caching and Revocation + +   When locks and delegations are revoked, the assumptions upon which +   successful caching depend are no longer guaranteed.  For any locks or +   share reservations that have been revoked, the corresponding owner +   needs to be notified.  This notification includes applications with a +   file open that has a corresponding delegation which has been revoked. +   Cached data associated with the revocation must be removed from the +   client.  In the case of modified data existing in the client's cache, +   that data must be removed from the client without it being written to +   the server.  As mentioned, the assumptions made by the client are no +   longer valid at the point when a lock or delegation has been revoked. +   For example, another client may have been granted a conflicting lock +   after the revocation of the lock at the first client.  Therefore, the +   data within the lock range may have been modified by the other +   client.  Obviously, the first client is unable to guarantee to the +   application what has occurred to the file in the case of revocation. + +   Notification to a lock owner will in many cases consist of simply +   returning an error on the next and all subsequent READs/WRITEs to the +   open file or on the close.  Where the methods available to a client +   make such notification impossible because errors for certain + + + +Shepler, et al.             Standards Track                   [Page 112] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   operations may not be returned, more drastic action such as signals +   or process termination may be appropriate.  The justification for +   this is that an invariant for which an application depends on may be +   violated.  Depending on how errors are typically treated for the +   client operating environment, further levels of notification +   including logging, console messages, and GUI pop-ups may be +   appropriate. + +9.5.1.  Revocation Recovery for Write Open Delegation + +   Revocation recovery for a write open delegation poses the special +   issue of modified data in the client cache while the file is not +   open.  In this situation, any client which does not flush modified +   data to the server on each close must ensure that the user receives +   appropriate notification of the failure as a result of the +   revocation.  Since such situations may require human action to +   correct problems, notification schemes in which the appropriate user +   or administrator is notified may be necessary.  Logging and console +   messages are typical examples. + +   If there is modified data on the client, it must not be flushed +   normally to the server.  A client may attempt to provide a copy of +   the file data as modified during the delegation under a different +   name in the filesystem name space to ease recovery.  Note that when +   the client can determine that the file has not been modified by any +   other client, or when the client has a complete cached copy of file +   in question, such a saved copy of the client's view of the file may +   be of particular value for recovery.  In other case, recovery using a +   copy of the file based partially on the client's cached data and +   partially on the server copy as modified by other clients, will be +   anything but straightforward, so clients may avoid saving file +   contents in these situations or mark the results specially to warn +   users of possible problems. + +   Saving of such modified data in delegation revocation situations may +   be limited to files of a certain size or might be used only when +   sufficient disk space is available within the target filesystem. +   Such saving may also be restricted to situations when the client has +   sufficient buffering resources to keep the cached copy available +   until it is properly stored to the target filesystem. + +9.6.  Attribute Caching + +   The attributes discussed in this section do not include named +   attributes.  Individual named attributes are analogous to files and +   caching of the data for these needs to be handled just as data + + + + + +Shepler, et al.             Standards Track                   [Page 113] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   caching is for ordinary files.  Similarly, LOOKUP results from an +   OPENATTR directory are to be cached on the same basis as any other +   pathnames and similarly for directory contents. + +   Clients may cache file attributes obtained from the server and use +   them to avoid subsequent GETATTR requests.  Such caching is write +   through in that modification to file attributes is always done by +   means of requests to the server and should not be done locally and +   cached.  The exception to this are modifications to attributes that +   are intimately connected with data caching.  Therefore, extending a +   file by writing data to the local data cache is reflected immediately +   in the size as seen on the client without this change being +   immediately reflected on the server.  Normally such changes are not +   propagated directly to the server but when the modified data is +   flushed to the server, analogous attribute changes are made on the +   server.  When open delegation is in effect, the modified attributes +   may be returned to the server in the response to a CB_RECALL call. + +   The result of local caching of attributes is that the attribute +   caches maintained on individual clients will not be coherent. +   Changes made in one order on the server may be seen in a different +   order on one client and in a third order on a different client. + +   The typical filesystem application programming interfaces do not +   provide means to atomically modify or interrogate attributes for +   multiple files at the same time.  The following rules provide an +   environment where the potential incoherences mentioned above can be +   reasonably managed.  These rules are derived from the practice of +   previous NFS protocols. + +   o  All attributes for a given file (per-fsid attributes excepted) are +      cached as a unit at the client so that no non-serializability can +      arise within the context of a single file. + +   o  An upper time boundary is maintained on how long a client cache +      entry can be kept without being refreshed from the server. + +   o  When operations are performed that change attributes at the +      server, the updated attribute set is requested as part of the +      containing RPC.  This includes directory operations that update +      attributes indirectly.  This is accomplished by following the +      modifying operation with a GETATTR operation and then using the +      results of the GETATTR to update the client's cached attributes. + +   Note that if the full set of attributes to be cached is requested by +   READDIR, the results can be cached by the client on the same basis as +   attributes obtained via GETATTR. + + + + +Shepler, et al.             Standards Track                   [Page 114] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   A client may validate its cached version of attributes for a file by +   fetching just both the change and time_access attributes and assuming +   that if the change attribute has the same value as it did when the +   attributes were cached, then no attributes other than time_access +   have changed.  The reason why time_access is also fetched is because +   many servers operate in environments where the operation that updates +   change does not update time_access.  For example, POSIX file +   semantics do not update access time when a file is modified by the +   write system call.  Therefore, the client that wants a current +   time_access value should fetch it with change during the attribute +   cache validation processing and update its cached time_access. + +   The client may maintain a cache of modified attributes for those +   attributes intimately connected with data of modified regular files +   (size, time_modify, and change).  Other than those three attributes, +   the client MUST NOT maintain a cache of modified attributes. +   Instead, attribute changes are immediately sent to the server. + +   In some operating environments, the equivalent to time_access is +   expected to be implicitly updated by each read of the content of the +   file object.  If an NFS client is caching the content of a file +   object, whether it is a regular file, directory, or symbolic link, +   the client SHOULD NOT update the time_access attribute (via SETATTR +   or a small READ or READDIR request) on the server with each read that +   is satisfied from cache.  The reason is that this can defeat the +   performance benefits of caching content, especially since an explicit +   SETATTR of time_access may alter the change attribute on the server. +   If the change attribute changes, clients that are caching the content +   will think the content has changed, and will re-read unmodified data +   from the server.  Nor is the client encouraged to maintain a modified +   version of time_access in its cache, since this would mean that the +   client will either eventually have to write the access time to the +   server with bad performance effects, or it would never update the +   server's time_access, thereby resulting in a situation where an +   application that caches access time between a close and open of the +   same file observes the access time oscillating between the past and +   present.  The time_access attribute always means the time of last +   access to a file by a read that was satisfied by the server.  This +   way clients will tend to see only time_access changes that go forward +   in time. + +9.7.  Data and Metadata Caching and Memory Mapped Files + +   Some operating environments include the capability for an application +   to map a file's content into the application's address space.  Each +   time the application accesses a memory location that corresponds to a +   block that has not been loaded into the address space, a page fault +   occurs and the file is read (or if the block does not exist in the + + + +Shepler, et al.             Standards Track                   [Page 115] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   file, the block is allocated and then instantiated in the +   application's address space). + +   As long as each memory mapped access to the file requires a page +   fault, the relevant attributes of the file that are used to detect +   access and modification (time_access, time_metadata, time_modify, and +   change) will be updated.  However, in many operating environments, +   when page faults are not required these attributes will not be +   updated on reads or updates to the file via memory access (regardless +   whether the file is local file or is being access remotely).  A +   client or server MAY fail to update attributes of a file that is +   being accessed via memory mapped I/O.  This has several implications: + +   o  If there is an application on the server that has memory mapped a +      file that a client is also accessing, the client may not be able +      to get a consistent value of the change attribute to determine +      whether its cache is stale or not.  A server that knows that the +      file is memory mapped could always pessimistically return updated +      values for change so as to force the application to always get the +      most up to date data and metadata for the file.  However, due to +      the negative performance implications of this, such behavior is +      OPTIONAL. + +   o  If the memory mapped file is not being modified on the server, and +      instead is just being read by an application via the memory mapped +      interface, the client will not see an updated time_access +      attribute.  However, in many operating environments, neither will +      any process running on the server.  Thus NFS clients are at no +      disadvantage with respect to local processes. + +   o  If there is another client that is memory mapping the file, and if +      that client is holding a write delegation, the same set of issues +      as discussed in the previous two bullet items apply.  So, when a +      server does a CB_GETATTR to a file that the client has modified in +      its cache, the response from CB_GETATTR will not necessarily be +      accurate.  As discussed earlier, the client's obligation is to +      report that the file has been modified since the delegation was +      granted, not whether it has been modified again between successive +      CB_GETATTR calls, and the server MUST assume that any file the +      client has modified in cache has been modified again between +      successive CB_GETATTR calls.  Depending on the nature of the +      client's memory management system, this weak obligation may not be +      possible.  A client MAY return stale information in CB_GETATTR +      whenever the file is memory mapped. + +   o  The mixture of memory mapping and file locking on the same file is +      problematic.  Consider the following scenario, where the page size +      on each client is 8192 bytes. + + + +Shepler, et al.             Standards Track                   [Page 116] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      -  Client A memory maps first page (8192 bytes) of file X + +      -  Client B memory maps first page (8192 bytes) of file X + +      -  Client A write locks first 4096 bytes + +      -  Client B write locks second 4096 bytes + +      -  Client A, via a STORE instruction modifies part of its locked +         region. + +      -  Simultaneous to client A, client B issues a STORE on part of +         its locked region. + +   Here the challenge is for each client to resynchronize to get a +   correct view of the first page.  In many operating environments, the +   virtual memory management systems on each client only know a page is +   modified, not that a subset of the page corresponding to the +   respective lock regions has been modified.  So it is not possible for +   each client to do the right thing, which is to only write to the +   server that portion of the page that is locked. For example, if +   client A simply writes out the page, and then client B writes out the +   page, client A's data is lost. + +   Moreover, if mandatory locking is enabled on the file, then we have a +   different problem.  When clients A and B issue the STORE +   instructions, the resulting page faults require a record lock on the +   entire page.  Each client then tries to extend their locked range to +   the entire page, which results in a deadlock. + +   Communicating the NFS4ERR_DEADLOCK error to a STORE instruction is +   difficult at best. + +   If a client is locking the entire memory mapped file, there is no +   problem with advisory or mandatory record locking, at least until the +   client unlocks a region in the middle of the file. + +   Given the above issues the following are permitted: + +   -  Clients and servers MAY deny memory mapping a file they know there +      are record locks for. + +   -  Clients and servers MAY deny a record lock on a file they know is +      memory mapped. + + + + + + + +Shepler, et al.             Standards Track                   [Page 117] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   -  A client MAY deny memory mapping a file that it knows requires +      mandatory locking for I/O.  If mandatory locking is enabled after +      the file is opened and mapped, the client MAY deny the application +      further access to its mapped file. + +9.8.  Name Caching + +   The results of LOOKUP and READDIR operations may be cached to avoid +   the cost of subsequent LOOKUP operations.  Just as in the case of +   attribute caching, inconsistencies may arise among the various client +   caches.  To mitigate the effects of these inconsistencies and given +   the context of typical filesystem APIs, an upper time boundary is +   maintained on how long a client name cache entry can be kept without +   verifying that the entry has not been made invalid by a directory +   change operation performed by another client. + +   When a client is not making changes to a directory for which there +   exist name cache entries, the client needs to periodically fetch +   attributes for that directory to ensure that it is not being +   modified.  After determining that no modification has occurred, the +   expiration time for the associated name cache entries may be updated +   to be the current time plus the name cache staleness bound. + +   When a client is making changes to a given directory, it needs to +   determine whether there have been changes made to the directory by +   other clients.  It does this by using the change attribute as +   reported before and after the directory operation in the associated +   change_info4 value returned for the operation.  The server is able to +   communicate to the client whether the change_info4 data is provided +   atomically with respect to the directory operation.  If the change +   values are provided atomically, the client is then able to compare +   the pre-operation change value with the change value in the client's +   name cache.  If the comparison indicates that the directory was +   updated by another client, the name cache associated with the +   modified directory is purged from the client.  If the comparison +   indicates no modification, the name cache can be updated on the +   client to reflect the directory operation and the associated timeout +   extended.  The post-operation change value needs to be saved as the +   basis for future change_info4 comparisons. + +   As demonstrated by the scenario above, name caching requires that the +   client revalidate name cache data by inspecting the change attribute +   of a directory at the point when the name cache item was cached. +   This requires that the server update the change attribute for +   directories when the contents of the corresponding directory is +   modified.  For a client to use the change_info4 information +   appropriately and correctly, the server must report the pre and post +   operation change attribute values atomically.  When the server is + + + +Shepler, et al.             Standards Track                   [Page 118] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   unable to report the before and after values atomically with respect +   to the directory operation, the server must indicate that fact in the +   change_info4 return value.  When the information is not atomically +   reported, the client should not assume that other clients have not +   changed the directory. + +9.9.  Directory Caching + +   The results of READDIR operations may be used to avoid subsequent +   READDIR operations.  Just as in the cases of attribute and name +   caching, inconsistencies may arise among the various client caches. +   To mitigate the effects of these inconsistencies, and given the +   context of typical filesystem APIs, the following rules should be +   followed: + +   o  Cached READDIR information for a directory which is not obtained +      in a single READDIR operation must always be a consistent snapshot +      of directory contents.  This is determined by using a GETATTR +      before the first READDIR and after the last of READDIR that +      contributes to the cache. + +   o  An upper time boundary is maintained to indicate the length of +      time a directory cache entry is considered valid before the client +      must revalidate the cached information. + +   The revalidation technique parallels that discussed in the case of +   name caching.  When the client is not changing the directory in +   question, checking the change attribute of the directory with GETATTR +   is adequate.  The lifetime of the cache entry can be extended at +   these checkpoints.  When a client is modifying the directory, the +   client needs to use the change_info4 data to determine whether there +   are other clients modifying the directory.  If it is determined that +   no other client modifications are occurring, the client may update +   its directory cache to reflect its own changes. + +   As demonstrated previously, directory caching requires that the +   client revalidate directory cache data by inspecting the change +   attribute of a directory at the point when the directory was cached. +   This requires that the server update the change attribute for +   directories when the contents of the corresponding directory is +   modified.  For a client to use the change_info4 information +   appropriately and correctly, the server must report the pre and post +   operation change attribute values atomically.  When the server is +   unable to report the before and after values atomically with respect +   to the directory operation, the server must indicate that fact in the +   change_info4 return value.  When the information is not atomically +   reported, the client should not assume that other clients have not +   changed the directory. + + + +Shepler, et al.             Standards Track                   [Page 119] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +10.  Minor Versioning + +   To address the requirement of an NFS protocol that can evolve as the +   need arises, the NFS version 4 protocol contains the rules and +   framework to allow for future minor changes or versioning. + +   The base assumption with respect to minor versioning is that any +   future accepted minor version must follow the IETF process and be +   documented in a standards track RFC.  Therefore, each minor version +   number will correspond to an RFC.  Minor version zero of the NFS +   version 4 protocol is represented by this RFC.  The COMPOUND +   procedure will support the encoding of the minor version being +   requested by the client. + +   The following items represent the basic rules for the development of +   minor versions.  Note that a future minor version may decide to +   modify or add to the following rules as part of the minor version +   definition. + +    1.  Procedures are not added or deleted + +        To maintain the general RPC model, NFS version 4 minor versions +        will not add to or delete procedures from the NFS program. + +    2.  Minor versions may add operations to the COMPOUND and +        CB_COMPOUND procedures. + +        The addition of operations to the COMPOUND and CB_COMPOUND +        procedures does not affect the RPC model. + +    2.1 Minor versions may append attributes to GETATTR4args, bitmap4, +        and GETATTR4res. + +        This allows for the expansion of the attribute model to allow +        for future growth or adaptation. + +    2.2 Minor version X must append any new attributes after the last +        documented attribute. + +        Since attribute results are specified as an opaque array of +        per-attribute XDR encoded results, the complexity of adding new +        attributes in the midst of the current definitions will be too +        burdensome. + +    3.  Minor versions must not modify the structure of an existing +        operation's arguments or results. + + + + + +Shepler, et al.             Standards Track                   [Page 120] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +        Again the complexity of handling multiple structure definitions +        for a single operation is too burdensome.  New operations should +        be added instead of modifying existing structures for a minor +        version. + +        This rule does not preclude the following adaptations in a minor +        version. + +      o  adding bits to flag fields such as new attributes to GETATTR's +         bitmap4 data type + +      o  adding bits to existing attributes like ACLs that have flag +         words + +      o  extending enumerated types (including NFS4ERR_*) with new +         values + +    4.  Minor versions may not modify the structure of existing +        attributes. + +    5.  Minor versions may not delete operations. + +        This prevents the potential reuse of a particular operation +        "slot" in a future minor version. + +    6.  Minor versions may not delete attributes. + +    7.  Minor versions may not delete flag bits or enumeration values. + +    8.  Minor versions may declare an operation as mandatory to NOT +        implement. + +        Specifying an operation as "mandatory to not implement" is +        equivalent to obsoleting an operation.  For the client, it means +        that the operation should not be sent to the server.  For the +        server, an NFS error can be returned as opposed to "dropping" +        the request as an XDR decode error.  This approach allows for +        the obsolescence of an operation while maintaining its structure +        so that a future minor version can reintroduce the operation. + +    8.1 Minor versions may declare attributes mandatory to NOT +        implement. + +    8.2 Minor versions may declare flag bits or enumeration values as +        mandatory to NOT implement. + +    9.  Minor versions may downgrade features from mandatory to +        recommended, or recommended to optional. + + + +Shepler, et al.             Standards Track                   [Page 121] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +    10. Minor versions may upgrade features from optional to recommended +        or recommended to mandatory. + +    11. A client and server that support minor version X must support +        minor versions 0 (zero) through X-1 as well. + +    12. No new features may be introduced as mandatory in a minor +        version. + +        This rule allows for the introduction of new functionality and +        forces the use of implementation experience before designating a +        feature as mandatory. + +    13. A client MUST NOT attempt to use a stateid, filehandle, or +        similar returned object from the COMPOUND procedure with minor +        version X for another COMPOUND procedure with minor version Y, +        where X != Y. + +11.  Internationalization + +   The primary issue in which NFS version 4 needs to deal with +   internationalization, or I18N, is with respect to file names and +   other strings as used within the protocol.  The choice of string +   representation must allow reasonable name/string access to clients +   which use various languages.  The UTF-8 encoding of the UCS as +   defined by [ISO10646] allows for this type of access and follows the +   policy described in "IETF Policy on Character Sets and Languages", +   [RFC2277]. + +   [RFC3454], otherwise know as "stringprep", documents a framework for +   using Unicode/UTF-8 in networking protocols, so as "to increase the +   likelihood that string input and string comparison work in ways that +   make sense for typical users throughout the world."  A protocol must +   define a profile of stringprep "in order to fully specify the +   processing options."  The remainder of this Internationalization +   section defines the NFS version 4 stringprep profiles.  Much of +   terminology used for the remainder of this section comes from +   stringprep. + +   There are three UTF-8 string types defined for NFS version 4: +   utf8str_cs, utf8str_cis, and utf8str_mixed.  Separate profiles are +   defined for each. Each profile defines the following, as required by +   stringprep: + +   o  The intended applicability of the profile + + + + + + +Shepler, et al.             Standards Track                   [Page 122] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  The character repertoire that is the input and output to +      stringprep (which is Unicode 3.2 for referenced version of +      stringprep) + +   o  The mapping tables from stringprep used (as described in section 3 +      of stringprep) + +   o  Any additional mapping tables specific to the profile + +   o  The Unicode normalization used, if any (as described in section 4 +      of stringprep) + +   o  The tables from stringprep listing of characters that are +      prohibited as output (as described in section 5 of stringprep) + +   o  The bidirectional string testing used, if any (as described in +      section 6 of stringprep) + +   o  Any additional characters that are prohibited as output specific +      to the profile + +   Stringprep discusses Unicode characters, whereas NFS version 4 +   renders UTF-8 characters.  Since there is a one to one mapping from +   UTF-8 to Unicode, where ever the remainder of this document refers to +   to Unicode, the reader should assume UTF-8. + +   Much of the text for the profiles comes from [RFC3454]. + +11.1.  Stringprep profile for the utf8str_cs type + +   Every use of the utf8str_cs type definition in the NFS version 4 +   protocol specification follows the profile named nfs4_cs_prep. + +11.1.1.  Intended applicability of the nfs4_cs_prep profile + +   The utf8str_cs type is a case sensitive string of UTF-8 characters. +   Its primary use in NFS Version 4 is for naming components and +   pathnames.  Components and pathnames are stored on the server's +   filesystem.  Two valid distinct UTF-8 strings might be the same after +   processing via the utf8str_cs profile.  If the strings are two names +   inside a directory, the NFS version 4 server will need to either: + +   o  disallow the creation of a second name if it's post processed form +      collides with that of an existing name, or + +   o  allow the creation of the second name, but arrange so that after +      post processing, the second name is different than the post +      processed form of the first name. + + + +Shepler, et al.             Standards Track                   [Page 123] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +11.1.2.  Character repertoire of nfs4_cs_prep + +   The nfs4_cs_prep profile uses Unicode 3.2, as defined in stringprep's +   Appendix A.1 + +11.1.3.  Mapping used by nfs4_cs_prep + +   The nfs4_cs_prep profile specifies mapping using the following tables +   from stringprep: + +      Table B.1 + +   Table B.2 is normally not part of the nfs4_cs_prep profile as it is +   primarily for dealing with case-insensitive comparisons.  However, if +   the NFS version 4 file server supports the case_insensitive +   filesystem attribute, and if case_insensitive is true, the NFS +   version 4 server MUST use Table B.2 (in addition to Table B1) when +   processing utf8str_cs strings, and the NFS version 4 client MUST +   assume Table B.2 (in addition to Table B.1) are being used. + +   If the case_preserving attribute is present and set to false, then +   the NFS version 4 server MUST use table B.2 to map case when +   processing utf8str_cs strings.  Whether the server maps from lower to +   upper case or the upper to lower case is an implementation +   dependency. + +11.1.4.  Normalization used by nfs4_cs_prep + +   The nfs4_cs_prep profile does not specify a normalization form.  A +   later revision of this specification may specify a particular +   normalization form.  Therefore, the server and client can expect that +   they may receive unnormalized characters within protocol requests and +   responses.  If the operating environment requires normalization, then +   the implementation must normalize utf8str_cs strings within the +   protocol before presenting the information to an application (at the +   client) or local filesystem (at the server). + + + + + + + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 124] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +11.1.5.  Prohibited output for nfs4_cs_prep + +   The nfs4_cs_prep profile specifies prohibiting using the following +   tables from stringprep: + +      Table C.3 +      Table C.4 +      Table C.5 +      Table C.6 +      Table C.7 +      Table C.8 +      Table C.9 + +11.1.6.  Bidirectional output for nfs4_cs_prep + +   The nfs4_cs_prep profile does not specify any checking of +   bidirectional strings. + +11.2.  Stringprep profile for the utf8str_cis type + +   Every use of the utf8str_cis type definition in the NFS version 4 +   protocol specification follows the profile named nfs4_cis_prep. + +11.2.1.  Intended applicability of the nfs4_cis_prep profile + +   The utf8str_cis type is a case insensitive string of UTF-8 +   characters.  Its primary use in NFS Version 4 is for naming NFS +   servers. + +11.2.2.  Character repertoire of nfs4_cis_prep + +   The nfs4_cis_prep profile uses Unicode 3.2, as defined in +   stringprep's Appendix A.1 + +11.2.3.  Mapping used by nfs4_cis_prep + +   The nfs4_cis_prep profile specifies mapping using the following +   tables from stringprep: + +      Table B.1 +      Table B.2 + +11.2.4.  Normalization used by nfs4_cis_prep + +   The nfs4_cis_prep profile specifies using Unicode normalization form +   KC, as described in stringprep. + + + + + +Shepler, et al.             Standards Track                   [Page 125] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +11.2.5.  Prohibited output for nfs4_cis_prep + +   The nfs4_cis_prep profile specifies prohibiting using the following +   tables from stringprep: + +      Table C.1.2 +      Table C.2.2 +      Table C.3 +      Table C.4 +      Table C.5 +      Table C.6 +      Table C.7 +      Table C.8 +      Table C.9 + +11.2.6.  Bidirectional output for nfs4_cis_prep + +   The nfs4_cis_prep profile specifies checking bidirectional strings as +   described in stringprep's section 6. + +11.3.  Stringprep profile for the utf8str_mixed type + +   Every use of the utf8str_mixed type definition in the NFS version 4 +   protocol specification follows the profile named nfs4_mixed_prep. + +11.3.1.  Intended applicability of the nfs4_mixed_prep profile + +   The utf8str_mixed type is a string of UTF-8 characters, with a prefix +   that is case sensitive, a separator equal to '@', and a suffix that +   is fully qualified domain name.  Its primary use in NFS Version 4 is +   for naming principals identified in an Access Control Entry. + +11.3.2.  Character repertoire of nfs4_mixed_prep + +   The nfs4_mixed_prep profile uses Unicode 3.2, as defined in +   stringprep's Appendix A.1 + +11.3.3.  Mapping used by nfs4_cis_prep + +   For the prefix and the separator of a utf8str_mixed string, the +   nfs4_mixed_prep profile specifies mapping using the following table +   from stringprep: + +      Table B.1 + +   For the suffix of a utf8str_mixed string, the nfs4_mixed_prep profile +   specifies mapping using the following tables from stringprep: + + + + +Shepler, et al.             Standards Track                   [Page 126] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      Table B.1 +      Table B.2 + +11.3.4.  Normalization used by nfs4_mixed_prep + +   The nfs4_mixed_prep profile specifies using Unicode normalization +   form KC, as described in stringprep. + +11.3.5.  Prohibited output for nfs4_mixed_prep + +   The nfs4_mixed_prep profile specifies prohibiting using the following +   tables from stringprep: + +      Table C.1.2 +      Table C.2.2 +      Table C.3 +      Table C.4 +      Table C.5 +      Table C.6 +      Table C.7 +      Table C.8 +      Table C.9 + +11.3.6.  Bidirectional output for nfs4_mixed_prep + +   The nfs4_mixed_prep profile specifies checking bidirectional strings +   as described in stringprep's section 6. + +11.4.  UTF-8 Related Errors + +   Where the client sends an invalid UTF-8 string, the server should +   return an NFS4ERR_INVAL error.  This includes cases in which +   inappropriate prefixes are detected and where the count includes +   trailing bytes that do not constitute a full UCS character. + +   Where the client supplied string is valid UTF-8 but contains +   characters that are not supported by the server as a value for that +   string (e.g., names containing characters that have more than two +   octets on a filesystem that supports Unicode characters only), the +   server should return an NFS4ERR_BADCHAR error. + +   Where a UTF-8 string is used as a file name, and the filesystem, +   while supporting all of the characters within the name, does not +   allow that particular name to be used, the server should return the +   error NFS4ERR_BADNAME.  This includes situations in which the server +   filesystem imposes a normalization constraint on name strings, but + + + + + +Shepler, et al.             Standards Track                   [Page 127] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   will also include such situations as filesystem prohibitions of "." +   and ".." as file names for certain operations, and other such +   constraints. + +12.  Error Definitions + +   NFS error numbers are assigned to failed operations within a compound +   request.  A compound request contains a number of NFS operations that +   have their results encoded in sequence in a compound reply.  The +   results of successful operations will consist of an NFS4_OK status +   followed by the encoded results of the operation.  If an NFS +   operation fails, an error status will be entered in the reply and the +   compound request will be terminated. + +   A description of each defined error follows: + +   NFS4_OK               Indicates the operation completed successfully. + + +   NFS4ERR_ACCESS        Permission denied. The caller does not have the +                         correct permission to perform the requested +                         operation. Contrast this with NFS4ERR_PERM, +                         which restricts itself to owner or privileged +                         user permission failures. + +   NFS4ERR_ATTRNOTSUPP   An attribute specified is not supported by the +                         server.  Does not apply to the GETATTR +                         operation. + +   NFS4ERR_ADMIN_REVOKED Due to administrator intervention, the +                         lockowner's record locks, share reservations, +                         and delegations have been revoked by the +                         server. + +   NFS4ERR_BADCHAR       A UTF-8 string contains a character which is +                         not supported by the server in the context in +                         which it being used. + +   NFS4ERR_BAD_COOKIE    READDIR cookie is stale. + +   NFS4ERR_BADHANDLE     Illegal NFS filehandle. The filehandle failed +                         internal consistency checks. + +   NFS4ERR_BADNAME       A name string in a request consists of valid +                         UTF-8 characters supported by the server but +                         the name is not supported by the server as a +                         valid name for current operation. + + + + +Shepler, et al.             Standards Track                   [Page 128] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   NFS4ERR_BADOWNER      An owner, owner_group, or ACL attribute value +                         can not be translated to local representation. + +   NFS4ERR_BADTYPE       An attempt was made to create an object of a +                         type not supported by the server. + +   NFS4ERR_BAD_RANGE     The range for a LOCK, LOCKT, or LOCKU operation +                         is not appropriate to the allowable range of +                         offsets for the server. + +   NFS4ERR_BAD_SEQID     The sequence number in a locking request is +                         neither the next expected number or the last +                         number processed. + +   NFS4ERR_BAD_STATEID   A stateid generated by the current server +                         instance, but which does not designate any +                         locking state (either current or superseded) +                         for a current lockowner-file pair, was used. + +   NFS4ERR_BADXDR        The server encountered an XDR decoding error +                         while processing an operation. + +   NFS4ERR_CLID_INUSE    The SETCLIENTID operation has found that a +                         client id is already in use by another client. + +   NFS4ERR_DEADLOCK      The server has been able to determine a file +                         locking deadlock condition for a blocking lock +                         request. + +   NFS4ERR_DELAY         The server initiated the request, but was not +                         able to complete it in a timely fashion. The +                         client should wait and then try the request +                         with a new RPC transaction ID.  For example, +                         this error should be returned from a server +                         that supports hierarchical storage and receives +                         a request to process a file that has been +                         migrated. In this case, the server should start +                         the immigration process and respond to client +                         with this error.  This error may also occur +                         when a necessary delegation recall makes +                         processing a request in a timely fashion +                         impossible. + +   NFS4ERR_DENIED        An attempt to lock a file is denied.  Since +                         this may be a temporary condition, the client +                         is encouraged to retry the lock request until +                         the lock is accepted. + + + + +Shepler, et al.             Standards Track                   [Page 129] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   NFS4ERR_DQUOT         Resource (quota) hard limit exceeded. The +                         user's resource limit on the server has been +                         exceeded. + +   NFS4ERR_EXIST         File exists. The file specified already exists. + +   NFS4ERR_EXPIRED       A lease has expired that is being used in the +                         current operation. + +   NFS4ERR_FBIG          File too large. The operation would have caused +                         a file to grow beyond the server's limit. + +   NFS4ERR_FHEXPIRED     The filehandle provided is volatile and has +                         expired at the server. + +   NFS4ERR_FILE_OPEN     The operation can not be successfully processed +                         because a file involved in the operation is +                         currently open. + +   NFS4ERR_GRACE         The server is in its recovery or grace period +                         which should match the lease period of the +                         server. + +   NFS4ERR_INVAL         Invalid argument or unsupported argument for an +                         operation. Two examples are attempting a +                         READLINK on an object other than a symbolic +                         link or specifying a value for an enum field +                         that is not defined in the protocol (e.g., +                         nfs_ftype4). + +   NFS4ERR_IO            I/O error. A hard error (for example, a disk +                         error) occurred while processing the requested +                         operation. + +   NFS4ERR_ISDIR         Is a directory. The caller specified a +                         directory in a non-directory operation. + +   NFS4ERR_LEASE_MOVED   A lease being renewed is associated with a +                         filesystem that has been migrated to a new +                         server. + +   NFS4ERR_LOCKED        A read or write operation was attempted on a +                         locked file. + +   NFS4ERR_LOCK_NOTSUPP  Server does not support atomic upgrade or +                         downgrade of locks. + + + + + +Shepler, et al.             Standards Track                   [Page 130] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   NFS4ERR_LOCK_RANGE    A lock request is operating on a sub-range of a +                         current lock for the lock owner and the server +                         does not support this type of request. + +   NFS4ERR_LOCKS_HELD    A CLOSE was attempted and file locks would +                         exist after the CLOSE. + +   NFS4ERR_MINOR_VERS_MISMATCH +                         The server has received a request that +                         specifies an unsupported minor version.  The +                         server must return a COMPOUND4res with a zero +                         length operations result array. + +   NFS4ERR_MLINK         Too many hard links. + +   NFS4ERR_MOVED         The filesystem which contains the current +                         filehandle object has been relocated or +                         migrated to another server.  The client may +                         obtain the new filesystem location by obtaining +                         the "fs_locations" attribute for the current +                         filehandle.  For further discussion, refer to +                         the section "Filesystem Migration or +                         Relocation". + +   NFS4ERR_NAMETOOLONG   The filename in an operation was too long. + +   NFS4ERR_NOENT         No such file or directory. The file or +                         directory name specified does not exist. + +   NFS4ERR_NOFILEHANDLE  The logical current filehandle value (or, in +                         the case of RESTOREFH, the saved filehandle +                         value) has not been set properly.  This may be +                         a result of a malformed COMPOUND operation +                         (i.e., no PUTFH or PUTROOTFH before an +                         operation that requires the current filehandle +                         be set). + +   NFS4ERR_NO_GRACE      A reclaim of client state has fallen outside of +                         the grace period of the server.  As a result, +                         the server can not guarantee that conflicting +                         state has not been provided to another client. + +   NFS4ERR_NOSPC         No space left on device. The operation would +                         have caused the server's filesystem to exceed +                         its limit. + +   NFS4ERR_NOTDIR        Not a directory. The caller specified a non- +                         directory in a directory operation. + + + +Shepler, et al.             Standards Track                   [Page 131] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   NFS4ERR_NOTEMPTY      An attempt was made to remove a directory that +                         was not empty. + +   NFS4ERR_NOTSUPP       Operation is not supported. + +   NFS4ERR_NOT_SAME      This error is returned by the VERIFY operation +                         to signify that the attributes compared were +                         not the same as provided in the client's +                         request. + +   NFS4ERR_NXIO          I/O error. No such device or address. + +   NFS4ERR_OLD_STATEID   A stateid which designates the locking state +                         for a lockowner-file at an earlier time was +                         used. + +   NFS4ERR_OPENMODE      The client attempted a READ, WRITE, LOCK or +                         SETATTR operation not sanctioned by the stateid +                         passed (e.g., writing to a file opened only for +                         read). + +   NFS4ERR_OP_ILLEGAL    An illegal operation value has been specified +                         in the argop field of a COMPOUND or CB_COMPOUND +                         procedure. + +   NFS4ERR_PERM          Not owner. The operation was not allowed +                         because the caller is either not a privileged +                         user (root) or not the owner of the target of +                         the operation. + +   NFS4ERR_RECLAIM_BAD   The reclaim provided by the client does not +                         match any of the server's state consistency +                         checks and is bad. + +   NFS4ERR_RECLAIM_CONFLICT +                         The reclaim provided by the client has +                         encountered a conflict and can not be provided. +                         Potentially indicates a misbehaving client. + +   NFS4ERR_RESOURCE      For the processing of the COMPOUND procedure, +                         the server may exhaust available resources and +                         can not continue processing operations within +                         the COMPOUND procedure.  This error will be +                         returned from the server in those instances of +                         resource exhaustion related to the processing +                         of the COMPOUND procedure. + + + + + +Shepler, et al.             Standards Track                   [Page 132] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   NFS4ERR_RESTOREFH     The RESTOREFH operation does not have a saved +                         filehandle (identified by SAVEFH) to operate +                         upon. + +   NFS4ERR_ROFS          Read-only filesystem. A modifying operation was +                         attempted on a read-only filesystem. + +   NFS4ERR_SAME          This error is returned by the NVERIFY operation +                         to signify that the attributes compared were +                         the same as provided in the client's request. + +   NFS4ERR_SERVERFAULT   An error occurred on the server which does not +                         map to any of the legal NFS version 4 protocol +                         error values.  The client should translate this +                         into an appropriate error.  UNIX clients may +                         choose to translate this to EIO. + +   NFS4ERR_SHARE_DENIED  An attempt to OPEN a file with a share +                         reservation has failed because of a share +                         conflict. + +   NFS4ERR_STALE         Invalid filehandle. The filehandle given in the +                         arguments was invalid. The file referred to by +                         that filehandle no longer exists or access to +                         it has been revoked. + +   NFS4ERR_STALE_CLIENTID A clientid not recognized by the server was +                          used in a locking or SETCLIENTID_CONFIRM +                          request. + +   NFS4ERR_STALE_STATEID A stateid generated by an earlier server +                         instance was used. + +   NFS4ERR_SYMLINK       The current filehandle provided for a LOOKUP is +                         not a directory but a symbolic link.  Also used +                         if the final component of the OPEN path is a +                         symbolic link. + +   NFS4ERR_TOOSMALL      The encoded response to a READDIR request +                         exceeds the size limit set by the initial +                         request. + +   NFS4ERR_WRONGSEC      The security mechanism being used by the client +                         for the operation does not match the server's +                         security policy.  The client should change the +                         security mechanism being used and retry the +                         operation. + + + + +Shepler, et al.             Standards Track                   [Page 133] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   NFS4ERR_XDEV          Attempt to do an operation between different +                         fsids. + +13.  NFS version 4 Requests + +   For the NFS version 4 RPC program, there are two traditional RPC +   procedures: NULL and COMPOUND.  All other functionality is defined as +   a set of operations and these operations are defined in normal +   XDR/RPC syntax and semantics.  However, these operations are +   encapsulated within the COMPOUND procedure.  This requires that the +   client combine one or more of the NFS version 4 operations into a +   single request. + +   The NFS4_CALLBACK program is used to provide server to client +   signaling and is constructed in a similar fashion as the NFS version +   4 program.  The procedures CB_NULL and CB_COMPOUND are defined in the +   same way as NULL and COMPOUND are within the NFS program.  The +   CB_COMPOUND request also encapsulates the remaining operations of the +   NFS4_CALLBACK program.  There is no predefined RPC program number for +   the NFS4_CALLBACK program.  It is up to the client to specify a +   program number in the "transient" program range.  The program and +   port number of the NFS4_CALLBACK program are provided by the client +   as part of the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The program +   and port can be changed by another SETCLIENTID/SETCLIENTID_CONFIRM +   sequence, and it is possible to use the sequence to change them +   within a client incarnation without removing relevant leased client +   state. + +13.1.  Compound Procedure + +   The COMPOUND procedure provides the opportunity for better +   performance within high latency networks.  The client can avoid +   cumulative latency of multiple RPCs by combining multiple dependent +   operations into a single COMPOUND procedure.  A compound operation +   may provide for protocol simplification by allowing the client to +   combine basic procedures into a single request that is customized for +   the client's environment. + +   The CB_COMPOUND procedure precisely parallels the features of +   COMPOUND as described above. + +   The basic structure of the COMPOUND procedure is: + +   +-----+--------------+--------+-----------+-----------+-----------+-- +   | tag | minorversion | numops | op + args | op + args | op + args | +   +-----+--------------+--------+-----------+-----------+-----------+-- + + + + + +Shepler, et al.             Standards Track                   [Page 134] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   and the reply's structure is: + +      +------------+-----+--------+-----------------------+-- +      |last status | tag | numres | status + op + results | +      +------------+-----+--------+-----------------------+-- + +   The numops and numres fields, used in the depiction above, represent +   the count for the counted array encoding use to signify the number of +   arguments or results encoded in the request and response.  As per the +   XDR encoding, these counts must match exactly the number of operation +   arguments or results encoded. + +13.2.  Evaluation of a Compound Request + +   The server will process the COMPOUND procedure by evaluating each of +   the operations within the COMPOUND procedure in order.  Each +   component operation consists of a 32 bit operation code, followed by +   the argument of length determined by the type of operation. The +   results of each operation are encoded in sequence into a reply +   buffer.  The results of each operation are preceded by the opcode and +   a status code (normally zero).  If an operation results in a non-zero +   status code, the status will be encoded and evaluation of the +   compound sequence will halt and the reply will be returned.  Note +   that evaluation stops even in the event of "non error" conditions +   such as NFS4ERR_SAME. + +   There are no atomicity requirements for the operations contained +   within the COMPOUND procedure.  The operations being evaluated as +   part of a COMPOUND request may be evaluated simultaneously with other +   COMPOUND requests that the server receives. + +   It is the client's responsibility for recovering from any partially +   completed COMPOUND procedure.  Partially completed COMPOUND +   procedures may occur at any point due to errors such as +   NFS4ERR_RESOURCE and NFS4ERR_DELAY.  This may occur even given an +   otherwise valid operation string.  Further, a server reboot which +   occurs in the middle of processing a COMPOUND procedure may leave the +   client with the difficult task of determining how far COMPOUND +   processing has proceeded.  Therefore, the client should avoid overly +   complex COMPOUND procedures in the event of the failure of an +   operation within the procedure. + +   Each operation assumes a "current" and "saved" filehandle that is +   available as part of the execution context of the compound request. +   Operations may set, change, or return the current filehandle.  The +   "saved" filehandle is used for temporary storage of a filehandle +   value and as operands for the RENAME and LINK operations. + + + + +Shepler, et al.             Standards Track                   [Page 135] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +13.3.  Synchronous Modifying Operations + +   NFS version 4 operations that modify the filesystem are synchronous. +   When an operation is successfully completed at the server, the client +   can depend that any data associated with the request is now on stable +   storage (the one exception is in the case of the file data in a WRITE +   operation with the UNSTABLE option specified). + +   This implies that any previous operations within the same compound +   request are also reflected in stable storage.  This behavior enables +   the client's ability to recover from a partially executed compound +   request which may resulted from the failure of the server.  For +   example, if a compound request contains operations A and B and the +   server is unable to send a response to the client, depending on the +   progress the server made in servicing the request the result of both +   operations may be reflected in stable storage or just operation A may +   be reflected.  The server must not have just the results of operation +   B in stable storage. + +13.4.  Operation Values + +   The operations encoded in the COMPOUND procedure are identified by +   operation values.  To avoid overlap with the RPC procedure numbers, +   operations 0 (zero) and 1 are not defined.  Operation 2 is not +   defined but reserved for future use with minor versioning. + +14.  NFS version 4 Procedures + +14.1.  Procedure 0: NULL - No Operation + +   SYNOPSIS + +      <null> + +   ARGUMENT + +      void; + +   RESULT + +      void; + + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 136] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +      Standard NULL procedure.  Void argument, void response.  This +      procedure has no functionality associated with it.  Because of +      this it is sometimes used to measure the overhead of processing a +      service request.  Therefore, the server should ensure that no +      unnecessary work is done in servicing this procedure. + +   ERRORS + +      None. + +14.2.  Procedure 1: COMPOUND - Compound Operations + +   SYNOPSIS + +     compoundargs -> compoundres + +   ARGUMENT + +     union nfs_argop4 switch (nfs_opnum4 argop) { +             case <OPCODE>: <argument>; +             ... +     }; + +     struct COMPOUND4args { +             utf8str_cs      tag; +             uint32_t        minorversion; +             nfs_argop4      argarray<>; +     }; + +   RESULT + +     union nfs_resop4 switch (nfs_opnum4 resop){ +             case <OPCODE>: <result>; +             ... +     }; + +     struct COMPOUND4res { +             nfsstat4        status; +             utf8str_cs      tag; +             nfs_resop4      resarray<>; +     }; + + + + + + + + +Shepler, et al.             Standards Track                   [Page 137] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   The COMPOUND procedure is used to combine one or more of the NFS +   operations into a single RPC request.  The main NFS RPC program has +   two main procedures: NULL and COMPOUND.  All other operations use the +   COMPOUND procedure as a wrapper. + +   The COMPOUND procedure is used to combine individual operations into +   a single RPC request.  The server interprets each of the operations +   in turn.  If an operation is executed by the server and the status of +   that operation is NFS4_OK, then the next operation in the COMPOUND +   procedure is executed.  The server continues this process until there +   are no more operations to be executed or one of the operations has a +   status value other than NFS4_OK. + +   In the processing of the COMPOUND procedure, the server may find that +   it does not have the available resources to execute any or all of the +   operations within the COMPOUND sequence.  In this case, the error +   NFS4ERR_RESOURCE will be returned for the particular operation within +   the COMPOUND procedure where the resource exhaustion occurred.  This +   assumes that all previous operations within the COMPOUND sequence +   have been evaluated successfully.  The results for all of the +   evaluated operations must be returned to the client. + +   The server will generally choose between two methods of decoding the +   client's request.  The first would be the traditional one-pass XDR +   decode, in which decoding of the entire COMPOUND precedes execution +   of any operation within it.  If there is an XDR decoding error in +   this case, an RPC XDR decode error would be returned.  The second +   method would be to make an initial pass to decode the basic COMPOUND +   request and then to XDR decode each of the individual operations, as +   the server is ready to execute it.  In this case, the server may +   encounter an XDR decode error during such an operation decode, after +   previous operations within the COMPOUND have been executed.  In this +   case, the server would return the error NFS4ERR_BADXDR to signify the +   decode error. + +   The COMPOUND arguments contain a "minorversion" field.  The initial +   and default value for this field is 0 (zero).  This field will be +   used by future minor versions such that the client can communicate to +   the server what minor version is being requested.  If the server +   receives a COMPOUND procedure with a minorversion field value that it +   does not support, the server MUST return an error of +   NFS4ERR_MINOR_VERS_MISMATCH and a zero length resultdata array. + +   Contained within the COMPOUND results is a "status" field.  If the +   results array length is non-zero, this status must be equivalent to +   the status of the last operation that was executed within the + + + +Shepler, et al.             Standards Track                   [Page 138] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   COMPOUND procedure.  Therefore, if an operation incurred an error +   then the "status" value will be the same error value as is being +   returned for the operation that failed. + +   Note that operations, 0 (zero) and 1 (one) are not defined for the +   COMPOUND procedure.  Operation 2 is not defined but reserved for +   future definition and use with minor versioning.  If the server +   receives a operation array that contains operation 2 and the +   minorversion field has a value of 0 (zero), an error of +   NFS4ERR_OP_ILLEGAL, as described in the next paragraph, is returned +   to the client.  If an operation array contains an operation 2 and the +   minorversion field is non-zero and the server does not support the +   minor version, the server returns an error of +   NFS4ERR_MINOR_VERS_MISMATCH.  Therefore, the +   NFS4ERR_MINOR_VERS_MISMATCH error takes precedence over all other +   errors. + +   It is possible that the server receives a request that contains an +   operation that is less than the first legal operation (OP_ACCESS) or +   greater than the last legal operation (OP_RELEASE_LOCKOWNER). + +   In this case, the server's response will encode the opcode OP_ILLEGAL +   rather than the illegal opcode of the request. The status field in +   the ILLEGAL return results will set to NFS4ERR_OP_ILLEGAL.  The +   COMPOUND procedure's return results will also be NFS4ERR_OP_ILLEGAL. + +   The definition of the "tag" in the request is left to the +   implementor.  It may be used to summarize the content of the compound +   request for the benefit of packet sniffers and engineers debugging +   implementations.  However, the value of "tag" in the response SHOULD +   be the same value as provided in the request.  This applies to the +   tag field of the CB_COMPOUND procedure as well. + +   IMPLEMENTATION + +   Since an error of any type may occur after only a portion of the +   operations have been evaluated, the client must be prepared to +   recover from any failure.  If the source of an NFS4ERR_RESOURCE error +   was a complex or lengthy set of operations, it is likely that if the +   number of operations were reduced the server would be able to +   evaluate them successfully.  Therefore, the client is responsible for +   dealing with this type of complexity in recovery. + +   ERRORS + +   All errors defined in the protocol + + + + + +Shepler, et al.             Standards Track                   [Page 139] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.1.  Operation 3: ACCESS - Check Access Rights + +   SYNOPSIS + +     (cfh), accessreq -> supported, accessrights + +   ARGUMENT + +     const ACCESS4_READ      = 0x00000001; +     const ACCESS4_LOOKUP    = 0x00000002; +     const ACCESS4_MODIFY    = 0x00000004; +     const ACCESS4_EXTEND    = 0x00000008; +     const ACCESS4_DELETE    = 0x00000010; +     const ACCESS4_EXECUTE   = 0x00000020; + +     struct ACCESS4args { +             /* CURRENT_FH: object */ +             uint32_t        access; +     }; + +   RESULT + +     struct ACCESS4resok { +             uint32_t        supported; +             uint32_t        access; +     }; + +     union ACCESS4res switch (nfsstat4 status) { +      case NFS4_OK: +              ACCESS4resok   resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   ACCESS determines the access rights that a user, as identified by the +   credentials in the RPC request, has with respect to the file system +   object specified by the current filehandle.  The client encodes the +   set of access rights that are to be checked in the bit mask "access". +   The server checks the permissions encoded in the bit mask.  If a +   status of NFS4_OK is returned, two bit masks are included in the +   response.  The first, "supported", represents the access rights for +   which the server can verify reliably.  The second, "access", +   represents the access rights available to the user for the filehandle +   provided.  On success, the current filehandle retains its value. + + + + + +Shepler, et al.             Standards Track                   [Page 140] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Note that the supported field will contain only as many values as +   were originally sent in the arguments.  For example, if the client +   sends an ACCESS operation with only the ACCESS4_READ value set and +   the server supports this value, the server will return only +   ACCESS4_READ even if it could have reliably checked other values. + +   The results of this operation are necessarily advisory in nature.  A +   return status of NFS4_OK and the appropriate bit set in the bit mask +   does not imply that such access will be allowed to the file system +   object in the future. This is because access rights can be revoked by +   the server at any time. + +   The following access permissions may be requested: + +   ACCESS4_READ    Read data from file or read a directory. + +   ACCESS4_LOOKUP  Look up a name in a directory (no meaning for non- +                   directory objects). + +   ACCESS4_MODIFY  Rewrite existing file data or modify existing +                   directory entries. + +   ACCESS4_EXTEND  Write new data or add directory entries. + +   ACCESS4_DELETE  Delete an existing directory entry. + +   ACCESS4_EXECUTE Execute file (no meaning for a directory). + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   In general, it is not sufficient for the client to attempt to deduce +   access permissions by inspecting the uid, gid, and mode fields in the +   file attributes or by attempting to interpret the contents of the ACL +   attribute.  This is because the server may perform uid or gid mapping +   or enforce additional access control restrictions.  It is also +   possible that the server may not be in the same ID space as the +   client.  In these cases (and perhaps others), the client can not +   reliably perform an access check with only current file attributes. + +   In the NFS version 2 protocol, the only reliable way to determine +   whether an operation was allowed was to try it and see if it +   succeeded or failed.  Using the ACCESS operation in the NFS version 4 +   protocol, the client can ask the server to indicate whether or not +   one or more classes of operations are permitted.  The ACCESS +   operation is provided to allow clients to check before doing a series +   of operations which will result in an access failure.  The OPEN + + + +Shepler, et al.             Standards Track                   [Page 141] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   operation provides a point where the server can verify access to the +   file object and method to return that information to the client.  The +   ACCESS operation is still useful for directory operations or for use +   in the case the UNIX API "access" is used on the client. + +   The information returned by the server in response to an ACCESS call +   is not permanent.  It was correct at the exact time that the server +   performed the checks, but not necessarily afterwards.  The server can +   revoke access permission at any time. + +   The client should use the effective credentials of the user to build +   the authentication information in the ACCESS request used to +   determine access rights.  It is the effective user and group +   credentials that are used in subsequent read and write operations. + +   Many implementations do not directly support the ACCESS4_DELETE +   permission.  Operating systems like UNIX will ignore the +   ACCESS4_DELETE bit if set on an access request on a non-directory +   object.  In these systems, delete permission on a file is determined +   by the access permissions on the directory in which the file resides, +   instead of being determined by the permissions of the file itself. +   Therefore, the mask returned enumerating which access rights can be +   determined will have the ACCESS4_DELETE value set to 0.  This +   indicates to the client that the server was unable to check that +   particular access right.  The ACCESS4_DELETE bit in the access mask +   returned will then be ignored by the client. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.2.  Operation 4: CLOSE - Close File + +   SYNOPSIS + +     (cfh), seqid, open_stateid -> open_stateid + + + + +Shepler, et al.             Standards Track                   [Page 142] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ARGUMENT + +     struct CLOSE4args { +             /* CURRENT_FH: object */ +             seqid4          seqid +             stateid4        open_stateid; +     }; + +   RESULT + +     union CLOSE4res switch (nfsstat4 status) { +      case NFS4_OK: +              stateid4       open_stateid; +      default: +              void; +     }; + +   DESCRIPTION + +   The CLOSE operation releases share reservations for the regular or +   named attribute file as specified by the current filehandle.  The +   share reservations and other state information released at the server +   as a result of this CLOSE is only associated with the supplied +   stateid.  The sequence id provides for the correct ordering. State +   associated with other OPENs is not affected. + +   If record locks are held, the client SHOULD release all locks before +   issuing a CLOSE.  The server MAY free all outstanding locks on CLOSE +   but some servers may not support the CLOSE of a file that still has +   record locks held.  The server MUST return failure if any locks would +   exist after the CLOSE. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   Even though CLOSE returns a stateid, this stateid is not useful to +   the client and should be treated as deprecated.  CLOSE "shuts down" +   the state associated with all OPENs for the file by a single +   open_owner.  As noted above, CLOSE will either release all file +   locking state or return an error.  Therefore, the stateid returned by +   CLOSE is not useful for operations that follow. + +   ERRORS + +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_SEQID + + + +Shepler, et al.             Standards Track                   [Page 143] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_EXPIRED +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_ISDIR +      NFS4ERR_LEASE_MOVED +      NFS4ERR_LOCKS_HELD +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_OLD_STATEID +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + +14.2.3.  Operation 5: COMMIT - Commit Cached Data + +   SYNOPSIS + +     (cfh), offset, count -> verifier + +   ARGUMENT + +     struct COMMIT4args { +             /* CURRENT_FH: file */ +             offset4         offset; +             count4          count; +     }; + +   RESULT + +     struct COMMIT4resok { +             verifier4       writeverf; +     }; + +     union COMMIT4res switch (nfsstat4 status) { +      case NFS4_OK: +              COMMIT4resok   resok4; +      default: +              void; +     }; + + + + + + + + +Shepler, et al.             Standards Track                   [Page 144] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   The COMMIT operation forces or flushes data to stable storage for the +   file specified by the current filehandle.  The flushed data is that +   which was previously written with a WRITE operation which had the +   stable field set to UNSTABLE4. + +   The offset specifies the position within the file where the flush is +   to begin.  An offset value of 0 (zero) means to flush data starting +   at the beginning of the file.  The count specifies the number of +   bytes of data to flush.  If count is 0 (zero), a flush from offset to +   the end of the file is done. + +   The server returns a write verifier upon successful completion of the +   COMMIT.  The write verifier is used by the client to determine if the +   server has restarted or rebooted between the initial WRITE(s) and the +   COMMIT.  The client does this by comparing the write verifier +   returned from the initial writes and the verifier returned by the +   COMMIT operation.  The server must vary the value of the write +   verifier at each server event or instantiation that may lead to a +   loss of uncommitted data.  Most commonly this occurs when the server +   is rebooted; however, other events at the server may result in +   uncommitted data loss as well. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   The COMMIT operation is similar in operation and semantics to the +   POSIX fsync(2) system call that synchronizes a file's state with the +   disk (file data and metadata is flushed to disk or stable storage). +   COMMIT performs the same operation for a client, flushing any +   unsynchronized data and metadata on the server to the server's disk +   or stable storage for the specified file.  Like fsync(2), it may be +   that there is some modified data or no modified data to synchronize. +   The data may have been synchronized by the server's normal periodic +   buffer synchronization activity.  COMMIT should return NFS4_OK, +   unless there has been an unexpected error. + +   COMMIT differs from fsync(2) in that it is possible for the client to +   flush a range of the file (most likely triggered by a buffer- +   reclamation scheme on the client before file has been completely +   written). + +   The server implementation of COMMIT is reasonably simple.  If the +   server receives a full file COMMIT request, that is starting at +   offset 0 and count 0, it should do the equivalent of fsync()'ing the +   file.  Otherwise, it should arrange to have the cached data in the + + + +Shepler, et al.             Standards Track                   [Page 145] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   range specified by offset and count to be flushed to stable storage. +   In both cases, any metadata associated with the file must be flushed +   to stable storage before returning.  It is not an error for there to +   be nothing to flush on the server.  This means that the data and +   metadata that needed to be flushed have already been flushed or lost +   during the last server failure. + +   The client implementation of COMMIT is a little more complex.  There +   are two reasons for wanting to commit a client buffer to stable +   storage.  The first is that the client wants to reuse a buffer.  In +   this case, the offset and count of the buffer are sent to the server +   in the COMMIT request.  The server then flushes any cached data based +   on the offset and count, and flushes any metadata associated with the +   file.  It then returns the status of the flush and the write +   verifier.  The other reason for the client to generate a COMMIT is +   for a full file flush, such as may be done at close.  In this case, +   the client would gather all of the buffers for this file that contain +   uncommitted data, do the COMMIT operation with an offset of 0 and +   count of 0, and then free all of those buffers.  Any other dirty +   buffers would be sent to the server in the normal fashion. + +   After a buffer is written by the client with the stable parameter set +   to UNSTABLE4, the buffer must be considered as modified by the client +   until the buffer has either been flushed via a COMMIT operation or +   written via a WRITE operation with stable parameter set to FILE_SYNC4 +   or DATA_SYNC4. This is done to prevent the buffer from being freed +   and reused before the data can be flushed to stable storage on the +   server. + +   When a response is returned from either a WRITE or a COMMIT operation +   and it contains a write verifier that is different than previously +   returned by the server, the client will need to retransmit all of the +   buffers containing uncommitted cached data to the server.  How this +   is to be done is up to the implementor.  If there is only one buffer +   of interest, then it should probably be sent back over in a WRITE +   request with the appropriate stable parameter.  If there is more than +   one buffer, it might be worthwhile retransmitting all of the buffers +   in WRITE requests with the stable parameter set to UNSTABLE4 and then +   retransmitting the COMMIT operation to flush all of the data on the +   server to stable storage.  The timing of these retransmissions is +   left to the implementor. + +   The above description applies to page-cache-based systems as well as +   buffer-cache-based systems.  In those systems, the virtual memory +   system will need to be modified instead of the buffer cache. + + + + + + +Shepler, et al.             Standards Track                   [Page 146] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_ISDIR +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.4.  Operation 6: CREATE - Create a Non-Regular File Object + +   SYNOPSIS + +     (cfh), name, type, attrs -> (cfh), change_info, attrs_set + +   ARGUMENT + +     union createtype4 switch (nfs_ftype4 type) { +      case NF4LNK: +              linktext4      linkdata; +      case NF4BLK: +      case NF4CHR: +              specdata4      devdata; +      case NF4SOCK: +      case NF4FIFO: +      case NF4DIR: +              void; +     }; + +     struct CREATE4args { +             /* CURRENT_FH: directory for creation */ +             createtype4     objtype; +             component4      objname; +             fattr4          createattrs; +     }; + +   RESULT + +     struct CREATE4resok { +             change_info4    cinfo; +             bitmap4         attrset;        /* attributes set */ + + + +Shepler, et al.             Standards Track                   [Page 147] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +     }; + +     union CREATE4res switch (nfsstat4 status) { +      case NFS4_OK: +              CREATE4resok resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   The CREATE operation creates a non-regular file object in a directory +   with a given name.  The OPEN operation MUST be used to create a +   regular file. + +   The objname specifies the name for the new object.  The objtype +   determines the type of object to be created: directory, symlink, etc. + +   If an object of the same name already exists in the directory, the +   server will return the error NFS4ERR_EXIST. + +   For the directory where the new file object was created, the server +   returns change_info4 information in cinfo.  With the atomic field of +   the change_info4 struct, the server will indicate if the before and +   after change attributes were obtained atomically with respect to the +   file object creation. + +   If the objname has a length of 0 (zero), or if objname does not obey +   the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + +   The current filehandle is replaced by that of the new object. + +   The createattrs specifies the initial set of attributes for the +   object.  The set of attributes may include any writable attribute +   valid for the object type. When the operation is successful, the +   server will return to the client an attribute mask signifying which +   attributes were successfully set for the object. + +   If createattrs includes neither the owner attribute nor an ACL with +   an ACE for the owner, and if the server's filesystem both supports +   and requires an owner attribute (or an owner ACE) then the server +   MUST derive the owner (or the owner ACE). This would typically be +   from the principal indicated in the RPC credentials of the call, but +   the server's operating environment or filesystem semantics may +   dictate other methods of derivation. Similarly, if createattrs +   includes neither the group attribute nor a group ACE, and if the +   server's filesystem both supports and requires the notion of a group +   attribute (or group ACE), the server MUST derive the group attribute + + + +Shepler, et al.             Standards Track                   [Page 148] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   (or the corresponding owner ACE) for the file. This could be from the +   RPC call's credentials, such as the group principal if the +   credentials include it (such as with AUTH_SYS), from the group +   identifier associated with the principal in the credentials (for +   e.g., POSIX systems have a passwd database that has the group +   identifier for every user identifier), inherited from directory the +   object is created in, or whatever else the server's operating +   environment or filesystem semantics dictate. This applies to the OPEN +   operation too. + +   Conversely, it is possible the client will specify in createattrs an +   owner attribute or group attribute or ACL that the principal +   indicated the RPC call's credentials does not have permissions to +   create files for. The error to be returned in this instance is +   NFS4ERR_PERM. This applies to the OPEN operation too. + +   IMPLEMENTATION + +   If the client desires to set attribute values after the create, a +   SETATTR operation can be added to the COMPOUND request so that the +   appropriate attributes will be set. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ATTRNOTSUPP +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADNAME +      NFS4ERR_BADOWNER +      NFS4ERR_BADTYPE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DQUOT +      NFS4ERR_EXIST +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NAMETOOLONG +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOSPC +      NFS4ERR_NOTDIR +      NFS4ERR_PERM +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + + + +Shepler, et al.             Standards Track                   [Page 149] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.5.  Operation 7: DELEGPURGE - Purge Delegations Awaiting Recovery + +   SYNOPSIS + +     clientid -> + +   ARGUMENT + +     struct DELEGPURGE4args { +             clientid4       clientid; +     }; + +   RESULT + +     struct DELEGPURGE4res { +             nfsstat4        status; +     }; + +   DESCRIPTION + +   Purges all of the delegations awaiting recovery for a given client. +   This is useful for clients which do not commit delegation information +   to stable storage to indicate that conflicting requests need not be +   delayed by the server awaiting recovery of delegation information. + +   This operation should be used by clients that record delegation +   information on stable storage on the client.  In this case, +   DELEGPURGE should be issued immediately after doing delegation +   recovery on all delegations known to the client.  Doing so will +   notify the server that no additional delegations for the client will +   be recovered allowing it to free resources, and avoid delaying other +   clients who make requests that conflict with the unrecovered +   delegations.  The set of delegations known to the server and the +   client may be different.  The reason for this is that a client may +   fail after making a request which resulted in delegation but before +   it received the results and committed them to the client's stable +   storage. + +   The server MAY support DELEGPURGE, but if it does not, it MUST NOT +   support CLAIM_DELEGATE_PREV. + +   ERRORS + +      NFS4ERR_BADXDR +      NFS4ERR_NOTSUPP +      NFS4ERR_LEASE_MOVED +      NFS4ERR_MOVED +      NFS4ERR_RESOURCE + + + +Shepler, et al.             Standards Track                   [Page 150] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE_CLIENTID + +14.2.6.  Operation 8: DELEGRETURN - Return Delegation + +   SYNOPSIS + +     (cfh), stateid -> + +   ARGUMENT + +     struct DELEGRETURN4args { +             /* CURRENT_FH: delegated file */ +             stateid4        stateid; +     }; + +   RESULT + +     struct DELEGRETURN4res { +             nfsstat4        status; +     }; + +   DESCRIPTION + +   Returns the delegation represented by the current filehandle and +   stateid. + +   Delegations may be returned when recalled or voluntarily (i.e., +   before the server has recalled them).  In either case the client must +   properly propagate state changed under the context of the delegation +   to the server before returning the delegation. + +   ERRORS + +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_EXPIRED +      NFS4ERR_INVAL +      NFS4ERR_LEASE_MOVED +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOTSUPP +      NFS4ERR_OLD_STATEID +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + + + +Shepler, et al.             Standards Track                   [Page 151] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.7.  Operation 9: GETATTR - Get Attributes + +   SYNOPSIS + +     (cfh), attrbits -> attrbits, attrvals + +   ARGUMENT + +     struct GETATTR4args { +             /* CURRENT_FH: directory or file */ +             bitmap4         attr_request; +     }; + +   RESULT + +     struct GETATTR4resok { +             fattr4          obj_attributes; +     }; + +     union GETATTR4res switch (nfsstat4 status) { +      case NFS4_OK: +              GETATTR4resok  resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   The GETATTR operation will obtain attributes for the filesystem +   object specified by the current filehandle.  The client sets a bit in +   the bitmap argument for each attribute value that it would like the +   server to return.  The server returns an attribute bitmap that +   indicates the attribute values for which it was able to return, +   followed by the attribute values ordered lowest attribute number +   first. + +   The server must return a value for each attribute that the client +   requests if the attribute is supported by the server.  If the server +   does not support an attribute or cannot approximate a useful value +   then it must not return the attribute value and must not set the +   attribute bit in the result bitmap.  The server must return an error +   if it supports an attribute but cannot obtain its value.  In that +   case no attribute values will be returned. + +   All servers must support the mandatory attributes as specified in the +   section "File Attributes". + +   On success, the current filehandle retains its value. + + + +Shepler, et al.             Standards Track                   [Page 152] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   IMPLEMENTATION + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.8.  Operation 10: GETFH - Get Current Filehandle + +   SYNOPSIS + +     (cfh) -> filehandle + +   ARGUMENT + +     /* CURRENT_FH: */ +     void; + + +   RESULT + +     struct GETFH4resok { +             nfs_fh4         object; +     }; + +     union GETFH4res switch (nfsstat4 status) { +      case NFS4_OK: +             GETFH4resok     resok4; +      default: +             void; +     }; + +   DESCRIPTION + +   This operation returns the current filehandle value. + +   On success, the current filehandle retains its value. + + + + +Shepler, et al.             Standards Track                   [Page 153] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   IMPLEMENTATION + +   Operations that change the current filehandle like LOOKUP or CREATE +   do not automatically return the new filehandle as a result.  For +   instance, if a client needs to lookup a directory entry and obtain +   its filehandle then the following request is needed. + +      PUTFH  (directory filehandle) +      LOOKUP (entry name) +      GETFH + +   ERRORS + +      NFS4ERR_BADHANDLE +      NFS4ERR_FHEXPIRED +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.9.  Operation 11: LINK - Create Link to a File + +   SYNOPSIS + +     (sfh), (cfh), newname -> (cfh), change_info + +   ARGUMENT + +     struct LINK4args { +             /* SAVED_FH: source object */ +             /* CURRENT_FH: target directory */ +             component4      newname; +     }; + +   RESULT + +     struct LINK4resok { +             change_info4    cinfo; +     }; + +     union LINK4res switch (nfsstat4 status) { +      case NFS4_OK: +              LINK4resok resok4; +      default: +              void; +     }; + + + + +Shepler, et al.             Standards Track                   [Page 154] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   The LINK operation creates an additional newname for the file +   represented by the saved filehandle, as set by the SAVEFH operation, +   in the directory represented by the current filehandle.  The existing +   file and the target directory must reside within the same filesystem +   on the server.  On success, the current filehandle will continue to +   be the target directory.  If an object exists in the target directory +   with the same name as newname, the server must return NFS4ERR_EXIST. + +   For the target directory, the server returns change_info4 information +   in cinfo.  With the atomic field of the change_info4 struct, the +   server will indicate if the before and after change attributes were +   obtained atomically with respect to the link creation. + +   If the newname has a length of 0 (zero), or if newname does not obey +   the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + +   IMPLEMENTATION + +   Changes to any property of the "hard" linked files are reflected in +   all of the linked files.  When a link is made to a file, the +   attributes for the file should have a value for numlinks that is one +   greater than the value before the LINK operation. + +   The statement "file and the target directory must reside within the +   same filesystem on the server" means that the fsid fields in the +   attributes for the objects are the same. If they reside on different +   filesystems, the error, NFS4ERR_XDEV, is returned.  On some servers, +   the filenames, "." and "..", are illegal as newname. + +   In the case that newname is already linked to the file represented by +   the saved filehandle, the server will return NFS4ERR_EXIST. + +   Note that symbolic links are created with the CREATE operation. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADNAME +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DQUOT +      NFS4ERR_EXIST +      NFS4ERR_FHEXPIRED +      NFS4ERR_FILE_OPEN + + + +Shepler, et al.             Standards Track                   [Page 155] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_ISDIR +      NFS4ERR_MLINK +      NFS4ERR_MOVED +      NFS4ERR_NAMETOOLONG +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOSPC +      NFS4ERR_NOTDIR +      NFS4ERR_NOTSUPP +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_WRONGSEC +      NFS4ERR_XDEV + +14.2.10.  Operation 12: LOCK - Create Lock + +   SYNOPSIS + +     (cfh) locktype, reclaim, offset, length, locker -> stateid + +   ARGUMENT + +     struct open_to_lock_owner4 { +             seqid4          open_seqid; +             stateid4        open_stateid; +             seqid4          lock_seqid; +             lock_owner4     lock_owner; +     }; + +     struct exist_lock_owner4 { +             stateid4        lock_stateid; +             seqid4          lock_seqid; +     }; + +     union locker4 switch (bool new_lock_owner) { +      case TRUE: +             open_to_lock_owner4     open_owner; +      case FALSE: +             exist_lock_owner4       lock_owner; +     }; + +     enum nfs_lock_type4 { +             READ_LT         = 1, +             WRITE_LT        = 2, + + + +Shepler, et al.             Standards Track                   [Page 156] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +             READW_LT        = 3,    /* blocking read */ +             WRITEW_LT       = 4     /* blocking write */ +     }; + +     struct LOCK4args { +             /* CURRENT_FH: file */ +             nfs_lock_type4  locktype; +             bool            reclaim; +             offset4         offset; +             length4         length; +             locker4         locker; +     }; + +   RESULT + +     struct LOCK4denied { +             offset4         offset; +             length4         length; +             nfs_lock_type4  locktype; +             lock_owner4     owner; +     }; + +     struct LOCK4resok { +             stateid4        lock_stateid; +     }; + +     union LOCK4res switch (nfsstat4 status) { +      case NFS4_OK: +              LOCK4resok     resok4; +      case NFS4ERR_DENIED: +              LOCK4denied    denied; +      default: +              void; +     }; + +   DESCRIPTION + +   The LOCK operation requests a record lock for the byte range +   specified by the offset and length parameters.  The lock type is also +   specified to be one of the nfs_lock_type4s.  If this is a reclaim +   request, the reclaim parameter will be TRUE; + +   Bytes in a file may be locked even if those bytes are not currently +   allocated to the file.  To lock the file from a specific offset +   through the end-of-file (no matter how long the file actually is) use +   a length field with all bits set to 1 (one).  If the length is zero, + + + + + +Shepler, et al.             Standards Track                   [Page 157] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   or if a length which is not all bits set to one is specified, and +   length when added to the offset exceeds the maximum 64-bit unsigned +   integer value, the error NFS4ERR_INVAL will result. + +   Some servers may only support locking for byte offsets that fit +   within 32 bits.  If the client specifies a range that includes a byte +   beyond the last byte offset of the 32-bit range, but does not include +   the last byte offset of the 32-bit and all of the byte offsets beyond +   it, up to the end of the valid 64-bit range, such a 32-bit server +   MUST return the error NFS4ERR_BAD_RANGE. + +   In the case that the lock is denied, the owner, offset, and length of +   a conflicting lock are returned. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   If the server is unable to determine the exact offset and length of +   the conflicting lock, the same offset and length that were provided +   in the arguments should be returned in the denied results.  The File +   Locking section contains a full description of this and the other +   file locking operations. + +   LOCK operations are subject to permission checks and to checks +   against the access type of the associated file.  However, the +   specific right and modes required for various type of locks, reflect +   the semantics of the server-exported filesystem, and are not +   specified by the protocol.  For example, Windows 2000 allows a write +   lock of a file open for READ, while a POSIX-compliant system does +   not. + +   When the client makes a lock request that corresponds to a range that +   the lockowner has locked already (with the same or different lock +   type), or to a sub-region of such a range, or to a region which +   includes multiple locks already granted to that lockowner, in whole +   or in part, and the server does not support such locking operations +   (i.e., does not support POSIX locking semantics), the server will +   return the error NFS4ERR_LOCK_RANGE.  In that case, the client may +   return an error, or it may emulate the required operations, using +   only LOCK for ranges that do not include any bytes already locked by +   that lock_owner and LOCKU of locks held by that lock_owner +   (specifying an exactly-matching range and type).  Similarly, when the +   client makes a lock request that amounts to upgrading (changing from +   a read lock to a write lock) or downgrading (changing from write lock +   to a read lock) an existing record lock, and the server does not + + + + + +Shepler, et al.             Standards Track                   [Page 158] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   support such a lock, the server will return NFS4ERR_LOCK_NOTSUPP. +   Such operations may not perfectly reflect the required semantics in +   the face of conflicting lock requests from other clients. + +   The locker argument specifies the lock_owner that is associated with +   the LOCK request.  The locker4 structure is a switched union that +   indicates whether the lock_owner is known to the server or if the +   lock_owner is new to the server.  In the case that the lock_owner is +   known to the server and has an established lock_seqid, the argument +   is just the lock_owner and lock_seqid.  In the case that the +   lock_owner is not known to the server, the argument contains not only +   the lock_owner and lock_seqid but also the open_stateid and +   open_seqid.  The new lock_owner case covers the very first lock done +   by the lock_owner and offers a method to use the established state of +   the open_stateid to transition to the use of the lock_owner. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_RANGE +      NFS4ERR_BAD_SEQID +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_DEADLOCK +      NFS4ERR_DELAY +      NFS4ERR_DENIED +      NFS4ERR_EXPIRED +      NFS4ERR_FHEXPIRED +      NFS4ERR_GRACE +      NFS4ERR_INVAL +      NFS4ERR_ISDIR +      NFS4ERR_LEASE_MOVED +      NFS4ERR_LOCK_NOTSUPP +      NFS4ERR_LOCK_RANGE +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NO_GRACE +      NFS4ERR_OLD_STATEID +      NFS4ERR_OPENMODE +      NFS4ERR_RECLAIM_BAD +      NFS4ERR_RECLAIM_CONFLICT +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_CLIENTID +      NFS4ERR_STALE_STATEID + + + +Shepler, et al.             Standards Track                   [Page 159] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.11.  Operation 13: LOCKT - Test For Lock + +   SYNOPSIS + +     (cfh) locktype, offset, length owner -> {void, NFS4ERR_DENIED -> +     owner} + +   ARGUMENT + +     struct LOCKT4args { +             /* CURRENT_FH: file */ +             nfs_lock_type4  locktype; +             offset4         offset; +             length4         length; +             lock_owner4     owner; +     }; + +   RESULT + +     struct LOCK4denied { +             offset4         offset; +             length4         length; +             nfs_lock_type4  locktype; +             lock_owner4     owner; +     }; + +     union LOCKT4res switch (nfsstat4 status) { +      case NFS4ERR_DENIED: +              LOCK4denied    denied; +      case NFS4_OK: +              void; +      default: +              void; +     }; + +   DESCRIPTION + +   The LOCKT operation tests the lock as specified in the arguments.  If +   a conflicting lock exists, the owner, offset, length, and type of the +   conflicting lock are returned; if no lock is held, nothing other than +   NFS4_OK is returned.  Lock types READ_LT and READW_LT are processed +   in the same way in that a conflicting lock test is done without +   regard to blocking or non-blocking.  The same is true for WRITE_LT +   and WRITEW_LT. + +   The ranges are specified as for LOCK.  The NFS4ERR_INVAL and +   NFS4ERR_BAD_RANGE errors are returned under the same circumstances as +   for LOCK. + + + +Shepler, et al.             Standards Track                   [Page 160] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   If the server is unable to determine the exact offset and length of +   the conflicting lock, the same offset and length that were provided +   in the arguments should be returned in the denied results.  The File +   Locking section contains further discussion of the file locking +   mechanisms. + +   LOCKT uses a lock_owner4 rather a stateid4, as is used in LOCK to +   identify the owner.  This is because the client does not have to open +   the file to test for the existence of a lock, so a stateid may not be +   available. + +   The test for conflicting locks should exclude locks for the current +   lockowner.  Note that since such locks are not examined the possible +   existence of overlapping ranges may not affect the results of LOCKT. +   If the server does examine locks that match the lockowner for the +   purpose of range checking, NFS4ERR_LOCK_RANGE may be returned..  In +   the event that it returns NFS4_OK, clients may do a LOCK and receive +   NFS4ERR_LOCK_RANGE on the LOCK request because of the flexibility +   provided to the server. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_RANGE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DENIED +      NFS4ERR_FHEXPIRED +      NFS4ERR_GRACE +      NFS4ERR_INVAL +      NFS4ERR_ISDIR +      NFS4ERR_LEASE_MOVED +      NFS4ERR_LOCK_RANGE +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_CLIENTID + + + + + + + +Shepler, et al.             Standards Track                   [Page 161] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.12.  Operation 14: LOCKU - Unlock File + +   SYNOPSIS + +     (cfh) type, seqid, stateid, offset, length -> stateid + +   ARGUMENT + +     struct LOCKU4args { +             /* CURRENT_FH: file */ +             nfs_lock_type4  locktype; +             seqid4          seqid; +             stateid4        stateid; +             offset4         offset; +             length4         length; +     }; + +   RESULT + +     union LOCKU4res switch (nfsstat4 status) { +      case   NFS4_OK: +              stateid4       stateid; +      default: +              void; +     }; + +   DESCRIPTION + +   The LOCKU operation unlocks the record lock specified by the +   parameters. The client may set the locktype field to any value that +   is legal for the nfs_lock_type4 enumerated type, and the server MUST +   accept any legal value for locktype. Any legal value for locktype has +   no effect on the success or failure of the LOCKU operation. + +   The ranges are specified as for LOCK.  The NFS4ERR_INVAL and +   NFS4ERR_BAD_RANGE errors are returned under the same circumstances as +   for LOCK. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   If the area to be unlocked does not correspond exactly to a lock +   actually held by the lockowner the server may return the error +   NFS4ERR_LOCK_RANGE.  This includes the case in which the area is not +   locked, where the area is a sub-range of the area locked, where it +   overlaps the area locked without matching exactly or the area +   specified includes multiple locks held by the lockowner.  In all of + + + +Shepler, et al.             Standards Track                   [Page 162] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   these cases, allowed by POSIX locking semantics, a client receiving +   this error, should if it desires support for such operations, +   simulate the operation using LOCKU on ranges corresponding to locks +   it actually holds, possibly followed by LOCK requests for the sub- +   ranges not being unlocked. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_RANGE +      NFS4ERR_BAD_SEQID +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_EXPIRED +      NFS4ERR_FHEXPIRED +      NFS4ERR_GRACE +      NFS4ERR_INVAL +      NFS4ERR_ISDIR +      NFS4ERR_LEASE_MOVED +      NFS4ERR_LOCK_RANGE +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_OLD_STATEID +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + +14.2.13.  Operation 15: LOOKUP - Lookup Filename + +   SYNOPSIS + +     (cfh), component -> (cfh) + +   ARGUMENT + +     struct LOOKUP4args { +             /* CURRENT_FH: directory */ +             component4      objname; +     }; + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 163] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   RESULT + +     struct LOOKUP4res { +             /* CURRENT_FH: object */ +             nfsstat4        status; +     }; + +   DESCRIPTION + +   This operation LOOKUPs or finds a filesystem object using the +   directory specified by the current filehandle.  LOOKUP evaluates the +   component and if the object exists the current filehandle is replaced +   with the component's filehandle. + +   If the component cannot be evaluated either because it does not exist +   or because the client does not have permission to evaluate the +   component, then an error will be returned and the current filehandle +   will be unchanged. + +   If the component is a zero length string or if any component does not +   obey the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + +   IMPLEMENTATION + +   If the client wants to achieve the effect of a multi-component +   lookup, it may construct a COMPOUND request such as (and obtain each +   filehandle): + +      PUTFH  (directory filehandle) +      LOOKUP "pub" +      GETFH +      LOOKUP "foo" +      GETFH +      LOOKUP "bar" +      GETFH + +   NFS version 4 servers depart from the semantics of previous NFS +   versions in allowing LOOKUP requests to cross mountpoints on the +   server.  The client can detect a mountpoint crossing by comparing the +   fsid attribute of the directory with the fsid attribute of the +   directory looked up.  If the fsids are different then the new +   directory is a server mountpoint.  UNIX clients that detect a +   mountpoint crossing will need to mount the server's filesystem.  This +   needs to be done to maintain the file object identity checking +   mechanisms common to UNIX clients. + + + + + + +Shepler, et al.             Standards Track                   [Page 164] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   Servers that limit NFS access to "shares" or "exported" filesystems +   should provide a pseudo-filesystem into which the exported +   filesystems can be integrated, so that clients can browse the +   server's name space.  The clients' view of a pseudo filesystem will +   be limited to paths that lead to exported filesystems. + +   Note: previous versions of the protocol assigned special semantics to +   the names "." and "..".  NFS version 4 assigns no special semantics +   to these names.  The LOOKUPP operator must be used to lookup a parent +   directory. + +   Note that this operation does not follow symbolic links.  The client +   is responsible for all parsing of filenames including filenames that +   are modified by symbolic links encountered during the lookup process. + +   If the current filehandle supplied is not a directory but a symbolic +   link, the error NFS4ERR_SYMLINK is returned as the error.  For all +   other non-directory file types, the error NFS4ERR_NOTDIR is returned. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADNAME +      NFS4ERR_BADXDR +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NAMETOOLONG +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOTDIR +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_SYMLINK +      NFS4ERR_WRONGSEC + +14.2.14.  Operation 16: LOOKUPP - Lookup Parent Directory + +   SYNOPSIS + +     (cfh) -> (cfh) + + + + + + +Shepler, et al.             Standards Track                   [Page 165] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ARGUMENT + +     /* CURRENT_FH: object */ +     void; + +   RESULT + +     struct LOOKUPP4res { +             /* CURRENT_FH: directory */ +             nfsstat4        status; +     }; + +   DESCRIPTION + +   The current filehandle is assumed to refer to a regular directory +   or a named attribute directory.  LOOKUPP assigns the filehandle for +   its parent directory to be the current filehandle.  If there is no +   parent directory an NFS4ERR_NOENT error must be returned. +   Therefore, NFS4ERR_NOENT will be returned by the server when the +   current filehandle is at the root or top of the server's file tree. + +   IMPLEMENTATION + +   As for LOOKUP, LOOKUPP will also cross mountpoints. + +   If the current filehandle is not a directory or named attribute +   directory, the error NFS4ERR_NOTDIR is returned. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_FHEXPIRED +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOTDIR +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.15.  Operation 17: NVERIFY - Verify Difference in Attributes + +   SYNOPSIS + +     (cfh), fattr -> - + + + + +Shepler, et al.             Standards Track                   [Page 166] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ARGUMENT + +     struct NVERIFY4args { +             /* CURRENT_FH: object */ +             fattr4          obj_attributes; +     }; + +   RESULT + +     struct NVERIFY4res { +             nfsstat4        status; +     }; + +   DESCRIPTION + +   This operation is used to prefix a sequence of operations to be +   performed if one or more attributes have changed on some filesystem +   object.  If all the attributes match then the error NFS4ERR_SAME must +   be returned. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   This operation is useful as a cache validation operator.  If the +   object to which the attributes belong has changed then the following +   operations may obtain new data associated with that object.  For +   instance, to check if a file has been changed and obtain new data if +   it has: + +      PUTFH  (public) +      LOOKUP "foobar" +      NVERIFY attrbits attrs +      READ 0 32767 + +   In the case that a recommended attribute is specified in the NVERIFY +   operation and the server does not support that attribute for the +   filesystem object, the error NFS4ERR_ATTRNOTSUPP is returned to the +   client. + +   When the attribute rdattr_error or any write-only attribute (e.g., +   time_modify_set) is specified, the error NFS4ERR_INVAL is returned to +   the client. + + + + + + + + +Shepler, et al.             Standards Track                   [Page 167] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ATTRNOTSUPP +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_RESOURCE +      NFS4ERR_SAME +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.16.  Operation 18: OPEN - Open a Regular File + +   SYNOPSIS + +     (cfh), seqid, share_access, share_deny, owner, openhow, claim -> +     (cfh), stateid, cinfo, rflags, open_confirm, attrset delegation + +   ARGUMENT + +     struct OPEN4args { +             seqid4          seqid; +             uint32_t        share_access; +             uint32_t        share_deny; +             open_owner4     owner; +             openflag4       openhow; +             open_claim4     claim; +     }; + +     enum createmode4 { +             UNCHECKED4      = 0, +             GUARDED4        = 1, +             EXCLUSIVE4      = 2 +     }; + +     union createhow4 switch (createmode4 mode) { +      case UNCHECKED4: +      case GUARDED4: +              fattr4         createattrs; +      case EXCLUSIVE4: +              verifier4      createverf; + + + +Shepler, et al.             Standards Track                   [Page 168] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +     }; + +     enum opentype4 { +             OPEN4_NOCREATE  = 0, +             OPEN4_CREATE    = 1 +     }; + +     union openflag4 switch (opentype4 opentype) { +      case OPEN4_CREATE: +              createhow4     how; +      default: +              void; +     }; + +     /* Next definitions used for OPEN delegation */ +     enum limit_by4 { +             NFS_LIMIT_SIZE          = 1, +             NFS_LIMIT_BLOCKS        = 2 +             /* others as needed */ +     }; + +     struct nfs_modified_limit4 { +             uint32_t        num_blocks; +             uint32_t        bytes_per_block; +     }; + +     union nfs_space_limit4 switch (limit_by4 limitby) { +      /* limit specified as file size */ +      case NFS_LIMIT_SIZE: +              uint64_t               filesize; +      /* limit specified by number of blocks */ +      case NFS_LIMIT_BLOCKS: +              nfs_modified_limit4    mod_blocks; +     } ; + +     enum open_delegation_type4 { +             OPEN_DELEGATE_NONE      = 0, +             OPEN_DELEGATE_READ      = 1, +             OPEN_DELEGATE_WRITE     = 2 +     }; + +     enum open_claim_type4 { +             CLAIM_NULL              = 0, +             CLAIM_PREVIOUS          = 1, +             CLAIM_DELEGATE_CUR      = 2, +             CLAIM_DELEGATE_PREV     = 3 +     }; + + + + +Shepler, et al.             Standards Track                   [Page 169] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +     struct open_claim_delegate_cur4 { +             stateid4        delegate_stateid; +             component4      file; +     }; + +     union open_claim4 switch (open_claim_type4 claim) { +      /* +       * No special rights to file. Ordinary OPEN of the specified file. +       */ +      case CLAIM_NULL: +              /* CURRENT_FH: directory */ +              component4     file; + +      /* +       * Right to the file established by an open previous to server +       * reboot.  File identified by filehandle obtained at that time +       * rather than by name. +       */ +      case CLAIM_PREVIOUS: +              /* CURRENT_FH: file being reclaimed */ +              open_delegation_type4   delegate_type; + +      /* +       * Right to file based on a delegation granted by the server. +       * File is specified by name. +       */ +      case CLAIM_DELEGATE_CUR: +              /* CURRENT_FH: directory */ +              open_claim_delegate_cur4       delegate_cur_info; + +      /* Right to file based on a delegation granted to a previous boot +       * instance of the client.  File is specified by name. +       */ +      case CLAIM_DELEGATE_PREV: +              /* CURRENT_FH: directory */ +              component4     file_delegate_prev; +     }; + +   RESULT + +   struct open_read_delegation4 { +           stateid4        stateid;        /* Stateid for delegation*/ +           bool            recall;         /* Pre-recalled flag for +                                              delegations obtained +                                              by reclaim +                                              (CLAIM_PREVIOUS) */ +           nfsace4         permissions;    /* Defines users who don't +                                              need an ACCESS call to + + + +Shepler, et al.             Standards Track                   [Page 170] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +                                              open for read */ +   }; + +   struct open_write_delegation4 { +           stateid4        stateid;        /* Stateid for delegation*/ +           bool            recall;         /* Pre-recalled flag for +                                              delegations obtained +                                              by reclaim +                                              (CLAIM_PREVIOUS) */ +           nfs_space_limit4 space_limit;   /* Defines condition that +                                              the client must check to +                                              determine whether the +                                              file needs to be flushed +                                              to the server on close. +                                              */ +           nfsace4         permissions;    /* Defines users who don't +                                              need an ACCESS call as +                                              part of a delegated +                                              open. */ +   }; + +   union open_delegation4 +   switch (open_delegation_type4 delegation_type) { +           case OPEN_DELEGATE_NONE: +                   void; +           case OPEN_DELEGATE_READ: +                   open_read_delegation4 read; +           case OPEN_DELEGATE_WRITE: +                   open_write_delegation4 write; +   }; + +   const OPEN4_RESULT_CONFIRM      = 0x00000002; +   const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; + +   struct OPEN4resok { +           stateid4        stateid;        /* Stateid for open */ +           change_info4    cinfo;          /* Directory Change Info */ +           uint32_t        rflags;         /* Result flags */ +           bitmap4         attrset;        /* attributes on create */ +           open_delegation4 delegation;    /* Info on any open +                                              delegation */ +   }; + +   union OPEN4res switch (nfsstat4 status) { +    case NFS4_OK: +           /* CURRENT_FH: opened file */ +           OPEN4resok      resok4; +    default: + + + +Shepler, et al.             Standards Track                   [Page 171] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           void; +   }; + +   WARNING TO CLIENT IMPLEMENTORS + +   OPEN resembles LOOKUP in that it generates a filehandle for the +   client to use.  Unlike LOOKUP though, OPEN creates server state on +   the filehandle.  In normal circumstances, the client can only release +   this state with a CLOSE operation.  CLOSE uses the current filehandle +   to determine which file to close.  Therefore the client MUST follow +   every OPEN operation with a GETFH operation in the same COMPOUND +   procedure.  This will supply the client with the filehandle such that +   CLOSE can be used appropriately. + +   Simply waiting for the lease on the file to expire is insufficient +   because the server may maintain the state indefinitely as long as +   another client does not attempt to make a conflicting access to the +   same file. + +   DESCRIPTION + +   The OPEN operation creates and/or opens a regular file in a directory +   with the provided name.  If the file does not exist at the server and +   creation is desired, specification of the method of creation is +   provided by the openhow parameter.  The client has the choice of +   three creation methods: UNCHECKED, GUARDED, or EXCLUSIVE. + +   If the current filehandle is a named attribute directory, OPEN will +   then create or open a named attribute file.  Note that exclusive +   create of a named attribute is not supported.  If the createmode is +   EXCLUSIVE4 and the current filehandle is a named attribute directory, +   the server will return EINVAL. + +   UNCHECKED means that the file should be created if a file of that +   name does not exist and encountering an existing regular file of that +   name is not an error.  For this type of create, createattrs specifies +   the initial set of attributes for the file.  The set of attributes +   may include any writable attribute valid for regular files.  When an +   UNCHECKED create encounters an existing file, the attributes +   specified by createattrs are not used, except that when an size of +   zero is specified, the existing file is truncated.  If GUARDED is +   specified, the server checks for the presence of a duplicate object +   by name before performing the create.  If a duplicate exists, an +   error of NFS4ERR_EXIST is returned as the status.  If the object does +   not exist, the request is performed as described for UNCHECKED.  For + + + + + + +Shepler, et al.             Standards Track                   [Page 172] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   each of these cases (UNCHECKED and GUARDED) where the operation is +   successful, the server will return to the client an attribute mask +   signifying which attributes were successfully set for the object. + +   EXCLUSIVE specifies that the server is to follow exclusive creation +   semantics, using the verifier to ensure exclusive creation of the +   target.  The server should check for the presence of a duplicate +   object by name.  If the object does not exist, the server creates the +   object and stores the verifier with the object.  If the object does +   exist and the stored verifier matches the client provided verifier, +   the server uses the existing object as the newly created object.  If +   the stored verifier does not match, then an error of NFS4ERR_EXIST is +   returned.  No attributes may be provided in this case, since the +   server may use an attribute of the target object to store the +   verifier.  If the server uses an attribute to store the exclusive +   create verifier, it will signify which attribute by setting the +   appropriate bit in the attribute mask that is returned in the +   results. + +   For the target directory, the server returns change_info4 information +   in cinfo.  With the atomic field of the change_info4 struct, the +   server will indicate if the before and after change attributes were +   obtained atomically with respect to the link creation. + +   Upon successful creation, the current filehandle is replaced by that +   of the new object. + +   The OPEN operation provides for Windows share reservation capability +   with the use of the share_access and share_deny fields of the OPEN +   arguments.  The client specifies at OPEN the required share_access +   and share_deny modes.  For clients that do not directly support +   SHAREs (i.e., UNIX), the expected deny value is DENY_NONE.  In the +   case that there is a existing SHARE reservation that conflicts with +   the OPEN request, the server returns the error NFS4ERR_SHARE_DENIED. +   For a complete SHARE request, the client must provide values for the +   owner and seqid fields for the OPEN argument.  For additional +   discussion of SHARE semantics see the section on 'Share +   Reservations'. + +   In the case that the client is recovering state from a server +   failure, the claim field of the OPEN argument is used to signify that +   the request is meant to reclaim state previously held. + +   The "claim" field of the OPEN argument is used to specify the file to +   be opened and the state information which the client claims to +   possess.  There are four basic claim types which cover the various +   situations for an OPEN.  They are as follows: + + + + +Shepler, et al.             Standards Track                   [Page 173] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   CLAIM_NULL +                         For the client, this is a new OPEN +                         request and there is no previous state +                         associate with the file for the client. + +   CLAIM_PREVIOUS +                         The client is claiming basic OPEN state +                         for a file that was held previous to a +                         server reboot.  Generally used when a +                         server is returning persistent +                         filehandles; the client may not have the +                         file name to reclaim the OPEN. + +   CLAIM_DELEGATE_CUR +                         The client is claiming a delegation for +                         OPEN as granted by the server. +                         Generally this is done as part of +                         recalling a delegation. + +   CLAIM_DELEGATE_PREV +                         The client is claiming a delegation +                         granted to a previous client instance; +                         used after the client reboots. The +                         server MAY support CLAIM_DELEGATE_PREV. +                         If it does support CLAIM_DELEGATE_PREV, +                         SETCLIENTID_CONFIRM MUST NOT remove the +                         client's delegation state, and the +                         server MUST support the DELEGPURGE +                         operation. + +   For OPEN requests whose claim type is other than CLAIM_PREVIOUS +   (i.e., requests other than those devoted to reclaiming opens after a +   server reboot) that reach the server during its grace or lease +   expiration period, the server returns an error of NFS4ERR_GRACE. + +   For any OPEN request, the server may return an open delegation, which +   allows further opens and closes to be handled locally on the client +   as described in the section Open Delegation.  Note that delegation is +   up to the server to decide.  The client should never assume that +   delegation will or will not be granted in a particular instance.  It +   should always be prepared for either case.  A partial exception is +   the reclaim (CLAIM_PREVIOUS) case, in which a delegation type is +   claimed.  In this case, delegation will always be granted, although +   the server may specify an immediate recall in the delegation +   structure. + +   The rflags returned by a successful OPEN allow the server to return +   information governing how the open file is to be handled. + + + +Shepler, et al.             Standards Track                   [Page 174] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   OPEN4_RESULT_CONFIRM indicates that the client MUST execute an +   OPEN_CONFIRM operation before using the open file. +   OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking +   behavior supports the complete set of Posix locking techniques.  From +   this the client can choose to manage file locking state in a way to +   handle a mis-match of file locking management. + +   If the component is of zero length, NFS4ERR_INVAL will be returned. +   The component is also subject to the normal UTF-8, character support, +   and name checks.  See the section "UTF-8 Related Errors" for further +   discussion. + +   When an OPEN is done and the specified lockowner already has the +   resulting filehandle open, the result is to "OR" together the new +   share and deny status together with the existing status.  In this +   case, only a single CLOSE need be done, even though multiple OPENs +   were completed.  When such an OPEN is done, checking of share +   reservations for the new OPEN proceeds normally, with no exception +   for the existing OPEN held by the same lockowner. + +   If the underlying filesystem at the server is only accessible in a +   read-only mode and the OPEN request has specified ACCESS_WRITE or +   ACCESS_BOTH, the server will return NFS4ERR_ROFS to indicate a read- +   only filesystem. + +   As with the CREATE operation, the server MUST derive the owner, owner +   ACE, group, or group ACE if any of the four attributes are required +   and supported by the server's filesystem.  For an OPEN with the +   EXCLUSIVE4 createmode, the server has no choice, since such OPEN +   calls do not include the createattrs field.  Conversely, if +   createattrs is specified, and includes owner or group (or +   corresponding ACEs) that the principal in the RPC call's credentials +   does not have authorization to create files for, then the server may +   return NFS4ERR_PERM. + +   In the case of a OPEN which specifies a size of zero (e.g., +   truncation) and the file has named attributes, the named attributes +   are left as is.  They are not removed. + +   IMPLEMENTATION + +   The OPEN operation contains support for EXCLUSIVE create.  The +   mechanism is similar to the support in NFS version 3 [RFC1813].  As +   in NFS version 3, this mechanism provides reliable exclusive +   creation.  Exclusive create is invoked when the how parameter is +   EXCLUSIVE.  In this case, the client provides a verifier that can +   reasonably be expected to be unique.  A combination of a client + + + + +Shepler, et al.             Standards Track                   [Page 175] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   identifier, perhaps the client network address, and a unique number +   generated by the client, perhaps the RPC transaction identifier, may +   be appropriate. + +   If the object does not exist, the server creates the object and +   stores the verifier in stable storage. For filesystems that do not +   provide a mechanism for the storage of arbitrary file attributes, the +   server may use one or more elements of the object meta-data to store +   the verifier. The verifier must be stored in stable storage to +   prevent erroneous failure on retransmission of the request. It is +   assumed that an exclusive create is being performed because exclusive +   semantics are critical to the application. Because of the expected +   usage, exclusive CREATE does not rely solely on the normally volatile +   duplicate request cache for storage of the verifier. The duplicate +   request cache in volatile storage does not survive a crash and may +   actually flush on a long network partition, opening failure windows. +   In the UNIX local filesystem environment, the expected storage +   location for the verifier on creation is the meta-data (time stamps) +   of the object. For this reason, an exclusive object create may not +   include initial attributes because the server would have nowhere to +   store the verifier. + +   If the server can not support these exclusive create semantics, +   possibly because of the requirement to commit the verifier to stable +   storage, it should fail the OPEN request with the error, +   NFS4ERR_NOTSUPP. + +   During an exclusive CREATE request, if the object already exists, the +   server reconstructs the object's verifier and compares it with the +   verifier in the request. If they match, the server treats the request +   as a success. The request is presumed to be a duplicate of an +   earlier, successful request for which the reply was lost and that the +   server duplicate request cache mechanism did not detect.  If the +   verifiers do not match, the request is rejected with the status, +   NFS4ERR_EXIST. + +   Once the client has performed a successful exclusive create, it must +   issue a SETATTR to set the correct object attributes.  Until it does +   so, it should not rely upon any of the object attributes, since the +   server implementation may need to overload object meta-data to store +   the verifier.  The subsequent SETATTR must not occur in the same +   COMPOUND request as the OPEN.  This separation will guarantee that +   the exclusive create mechanism will continue to function properly in +   the face of retransmission of the request. + +   Use of the GUARDED attribute does not provide exactly-once semantics. +   In particular, if a reply is lost and the server does not detect the +   retransmission of the request, the operation can fail with + + + +Shepler, et al.             Standards Track                   [Page 176] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   NFS4ERR_EXIST, even though the create was performed successfully. +   The client would use this behavior in the case that the application +   has not requested an exclusive create but has asked to have the file +   truncated when the file is opened.  In the case of the client timing +   out and retransmitting the create request, the client can use GUARDED +   to prevent against a sequence like: create, write, create +   (retransmitted) from occurring. + +   For SHARE reservations, the client must specify a value for +   share_access that is one of READ, WRITE, or BOTH.  For share_deny, +   the client must specify one of NONE, READ, WRITE, or BOTH.  If the +   client fails to do this, the server must return NFS4ERR_INVAL. + +   Based on the share_access value (READ, WRITE, or BOTH) the client +   should check that the requester has the proper access rights to +   perform the specified operation.  This would generally be the results +   of applying the ACL access rules to the file for the current +   requester.  However, just as with the ACCESS operation, the client +   should not attempt to second-guess the server's decisions, as access +   rights may change and may be subject to server administrative +   controls outside the ACL framework.  If the requester is not +   authorized to READ or WRITE (depending on the share_access value), +   the server must return NFS4ERR_ACCESS.  Note that since the NFS +   version 4 protocol does not impose any requirement that READs and +   WRITEs issued for an open file have the same credentials as the OPEN +   itself, the server still must do appropriate access checking on the +   READs and WRITEs themselves. + +   If the component provided to OPEN is a symbolic link, the error +   NFS4ERR_SYMLINK will be returned to the client.  If the current +   filehandle is not a directory, the error NFS4ERR_NOTDIR will be +   returned. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_ATTRNOTSUPP +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADNAME +      NFS4ERR_BADOWNER +      NFS4ERR_BAD_SEQID +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DQUOT +      NFS4ERR_EXIST +      NFS4ERR_EXPIRED + + + +Shepler, et al.             Standards Track                   [Page 177] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      NFS4ERR_FHEXPIRED +      NFS4ERR_GRACE +      NFS4ERR_IO +      NFS4ERR_INVAL +      NFS4ERR_ISDIR +      NFS4ERR_LEASE_MOVED +      NFS4ERR_MOVED +      NFS4ERR_NAMETOOLONG +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOSPC +      NFS4ERR_NOTDIR +      NFS4ERR_NOTSUPP +      NFS4ERR_NO_GRACE +      NFS4ERR_PERM +      NFS4ERR_RECLAIM_BAD +      NFS4ERR_RECLAIM_CONFLICT +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_SHARE_DENIED +      NFS4ERR_STALE +      NFS4ERR_STALE_CLIENTID +      NFS4ERR_SYMLINK +      NFS4ERR_WRONGSEC + +14.2.17.  Operation 19: OPENATTR - Open Named Attribute Directory + +   SYNOPSIS + +     (cfh) createdir -> (cfh) + +   ARGUMENT + +     struct OPENATTR4args { +             /* CURRENT_FH: object */ +             bool    createdir; +     }; + +   RESULT + +     struct OPENATTR4res { +             /* CURRENT_FH: named attr directory*/ +             nfsstat4        status; +     }; + + + + + + +Shepler, et al.             Standards Track                   [Page 178] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   The OPENATTR operation is used to obtain the filehandle of the named +   attribute directory associated with the current filehandle.  The +   result of the OPENATTR will be a filehandle to an object of type +   NF4ATTRDIR.  From this filehandle, READDIR and LOOKUP operations can +   be used to obtain filehandles for the various named attributes +   associated with the original filesystem object.  Filehandles returned +   within the named attribute directory will have a type of +   NF4NAMEDATTR. + +   The createdir argument allows the client to signify if a named +   attribute directory should be created as a result of the OPENATTR +   operation.  Some clients may use the OPENATTR operation with a value +   of FALSE for createdir to determine if any named attributes exist for +   the object.  If none exist, then NFS4ERR_NOENT will be returned.  If +   createdir has a value of TRUE and no named attribute directory +   exists, one is created.  The creation of a named attribute directory +   assumes that the server has implemented named attribute support in +   this fashion and is not required to do so by this definition. + +   IMPLEMENTATION + +   If the server does not support named attributes for the current +   filehandle, an error of NFS4ERR_NOTSUPP will be returned to the +   client. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DQUOT +      NFS4ERR_FHEXPIRED +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOSPC +      NFS4ERR_NOTSUPP +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + + + + + + +Shepler, et al.             Standards Track                   [Page 179] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.18.  Operation 20: OPEN_CONFIRM - Confirm Open + +   SYNOPSIS + +     (cfh), seqid, stateid-> stateid + +   ARGUMENT + +     struct OPEN_CONFIRM4args { +             /* CURRENT_FH: opened file */ +             stateid4        open_stateid; +             seqid4          seqid; +     }; + +   RESULT + +     struct OPEN_CONFIRM4resok { +             stateid4        open_stateid; +     }; + +     union OPEN_CONFIRM4res switch (nfsstat4 status) { +      case NFS4_OK: +              OPEN_CONFIRM4resok     resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   This operation is used to confirm the sequence id usage for the first +   time that a open_owner is used by a client.  The stateid returned +   from the OPEN operation is used as the argument for this operation +   along with the next sequence id for the open_owner.  The sequence id +   passed to the OPEN_CONFIRM must be 1 (one) greater than the seqid +   passed to the OPEN operation from which the open_confirm value was +   obtained.  If the server receives an unexpected sequence id with +   respect to the original open, then the server assumes that the client +   will not confirm the original OPEN and all state associated with the +   original OPEN is released by the server. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   A given client might generate many open_owner4 data structures for a +   given clientid.  The client will periodically either dispose of its +   open_owner4s or stop using them for indefinite periods of time.  The +   latter situation is why the NFS version 4 protocol does not have an + + + +Shepler, et al.             Standards Track                   [Page 180] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   explicit operation to exit an open_owner4: such an operation is of no +   use in that situation.  Instead, to avoid unbounded memory use, the +   server needs to implement a strategy for disposing of open_owner4s +   that have no current lock, open, or delegation state for any files +   and have not been used recently.  The time period used to determine +   when to dispose of open_owner4s is an implementation choice.  The +   time period should certainly be no less than the lease time plus any +   grace period the server wishes to implement beyond a lease time.  The +   OPEN_CONFIRM operation allows the server to safely dispose of unused +   open_owner4 data structures. + +   In the case that a client issues an OPEN operation and the server no +   longer has a record of the open_owner4, the server needs to ensure +   that this is a new OPEN and not a replay or retransmission. + +   Servers must not require confirmation on OPENs that grant delegations +   or are doing reclaim operations.  See section "Use of Open +   Confirmation" for details.  The server can easily avoid this by +   noting whether it has disposed of one open_owner4 for the given +   clientid.  If the server does not support delegation, it might simply +   maintain a single bit that notes whether any open_owner4 (for any +   client) has been disposed of. + +   The server must hold unconfirmed OPEN state until one of three events +   occur.  First, the client sends an OPEN_CONFIRM request with the +   appropriate sequence id and stateid within the lease period.  In this +   case, the OPEN state on the server goes to confirmed, and the +   open_owner4 on the server is fully established. + +   Second, the client sends another OPEN request with a sequence id that +   is incorrect for the open_owner4 (out of sequence).  In this case, +   the server assumes the second OPEN request is valid and the first one +   is a replay.  The server cancels the OPEN state of the first OPEN +   request, establishes an unconfirmed OPEN state for the second OPEN +   request, and responds to the second OPEN request with an indication +   that an OPEN_CONFIRM is needed.  The process then repeats itself. +   While there is a potential for a denial of service attack on the +   client, it is mitigated if the client and server require the use of a +   security flavor based on Kerberos V5, LIPKEY, or some other flavor +   that uses cryptography. + +   What if the server is in the unconfirmed OPEN state for a given +   open_owner4, and it receives an operation on the open_owner4 that has +   a stateid but the operation is not OPEN, or it is OPEN_CONFIRM but +   with the wrong stateid?  Then, even if the seqid is correct, the + + + + + + +Shepler, et al.             Standards Track                   [Page 181] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   server returns NFS4ERR_BAD_STATEID, because the server assumes the +   operation is a replay: if the server has no established OPEN state, +   then there is no way, for example, a LOCK operation could be valid. + +   Third, neither of the two aforementioned events occur for the +   open_owner4 within the lease period.  In this case, the OPEN state is +   canceled and disposal of the open_owner4 can occur. + +   ERRORS + +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_SEQID +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_EXPIRED +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_ISDIR +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_OLD_STATEID +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + +14.2.19.  Operation 21: OPEN_DOWNGRADE - Reduce Open File Access + +   SYNOPSIS + +     (cfh), stateid, seqid, access, deny -> stateid + +   ARGUMENT + +     struct OPEN_DOWNGRADE4args { +             /* CURRENT_FH: opened file */ +             stateid4        open_stateid; +             seqid4          seqid; +             uint32_t        share_access; +             uint32_t        share_deny; +     }; + +   RESULT + +     struct OPEN_DOWNGRADE4resok { +             stateid4        open_stateid; +     }; + + + +Shepler, et al.             Standards Track                   [Page 182] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +     union OPEN_DOWNGRADE4res switch(nfsstat4 status) { +      case NFS4_OK: +             OPEN_DOWNGRADE4resok    resok4; +      default: +             void; +     }; + +   DESCRIPTION + +   This operation is used to adjust the share_access and share_deny bits +   for a given open.  This is necessary when a given openowner opens the +   same file multiple times with different share_access and share_deny +   flags.  In this situation, a close of one of the opens may change the +   appropriate share_access and share_deny flags to remove bits +   associated with opens no longer in effect. + +   The share_access and share_deny bits specified in this operation +   replace the current ones for the specified open file.  The +   share_access and share_deny bits specified must be exactly equal to +   the union of the share_access and share_deny bits specified for some +   subset of the OPENs in effect for current openowner on the current +   file.  If that constraint is not respected, the error NFS4ERR_INVAL +   should be returned.  Since share_access and share_deny bits are +   subsets of those already granted, it is not possible for this request +   to be denied because of conflicting share reservations. + +   On success, the current filehandle retains its value. + +   ERRORS + +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_SEQID +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_EXPIRED +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_OLD_STATEID +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + + + + + + +Shepler, et al.             Standards Track                   [Page 183] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.20.  Operation 22: PUTFH - Set Current Filehandle + +   SYNOPSIS + +     filehandle -> (cfh) + +   ARGUMENT + +     struct PUTFH4args { +             nfs_fh4         object; +     }; + +   RESULT + +     struct PUTFH4res { +             /* CURRENT_FH: */ +             nfsstat4        status; +     }; + +   DESCRIPTION + +   Replaces the current filehandle with the filehandle provided as an +   argument. + +   If the security mechanism used by the requester does not meet the +   requirements of the filehandle provided to this operation, the server +   MUST return NFS4ERR_WRONGSEC. + +   IMPLEMENTATION + +   Commonly used as the first operator in an NFS request to set the +   context for following operations. + +   ERRORS + +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_FHEXPIRED +      NFS4ERR_MOVED +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_WRONGSEC + + + + + + + + +Shepler, et al.             Standards Track                   [Page 184] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.21.  Operation 23: PUTPUBFH - Set Public Filehandle + +   SYNOPSIS + +     - -> (cfh) + +   ARGUMENT + +     void; + +   RESULT + +     struct PUTPUBFH4res { +             /* CURRENT_FH: public fh */ +             nfsstat4        status; +     }; + +   DESCRIPTION + +   Replaces the current filehandle with the filehandle that represents +   the public filehandle of the server's name space.  This filehandle +   may be different from the "root" filehandle which may be associated +   with some other directory on the server. + +   The public filehandle represents the concepts embodied in [RFC2054], +   [RFC2055], [RFC2224].  The intent for NFS version 4 is that the +   public filehandle (represented by the PUTPUBFH operation) be used as +   a method of providing WebNFS server compatibility with NFS versions 2 +   and 3. + +   The public filehandle and the root filehandle (represented by the +   PUTROOTFH operation) should be equivalent.  If the public and root +   filehandles are not equivalent, then the public filehandle MUST be a +   descendant of the root filehandle. + +   IMPLEMENTATION + +   Used as the first operator in an NFS request to set the context for +   following operations. + +   With the NFS version 2 and 3 public filehandle, the client is able to +   specify whether the path name provided in the LOOKUP should be +   evaluated as either an absolute path relative to the server's root or +   relative to the public filehandle.  [RFC2224] contains further +   discussion of the functionality.  With NFS version 4, that type of +   specification is not directly available in the LOOKUP operation.  The +   reason for this is because the component separators needed to specify +   absolute vs. relative are not allowed in NFS version 4.  Therefore, + + + +Shepler, et al.             Standards Track                   [Page 185] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   the client is responsible for constructing its request such that the +   use of either PUTROOTFH or PUTPUBFH are used to signify absolute or +   relative evaluation of an NFS URL respectively. + +   Note that there are warnings mentioned in [RFC2224] with respect to +   the use of absolute evaluation and the restrictions the server may +   place on that evaluation with respect to how much of its namespace +   has been made available.  These same warnings apply to NFS version 4. +   It is likely, therefore that because of server implementation +   details, an NFS version 3 absolute public filehandle lookup may +   behave differently than an NFS version 4 absolute resolution. + +   There is a form of security negotiation as described in [RFC2755] +   that uses the public filehandle a method of employing SNEGO.  This +   method is not available with NFS version 4 as filehandles are not +   overloaded with special meaning and therefore do not provide the same +   framework as NFS versions 2 and 3.  Clients should therefore use the +   security negotiation mechanisms described in this RFC. + +   ERRORS + +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_WRONGSEC + +14.2.22.  Operation 24: PUTROOTFH - Set Root Filehandle + +   SYNOPSIS + +     - -> (cfh) + +   ARGUMENT + +     void; + +   RESULT + +     struct PUTROOTFH4res { +             /* CURRENT_FH: root fh */ +             nfsstat4        status; +     }; + + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 186] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   Replaces the current filehandle with the filehandle that represents +   the root of the server's name space.  From this filehandle a LOOKUP +   operation can locate any other filehandle on the server. This +   filehandle may be different from the "public" filehandle which may be +   associated with some other directory on the server. + +   IMPLEMENTATION + +   Commonly used as the first operator in an NFS request to set the +   context for following operations. + +   ERRORS + +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_WRONGSEC + +14.2.23.  Operation 25: READ - Read from File + +   SYNOPSIS + +     (cfh), stateid, offset, count -> eof, data + +   ARGUMENT + +     struct READ4args { +             /* CURRENT_FH: file */ +             stateid4        stateid; +             offset4         offset; +             count4          count; +     }; + +   RESULT + +     struct READ4resok { +             bool            eof; +             opaque          data<>; +     }; + +     union READ4res switch (nfsstat4 status) { +      case NFS4_OK: +              READ4resok     resok4; +      default: +              void; +     }; + + + + +Shepler, et al.             Standards Track                   [Page 187] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   The READ operation reads data from the regular file identified by the +   current filehandle. + +   The client provides an offset of where the READ is to start and a +   count of how many bytes are to be read.  An offset of 0 (zero) means +   to read data starting at the beginning of the file.  If offset is +   greater than or equal to the size of the file, the status, NFS4_OK, +   is returned with a data length set to 0 (zero) and eof is set to +   TRUE.  The READ is subject to access permissions checking. + +   If the client specifies a count value of 0 (zero), the READ succeeds +   and returns 0 (zero) bytes of data again subject to access +   permissions checking.  The server may choose to return fewer bytes +   than specified by the client.  The client needs to check for this +   condition and handle the condition appropriately. + +   The stateid value for a READ request represents a value returned from +   a previous record lock or share reservation request.  The stateid is +   used by the server to verify that the associated share reservation +   and any record locks are still valid and to update lease timeouts for +   the client. + +   If the read ended at the end-of-file (formally, in a correctly formed +   READ request, if offset + count is equal to the size of the file), or +   the read request extends beyond the size of the file (if offset + +   count is greater than the size of the file), eof is returned as TRUE; +   otherwise it is FALSE.  A successful READ of an empty file will +   always return eof as TRUE. + +   If the current filehandle is not a regular file, an error will be +   returned to the client.  In the case the current filehandle +   represents a directory, NFS4ERR_ISDIR is return; otherwise, +   NFS4ERR_INVAL is returned. + +   For a READ with a stateid value of all bits 0, the server MAY allow +   the READ to be serviced subject to mandatory file locks or the +   current share deny modes for the file.  For a READ with a stateid +   value of all bits 1, the server MAY allow READ operations to bypass +   locking checks at the server. + +   On success, the current filehandle retains its value. + + + + + + + + +Shepler, et al.             Standards Track                   [Page 188] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   IMPLEMENTATION + +   It is possible for the server to return fewer than count bytes of +   data.  If the server returns less than the count requested and eof is +   set to FALSE, the client should issue another READ to get the +   remaining data.  A server may return less data than requested under +   several circumstances.  The file may have been truncated by another +   client or perhaps on the server itself, changing the file size from +   what the requesting client believes to be the case.  This would +   reduce the actual amount of data available to the client.  It is +   possible that the server may back off the transfer size and reduce +   the read request return.  Server resource exhaustion may also occur +   necessitating a smaller read return. + +   If mandatory file locking is on for the file, and if the region +   corresponding to the data to be read from file is write locked by an +   owner not associated the stateid, the server will return the +   NFS4ERR_LOCKED error.  The client should try to get the appropriate +   read record lock via the LOCK operation before re-attempting the +   READ.  When the READ completes, the client should release the record +   lock via LOCKU. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_EXPIRED +      NFS4ERR_FHEXPIRED +      NFS4ERR_GRACE +      NFS4ERR_IO +      NFS4ERR_INVAL +      NFS4ERR_ISDIR +      NFS4ERR_LEASE_MOVED +      NFS4ERR_LOCKED +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NXIO +      NFS4ERR_OLD_STATEID +      NFS4ERR_OPENMODE +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + + + + +Shepler, et al.             Standards Track                   [Page 189] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +14.2.24.  Operation 26: READDIR - Read Directory + +   SYNOPSIS +      (cfh), cookie, cookieverf, dircount, maxcount, attr_request -> +      cookieverf { cookie, name, attrs } + +   ARGUMENT + +     struct READDIR4args { +             /* CURRENT_FH: directory */ +             nfs_cookie4     cookie; +             verifier4       cookieverf; +             count4          dircount; +             count4          maxcount; +             bitmap4         attr_request; +     }; + +   RESULT + +     struct entry4 { +             nfs_cookie4     cookie; +             component4      name; +             fattr4          attrs; +             entry4          *nextentry; +     }; + +     struct dirlist4 { +             entry4          *entries; +             bool            eof; +     }; + +     struct READDIR4resok { +             verifier4       cookieverf; +             dirlist4        reply; +     }; + + +     union READDIR4res switch (nfsstat4 status) { +      case NFS4_OK: +              READDIR4resok  resok4; +      default: +              void; +     }; + + + + + + + + +Shepler, et al.             Standards Track                   [Page 190] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   The READDIR operation retrieves a variable number of entries from a +   filesystem directory and returns client requested attributes for each +   entry along with information to allow the client to request +   additional directory entries in a subsequent READDIR. + +   The arguments contain a cookie value that represents where the +   READDIR should start within the directory.  A value of 0 (zero) for +   the cookie is used to start reading at the beginning of the +   directory.  For subsequent READDIR requests, the client specifies a +   cookie value that is provided by the server on a previous READDIR +   request. + +   The cookieverf value should be set to 0 (zero) when the cookie value +   is 0 (zero) (first directory read).  On subsequent requests, it +   should be a cookieverf as returned by the server.  The cookieverf +   must match that returned by the READDIR in which the cookie was +   acquired.  If the server determines that the cookieverf is no longer +   valid for the directory, the error NFS4ERR_NOT_SAME must be returned. + +   The dircount portion of the argument is a hint of the maximum number +   of bytes of directory information that should be returned.  This +   value represents the length of the names of the directory entries and +   the cookie value for these entries.  This length represents the XDR +   encoding of the data (names and cookies) and not the length in the +   native format of the server. + +   The maxcount value of the argument is the maximum number of bytes for +   the result.  This maximum size represents all of the data being +   returned within the READDIR4resok structure and includes the XDR +   overhead.  The server may return less data.  If the server is unable +   to return a single directory entry within the maxcount limit, the +   error NFS4ERR_TOOSMALL will be returned to the client. + +   Finally, attr_request represents the list of attributes to be +   returned for each directory entry supplied by the server. + +   On successful return, the server's response will provide a list of +   directory entries.  Each of these entries contains the name of the +   directory entry, a cookie value for that entry, and the associated +   attributes as requested.  The "eof" flag has a value of TRUE if there +   are no more entries in the directory. + +   The cookie value is only meaningful to the server and is used as a +   "bookmark" for the directory entry.  As mentioned, this cookie is +   used by the client for subsequent READDIR operations so that it may +   continue reading a directory.  The cookie is similar in concept to a + + + +Shepler, et al.             Standards Track                   [Page 191] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   READ offset but should not be interpreted as such by the client. +   Ideally, the cookie value should not change if the directory is +   modified since the client may be caching these values. + +   In some cases, the server may encounter an error while obtaining the +   attributes for a directory entry.  Instead of returning an error for +   the entire READDIR operation, the server can instead return the +   attribute 'fattr4_rdattr_error'.  With this, the server is able to +   communicate the failure to the client and not fail the entire +   operation in the instance of what might be a transient failure. +   Obviously, the client must request the fattr4_rdattr_error attribute +   for this method to work properly.  If the client does not request the +   attribute, the server has no choice but to return failure for the +   entire READDIR operation. + +   For some filesystem environments, the directory entries "." and ".." +   have special meaning and in other environments, they may not.  If the +   server supports these special entries within a directory, they should +   not be returned to the client as part of the READDIR response.  To +   enable some client environments, the cookie values of 0, 1, and 2 are +   to be considered reserved.  Note that the UNIX client will use these +   values when combining the server's response and local representations +   to enable a fully formed UNIX directory presentation to the +   application. + +   For READDIR arguments, cookie values of 1 and 2 should not be used +   and for READDIR results cookie values of 0, 1, and 2 should not be +   returned. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   The server's filesystem directory representations can differ greatly. +   A client's programming interfaces may also be bound to the local +   operating environment in a way that does not translate well into the +   NFS protocol.  Therefore the use of the dircount and maxcount fields +   are provided to allow the client the ability to provide guidelines to +   the server.  If the client is aggressive about attribute collection +   during a READDIR, the server has an idea of how to limit the encoded +   response.  The dircount field provides a hint on the number of +   entries based solely on the names of the directory entries.  Since it +   is a hint, it may be possible that a dircount value is zero.  In this +   case, the server is free to ignore the dircount value and return +   directory information based on the specified maxcount value. + + + + + + +Shepler, et al.             Standards Track                   [Page 192] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The cookieverf may be used by the server to help manage cookie values +   that may become stale.  It should be a rare occurrence that a server +   is unable to continue properly reading a directory with the provided +   cookie/cookieverf pair.  The server should make every effort to avoid +   this condition since the application at the client may not be able to +   properly handle this type of failure. + +   The use of the cookieverf will also protect the client from using +   READDIR cookie values that may be stale.  For example, if the file +   system has been migrated, the server may or may not be able to use +   the same cookie values to service READDIR as the previous server +   used.  With the client providing the cookieverf, the server is able +   to provide the appropriate response to the client.  This prevents the +   case where the server may accept a cookie value but the underlying +   directory has changed and the response is invalid from the client's +   context of its previous READDIR. + +   Since some servers will not be returning "." and ".." entries as has +   been done with previous versions of the NFS protocol, the client that +   requires these entries be present in READDIR responses must fabricate +   them. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_COOKIE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOTDIR +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_TOOSMALL + +14.2.25.  Operation 27: READLINK - Read Symbolic Link + +   SYNOPSIS + +     (cfh) -> linktext + + + + + + +Shepler, et al.             Standards Track                   [Page 193] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ARGUMENT + +     /* CURRENT_FH: symlink */ +     void; + +   RESULT + +     struct READLINK4resok { +             linktext4       link; +     }; + +     union READLINK4res switch (nfsstat4 status) { +      case NFS4_OK: +              READLINK4resok resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   READLINK reads the data associated with a symbolic link.  The data is +   a UTF-8 string that is opaque to the server.  That is, whether +   created by an NFS client or created locally on the server, the data +   in a symbolic link is not interpreted when created, but is simply +   stored. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   A symbolic link is nominally a pointer to another file.  The data is +   not necessarily interpreted by the server, just stored in the file. +   It is possible for a client implementation to store a path name that +   is not meaningful to the server operating system in a symbolic link. +   A READLINK operation returns the data to the client for +   interpretation. If different implementations want to share access to +   symbolic links, then they must agree on the interpretation of the +   data in the symbolic link. + +   The READLINK operation is only allowed on objects of type NF4LNK. +   The server should return the error, NFS4ERR_INVAL, if the object is +   not of type, NF4LNK. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADHANDLE +      NFS4ERR_DELAY + + + +Shepler, et al.             Standards Track                   [Page 194] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_ISDIR +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOTSUPP +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.26.  Operation 28: REMOVE - Remove Filesystem Object + +   SYNOPSIS + +     (cfh), filename -> change_info + +   ARGUMENT + +     struct REMOVE4args { +             /* CURRENT_FH: directory */ +             component4       target; +     }; + +   RESULT + +     struct REMOVE4resok { +             change_info4    cinfo; +     } + +     union REMOVE4res switch (nfsstat4 status) { +      case NFS4_OK: +              REMOVE4resok   resok4; +      default: +              void; +     } + +   DESCRIPTION + +   The REMOVE operation removes (deletes) a directory entry named by +   filename from the directory corresponding to the current filehandle. +   If the entry in the directory was the last reference to the +   corresponding filesystem object, the object may be destroyed. + + + + + + + + +Shepler, et al.             Standards Track                   [Page 195] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   For the directory where the filename was removed, the server returns +   change_info4 information in cinfo.  With the atomic field of the +   change_info4 struct, the server will indicate if the before and after +   change attributes were obtained atomically with respect to the +   removal. + +   If the target has a length of 0 (zero), or if target does not obey +   the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   NFS versions 2 and 3 required a different operator RMDIR for +   directory removal and REMOVE for non-directory removal. This allowed +   clients to skip checking the file type when being passed a non- +   directory delete system call (e.g., unlink() in POSIX) to remove a +   directory, as well as the converse (e.g., a rmdir() on a non- +   directory) because they knew the server would check the file type. +   NFS version 4 REMOVE can be used to delete any directory entry +   independent of its file type. The implementor of an NFS version 4 +   client's entry points from the unlink() and rmdir() system calls +   should first check the file type against the types the system call is +   allowed to remove before issuing a REMOVE. Alternatively, the +   implementor can produce a COMPOUND call that includes a LOOKUP/VERIFY +   sequence to verify the file type before a REMOVE operation in the +   same COMPOUND call. + +   The concept of last reference is server specific.  However, if the +   numlinks field in the previous attributes of the object had the value +   1, the client should not rely on referring to the object via a +   filehandle.  Likewise, the client should not rely on the resources +   (disk space, directory entry, and so on) formerly associated with the +   object becoming immediately available.  Thus, if a client needs to be +   able to continue to access a file after using REMOVE to remove it, +   the client should take steps to make sure that the file will still be +   accessible.  The usual mechanism used is to RENAME the file from its +   old name to a new hidden name. + +   If the server finds that the file is still open when the REMOVE +   arrives: + +   o  The server SHOULD NOT delete the file's directory entry if the +      file was opened with OPEN4_SHARE_DENY_WRITE or +      OPEN4_SHARE_DENY_BOTH. + + + + + + +Shepler, et al.             Standards Track                   [Page 196] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  If the file was not opened with OPEN4_SHARE_DENY_WRITE or +      OPEN4_SHARE_DENY_BOTH, the server SHOULD delete the file's +      directory entry.  However, until last CLOSE of the file, the +      server MAY continue to allow access to the file via its +      filehandle. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADNAME +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_FHEXPIRED +      NFS4ERR_FILE_OPEN +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NAMETOOLONG +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOTDIR +      NFS4ERR_NOTEMPTY +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.27.  Operation 29: RENAME - Rename Directory Entry + +   SYNOPSIS + +     (sfh), oldname, (cfh), newname -> source_change_info, +     target_change_info + +   ARGUMENT + +     struct RENAME4args { +             /* SAVED_FH: source directory */ +             component4      oldname; +             /* CURRENT_FH: target directory */ +             component4      newname; +     }; + + + + + + + +Shepler, et al.             Standards Track                   [Page 197] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   RESULT + +     struct RENAME4resok { +             change_info4    source_cinfo; +             change_info4    target_cinfo; +     }; + +     union RENAME4res switch (nfsstat4 status) { +      case NFS4_OK: +              RENAME4resok   resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   The RENAME operation renames the object identified by oldname in the +   source directory corresponding to the saved filehandle, as set by the +   SAVEFH operation, to newname in the target directory corresponding to +   the current filehandle.  The operation is required to be atomic to +   the client.  Source and target directories must reside on the same +   filesystem on the server.  On success, the current filehandle will +   continue to be the target directory. + +   If the target directory already contains an entry with the name, +   newname, the source object must be compatible with the target: +   either both are non-directories or both are directories and the +   target must be empty.  If compatible, the existing target is removed +   before the rename occurs (See the IMPLEMENTATION subsection of the +   section "Operation 28: REMOVE - Remove Filesystem Object" for client +   and server actions whenever a target is removed).  If they are not +   compatible or if the target is a directory but not empty, the server +   will return the error, NFS4ERR_EXIST. + +   If oldname and newname both refer to the same file (they might be +   hard links of each other), then RENAME should perform no action and +   return success. + +   For both directories involved in the RENAME, the server returns +   change_info4 information.  With the atomic field of the change_info4 +   struct, the server will indicate if the before and after change +   attributes were obtained atomically with respect to the rename. + +   If the oldname refers to a named attribute and the saved and current +   filehandles refer to different filesystem objects, the server will +   return NFS4ERR_XDEV just as if the saved and current filehandles +   represented directories on different filesystems. + + + + +Shepler, et al.             Standards Track                   [Page 198] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   If the oldname or newname has a length of 0 (zero), or if oldname or +   newname does not obey the UTF-8 definition, the error NFS4ERR_INVAL +   will be returned. + +   IMPLEMENTATION + +   The RENAME operation must be atomic to the client.  The statement +   "source and target directories must reside on the same filesystem on +   the server" means that the fsid fields in the attributes for the +   directories are the same. If they reside on different filesystems, +   the error, NFS4ERR_XDEV, is returned. + +   Based on the value of the fh_expire_type attribute for the object, +   the filehandle may or may not expire on a RENAME.  However, server +   implementors are strongly encouraged to attempt to keep filehandles +   from expiring in this fashion. + +   On some servers, the file names "." and ".." are illegal as either +   oldname or newname, and will result in the error NFS4ERR_BADNAME.  In +   addition, on many servers the case of oldname or newname being an +   alias for the source directory will be checked for.  Such servers +   will return the error NFS4ERR_INVAL in these cases. + +   If either of the source or target filehandles are not directories, +   the server will return NFS4ERR_NOTDIR. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADNAME +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DQUOT +      NFS4ERR_EXIST +      NFS4ERR_FHEXPIRED +      NFS4ERR_FILE_OPEN +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_MOVED +      NFS4ERR_NAMETOOLONG +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOSPC +      NFS4ERR_NOTDIR +      NFS4ERR_NOTEMPTY +      NFS4ERR_RESOURCE + + + +Shepler, et al.             Standards Track                   [Page 199] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_WRONGSEC +      NFS4ERR_XDEV + +14.2.28.  Operation 30: RENEW - Renew a Lease + +   SYNOPSIS + +     clientid -> () + +   ARGUMENT + +     struct RENEW4args { +             clientid4       clientid; +     }; + +   RESULT + +     struct RENEW4res { +             nfsstat4        status; +     }; + +   DESCRIPTION + +   The RENEW operation is used by the client to renew leases which it +   currently holds at a server.  In processing the RENEW request, the +   server renews all leases associated with the client.  The associated +   leases are determined by the clientid provided via the SETCLIENTID +   operation. + +   IMPLEMENTATION + +   When the client holds delegations, it needs to use RENEW to detect +   when the server has determined that the callback path is down.  When +   the server has made such a determination, only the RENEW operation +   will renew the lease on delegations.  If the server determines the +   callback path is down, it returns NFS4ERR_CB_PATH_DOWN.  Even though +   it returns NFS4ERR_CB_PATH_DOWN, the server MUST renew the lease on +   the record locks and share reservations that the client has +   established on the server.  If for some reason the lock and share +   reservation lease cannot be renewed, then the server MUST return an +   error other than NFS4ERR_CB_PATH_DOWN, even if the callback path is +   also down. + + + + + + +Shepler, et al.             Standards Track                   [Page 200] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The client that issues RENEW MUST choose the principal, RPC security +   flavor, and if applicable, GSS-API mechanism and service via one of +   the following algorithms: + +   o  The client uses the same principal, RPC security flavor -- and if +      the flavor was RPCSEC_GSS -- the same mechanism and service that +      was used when the client id was established via +      SETCLIENTID_CONFIRM. + +   o  The client uses any principal, RPC security flavor mechanism and +      service combination that currently has an OPEN file on the server. +      I.e.,  the same principal had a successful OPEN operation, the +      file is still open by that principal, and the flavor, mechanism, +      and service of RENEW match that of the previous OPEN. + +   The server MUST reject a RENEW that does not use one the +   aforementioned algorithms, with the error NFS4ERR_ACCESS. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADXDR +      NFS4ERR_CB_PATH_DOWN +      NFS4ERR_EXPIRED +      NFS4ERR_LEASE_MOVED +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE_CLIENTID + +14.2.29.  Operation 31: RESTOREFH - Restore Saved Filehandle + +   SYNOPSIS + +     (sfh) -> (cfh) + +   ARGUMENT + +     /* SAVED_FH: */ +     void; + +   RESULT + +     struct RESTOREFH4res { +             /* CURRENT_FH: value of saved fh */ +             nfsstat4        status; +     }; + + + + +Shepler, et al.             Standards Track                   [Page 201] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   Set the current filehandle to the value in the saved filehandle.  If +   there is no saved filehandle then return the error NFS4ERR_RESTOREFH. + +   IMPLEMENTATION + +   Operations like OPEN and LOOKUP use the current filehandle to +   represent a directory and replace it with a new filehandle.  Assuming +   the previous filehandle was saved with a SAVEFH operator, the +   previous filehandle can be restored as the current filehandle.  This +   is commonly used to obtain post-operation attributes for the +   directory, e.g., + +         PUTFH (directory filehandle) +         SAVEFH +         GETATTR attrbits     (pre-op dir attrs) +         CREATE optbits "foo" attrs +         GETATTR attrbits     (file attributes) +         RESTOREFH +         GETATTR attrbits     (post-op dir attrs) + +   ERRORS + +      NFS4ERR_BADHANDLE +      NFS4ERR_FHEXPIRED +      NFS4ERR_MOVED +      NFS4ERR_RESOURCE +      NFS4ERR_RESTOREFH +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_WRONGSEC + +14.2.30.  Operation 32: SAVEFH - Save Current Filehandle + +   SYNOPSIS + +     (cfh) -> (sfh) + +   ARGUMENT + +     /* CURRENT_FH: */ +     void; + + + + + + + + +Shepler, et al.             Standards Track                   [Page 202] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   RESULT + +     struct SAVEFH4res { +             /* SAVED_FH: value of current fh */ +             nfsstat4        status; +     }; + +   DESCRIPTION + +   Save the current filehandle.  If a previous filehandle was saved then +   it is no longer accessible.  The saved filehandle can be restored as +   the current filehandle with the RESTOREFH operator. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   ERRORS + +      NFS4ERR_BADHANDLE +      NFS4ERR_FHEXPIRED +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.31.  Operation 33: SECINFO - Obtain Available Security + +   SYNOPSIS + +     (cfh), name -> { secinfo } + +   ARGUMENT + +     struct SECINFO4args { +             /* CURRENT_FH: directory */ +             component4     name; +     }; + +   RESULT + +     enum rpc_gss_svc_t {/* From RFC 2203 */ +             RPC_GSS_SVC_NONE        = 1, +             RPC_GSS_SVC_INTEGRITY   = 2, +             RPC_GSS_SVC_PRIVACY     = 3 +     }; + + + + +Shepler, et al.             Standards Track                   [Page 203] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +     struct rpcsec_gss_info { +             sec_oid4        oid; +             qop4            qop; +             rpc_gss_svc_t   service; +     }; + +     union secinfo4 switch (uint32_t flavor) { +      case RPCSEC_GSS: +              rpcsec_gss_info        flavor_info; +      default: +              void; +     }; + +     typedef secinfo4 SECINFO4resok<>; + +     union SECINFO4res switch (nfsstat4 status) { +      case NFS4_OK: +              SECINFO4resok resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   The SECINFO operation is used by the client to obtain a list of valid +   RPC authentication flavors for a specific directory filehandle, file +   name pair.  SECINFO should apply the same access methodology used for +   LOOKUP when evaluating the name.  Therefore, if the requester does +   not have the appropriate access to LOOKUP the name then SECINFO must +   behave the same way and return NFS4ERR_ACCESS. + +   The result will contain an array which represents the security +   mechanisms available, with an order corresponding to server's +   preferences, the most preferred being first in the array. The client +   is free to pick whatever security mechanism it both desires and +   supports, or to pick in the server's preference order the first one +   it supports.  The array entries are represented by the secinfo4 +   structure.  The field 'flavor' will contain a value of AUTH_NONE, +   AUTH_SYS (as defined in [RFC1831]), or RPCSEC_GSS (as defined in +   [RFC2203]). + +   For the flavors AUTH_NONE and AUTH_SYS, no additional security +   information is returned.  For a return value of RPCSEC_GSS, a +   security triple is returned that contains the mechanism object id (as +   defined in [RFC2743]), the quality of protection (as defined in +   [RFC2743]) and the service type (as defined in [RFC2203]).  It is +   possible for SECINFO to return multiple entries with flavor equal to +   RPCSEC_GSS with different security triple values. + + + +Shepler, et al.             Standards Track                   [Page 204] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   On success, the current filehandle retains its value. + +   If the name has a length of 0 (zero), or if name does not obey the +   UTF-8 definition, the error NFS4ERR_INVAL will be returned. + +   IMPLEMENTATION + +   The SECINFO operation is expected to be used by the NFS client when +   the error value of NFS4ERR_WRONGSEC is returned from another NFS +   operation.  This signifies to the client that the server's security +   policy is different from what the client is currently using.  At this +   point, the client is expected to obtain a list of possible security +   flavors and choose what best suits its policies. + +   As mentioned, the server's security policies will determine when a +   client request receives NFS4ERR_WRONGSEC.  The operations which may +   receive this error are: LINK, LOOKUP, OPEN, PUTFH, PUTPUBFH, +   PUTROOTFH, RESTOREFH, RENAME, and indirectly READDIR.  LINK and +   RENAME will only receive this error if the security used for the +   operation is inappropriate for saved filehandle.  With the exception +   of READDIR, these operations represent the point at which the client +   can instantiate a filehandle into the "current filehandle" at the +   server.  The filehandle is either provided by the client (PUTFH, +   PUTPUBFH, PUTROOTFH) or generated as a result of a name to filehandle +   translation (LOOKUP and OPEN).  RESTOREFH is different because the +   filehandle is a result of a previous SAVEFH.  Even though the +   filehandle, for RESTOREFH, might have previously passed the server's +   inspection for a security match, the server will check it again on +   RESTOREFH to ensure that the security policy has not changed. + +   If the client wants to resolve an error return of NFS4ERR_WRONGSEC, +   the following will occur: + +   o  For LOOKUP and OPEN, the client will use SECINFO with the same +      current filehandle and name as provided in the original LOOKUP or +      OPEN to enumerate the available security triples. + +   o  For LINK, PUTFH, RENAME, and RESTOREFH, the client will use +      SECINFO and provide the parent directory filehandle and object +      name which corresponds to the filehandle originally provided by +      the PUTFH RESTOREFH, or for LINK and RENAME, the SAVEFH. + +   o  For PUTROOTFH and PUTPUBFH, the client will be unable to use the +      SECINFO operation since SECINFO requires a current filehandle and +      none exist for these two operations.  Therefore, the client must +      iterate through the security triples available at the client and +      reattempt the PUTROOTFH or PUTPUBFH operation. In the unfortunate +      event none of the MANDATORY security triples are supported by the + + + +Shepler, et al.             Standards Track                   [Page 205] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      client and server, the client SHOULD try using others that support +      integrity. Failing that, the client can try using AUTH_NONE, but +      because such forms lack integrity checks, this puts the client at +      risk.  Nonetheless, the server SHOULD allow the client to use +      whatever security form the client requests and the server +      supports, since the risks of doing so are on the client. + +   The READDIR operation will not directly return the NFS4ERR_WRONGSEC +   error.  However, if the READDIR request included a request for +   attributes, it is possible that the READDIR request's security triple +   does not match that of a directory entry.  If this is the case and +   the client has requested the rdattr_error attribute, the server will +   return the NFS4ERR_WRONGSEC error in rdattr_error for the entry. + +   See the section "Security Considerations" for a discussion on the +   recommendations for security flavor used by SECINFO. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADNAME +      NFS4ERR_BADXDR +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_MOVED +      NFS4ERR_NAMETOOLONG +      NFS4ERR_NOENT +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOTDIR +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.32.  Operation 34: SETATTR - Set Attributes + +   SYNOPSIS + +     (cfh), stateid, attrmask, attr_vals -> attrsset + +   ARGUMENT + +     struct SETATTR4args { +             /* CURRENT_FH: target object */ +             stateid4        stateid; +             fattr4          obj_attributes; +     }; + + + +Shepler, et al.             Standards Track                   [Page 206] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   RESULT + +     struct SETATTR4res { +             nfsstat4        status; +             bitmap4         attrsset; +     }; + +   DESCRIPTION + +   The SETATTR operation changes one or more of the attributes of a +   filesystem object.  The new attributes are specified with a bitmap +   and the attributes that follow the bitmap in bit order. + +   The stateid argument for SETATTR is used to provide file locking +   context that is necessary for SETATTR requests that set the size +   attribute.  Since setting the size attribute modifies the file's +   data, it has the same locking requirements as a corresponding WRITE. +   Any SETATTR that sets the size attribute is incompatible with a share +   reservation that specifies DENY_WRITE.  The area between the old +   end-of-file and the new end-of-file is considered to be modified just +   as would have been the case had the area in question been specified +   as the target of WRITE, for the purpose of checking conflicts with +   record locks, for those cases in which a server is implementing +   mandatory record locking behavior.  A valid stateid should always be +   specified.  When the file size attribute is not set, the special +   stateid consisting of all bits zero should be passed. + +   On either success or failure of the operation, the server will return +   the attrsset bitmask to represent what (if any) attributes were +   successfully set.  The attrsset in the response is a subset of the +   bitmap4 that is part of the obj_attributes in the argument. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   If the request specifies the owner attribute to be set, the server +   should allow the operation to succeed if the current owner of the +   object matches the value specified in the request.  Some servers may +   be implemented in a way as to prohibit the setting of the owner +   attribute unless the requester has privilege to do so.  If the server +   is lenient in this one case of matching owner values, the client +   implementation may be simplified in cases of creation of an object +   followed by a SETATTR. + +   The file size attribute is used to request changes to the size of a +   file. A value of 0 (zero) causes the file to be truncated, a value +   less than the current size of the file causes data from new size to + + + +Shepler, et al.             Standards Track                   [Page 207] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   the end of the file to be discarded, and a size greater than the +   current size of the file causes logically zeroed data bytes to be +   added to the end of the file.  Servers are free to implement this +   using holes or actual zero data bytes. Clients should not make any +   assumptions regarding a server's implementation of this feature, +   beyond that the bytes returned will be zeroed.  Servers must support +   extending the file size via SETATTR. + +   SETATTR is not guaranteed atomic.  A failed SETATTR may partially +   change a file's attributes. + +   Changing the size of a file with SETATTR indirectly changes the +   time_modify.  A client must account for this as size changes can +   result in data deletion. + +   The attributes time_access_set and time_modify_set are write-only +   attributes constructed as a switched union so the client can direct +   the server in setting the time values.  If the switched union +   specifies SET_TO_CLIENT_TIME4, the client has provided an nfstime4 to +   be used for the operation.  If the switch union does not specify +   SET_TO_CLIENT_TIME4, the server is to use its current time for the +   SETATTR operation. + +   If server and client times differ, programs that compare client time +   to file times can break. A time maintenance protocol should be used +   to limit client/server time skew. + +   Use of a COMPOUND containing a VERIFY operation specifying only the +   change attribute, immediately followed by a SETATTR, provides a means +   whereby a client may specify a request that emulates the +   functionality of the SETATTR guard mechanism of NFS version 3.  Since +   the function of the guard mechanism is to avoid changes to the file +   attributes based on stale information, delays between checking of the +   guard condition and the setting of the attributes have the potential +   to compromise this function, as would the corresponding delay in the +   NFS version 4 emulation.  Therefore, NFS version 4 servers should +   take care to avoid such delays, to the degree possible, when +   executing such a request. + +   If the server does not support an attribute as requested by the +   client, the server should return NFS4ERR_ATTRNOTSUPP. + +   A mask of the attributes actually set is returned by SETATTR in all +   cases.  That mask must not include attributes bits not requested to +   be set by the client, and must be equal to the mask of attributes +   requested to be set only if the SETATTR completes without error. + + + + + +Shepler, et al.             Standards Track                   [Page 208] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_ATTRNOTSUPP +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADOWNER +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DQUOT +      NFS4ERR_EXPIRED +      NFS4ERR_FBIG +      NFS4ERR_FHEXPIRED +      NFS4ERR_GRACE +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_ISDIR +      NFS4ERR_LOCKED +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOSPC +      NFS4ERR_OLD_STATEID +      NFS4ERR_OPENMODE +      NFS4ERR_PERM +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + +14.2.33.  Operation 35: SETCLIENTID - Negotiate Clientid + +   SYNOPSIS + +     client, callback, callback_ident -> clientid, setclientid_confirm + +   ARGUMENT + +     struct SETCLIENTID4args { +             nfs_client_id4  client; +             cb_client4      callback; +             uint32_t        callback_ident; +     }; + + + + + + +Shepler, et al.             Standards Track                   [Page 209] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   RESULT + +     struct SETCLIENTID4resok { +             clientid4       clientid; +             verifier4       setclientid_confirm; +     }; + +     union SETCLIENTID4res switch (nfsstat4 status) { +      case NFS4_OK: +              SETCLIENTID4resok      resok4; +      case NFS4ERR_CLID_INUSE: +              clientaddr4    client_using; +      default: +              void; +     }; + +   DESCRIPTION + +   The client uses the SETCLIENTID operation to notify the server of its +   intention to use a particular client identifier, callback, and +   callback_ident for subsequent requests that entail creating lock, +   share reservation, and delegation state on the server.  Upon +   successful completion the server will return a shorthand clientid +   which, if confirmed via a separate step, will be used in subsequent +   file locking and file open requests. Confirmation of the clientid +   must be done via the SETCLIENTID_CONFIRM operation to return the +   clientid and setclientid_confirm values, as verifiers, to the server. +   The reason why two verifiers are necessary is that it is possible to +   use SETCLIENTID and SETCLIENTID_CONFIRM to modify the callback and +   callback_ident information but not the shorthand clientid.  In that +   event, the setclientid_confirm value is effectively the only +   verifier. + +   The callback information provided in this operation will be used if +   the client is provided an open delegation at a future point. +   Therefore, the client must correctly reflect the program and port +   numbers for the callback program at the time SETCLIENTID is used. + +   The callback_ident value is used by the server on the callback.  The +   client can leverage the callback_ident to eliminate the need for more +   than one callback RPC program number, while still being able to +   determine which server is initiating the callback. + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 210] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   IMPLEMENTATION + +   To understand how to implement SETCLIENTID, make the following +   notations. Let: + +   x be the value of the client.id subfield of the SETCLIENTID4args +     structure. + +   v be the value of the client.verifier subfield of the +     SETCLIENTID4args structure. + +   c be the value of the clientid field returned in the +     SETCLIENTID4resok structure. + +   k represent the value combination of the fields callback and +     callback_ident fields of the SETCLIENTID4args structure. + +   s be the setclientid_confirm value returned in the +     SETCLIENTID4resok structure. + +   { v, x, c, k, s } +     be a quintuple for a client record. A client record is +     confirmed if there has been a SETCLIENTID_CONFIRM operation to +     confirm it.  Otherwise it is unconfirmed. An unconfirmed +     record is established by a SETCLIENTID call. + +   Since SETCLIENTID is a non-idempotent operation, let us assume that +   the server is implementing the duplicate request cache (DRC). + +   When the server gets a SETCLIENTID { v, x, k } request, it processes +   it in the following manner. + +   o  It first looks up the request in the DRC. If there is a hit, it +      returns the result cached in the DRC.  The server does NOT remove +      client state (locks, shares, delegations) nor does it modify any +      recorded callback and callback_ident information for client { x }. + +      For any DRC miss, the server takes the client id string x, and +      searches for client records for x that the server may have +      recorded from previous SETCLIENTID calls. For any confirmed record +      with the same id string x, if the recorded principal does not +      match that of SETCLIENTID call, then the server returns a +      NFS4ERR_CLID_INUSE error. + +      For brevity of discussion, the remaining description of the +      processing assumes that there was a DRC miss, and that where the +      server has previously recorded a confirmed record for client x, +      the aforementioned principal check has successfully passed. + + + +Shepler, et al.             Standards Track                   [Page 211] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  The server checks if it has recorded a confirmed record for { v, +      x, c, l, s }, where l may or may not equal k. If so, and since the +      id verifier v of the request matches that which is confirmed and +      recorded, the server treats this as a probable callback +      information update and records an unconfirmed { v, x, c, k, t } +      and leaves the confirmed { v, x, c, l, s } in place, such that t +      != s. It does not matter if k equals l or not.  Any pre-existing +      unconfirmed { v, x, c, *, * } is removed. + +      The server returns { c, t }. It is indeed returning the old +      clientid4 value c, because the client apparently only wants to +      update callback value k to value l.  It's possible this request is +      one from the Byzantine router that has stale callback information, +      but this is not a problem.  The callback information update is +      only confirmed if followed up by a SETCLIENTID_CONFIRM { c, t }. + +      The server awaits confirmation of k via +      SETCLIENTID_CONFIRM { c, t }. + +      The server does NOT remove client (lock/share/delegation) state +      for x. + +   o  The server has previously recorded a confirmed { u, x, c, l, s } +      record such that v != u, l may or may not equal k, and has not +      recorded any unconfirmed { *, x, *, *, * } record for x.  The +      server records an unconfirmed { v, x, d, k, t } (d != c, t != s). + +      The server returns { d, t }. + +      The server awaits confirmation of { d, k } via SETCLIENTID_CONFIRM +      { d, t }. + +      The server does NOT remove client (lock/share/delegation) state +      for x. + +   o  The server has previously recorded a confirmed { u, x, c, l, s } +      record such that v != u, l may or may not equal k, and recorded an +      unconfirmed { w, x, d, m, t } record such that c != d, t != s, m +      may or may not equal k, m may or may not equal l, and k may or may +      not equal l. Whether w == v or w != v makes no difference.  The +      server simply removes the unconfirmed { w, x, d, m, t } record and +      replaces it with an unconfirmed { v, x, e, k, r } record, such +      that e != d, e != c, r != t, r != s. + +      The server returns { e, r }. + +      The server awaits confirmation of { e, k } via +      SETCLIENTID_CONFIRM { e, r }. + + + +Shepler, et al.             Standards Track                   [Page 212] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      The server does NOT remove client (lock/share/delegation) state +      for x. + +   o  The server has no confirmed { *, x, *, *, * } for x. It may or may +      not have recorded an unconfirmed { u, x, c, l, s }, where l may or +      may not equal k, and u may or may not equal v.  Any unconfirmed +      record { u, x, c, l, * }, regardless whether u == v or l == k, is +      replaced with an unconfirmed record { v, x, d, k, t } where d != +      c, t != s. + +      The server returns { d, t }. + +      The server awaits confirmation of { d, k } via SETCLIENTID_CONFIRM +      { d, t }.  The server does NOT remove client +      (lock/share/delegation) state for x. + +   The server generates the clientid and setclientid_confirm values and +   must take care to ensure that these values are extremely unlikely to +   ever be regenerated. + +   ERRORS + +      NFS4ERR_BADXDR +      NFS4ERR_CLID_INUSE +      NFS4ERR_INVAL +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT + +14.2.34.  Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid + +   SYNOPSIS + +     clientid, verifier -> - + +   ARGUMENT + +     struct SETCLIENTID_CONFIRM4args { +             clientid4       clientid; +             verifier4       setclientid_confirm; +     }; + +   RESULT + +     struct SETCLIENTID_CONFIRM4res { +             nfsstat4        status; +     }; + + + + + +Shepler, et al.             Standards Track                   [Page 213] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   DESCRIPTION + +   This operation is used by the client to confirm the results from a +   previous call to SETCLIENTID.  The client provides the server +   supplied (from a SETCLIENTID response) clientid.  The server responds +   with a simple status of success or failure. + +   IMPLEMENTATION + +   The client must use the SETCLIENTID_CONFIRM operation to confirm the +   following two distinct cases: + +   o  The client's use of a new shorthand client identifier (as returned +      from the server in the response to SETCLIENTID), a new callback +      value (as specified in the arguments to SETCLIENTID) and a new +      callback_ident (as specified in the arguments to SETCLIENTID) +      value.  The client's use of SETCLIENTID_CONFIRM in this case also +      confirms the removal of any of the client's previous relevant +      leased state. Relevant leased client state includes record locks, +      share reservations, and where the server does not support the +      CLAIM_DELEGATE_PREV claim type, delegations.  If the server +      supports CLAIM_DELEGATE_PREV, then SETCLIENTID_CONFIRM MUST NOT +      remove delegations for this client; relevant leased client state +      would then just include record locks and share reservations. + +   o  The client's re-use of an old, previously confirmed, shorthand +      client identifier, a new callback value, and a new callback_ident +      value.  The client's use of SETCLIENTID_CONFIRM in this case MUST +      NOT result in the removal of any previous leased state (locks, +      share reservations, and delegations) + +   We use the same notation and definitions for v, x, c, k, s, and +   unconfirmed and confirmed client records as introduced in the +   description of the SETCLIENTID operation. The arguments to +   SETCLIENTID_CONFIRM are indicated by the notation { c, s }, where c +   is a value of type clientid4, and s is a value of type verifier4 +   corresponding to the setclientid_confirm field. + +   As with SETCLIENTID, SETCLIENTID_CONFIRM is a non-idempotent +   operation, and we assume that the server is implementing the +   duplicate request cache (DRC). + +   When the server gets a SETCLIENTID_CONFIRM { c, s } request, it +   processes it in the following manner. + + + + + + + +Shepler, et al.             Standards Track                   [Page 214] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   o  It first looks up the request in the DRC. If there is a hit, it +      returns the result cached in the DRC.  The server does not remove +      any relevant leased client state nor does it modify any recorded +      callback and callback_ident information for client { x } as +      represented by the shorthand value c. + +   For a DRC miss, the server checks for client records that match the +   shorthand value c.  The processing cases are as follows: + +   o  The server has recorded an unconfirmed { v, x, c, k, s } record +      and a confirmed { v, x, c, l, t } record, such that s != t.  If +      the principals of the records do not match that of the +      SETCLIENTID_CONFIRM, the server returns NFS4ERR_CLID_INUSE, and no +      relevant leased client state is removed and no recorded callback +      and callback_ident information for client { x } is changed. +      Otherwise, the confirmed { v, x, c, l, t } record is removed and +      the unconfirmed { v, x, c, k, s } is marked as confirmed, thereby +      modifying recorded and confirmed callback and callback_ident +      information for client { x }. + +      The server does not remove any relevant leased client state. + +      The server returns NFS4_OK. + +   o  The server has not recorded an unconfirmed { v, x, c, *, * } and +      has recorded a confirmed { v, x, c, *, s }. If the principals of +      the record and of SETCLIENTID_CONFIRM do not match, the server +      returns NFS4ERR_CLID_INUSE without removing any relevant leased +      client state and without changing recorded callback and +      callback_ident values for client { x }. + +      If the principals match, then what has likely happened is that the +      client never got the response from the SETCLIENTID_CONFIRM, and +      the DRC entry has been purged. Whatever the scenario, since the +      principals match, as well as { c, s } matching a confirmed record, +      the server leaves client x's relevant leased client state intact, +      leaves its callback and callback_ident values unmodified, and +      returns NFS4_OK. + +   o  The server has not recorded a confirmed { *, *, c, *, * }, and has +      recorded an unconfirmed { *, x, c, k, s }.  Even if this is a +      retry from client, nonetheless the client's first +      SETCLIENTID_CONFIRM attempt was not received by the server.  Retry +      or not, the server doesn't know, but it processes it as if were a +      first try.  If the principal of the unconfirmed { *, x, c, k, s } +      record mismatches that of the SETCLIENTID_CONFIRM request the +      server returns NFS4ERR_CLID_INUSE without removing any relevant +      leased client state. + + + +Shepler, et al.             Standards Track                   [Page 215] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      Otherwise, the server records a confirmed { *, x, c, k, s }. If +      there is also a confirmed { *, x, d, *, t }, the server MUST +      remove the client x's relevant leased client state, and overwrite +      the callback state with k. The confirmed record { *, x, d, *, t } +      is removed. + +      Server returns NFS4_OK. + +   o  The server has no record of a confirmed or unconfirmed { *, *, c, +      *, s }.  The server returns NFS4ERR_STALE_CLIENTID.  The server +      does not remove any relevant leased client state, nor does it +      modify any recorded callback and callback_ident information for +      any client. + +   The server needs to cache unconfirmed { v, x, c, k, s } client +   records and await for some time their confirmation.  As should be +   clear from the record processing discussions for SETCLIENTID and +   SETCLIENTID_CONFIRM, there are cases where the server does not +   deterministically remove unconfirmed client records.  To avoid +   running out of resources, the server is not required to hold +   unconfirmed records indefinitely.  One strategy the server might use +   is to set a limit on how many unconfirmed client records it will +   maintain, and then when the limit would be exceeded, remove the +   oldest record. Another strategy might be to remove an unconfirmed +   record when some amount of time has elapsed. The choice of the amount +   of time is fairly arbitrary but it is surely no higher than the +   server's lease time period. Consider that leases need to be renewed +   before the lease time expires via an operation from the client.  If +   the client cannot issue a SETCLIENTID_CONFIRM after a SETCLIENTID +   before a period of time equal to that of a lease expires, then the +   client is unlikely to be able maintain state on the server during +   steady state operation. + +   If the client does send a SETCLIENTID_CONFIRM for an unconfirmed +   record that the server has already deleted, the client will get +   NFS4ERR_STALE_CLIENTID back.  If so, the client should then start +   over, and send SETCLIENTID to reestablish an unconfirmed client +   record and get back an unconfirmed clientid and setclientid_confirm +   verifier.  The client should then send the SETCLIENTID_CONFIRM to +   confirm the clientid. + +   SETCLIENTID_CONFIRM does not establish or renew a lease.  However, if +   SETCLIENTID_CONFIRM removes relevant leased client state, and that +   state does not include existing delegations, the server MUST allow +   the client a period of time no less than the value of lease_time +   attribute, to reclaim, (via the CLAIM_DELEGATE_PREV claim type of the +   OPEN operation) its delegations before removing unreclaimed +   delegations. + + + +Shepler, et al.             Standards Track                   [Page 216] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ERRORS + +      NFS4ERR_BADXDR +      NFS4ERR_CLID_INUSE +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE_CLIENTID + +14.2.35.  Operation 37: VERIFY - Verify Same Attributes + +   SYNOPSIS + +     (cfh), fattr -> - + +   ARGUMENT + +     struct VERIFY4args { +             /* CURRENT_FH: object */ +             fattr4          obj_attributes; +     }; + +   RESULT + +     struct VERIFY4res { +             nfsstat4        status; +     }; + +   DESCRIPTION + +   The VERIFY operation is used to verify that attributes have a value +   assumed by the client before proceeding with following operations in +   the compound request.  If any of the attributes do not match then the +   error NFS4ERR_NOT_SAME must be returned.  The current filehandle +   retains its value after successful completion of the operation. + +   IMPLEMENTATION + +   One possible use of the VERIFY operation is the following compound +   sequence.  With this the client is attempting to verify that the file +   being removed will match what the client expects to be removed.  This +   sequence can help prevent the unintended deletion of a file. + +         PUTFH (directory filehandle) +         LOOKUP (file name) +         VERIFY (filehandle == fh) +         PUTFH (directory filehandle) +         REMOVE (file name) + + + + +Shepler, et al.             Standards Track                   [Page 217] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   This sequence does not prevent a second client from removing and +   creating a new file in the middle of this sequence but it does help +   avoid the unintended result. + +   In the case that a recommended attribute is specified in the VERIFY +   operation and the server does not support that attribute for the +   filesystem object, the error NFS4ERR_ATTRNOTSUPP is returned to the +   client. + +   When the attribute rdattr_error or any write-only attribute (e.g., +   time_modify_set) is specified, the error NFS4ERR_INVAL is returned to +   the client. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ATTRNOTSUPP +      NFS4ERR_BADCHAR +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_FHEXPIRED +      NFS4ERR_INVAL +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOT_SAME +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE + +14.2.36.  Operation 38: WRITE - Write to File + +   SYNOPSIS + +     (cfh), stateid, offset, stable, data -> count, committed, writeverf + +   ARGUMENT + +     enum stable_how4 { +             UNSTABLE4       = 0, +             DATA_SYNC4      = 1, +             FILE_SYNC4      = 2 +     }; + +     struct WRITE4args { +             /* CURRENT_FH: file */ +             stateid4        stateid; +             offset4         offset; + + + +Shepler, et al.             Standards Track                   [Page 218] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +             stable_how4     stable; +             opaque          data<>; +     }; + +   RESULT + +     struct WRITE4resok { +             count4          count; +             stable_how4     committed; +             verifier4       writeverf; +     }; + +     union WRITE4res switch (nfsstat4 status) { +      case NFS4_OK: +              WRITE4resok    resok4; +      default: +              void; +     }; + +   DESCRIPTION + +   The WRITE operation is used to write data to a regular file.  The +   target file is specified by the current filehandle.  The offset +   specifies the offset where the data should be written.  An offset of +   0 (zero) specifies that the write should start at the beginning of +   the file.  The count, as encoded as part of the opaque data +   parameter, represents the number of bytes of data that are to be +   written.  If the count is 0 (zero), the WRITE will succeed and return +   a count of 0 (zero) subject to permissions checking.  The server may +   choose to write fewer bytes than requested by the client. + +   Part of the write request is a specification of how the write is to +   be performed.  The client specifies with the stable parameter the +   method of how the data is to be processed by the server.  If stable +   is FILE_SYNC4, the server must commit the data written plus all +   filesystem metadata to stable storage before returning results.  This +   corresponds to the NFS version 2 protocol semantics.  Any other +   behavior constitutes a protocol violation.  If stable is DATA_SYNC4, +   then the server must commit all of the data to stable storage and +   enough of the metadata to retrieve the data before returning.  The +   server implementor is free to implement DATA_SYNC4 in the same +   fashion as FILE_SYNC4, but with a possible performance drop.  If +   stable is UNSTABLE4, the server is free to commit any part of the +   data and the metadata to stable storage, including all or none, +   before returning a reply to the client. There is no guarantee whether +   or when any uncommitted data will subsequently be committed to stable +   storage. The only guarantees made by the server are that it will not + + + + +Shepler, et al.             Standards Track                   [Page 219] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   destroy any data without changing the value of verf and that it will +   not commit the data and metadata at a level less than that requested +   by the client. + +   The stateid value for a WRITE request represents a value returned +   from a previous record lock or share reservation request.  The +   stateid is used by the server to verify that the associated share +   reservation and any record locks are still valid and to update lease +   timeouts for the client. + +   Upon successful completion, the following results are returned.  The +   count result is the number of bytes of data written to the file. The +   server may write fewer bytes than requested. If so, the actual number +   of bytes written starting at location, offset, is returned. + +   The server also returns an indication of the level of commitment of +   the data and metadata via committed. If the server committed all data +   and metadata to stable storage, committed should be set to +   FILE_SYNC4. If the level of commitment was at least as strong as +   DATA_SYNC4, then committed should be set to DATA_SYNC4.  Otherwise, +   committed must be returned as UNSTABLE4. If stable was FILE4_SYNC, +   then committed must also be FILE_SYNC4: anything else constitutes a +   protocol violation. If stable was DATA_SYNC4, then committed may be +   FILE_SYNC4 or DATA_SYNC4: anything else constitutes a protocol +   violation. If stable was UNSTABLE4, then committed may be either +   FILE_SYNC4, DATA_SYNC4, or UNSTABLE4. + +   The final portion of the result is the write verifier.  The write +   verifier is a cookie that the client can use to determine whether the +   server has changed instance (boot) state between a call to WRITE and +   a subsequent call to either WRITE or COMMIT.  This cookie must be +   consistent during a single instance of the NFS version 4 protocol +   service and must be unique between instances of the NFS version 4 +   protocol server, where uncommitted data may be lost. + +   If a client writes data to the server with the stable argument set to +   UNSTABLE4 and the reply yields a committed response of DATA_SYNC4 or +   UNSTABLE4, the client will follow up some time in the future with a +   COMMIT operation to synchronize outstanding asynchronous data and +   metadata with the server's stable storage, barring client error. It +   is possible that due to client crash or other error that a subsequent +   COMMIT will not be received by the server. + +   For a WRITE with a stateid value of all bits 0, the server MAY allow +   the WRITE to be serviced subject to mandatory file locks or the +   current share deny modes for the file.  For a WRITE with a stateid + + + + + +Shepler, et al.             Standards Track                   [Page 220] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   value of all bits 1, the server MUST NOT allow the WRITE operation to +   bypass locking checks at the server and are treated exactly the same +   as if a stateid of all bits 0 were used. + +   On success, the current filehandle retains its value. + +   IMPLEMENTATION + +   It is possible for the server to write fewer bytes of data than +   requested by the client.  In this case, the server should not return +   an error unless no data was written at all.  If the server writes +   less than the number of bytes specified, the client should issue +   another WRITE to write the remaining data. + +   It is assumed that the act of writing data to a file will cause the +   time_modified of the file to be updated.  However, the time_modified +   of the file should not be changed unless the contents of the file are +   changed.  Thus, a WRITE request with count set to 0 should not cause +   the time_modified of the file to be updated. + +   The definition of stable storage has been historically a point of +   contention.  The following expected properties of stable storage may +   help in resolving design issues in the implementation. Stable storage +   is persistent storage that survives: + +      1. Repeated power failures. +      2. Hardware failures (of any board, power supply, etc.). +      3. Repeated software crashes, including reboot cycle. + +   This definition does not address failure of the stable storage module +   itself. + +   The verifier is defined to allow a client to detect different +   instances of an NFS version 4 protocol server over which cached, +   uncommitted data may be lost. In the most likely case, the verifier +   allows the client to detect server reboots.  This information is +   required so that the client can safely determine whether the server +   could have lost cached data.  If the server fails unexpectedly and +   the client has uncommitted data from previous WRITE requests (done +   with the stable argument set to UNSTABLE4 and in which the result +   committed was returned as UNSTABLE4 as well) it may not have flushed +   cached data to stable storage. The burden of recovery is on the +   client and the client will need to retransmit the data to the server. + +   A suggested verifier would be to use the time that the server was +   booted or the time the server was last started (if restarting the +   server without a reboot results in lost buffers). + + + + +Shepler, et al.             Standards Track                   [Page 221] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The committed field in the results allows the client to do more +   effective caching.  If the server is committing all WRITE requests to +   stable storage, then it should return with committed set to +   FILE_SYNC4, regardless of the value of the stable field in the +   arguments. A server that uses an NVRAM accelerator may choose to +   implement this policy.  The client can use this to increase the +   effectiveness of the cache by discarding cached data that has already +   been committed on the server. + +   Some implementations may return NFS4ERR_NOSPC instead of +   NFS4ERR_DQUOT when a user's quota is exceeded.  In the case that the +   current filehandle is a directory, the server will return +   NFS4ERR_ISDIR.  If the current filehandle is not a regular file or a +   directory, the server will return NFS4ERR_INVAL. + +   If mandatory file locking is on for the file, and corresponding +   record of the data to be written file is read or write locked by an +   owner that is not associated with the stateid, the server will return +   NFS4ERR_LOCKED. If so, the client must check if the owner +   corresponding to the stateid used with the WRITE operation has a +   conflicting read lock that overlaps with the region that was to be +   written. If the stateid's owner has no conflicting read lock, then +   the client should try to get the appropriate write record lock via +   the LOCK operation before re-attempting the WRITE. When the WRITE +   completes, the client should release the record lock via LOCKU. + +   If the stateid's owner had a conflicting read lock, then the client +   has no choice but to return an error to the application that +   attempted the WRITE. The reason is that since the stateid's owner had +   a read lock, the server either attempted to temporarily effectively +   upgrade this read lock to a write lock, or the server has no upgrade +   capability. If the server attempted to upgrade the read lock and +   failed, it is pointless for the client to re-attempt the upgrade via +   the LOCK operation, because there might be another client also trying +   to upgrade.  If two clients are blocked trying upgrade the same lock, +   the clients deadlock.  If the server has no upgrade capability, then +   it is pointless to try a LOCK operation to upgrade. + +   ERRORS + +      NFS4ERR_ACCESS +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_DELAY +      NFS4ERR_DQUOT +      NFS4ERR_EXPIRED + + + +Shepler, et al.             Standards Track                   [Page 222] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +      NFS4ERR_FBIG +      NFS4ERR_FHEXPIRED +      NFS4ERR_GRACE +      NFS4ERR_INVAL +      NFS4ERR_IO +      NFS4ERR_ISDIR +      NFS4ERR_LEASE_MOVED +      NFS4ERR_LOCKED +      NFS4ERR_MOVED +      NFS4ERR_NOFILEHANDLE +      NFS4ERR_NOSPC +      NFS4ERR_NXIO +      NFS4ERR_OLD_STATEID +      NFS4ERR_OPENMODE +      NFS4ERR_RESOURCE +      NFS4ERR_ROFS +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE +      NFS4ERR_STALE_STATEID + +14.2.37.  Operation 39: RELEASE_LOCKOWNER - Release Lockowner State + +   SYNOPSIS + +     lockowner -> () + +   ARGUMENT + +     struct RELEASE_LOCKOWNER4args { +             lock_owner4     lock_owner; +     }; + +   RESULT + +     struct RELEASE_LOCKOWNER4res { +             nfsstat4        status; +     }; + +   DESCRIPTION + +   This operation is used to notify the server that the lock_owner is no +   longer in use by the client.  This allows the server to release +   cached state related to the specified lock_owner.  If file locks, +   associated with the lock_owner, are held at the server, the error +   NFS4ERR_LOCKS_HELD will be returned and no further action will be +   taken. + + + + + +Shepler, et al.             Standards Track                   [Page 223] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   IMPLEMENTATION + +   The client may choose to use this operation to ease the amount of +   server state that is held.  Depending on behavior of applications at +   the client, it may be important for the client to use this operation +   since the server has certain obligations with respect to holding a +   reference to a lock_owner as long as the associated file is open. +   Therefore, if the client knows for certain that the lock_owner will +   no longer be used under the context of the associated open_owner4, it +   should use RELEASE_LOCKOWNER. + +   ERRORS + +      NFS4ERR_ADMIN_REVOKED +      NFS4ERR_BADXDR +      NFS4ERR_EXPIRED +      NFS4ERR_LEASE_MOVED +      NFS4ERR_LOCKS_HELD +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT +      NFS4ERR_STALE_CLIENTID + +14.2.38.  Operation 10044: ILLEGAL - Illegal operation + +   SYNOPSIS + +     <null> -> () + +   ARGUMENT + +             void; + +   RESULT + +             struct ILLEGAL4res { +                     nfsstat4        status; +             }; + +   DESCRIPTION + +   This operation is a placeholder for encoding a result to handle the +   case of the client sending an operation code within COMPOUND that is +   not supported. See the COMPOUND procedure description for more +   details. + +   The status field of ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL. + + + + + +Shepler, et al.             Standards Track                   [Page 224] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   IMPLEMENTATION + +   A client will probably not send an operation with code OP_ILLEGAL but +   if it does, the response will be ILLEGAL4res just as it would be with +   any other invalid operation code. Note that if the server gets an +   illegal operation code that is not OP_ILLEGAL, and if the server +   checks for legal operation codes during the XDR decode phase, then +   the ILLEGAL4res would not be returned. + +   ERRORS + +   NFS4ERR_OP_ILLEGAL + +15.  NFS version 4 Callback Procedures + +   The procedures used for callbacks are defined in the following +   sections.  In the interest of clarity, the terms "client" and +   "server" refer to NFS clients and servers, despite the fact that for +   an individual callback RPC, the sense of these terms would be +   precisely the opposite. + +15.1.  Procedure 0: CB_NULL - No Operation + +   SYNOPSIS + +     <null> + +   ARGUMENT + +     void; + +   RESULT + +     void; + +   DESCRIPTION + +   Standard NULL procedure.  Void argument, void response.  Even though +   there is no direct functionality associated with this procedure, the +   server will use CB_NULL to confirm the existence of a path for RPCs +   from server to client. + +   ERRORS + +   None. + + + + + + +Shepler, et al.             Standards Track                   [Page 225] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +15.2.  Procedure 1: CB_COMPOUND - Compound Operations + +   SYNOPSIS + +     compoundargs -> compoundres + +   ARGUMENT + +     enum nfs_cb_opnum4 { +             OP_CB_GETATTR           = 3, +             OP_CB_RECALL            = 4, +             OP_CB_ILLEGAL           = 10044 +     }; + +     union nfs_cb_argop4 switch (unsigned argop) { +      case OP_CB_GETATTR:    CB_GETATTR4args opcbgetattr; +      case OP_CB_RECALL:     CB_RECALL4args  opcbrecall; +      case OP_CB_ILLEGAL:    void            opcbillegal; +     }; + +     struct CB_COMPOUND4args { +             utf8str_cs      tag; +             uint32_t        minorversion; +             uint32_t        callback_ident; +             nfs_cb_argop4   argarray<>; +     }; + +   RESULT + +     union nfs_cb_resop4 switch (unsigned resop){ +      case OP_CB_GETATTR:    CB_GETATTR4res  opcbgetattr; +      case OP_CB_RECALL:     CB_RECALL4res   opcbrecall; +     }; + +     struct CB_COMPOUND4res { +             nfsstat4 status; +             utf8str_cs      tag; +             nfs_cb_resop4   resarray<>; +     }; + +   DESCRIPTION + +   The CB_COMPOUND procedure is used to combine one or more of the +   callback procedures into a single RPC request.  The main callback RPC +   program has two main procedures: CB_NULL and CB_COMPOUND.  All other +   operations use the CB_COMPOUND procedure as a wrapper. + + + + + +Shepler, et al.             Standards Track                   [Page 226] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   In the processing of the CB_COMPOUND procedure, the client may find +   that it does not have the available resources to execute any or all +   of the operations within the CB_COMPOUND sequence.  In this case, the +   error NFS4ERR_RESOURCE will be returned for the particular operation +   within the CB_COMPOUND procedure where the resource exhaustion +   occurred.  This assumes that all previous operations within the +   CB_COMPOUND sequence have been evaluated successfully. + +   Contained within the CB_COMPOUND results is a 'status' field.  This +   status must be equivalent to the status of the last operation that +   was executed within the CB_COMPOUND procedure.  Therefore, if an +   operation incurred an error then the 'status' value will be the same +   error value as is being returned for the operation that failed. + +   For the definition of the "tag" field, see the section "Procedure 1: +   COMPOUND - Compound Operations". + +   The value of callback_ident is supplied by the client during +   SETCLIENTID.  The server must use the client supplied callback_ident +   during the CB_COMPOUND to allow the client to properly identify the +   server. + +   Illegal operation codes are handled in the same way as they are +   handled for the COMPOUND procedure. + +   IMPLEMENTATION + +   The CB_COMPOUND procedure is used to combine individual operations +   into a single RPC request.  The client interprets each of the +   operations in turn.  If an operation is executed by the client and +   the status of that operation is NFS4_OK, then the next operation in +   the CB_COMPOUND procedure is executed.  The client continues this +   process until there are no more operations to be executed or one of +   the operations has a status value other than NFS4_OK. + +   ERRORS + +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_OP_ILLEGAL +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT + + + + + + + + +Shepler, et al.             Standards Track                   [Page 227] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +15.2.1.  Operation 3: CB_GETATTR - Get Attributes + +   SYNOPSIS + +     fh, attr_request -> attrmask, attr_vals + +   ARGUMENT + +     struct CB_GETATTR4args { +             nfs_fh4 fh; +             bitmap4 attr_request; +     }; + +   RESULT + +     struct CB_GETATTR4resok { +             fattr4  obj_attributes; +     }; + +     union CB_GETATTR4res switch (nfsstat4 status) { +      case NFS4_OK: +              CB_GETATTR4resok       resok4; +      default: +              void; +     }; + +DESCRIPTION + +   The CB_GETATTR operation is used by the server to obtain the +   current modified state of a file that has been write delegated. +   The attributes size and change are the only ones guaranteed to be +   serviced by the client.  See the section "Handling of CB_GETATTR" +   for a full description of how the client and server are to interact +   with the use of CB_GETATTR. + +   If the filehandle specified is not one for which the client holds a +   write open delegation, an NFS4ERR_BADHANDLE error is returned. + +   IMPLEMENTATION + +   The client returns attrmask bits and the associated attribute +   values only for the change attribute, and attributes that it may +   change (time_modify, and size). + + + + + + + + +Shepler, et al.             Standards Track                   [Page 228] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   ERRORS + +      NFS4ERR_BADHANDLE +      NFS4ERR_BADXDR +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT + +15.2.2.  Operation 4: CB_RECALL - Recall an Open Delegation + +   SYNOPSIS + +     stateid, truncate, fh -> () + +   ARGUMENT + +     struct CB_RECALL4args { +             stateid4        stateid; +             bool            truncate; +             nfs_fh4         fh; +     }; + +   RESULT + +     struct CB_RECALL4res { +             nfsstat4        status; +     }; + +   DESCRIPTION + +   The CB_RECALL operation is used to begin the process of recalling an +   open delegation and returning it to the server. + +   The truncate flag is used to optimize recall for a file which is +   about to be truncated to zero.  When it is set, the client is freed +   of obligation to propagate modified data for the file to the server, +   since this data is irrelevant. + +   If the handle specified is not one for which the client holds an open +   delegation, an NFS4ERR_BADHANDLE error is returned. + +   If the stateid specified is not one corresponding to an open +   delegation for the file specified by the filehandle, an +   NFS4ERR_BAD_STATEID is returned. + + + + + + + + +Shepler, et al.             Standards Track                   [Page 229] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   IMPLEMENTATION + +   The client should reply to the callback immediately.  Replying does +   not complete the recall except when an error was returned.  The +   recall is not complete until the delegation is returned using a +   DELEGRETURN. + +   ERRORS + +      NFS4ERR_BADHANDLE +      NFS4ERR_BAD_STATEID +      NFS4ERR_BADXDR +      NFS4ERR_RESOURCE +      NFS4ERR_SERVERFAULT + +15.2.3.  Operation 10044: CB_ILLEGAL - Illegal Callback Operation + +   SYNOPSIS + +     <null> -> () + +   ARGUMENT + +       void; + +   RESULT + +             struct CB_ILLEGAL4res { +                     nfsstat4        status; +             }; + +   DESCRIPTION + +   This operation is a placeholder for encoding a result to handle the +   case of the client sending an operation code within COMPOUND that is +   not supported. See the COMPOUND procedure description for more +   details. + +   The status field of CB_ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL. + +   IMPLEMENTATION + +   A server will probably not send an operation with code OP_CB_ILLEGAL +   but if it does, the response will be CB_ILLEGAL4res just as it would +   be with any other invalid operation code. Note that if the client + + + + + + +Shepler, et al.             Standards Track                   [Page 230] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   gets an illegal operation code that is not OP_ILLEGAL, and if the +   client checks for legal operation codes during the XDR decode phase, +   then the CB_ILLEGAL4res would not be returned. + +   ERRORS + +   NFS4ERR_OP_ILLEGAL + +16.  Security Considerations + +   NFS has historically used a model where, from an authentication +   perspective, the client was the entire machine, or at least the +   source IP address of the machine.  The NFS server relied on the NFS +   client to make the proper authentication of the end-user.  The NFS +   server in turn shared its files only to specific clients, as +   identified by the client's source IP address.  Given this model, the +   AUTH_SYS RPC security flavor simply identified the end-user using the +   client to the NFS server.  When processing NFS responses, the client +   ensured that the responses came from the same IP address and port +   number that the request was sent to.  While such a model is easy to +   implement and simple to deploy and use, it is certainly not a safe +   model.  Thus, NFSv4 mandates that implementations support a security +   model that uses end to end authentication, where an end-user on a +   client mutually authenticates (via cryptographic schemes that do not +   expose passwords or keys in the clear on the network) to a principal +   on an NFS server.  Consideration should also be given to the +   integrity and privacy of NFS requests and responses.  The issues of +   end to end mutual authentication, integrity, and privacy are +   discussed as part of the section on "RPC and Security Flavor". + +   Note that while NFSv4 mandates an end to end mutual authentication +   model, the "classic" model of machine authentication via IP address +   checking and AUTH_SYS identification can still be supported with the +   caveat that the AUTH_SYS flavor is neither MANDATORY nor RECOMMENDED +   by this specification, and so interoperability via AUTH_SYS is not +   assured. + +   For reasons of reduced administration overhead, better performance +   and/or reduction of CPU utilization, users of NFS version 4 +   implementations may choose to not use security mechanisms that enable +   integrity protection on each remote procedure call and response. The +   use of mechanisms without integrity leaves the customer vulnerable to +   an attacker in between the NFS client and server that modifies the +   RPC request and/or the response. While implementations are free to +   provide the option to use weaker security mechanisms, there are two +   operations in particular that warrant the implementation overriding +   user choices. + + + + +Shepler, et al.             Standards Track                   [Page 231] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The first such operation is SECINFO.  It is recommended that the +   client issue the SECINFO call such that it is protected with a +   security flavor that has integrity protection, such as RPCSEC_GSS +   with a security triple that uses either rpc_gss_svc_integrity or +   rpc_gss_svc_privacy (rpc_gss_svc_privacy includes integrity +   protection) service. Without integrity protection encapsulating +   SECINFO and therefore its results, an attacker in the middle could +   modify results such that the client might select a weaker algorithm +   in the set allowed by server, making the client and/or server +   vulnerable to further attacks. + +   The second operation that should definitely use integrity protection +   is any GETATTR for the fs_locations attribute. The attack has two +   steps.  First the attacker modifies the unprotected results of some +   operation to return NFS4ERR_MOVED. Second, when the client follows up +   with a GETATTR for the fs_locations attribute, the attacker modifies +   the results to cause the client migrate its traffic to a server +   controlled by the attacker. + +   Because the operations SETCLIENTID/SETCLIENTID_CONFIRM are +   responsible for the release of client state, it is imperative that +   the principal used for these operations is checked against and match +   the previous use of these operations.  See the section "Client ID" +   for further discussion. + +17.  IANA Considerations + +17.1.  Named Attribute Definition + +   The NFS version 4 protocol provides for the association of named +   attributes to files.  The name space identifiers for these attributes +   are defined as string names.  The protocol does not define the +   specific assignment of the name space for these file attributes. +   Even though the name space is not specifically controlled to prevent +   collisions, an IANA registry has been created for the registration of +   NFS version 4 named attributes.  Registration will be achieved +   through the publication of an Informational RFC and will require not +   only the name of the attribute but the syntax and semantics of the +   named attribute contents; the intent is to promote interoperability +   where common interests exist.  While application developers are +   allowed to define and use attributes as needed, they are encouraged +   to register the attributes with IANA. + +17.2.  ONC RPC Network Identifiers (netids) + +   The section "Structured Data Types" discussed the r_netid field and +   the corresponding r_addr field of a clientaddr4 structure.  The NFS +   version 4 protocol depends on the syntax and semantics of these + + + +Shepler, et al.             Standards Track                   [Page 232] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   fields to effectively communicate callback information between client +   and server.  Therefore, an IANA registry has been created to include +   the values defined in this document and to allow for future expansion +   based on transport usage/availability.  Additions to this ONC RPC +   Network Identifier registry must be done with the publication of an +   RFC. + +   The initial values for this registry are as follows (some of this +   text is replicated from section 2.2 for clarity): + +   The Network Identifier (or r_netid for short) is used to specify a +   transport protocol and associated universal address (or r_addr for +   short).  The syntax of the Network Identifier is a US-ASCII string. +   The initial definitions for r_netid are: + +      "tcp"   - TCP over IP version 4 + +      "udp"   - UDP over IP version 4 + +      "tcp6"  - TCP over IP version 6 + +      "udp6"  - UDP over IP version 6 + +   Note: the '"' marks are used for delimiting the strings for this +   document and are not part of the Network Identifier string. + +   For the "tcp" and "udp" Network Identifiers the Universal Address or +   r_addr (for IPv4) is a US-ASCII string and is of the form: + +   h1.h2.h3.h4.p1.p2 + +   The prefix, "h1.h2.h3.h4", is the standard textual form for +   representing an IPv4 address, which is always four octets long. +   Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, +   the first through fourth octets each converted to ASCII-decimal. +   Assuming big-endian ordering, p1 and p2 are, respectively, the first +   and second octets each converted to ASCII-decimal.  For example, if a +   host, in big-endian order, has an address of 0x0A010307 and there is +   a service listening on, in big endian order, port 0x020F (decimal +   527), then complete universal address is "10.1.3.7.2.15". + +   For the "tcp6" and "udp6" Network Identifiers the Universal Address +   or r_addr (for IPv6) is a US-ASCII string and is of the form: + +      x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 + + + + + + +Shepler, et al.             Standards Track                   [Page 233] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   The suffix "p1.p2" is the service port, and is computed the same way +   as with universal addresses for "tcp" and "udp".  The prefix, +   "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form for +   representing an IPv6 address as defined in Section 2.2 of [RFC2373]. +   Additionally, the two alternative forms specified in Section 2.2 of +   [RFC2373] are also acceptable. + +   As mentioned, the registration of new Network Identifiers will +   require the publication of an Information RFC with similar detail as +   listed above for the Network Identifier itself and corresponding +   Universal Address. + +18.  RPC definition file + +   /* +    *  Copyright (C) The Internet Society (1998,1999,2000,2001,2002). +    *  All Rights Reserved. +    */ + +   /* +    *      nfs4_prot.x +    * +    */ + +   %#pragma ident  "%W%" + +   /* +    * Basic typedefs for RFC 1832 data type definitions +    */ +   typedef int             int32_t; +   typedef unsigned int    uint32_t; +   typedef hyper           int64_t; +   typedef unsigned hyper  uint64_t; + +   /* +    * Sizes +    */ +   const NFS4_FHSIZE               = 128; +   const NFS4_VERIFIER_SIZE        = 8; +   const NFS4_OPAQUE_LIMIT         = 1024; + +   /* +    * File types +    */ +   enum nfs_ftype4 { +           NF4REG          = 1,    /* Regular File */ +           NF4DIR          = 2,    /* Directory */ +           NF4BLK          = 3,    /* Special File - block device */ + + + +Shepler, et al.             Standards Track                   [Page 234] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           NF4CHR          = 4,    /* Special File - character device */ +           NF4LNK          = 5,    /* Symbolic Link */ +           NF4SOCK         = 6,    /* Special File - socket */ +           NF4FIFO         = 7,    /* Special File - fifo */ +           NF4ATTRDIR      = 8,    /* Attribute Directory */ +           NF4NAMEDATTR    = 9     /* Named Attribute */ +   }; + +   /* +    * Error status +    */ +   enum nfsstat4 { +           NFS4_OK                 = 0,    /* everything is okay      */ +           NFS4ERR_PERM            = 1,    /* caller not privileged   */ +           NFS4ERR_NOENT           = 2,    /* no such file/directory  */ +           NFS4ERR_IO              = 5,    /* hard I/O error          */ +           NFS4ERR_NXIO            = 6,    /* no such device          */ +           NFS4ERR_ACCESS          = 13,   /* access denied           */ +           NFS4ERR_EXIST           = 17,   /* file already exists     */ +           NFS4ERR_XDEV            = 18,   /* different filesystems   */ +           /* Unused/reserved        19 */ +           NFS4ERR_NOTDIR          = 20,   /* should be a directory   */ +           NFS4ERR_ISDIR           = 21,   /* should not be directory */ +           NFS4ERR_INVAL           = 22,   /* invalid argument        */ +           NFS4ERR_FBIG            = 27,   /* file exceeds server max */ +           NFS4ERR_NOSPC           = 28,   /* no space on filesystem  */ +           NFS4ERR_ROFS            = 30,   /* read-only filesystem    */ +           NFS4ERR_MLINK           = 31,   /* too many hard links     */ +           NFS4ERR_NAMETOOLONG     = 63,   /* name exceeds server max */ +           NFS4ERR_NOTEMPTY        = 66,   /* directory not empty     */ +           NFS4ERR_DQUOT           = 69,   /* hard quota limit reached*/ +           NFS4ERR_STALE           = 70,   /* file no longer exists   */ +           NFS4ERR_BADHANDLE       = 10001,/* Illegal filehandle      */ +           NFS4ERR_BAD_COOKIE      = 10003,/* READDIR cookie is stale */ +           NFS4ERR_NOTSUPP         = 10004,/* operation not supported */ +           NFS4ERR_TOOSMALL        = 10005,/* response limit exceeded */ +           NFS4ERR_SERVERFAULT     = 10006,/* undefined server error  */ +           NFS4ERR_BADTYPE         = 10007,/* type invalid for CREATE */ +           NFS4ERR_DELAY           = 10008,/* file "busy" - retry     */ +           NFS4ERR_SAME            = 10009,/* nverify says attrs same */ +           NFS4ERR_DENIED          = 10010,/* lock unavailable        */ +           NFS4ERR_EXPIRED         = 10011,/* lock lease expired      */ +           NFS4ERR_LOCKED          = 10012,/* I/O failed due to lock  */ +           NFS4ERR_GRACE           = 10013,/* in grace period         */ +           NFS4ERR_FHEXPIRED       = 10014,/* filehandle expired      */ +           NFS4ERR_SHARE_DENIED    = 10015,/* share reserve denied    */ +           NFS4ERR_WRONGSEC        = 10016,/* wrong security flavor   */ +           NFS4ERR_CLID_INUSE      = 10017,/* clientid in use         */ + + + +Shepler, et al.             Standards Track                   [Page 235] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           NFS4ERR_RESOURCE        = 10018,/* resource exhaustion     */ +           NFS4ERR_MOVED           = 10019,/* filesystem relocated    */ +           NFS4ERR_NOFILEHANDLE    = 10020,/* current FH is not set   */ +           NFS4ERR_MINOR_VERS_MISMATCH = 10021,/* minor vers not supp */ +           NFS4ERR_STALE_CLIENTID  = 10022,/* server has rebooted     */ +           NFS4ERR_STALE_STATEID   = 10023,/* server has rebooted     */ +           NFS4ERR_OLD_STATEID     = 10024,/* state is out of sync    */ +           NFS4ERR_BAD_STATEID     = 10025,/* incorrect stateid       */ +           NFS4ERR_BAD_SEQID       = 10026,/* request is out of seq.  */ +           NFS4ERR_NOT_SAME        = 10027,/* verify - attrs not same */ +           NFS4ERR_LOCK_RANGE      = 10028,/* lock range not supported*/ +           NFS4ERR_SYMLINK         = 10029,/* should be file/directory*/ +           NFS4ERR_RESTOREFH       = 10030,/* no saved filehandle     */ +           NFS4ERR_LEASE_MOVED     = 10031,/* some filesystem moved   */ +           NFS4ERR_ATTRNOTSUPP     = 10032,/* recommended attr not sup*/ +           NFS4ERR_NO_GRACE        = 10033,/* reclaim outside of grace*/ +           NFS4ERR_RECLAIM_BAD     = 10034,/* reclaim error at server */ +           NFS4ERR_RECLAIM_CONFLICT = 10035,/* conflict on reclaim    */ +           NFS4ERR_BADXDR          = 10036,/* XDR decode failed       */ +           NFS4ERR_LOCKS_HELD      = 10037,/* file locks held at CLOSE*/ +           NFS4ERR_OPENMODE        = 10038,/* conflict in OPEN and I/O*/ +           NFS4ERR_BADOWNER        = 10039,/* owner translation bad   */ +           NFS4ERR_BADCHAR         = 10040,/* utf-8 char not supported*/ +           NFS4ERR_BADNAME         = 10041,/* name not supported      */ +           NFS4ERR_BAD_RANGE       = 10042,/* lock range not supported*/ +           NFS4ERR_LOCK_NOTSUPP    = 10043,/* no atomic up/downgrade  */ +           NFS4ERR_OP_ILLEGAL      = 10044,/* undefined operation     */ +           NFS4ERR_DEADLOCK        = 10045,/* file locking deadlock   */ +           NFS4ERR_FILE_OPEN       = 10046,/* open file blocks op.    */ +           NFS4ERR_ADMIN_REVOKED   = 10047,/* lockowner state revoked */ +           NFS4ERR_CB_PATH_DOWN    = 10048 /* callback path down      */ +   }; + +   /* +    * Basic data types +    */ +   typedef uint32_t        bitmap4<>; +   typedef uint64_t        offset4; +   typedef uint32_t        count4; +   typedef uint64_t        length4; +   typedef uint64_t        clientid4; +   typedef uint32_t        seqid4; +   typedef opaque          utf8string<>; +   typedef utf8string      utf8str_cis; +   typedef utf8string      utf8str_cs; +   typedef utf8string      utf8str_mixed; +   typedef utf8str_cs      component4; +   typedef component4      pathname4<>; + + + +Shepler, et al.             Standards Track                   [Page 236] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   typedef uint64_t        nfs_lockid4; +   typedef uint64_t        nfs_cookie4; +   typedef utf8str_cs      linktext4; +   typedef opaque          sec_oid4<>; +   typedef uint32_t        qop4; +   typedef uint32_t        mode4; +   typedef uint64_t        changeid4; +   typedef opaque          verifier4[NFS4_VERIFIER_SIZE]; + +   /* +    * Timeval +    */ +   struct nfstime4 { +           int64_t         seconds; +           uint32_t        nseconds; +   }; + +   enum time_how4 { +           SET_TO_SERVER_TIME4 = 0, +           SET_TO_CLIENT_TIME4 = 1 +   }; + +   union settime4 switch (time_how4 set_it) { +    case SET_TO_CLIENT_TIME4: +            nfstime4       time; +    default: +            void; +   }; + +   /* +    * File access handle +    */ +   typedef opaque  nfs_fh4<NFS4_FHSIZE>; + + +   /* +    * File attribute definitions +    */ + +   /* +    * FSID structure for major/minor +    */ +   struct fsid4 { +           uint64_t        major; +           uint64_t        minor; +   }; + +   /* + + + +Shepler, et al.             Standards Track                   [Page 237] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +    * Filesystem locations attribute for relocation/migration +    */ +   struct fs_location4 { +           utf8str_cis     server<>; +           pathname4       rootpath; +   }; + +   struct fs_locations4 { +           pathname4       fs_root; +           fs_location4    locations<>; +   }; + +   /* +    * Various Access Control Entry definitions +    */ + +   /* +    * Mask that indicates which Access Control Entries are supported. +    * Values for the fattr4_aclsupport attribute. +    */ +   const ACL4_SUPPORT_ALLOW_ACL    = 0x00000001; +   const ACL4_SUPPORT_DENY_ACL     = 0x00000002; +   const ACL4_SUPPORT_AUDIT_ACL    = 0x00000004; +   const ACL4_SUPPORT_ALARM_ACL    = 0x00000008; + + +   typedef uint32_t        acetype4; +   /* +    * acetype4 values, others can be added as needed. +    */ +   const ACE4_ACCESS_ALLOWED_ACE_TYPE      = 0x00000000; +   const ACE4_ACCESS_DENIED_ACE_TYPE       = 0x00000001; +   const ACE4_SYSTEM_AUDIT_ACE_TYPE        = 0x00000002; +   const ACE4_SYSTEM_ALARM_ACE_TYPE        = 0x00000003; + + +   /* +    * ACE flag +    */ +   typedef uint32_t aceflag4; + +   /* +    * ACE flag values +    */ +   const ACE4_FILE_INHERIT_ACE             = 0x00000001; +   const ACE4_DIRECTORY_INHERIT_ACE        = 0x00000002; +   const ACE4_NO_PROPAGATE_INHERIT_ACE     = 0x00000004; +   const ACE4_INHERIT_ONLY_ACE             = 0x00000008; + + + +Shepler, et al.             Standards Track                   [Page 238] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG   = 0x00000010; +   const ACE4_FAILED_ACCESS_ACE_FLAG       = 0x00000020; +   const ACE4_IDENTIFIER_GROUP             = 0x00000040; + + +   /* +    * ACE mask +    */ +   typedef uint32_t        acemask4; + +   /* +    * ACE mask values +    */ +   const ACE4_READ_DATA            = 0x00000001; +   const ACE4_LIST_DIRECTORY       = 0x00000001; +   const ACE4_WRITE_DATA           = 0x00000002; +   const ACE4_ADD_FILE             = 0x00000002; +   const ACE4_APPEND_DATA          = 0x00000004; +   const ACE4_ADD_SUBDIRECTORY     = 0x00000004; +   const ACE4_READ_NAMED_ATTRS     = 0x00000008; +   const ACE4_WRITE_NAMED_ATTRS    = 0x00000010; +   const ACE4_EXECUTE              = 0x00000020; +   const ACE4_DELETE_CHILD         = 0x00000040; +   const ACE4_READ_ATTRIBUTES      = 0x00000080; +   const ACE4_WRITE_ATTRIBUTES     = 0x00000100; + +   const ACE4_DELETE               = 0x00010000; +   const ACE4_READ_ACL             = 0x00020000; +   const ACE4_WRITE_ACL            = 0x00040000; +   const ACE4_WRITE_OWNER          = 0x00080000; +   const ACE4_SYNCHRONIZE          = 0x00100000; + +   /* +    * ACE4_GENERIC_READ -- defined as combination of +    *      ACE4_READ_ACL | +    *      ACE4_READ_DATA | +    *      ACE4_READ_ATTRIBUTES | +    *      ACE4_SYNCHRONIZE +    */ + +   const ACE4_GENERIC_READ = 0x00120081; + +   /* +    * ACE4_GENERIC_WRITE -- defined as combination of +    *      ACE4_READ_ACL | +    *      ACE4_WRITE_DATA | +    *      ACE4_WRITE_ATTRIBUTES | +    *      ACE4_WRITE_ACL | + + + +Shepler, et al.             Standards Track                   [Page 239] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +    *      ACE4_APPEND_DATA | +    *      ACE4_SYNCHRONIZE +    */ +   const ACE4_GENERIC_WRITE = 0x00160106; + + +   /* +    * ACE4_GENERIC_EXECUTE -- defined as combination of +    *      ACE4_READ_ACL +    *      ACE4_READ_ATTRIBUTES +    *      ACE4_EXECUTE +    *      ACE4_SYNCHRONIZE +    */ +   const ACE4_GENERIC_EXECUTE = 0x001200A0; + + +   /* +    * Access Control Entry definition +    */ +   struct nfsace4 { +           acetype4        type; +           aceflag4        flag; +           acemask4        access_mask; +           utf8str_mixed   who; +   }; + +   /* +    * Field definitions for the fattr4_mode attribute +    */ +   const MODE4_SUID = 0x800;  /* set user id on execution */ +   const MODE4_SGID = 0x400;  /* set group id on execution */ +   const MODE4_SVTX = 0x200;  /* save text even after use */ +   const MODE4_RUSR = 0x100;  /* read permission: owner */ +   const MODE4_WUSR = 0x080;  /* write permission: owner */ +   const MODE4_XUSR = 0x040;  /* execute permission: owner */ +   const MODE4_RGRP = 0x020;  /* read permission: group */ +   const MODE4_WGRP = 0x010;  /* write permission: group */ +   const MODE4_XGRP = 0x008;  /* execute permission: group */ +   const MODE4_ROTH = 0x004;  /* read permission: other */ +   const MODE4_WOTH = 0x002;  /* write permission: other */ +   const MODE4_XOTH = 0x001;  /* execute permission: other */ + +   /* +    * Special data/attribute associated with +    * file types NF4BLK and NF4CHR. +    */ +   struct specdata4 { +           uint32_t        specdata1;      /* major device number */ + + + +Shepler, et al.             Standards Track                   [Page 240] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           uint32_t        specdata2;      /* minor device number */ +   }; + +   /* +    * Values for fattr4_fh_expire_type +    */ +   const   FH4_PERSISTENT          = 0x00000000; +   const   FH4_NOEXPIRE_WITH_OPEN  = 0x00000001; +   const   FH4_VOLATILE_ANY        = 0x00000002; +   const   FH4_VOL_MIGRATION       = 0x00000004; +   const   FH4_VOL_RENAME          = 0x00000008; + + +   typedef bitmap4         fattr4_supported_attrs; +   typedef nfs_ftype4      fattr4_type; +   typedef uint32_t        fattr4_fh_expire_type; +   typedef changeid4       fattr4_change; +   typedef uint64_t        fattr4_size; +   typedef bool            fattr4_link_support; +   typedef bool            fattr4_symlink_support; +   typedef bool            fattr4_named_attr; +   typedef fsid4           fattr4_fsid; +   typedef bool            fattr4_unique_handles; +   typedef uint32_t        fattr4_lease_time; +   typedef nfsstat4        fattr4_rdattr_error; + +   typedef nfsace4         fattr4_acl<>; +   typedef uint32_t        fattr4_aclsupport; +   typedef bool            fattr4_archive; +   typedef bool            fattr4_cansettime; +   typedef bool            fattr4_case_insensitive; +   typedef bool            fattr4_case_preserving; +   typedef bool            fattr4_chown_restricted; +   typedef uint64_t        fattr4_fileid; +   typedef uint64_t        fattr4_files_avail; +   typedef nfs_fh4         fattr4_filehandle; +   typedef uint64_t        fattr4_files_free; +   typedef uint64_t        fattr4_files_total; +   typedef fs_locations4   fattr4_fs_locations; +   typedef bool            fattr4_hidden; +   typedef bool            fattr4_homogeneous; +   typedef uint64_t        fattr4_maxfilesize; +   typedef uint32_t        fattr4_maxlink; +   typedef uint32_t        fattr4_maxname; +   typedef uint64_t        fattr4_maxread; +   typedef uint64_t        fattr4_maxwrite; +   typedef utf8str_cs      fattr4_mimetype; +   typedef mode4           fattr4_mode; + + + +Shepler, et al.             Standards Track                   [Page 241] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   typedef uint64_t        fattr4_mounted_on_fileid; +   typedef bool            fattr4_no_trunc; +   typedef uint32_t        fattr4_numlinks; +   typedef utf8str_mixed   fattr4_owner; +   typedef utf8str_mixed   fattr4_owner_group; +   typedef uint64_t        fattr4_quota_avail_hard; +   typedef uint64_t        fattr4_quota_avail_soft; +   typedef uint64_t        fattr4_quota_used; +   typedef specdata4       fattr4_rawdev; +   typedef uint64_t        fattr4_space_avail; +   typedef uint64_t        fattr4_space_free; +   typedef uint64_t        fattr4_space_total; +   typedef uint64_t        fattr4_space_used; +   typedef bool            fattr4_system; +   typedef nfstime4        fattr4_time_access; +   typedef settime4        fattr4_time_access_set; +   typedef nfstime4        fattr4_time_backup; +   typedef nfstime4        fattr4_time_create; +   typedef nfstime4        fattr4_time_delta; +   typedef nfstime4        fattr4_time_metadata; +   typedef nfstime4        fattr4_time_modify; +   typedef settime4        fattr4_time_modify_set; + + +   /* +    * Mandatory Attributes +    */ +   const FATTR4_SUPPORTED_ATTRS    = 0; +   const FATTR4_TYPE               = 1; +   const FATTR4_FH_EXPIRE_TYPE     = 2; +   const FATTR4_CHANGE             = 3; +   const FATTR4_SIZE               = 4; +   const FATTR4_LINK_SUPPORT       = 5; +   const FATTR4_SYMLINK_SUPPORT    = 6; +   const FATTR4_NAMED_ATTR         = 7; +   const FATTR4_FSID               = 8; +   const FATTR4_UNIQUE_HANDLES     = 9; +   const FATTR4_LEASE_TIME         = 10; +   const FATTR4_RDATTR_ERROR       = 11; +   const FATTR4_FILEHANDLE         = 19; + +   /* +    * Recommended Attributes +    */ +   const FATTR4_ACL                = 12; +   const FATTR4_ACLSUPPORT         = 13; +   const FATTR4_ARCHIVE            = 14; +   const FATTR4_CANSETTIME         = 15; + + + +Shepler, et al.             Standards Track                   [Page 242] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   const FATTR4_CASE_INSENSITIVE   = 16; +   const FATTR4_CASE_PRESERVING    = 17; +   const FATTR4_CHOWN_RESTRICTED   = 18; +   const FATTR4_FILEID             = 20; +   const FATTR4_FILES_AVAIL        = 21; +   const FATTR4_FILES_FREE         = 22; +   const FATTR4_FILES_TOTAL        = 23; +   const FATTR4_FS_LOCATIONS       = 24; +   const FATTR4_HIDDEN             = 25; +   const FATTR4_HOMOGENEOUS        = 26; +   const FATTR4_MAXFILESIZE        = 27; +   const FATTR4_MAXLINK            = 28; +   const FATTR4_MAXNAME            = 29; +   const FATTR4_MAXREAD            = 30; +   const FATTR4_MAXWRITE           = 31; +   const FATTR4_MIMETYPE           = 32; +   const FATTR4_MODE               = 33; +   const FATTR4_NO_TRUNC           = 34; +   const FATTR4_NUMLINKS           = 35; +   const FATTR4_OWNER              = 36; +   const FATTR4_OWNER_GROUP        = 37; +   const FATTR4_QUOTA_AVAIL_HARD   = 38; +   const FATTR4_QUOTA_AVAIL_SOFT   = 39; +   const FATTR4_QUOTA_USED         = 40; +   const FATTR4_RAWDEV             = 41; +   const FATTR4_SPACE_AVAIL        = 42; +   const FATTR4_SPACE_FREE         = 43; +   const FATTR4_SPACE_TOTAL        = 44; +   const FATTR4_SPACE_USED         = 45; +   const FATTR4_SYSTEM             = 46; +   const FATTR4_TIME_ACCESS        = 47; +   const FATTR4_TIME_ACCESS_SET    = 48; +   const FATTR4_TIME_BACKUP        = 49; +   const FATTR4_TIME_CREATE        = 50; +   const FATTR4_TIME_DELTA         = 51; +   const FATTR4_TIME_METADATA      = 52; +   const FATTR4_TIME_MODIFY        = 53; +   const FATTR4_TIME_MODIFY_SET    = 54; +   const FATTR4_MOUNTED_ON_FILEID  = 55; + +   typedef opaque  attrlist4<>; + +   /* +    * File attribute container +    */ +   struct fattr4 { +           bitmap4         attrmask; +           attrlist4       attr_vals; + + + +Shepler, et al.             Standards Track                   [Page 243] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   }; + +   /* +    * Change info for the client +    */ +   struct change_info4 { +           bool            atomic; +           changeid4       before; +           changeid4       after; +   }; + +   struct clientaddr4 { +           /* see struct rpcb in RFC 1833 */ +           string r_netid<>;               /* network id */ +           string r_addr<>;                /* universal address */ +   }; + +   /* +    * Callback program info as provided by the client +    */ +   struct cb_client4 { +           uint32_t        cb_program; +           clientaddr4     cb_location; +   }; + +   /* +    * Stateid +    */ +   struct stateid4 { +           uint32_t        seqid; +           opaque          other[12]; +   }; + +   /* +    * Client ID +    */ +   struct nfs_client_id4 { +           verifier4       verifier; +           opaque          id<NFS4_OPAQUE_LIMIT>; +   }; + +   struct open_owner4 { +           clientid4       clientid; +           opaque          owner<NFS4_OPAQUE_LIMIT>; +   }; + +   struct lock_owner4 { +           clientid4       clientid; + + + +Shepler, et al.             Standards Track                   [Page 244] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           opaque          owner<NFS4_OPAQUE_LIMIT>; +   }; + +   enum nfs_lock_type4 { +           READ_LT         = 1, +           WRITE_LT        = 2, +           READW_LT        = 3,    /* blocking read */ +           WRITEW_LT       = 4     /* blocking write */ +   }; + +   /* +    * ACCESS: Check access permission +    */ +   const ACCESS4_READ      = 0x00000001; +   const ACCESS4_LOOKUP    = 0x00000002; +   const ACCESS4_MODIFY    = 0x00000004; +   const ACCESS4_EXTEND    = 0x00000008; +   const ACCESS4_DELETE    = 0x00000010; +   const ACCESS4_EXECUTE   = 0x00000020; + +   struct ACCESS4args { +           /* CURRENT_FH: object */ +           uint32_t        access; +   }; + +   struct ACCESS4resok { +           uint32_t        supported; +           uint32_t        access; +   }; + +   union ACCESS4res switch (nfsstat4 status) { +    case NFS4_OK: +            ACCESS4resok   resok4; +    default: +            void; +   }; + +   /* +    * CLOSE: Close a file and release share reservations +    */ +   struct CLOSE4args { +           /* CURRENT_FH: object */ +           seqid4          seqid; +           stateid4        open_stateid; +   }; + +   union CLOSE4res switch (nfsstat4 status) { +    case NFS4_OK: + + + +Shepler, et al.             Standards Track                   [Page 245] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +            stateid4       open_stateid; +    default: +            void; +   }; + +   /* +    * COMMIT: Commit cached data on server to stable storage +    */ +   struct COMMIT4args { +           /* CURRENT_FH: file */ +           offset4         offset; +           count4          count; +   }; + +   struct COMMIT4resok { +           verifier4       writeverf; +   }; + + +   union COMMIT4res switch (nfsstat4 status) { +    case NFS4_OK: +            COMMIT4resok   resok4; +    default: +            void; +   }; + +   /* +    * CREATE: Create a non-regular file +    */ +   union createtype4 switch (nfs_ftype4 type) { +    case NF4LNK: +            linktext4      linkdata; +    case NF4BLK: +    case NF4CHR: +            specdata4      devdata; +    case NF4SOCK: +    case NF4FIFO: +    case NF4DIR: +            void; +    default: +            void;          /* server should return NFS4ERR_BADTYPE */ +   }; + +   struct CREATE4args { +           /* CURRENT_FH: directory for creation */ +           createtype4     objtype; +           component4      objname; +           fattr4          createattrs; + + + +Shepler, et al.             Standards Track                   [Page 246] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   }; + +   struct CREATE4resok { +           change_info4    cinfo; +           bitmap4         attrset;        /* attributes set */ +   }; + +   union CREATE4res switch (nfsstat4 status) { +    case NFS4_OK: +            CREATE4resok resok4; +    default: +            void; +   }; + +   /* +    * DELEGPURGE: Purge Delegations Awaiting Recovery +    */ +   struct DELEGPURGE4args { +           clientid4       clientid; +   }; + +   struct DELEGPURGE4res { +           nfsstat4        status; +   }; + +   /* +    * DELEGRETURN: Return a delegation +    */ +   struct DELEGRETURN4args { +           /* CURRENT_FH: delegated file */ +           stateid4        deleg_stateid; +   }; + +   struct DELEGRETURN4res { +           nfsstat4        status; +   }; + +   /* +    * GETATTR: Get file attributes +    */ +   struct GETATTR4args { +           /* CURRENT_FH: directory or file */ +           bitmap4         attr_request; +   }; + +   struct GETATTR4resok { +           fattr4          obj_attributes; +   }; + + + +Shepler, et al.             Standards Track                   [Page 247] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   union GETATTR4res switch (nfsstat4 status) { +    case NFS4_OK: +            GETATTR4resok  resok4; +    default: +            void; +   }; + +   /* +    * GETFH: Get current filehandle +    */ +   struct GETFH4resok { +           nfs_fh4         object; +   }; + +   union GETFH4res switch (nfsstat4 status) { +    case NFS4_OK: +           GETFH4resok     resok4; +    default: +           void; +   }; + +   /* +    * LINK: Create link to an object +    */ +   struct LINK4args { +           /* SAVED_FH: source object */ +           /* CURRENT_FH: target directory */ +           component4      newname; +   }; + +   struct LINK4resok { +           change_info4    cinfo; +   }; + +   union LINK4res switch (nfsstat4 status) { +    case NFS4_OK: +            LINK4resok resok4; +    default: +            void; +   }; + +   /* +    * For LOCK, transition from open_owner to new lock_owner +    */ +   struct open_to_lock_owner4 { +           seqid4          open_seqid; +           stateid4        open_stateid; +           seqid4          lock_seqid; + + + +Shepler, et al.             Standards Track                   [Page 248] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           lock_owner4     lock_owner; +   }; + +   /* +    * For LOCK, existing lock_owner continues to request file locks +    */ +   struct exist_lock_owner4 { +           stateid4        lock_stateid; +           seqid4          lock_seqid; +   }; + +   union locker4 switch (bool new_lock_owner) { +    case TRUE: +           open_to_lock_owner4     open_owner; +    case FALSE: +           exist_lock_owner4       lock_owner; +   }; + +   /* +    * LOCK/LOCKT/LOCKU: Record lock management +    */ +   struct LOCK4args { +           /* CURRENT_FH: file */ +           nfs_lock_type4  locktype; +           bool            reclaim; +           offset4         offset; +           length4         length; +           locker4         locker; +   }; + +   struct LOCK4denied { +           offset4         offset; +           length4         length; +           nfs_lock_type4  locktype; +           lock_owner4     owner; +   }; + +   struct LOCK4resok { +           stateid4        lock_stateid; +   }; + +   union LOCK4res switch (nfsstat4 status) { +    case NFS4_OK: +            LOCK4resok     resok4; +    case NFS4ERR_DENIED: +            LOCK4denied    denied; +    default: +            void; + + + +Shepler, et al.             Standards Track                   [Page 249] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   }; + +   struct LOCKT4args { +           /* CURRENT_FH: file */ +           nfs_lock_type4  locktype; +           offset4         offset; +           length4         length; +           lock_owner4     owner; +   }; + +   union LOCKT4res switch (nfsstat4 status) { +    case NFS4ERR_DENIED: +            LOCK4denied    denied; +    case NFS4_OK: +            void; +    default: +            void; +   }; + +   struct LOCKU4args { +           /* CURRENT_FH: file */ +           nfs_lock_type4  locktype; +           seqid4          seqid; +           stateid4        lock_stateid; +           offset4         offset; +           length4         length; +   }; + +   union LOCKU4res switch (nfsstat4 status) { +    case   NFS4_OK: +            stateid4       lock_stateid; +    default: +            void; +   }; + +   /* +    * LOOKUP: Lookup filename +    */ +   struct LOOKUP4args { +           /* CURRENT_FH: directory */ +           component4      objname; +   }; + +   struct LOOKUP4res { +           /* CURRENT_FH: object */ +           nfsstat4        status; +   }; + + + + +Shepler, et al.             Standards Track                   [Page 250] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   /* +    * LOOKUPP: Lookup parent directory +    */ +   struct LOOKUPP4res { +           /* CURRENT_FH: directory */ +           nfsstat4        status; +   }; + +   /* +    * NVERIFY: Verify attributes different +    */ +   struct NVERIFY4args { +           /* CURRENT_FH: object */ +           fattr4          obj_attributes; +   }; + +   struct NVERIFY4res { +           nfsstat4        status; +   }; + +   /* +    * Various definitions for OPEN +    */ +   enum createmode4 { +           UNCHECKED4      = 0, +           GUARDED4        = 1, +           EXCLUSIVE4      = 2 +   }; + +   union createhow4 switch (createmode4 mode) { +    case UNCHECKED4: +    case GUARDED4: +            fattr4         createattrs; +    case EXCLUSIVE4: +            verifier4      createverf; +   }; + +   enum opentype4 { +           OPEN4_NOCREATE  = 0, +           OPEN4_CREATE    = 1 +   }; + +   union openflag4 switch (opentype4 opentype) { +    case OPEN4_CREATE: +            createhow4     how; +    default: +            void; +   }; + + + +Shepler, et al.             Standards Track                   [Page 251] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   /* Next definitions used for OPEN delegation */ +   enum limit_by4 { +           NFS_LIMIT_SIZE          = 1, +           NFS_LIMIT_BLOCKS        = 2 +           /* others as needed */ +   }; + +   struct nfs_modified_limit4 { +           uint32_t        num_blocks; +           uint32_t        bytes_per_block; +   }; + +   union nfs_space_limit4 switch (limit_by4 limitby) { +    /* limit specified as file size */ +    case NFS_LIMIT_SIZE: +            uint64_t               filesize; +    /* limit specified by number of blocks */ +    case NFS_LIMIT_BLOCKS: +            nfs_modified_limit4    mod_blocks; +   } ; + +   /* +    * Share Access and Deny constants for open argument +    */ +   const OPEN4_SHARE_ACCESS_READ   = 0x00000001; +   const OPEN4_SHARE_ACCESS_WRITE  = 0x00000002; +   const OPEN4_SHARE_ACCESS_BOTH   = 0x00000003; + +   const OPEN4_SHARE_DENY_NONE     = 0x00000000; +   const OPEN4_SHARE_DENY_READ     = 0x00000001; +   const OPEN4_SHARE_DENY_WRITE    = 0x00000002; +   const OPEN4_SHARE_DENY_BOTH     = 0x00000003; + +   enum open_delegation_type4 { +           OPEN_DELEGATE_NONE      = 0, +           OPEN_DELEGATE_READ      = 1, +           OPEN_DELEGATE_WRITE     = 2 +   }; + +   enum open_claim_type4 { +           CLAIM_NULL              = 0, +           CLAIM_PREVIOUS          = 1, +           CLAIM_DELEGATE_CUR      = 2, +           CLAIM_DELEGATE_PREV     = 3 +   }; + +   struct open_claim_delegate_cur4 { +           stateid4        delegate_stateid; + + + +Shepler, et al.             Standards Track                   [Page 252] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           component4      file; +   }; + +   union open_claim4 switch (open_claim_type4 claim) { +    /* +     * No special rights to file. Ordinary OPEN of the specified file. +     */ +    case CLAIM_NULL: +           /* CURRENT_FH: directory */ +           component4      file; + +    /* +     * Right to the file established by an open previous to server +     * reboot.  File identified by filehandle obtained at that time +     * rather than by name. +     */ +    case CLAIM_PREVIOUS: +           /* CURRENT_FH: file being reclaimed */ +           open_delegation_type4   delegate_type; + +    /* +     * Right to file based on a delegation granted by the server. +     * File is specified by name. +     */ +    case CLAIM_DELEGATE_CUR: +           /* CURRENT_FH: directory */ +           open_claim_delegate_cur4        delegate_cur_info; + +    /* Right to file based on a delegation granted to a previous boot +     * instance of the client.  File is specified by name. +     */ +    case CLAIM_DELEGATE_PREV: +            /* CURRENT_FH: directory */ +           component4      file_delegate_prev; +   }; + +   /* +    * OPEN: Open a file, potentially receiving an open delegation +    */ +   struct OPEN4args { +           seqid4          seqid; +           uint32_t        share_access; +           uint32_t        share_deny; +           open_owner4     owner; +           openflag4       openhow; +           open_claim4     claim; +   }; + + + + +Shepler, et al.             Standards Track                   [Page 253] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   struct open_read_delegation4 { +           stateid4        stateid;        /* Stateid for delegation*/ +           bool            recall;         /* Pre-recalled flag for +                                              delegations obtained +                                              by reclaim +                                              (CLAIM_PREVIOUS) */ +           nfsace4         permissions;    /* Defines users who don't +                                              need an ACCESS call to +                                              open for read */ +   }; + +   struct open_write_delegation4 { +           stateid4        stateid;        /* Stateid for delegation */ +           bool            recall;         /* Pre-recalled flag for +                                              delegations obtained +                                              by reclaim +                                              (CLAIM_PREVIOUS) */ +           nfs_space_limit4 space_limit;   /* Defines condition that +                                              the client must check to +                                              determine whether the +                                              file needs to be flushed +                                              to the server on close. +                                              */ +           nfsace4         permissions;    /* Defines users who don't +                                              need an ACCESS call as +                                              part of a delegated +                                              open. */ +   }; + +   union open_delegation4 +   switch (open_delegation_type4 delegation_type) { +           case OPEN_DELEGATE_NONE: +                   void; +           case OPEN_DELEGATE_READ: +                   open_read_delegation4 read; +           case OPEN_DELEGATE_WRITE: +                   open_write_delegation4 write; +   }; +   /* +    * Result flags +    */ +   /* Client must confirm open */ +   const OPEN4_RESULT_CONFIRM      = 0x00000002; +   /* Type of file locking behavior at the server */ +   const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; + +   struct OPEN4resok { +           stateid4        stateid;        /* Stateid for open */ + + + +Shepler, et al.             Standards Track                   [Page 254] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           change_info4    cinfo;          /* Directory Change Info */ +           uint32_t        rflags;         /* Result flags */ +           bitmap4         attrset;        /* attribute set for create*/ +           open_delegation4 delegation;    /* Info on any open +                                              delegation */ +   }; + +   union OPEN4res switch (nfsstat4 status) { +    case NFS4_OK: +           /* CURRENT_FH: opened file */ +           OPEN4resok      resok4; +    default: +           void; +   }; + +   /* +    * OPENATTR: open named attributes directory +    */ +   struct OPENATTR4args { +           /* CURRENT_FH: object */ +           bool    createdir; +   }; + +   struct OPENATTR4res { +           /* CURRENT_FH: named attr directory */ +           nfsstat4        status; +   }; + +   /* +    * OPEN_CONFIRM: confirm the open +    */ +   struct OPEN_CONFIRM4args { +           /* CURRENT_FH: opened file */ +           stateid4        open_stateid; +           seqid4          seqid; +   }; + +   struct OPEN_CONFIRM4resok { +           stateid4        open_stateid; +   }; + +   union OPEN_CONFIRM4res switch (nfsstat4 status) { +       case NFS4_OK: +               OPEN_CONFIRM4resok     resok4; +    default: +            void; +   }; + + + + +Shepler, et al.             Standards Track                   [Page 255] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   /* +    * OPEN_DOWNGRADE: downgrade the access/deny for a file +    */ +   struct OPEN_DOWNGRADE4args { +           /* CURRENT_FH: opened file */ +           stateid4        open_stateid; +           seqid4          seqid; +           uint32_t        share_access; +           uint32_t        share_deny; +   }; + +   struct OPEN_DOWNGRADE4resok { +           stateid4        open_stateid; +   }; + +   union OPEN_DOWNGRADE4res switch(nfsstat4 status) { +    case NFS4_OK: +           OPEN_DOWNGRADE4resok    resok4; +    default: +            void; +   }; + +   /* +    * PUTFH: Set current filehandle +    */ +   struct PUTFH4args { +           nfs_fh4         object; +   }; + +   struct PUTFH4res { +           /* CURRENT_FH: */ +           nfsstat4        status; +   }; + +   /* +    * PUTPUBFH: Set public filehandle +    */ +   struct PUTPUBFH4res { +           /* CURRENT_FH: public fh */ +           nfsstat4        status; +   }; + +   /* +    * PUTROOTFH: Set root filehandle +    */ +   struct PUTROOTFH4res { + +           /* CURRENT_FH: root fh */ + + + +Shepler, et al.             Standards Track                   [Page 256] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           nfsstat4        status; +   }; + +   /* +    * READ: Read from file +    */ +   struct READ4args { +           /* CURRENT_FH: file */ +           stateid4        stateid; +           offset4         offset; +           count4          count; +   }; + +   struct READ4resok { +           bool            eof; +           opaque          data<>; +   }; + +   union READ4res switch (nfsstat4 status) { +    case NFS4_OK: +            READ4resok     resok4; +    default: +            void; +   }; + +   /* +    * READDIR: Read directory +    */ +   struct READDIR4args { +           /* CURRENT_FH: directory */ +           nfs_cookie4     cookie; +           verifier4       cookieverf; +           count4          dircount; +           count4          maxcount; +           bitmap4         attr_request; +   }; + +   struct entry4 { +           nfs_cookie4     cookie; +           component4      name; +           fattr4          attrs; +           entry4          *nextentry; +   }; + +   struct dirlist4 { +           entry4          *entries; +           bool            eof; +   }; + + + +Shepler, et al.             Standards Track                   [Page 257] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   struct READDIR4resok { +           verifier4       cookieverf; +           dirlist4        reply; +   }; + + +   union READDIR4res switch (nfsstat4 status) { +    case NFS4_OK: +            READDIR4resok  resok4; +    default: +            void; +   }; + + +   /* +    * READLINK: Read symbolic link +    */ +   struct READLINK4resok { +           linktext4       link; +   }; + +   union READLINK4res switch (nfsstat4 status) { +    case NFS4_OK: +            READLINK4resok resok4; +    default: +            void; +   }; + +   /* +    * REMOVE: Remove filesystem object +    */ +   struct REMOVE4args { +           /* CURRENT_FH: directory */ +           component4      target; +   }; + +   struct REMOVE4resok { +           change_info4    cinfo; +   }; + +   union REMOVE4res switch (nfsstat4 status) { +    case NFS4_OK: +            REMOVE4resok   resok4; +    default: +            void; +   }; + +   /* + + + +Shepler, et al.             Standards Track                   [Page 258] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +    * RENAME: Rename directory entry +    */ +   struct RENAME4args { +           /* SAVED_FH: source directory */ +           component4      oldname; +           /* CURRENT_FH: target directory */ + +           component4      newname; +   }; + +   struct RENAME4resok { +           change_info4    source_cinfo; +           change_info4    target_cinfo; +   }; + +   union RENAME4res switch (nfsstat4 status) { +    case NFS4_OK: +           RENAME4resok    resok4; +    default: +           void; +   }; + +   /* +    * RENEW: Renew a Lease +    */ +   struct RENEW4args { +           clientid4       clientid; +   }; + +   struct RENEW4res { +           nfsstat4        status; +   }; + +   /* +    * RESTOREFH: Restore saved filehandle +    */ + +   struct RESTOREFH4res { +           /* CURRENT_FH: value of saved fh */ +           nfsstat4        status; +   }; + +   /* +    * SAVEFH: Save current filehandle +    */ +   struct SAVEFH4res { +           /* SAVED_FH: value of current fh */ +           nfsstat4        status; + + + +Shepler, et al.             Standards Track                   [Page 259] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   }; + +   /* +    * SECINFO: Obtain Available Security Mechanisms +    */ +   struct SECINFO4args { +           /* CURRENT_FH: directory */ +           component4      name; +   }; + +   /* + +    * From RFC 2203 +    */ +   enum rpc_gss_svc_t { +           RPC_GSS_SVC_NONE        = 1, +           RPC_GSS_SVC_INTEGRITY   = 2, +           RPC_GSS_SVC_PRIVACY     = 3 +   }; + +   struct rpcsec_gss_info { +           sec_oid4        oid; +           qop4            qop; +           rpc_gss_svc_t   service; +   }; + +   /* RPCSEC_GSS has a value of '6' - See RFC 2203 */ +   union secinfo4 switch (uint32_t flavor) { +    case RPCSEC_GSS: +            rpcsec_gss_info        flavor_info; +    default: +            void; +   }; + +   typedef secinfo4 SECINFO4resok<>; + +   union SECINFO4res switch (nfsstat4 status) { +    case NFS4_OK: +            SECINFO4resok resok4; +    default: +            void; +   }; + +   /* +    * SETATTR: Set attributes +    */ +   struct SETATTR4args { +           /* CURRENT_FH: target object */ + + + +Shepler, et al.             Standards Track                   [Page 260] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           stateid4        stateid; +           fattr4          obj_attributes; +   }; + +   struct SETATTR4res { +           nfsstat4        status; +           bitmap4         attrsset; +   }; + +   /* +    * SETCLIENTID +    */ +   struct SETCLIENTID4args { +           nfs_client_id4  client; +           cb_client4      callback; +           uint32_t        callback_ident; + +   }; + +   struct SETCLIENTID4resok { +           clientid4       clientid; +           verifier4       setclientid_confirm; +   }; + +   union SETCLIENTID4res switch (nfsstat4 status) { +    case NFS4_OK: +            SETCLIENTID4resok      resok4; +    case NFS4ERR_CLID_INUSE: +            clientaddr4    client_using; +    default: +            void; +   }; + +   struct SETCLIENTID_CONFIRM4args { +           clientid4       clientid; +           verifier4       setclientid_confirm; +   }; + +   struct SETCLIENTID_CONFIRM4res { +           nfsstat4        status; +   }; + +   /* +    * VERIFY: Verify attributes same +    */ +   struct VERIFY4args { +           /* CURRENT_FH: object */ +           fattr4          obj_attributes; + + + +Shepler, et al.             Standards Track                   [Page 261] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   }; + +   struct VERIFY4res { +           nfsstat4        status; +   }; + +   /* +    * WRITE: Write to file +    */ +   enum stable_how4 { +           UNSTABLE4       = 0, +           DATA_SYNC4      = 1, +           FILE_SYNC4      = 2 +   }; + +   struct WRITE4args { +           /* CURRENT_FH: file */ +           stateid4        stateid; +           offset4         offset; +           stable_how4     stable; +           opaque          data<>; +   }; + +   struct WRITE4resok { +           count4          count; +           stable_how4     committed; +           verifier4       writeverf; +   }; + +   union WRITE4res switch (nfsstat4 status) { +    case NFS4_OK: +            WRITE4resok    resok4; +    default: +            void; +   }; + +   /* +    * RELEASE_LOCKOWNER: Notify server to release lockowner +    */ +   struct RELEASE_LOCKOWNER4args { +           lock_owner4     lock_owner; +   }; + +   struct RELEASE_LOCKOWNER4res { +           nfsstat4        status; +   }; + +   /* + + + +Shepler, et al.             Standards Track                   [Page 262] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +    * ILLEGAL: Response for illegal operation numbers +    */ +   struct ILLEGAL4res { +           nfsstat4        status; +   }; + +   /* +    * Operation arrays +    */ + +   enum nfs_opnum4 { +           OP_ACCESS               = 3, +           OP_CLOSE                = 4, +           OP_COMMIT               = 5, +           OP_CREATE               = 6, +           OP_DELEGPURGE           = 7, +           OP_DELEGRETURN          = 8, +           OP_GETATTR              = 9, +           OP_GETFH                = 10, +           OP_LINK                 = 11, +           OP_LOCK                 = 12, +           OP_LOCKT                = 13, +           OP_LOCKU                = 14, +           OP_LOOKUP               = 15, +           OP_LOOKUPP              = 16, +           OP_NVERIFY              = 17, +           OP_OPEN                 = 18, +           OP_OPENATTR             = 19, +           OP_OPEN_CONFIRM         = 20, +           OP_OPEN_DOWNGRADE       = 21, +           OP_PUTFH                = 22, +           OP_PUTPUBFH             = 23, +           OP_PUTROOTFH            = 24, +           OP_READ                 = 25, +           OP_READDIR              = 26, +           OP_READLINK             = 27, +           OP_REMOVE               = 28, +           OP_RENAME               = 29, +           OP_RENEW                = 30, +           OP_RESTOREFH            = 31, +           OP_SAVEFH               = 32, +           OP_SECINFO              = 33, +           OP_SETATTR              = 34, +           OP_SETCLIENTID          = 35, +           OP_SETCLIENTID_CONFIRM  = 36, +           OP_VERIFY               = 37, +           OP_WRITE                = 38, +           OP_RELEASE_LOCKOWNER    = 39, + + + +Shepler, et al.             Standards Track                   [Page 263] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           OP_ILLEGAL              = 10044 +   }; + +   union nfs_argop4 switch (nfs_opnum4 argop) { +    case OP_ACCESS:        ACCESS4args opaccess; +    case OP_CLOSE:         CLOSE4args opclose; +    case OP_COMMIT:        COMMIT4args opcommit; +    case OP_CREATE:        CREATE4args opcreate; +    case OP_DELEGPURGE:    DELEGPURGE4args opdelegpurge; +    case OP_DELEGRETURN:   DELEGRETURN4args opdelegreturn; +    case OP_GETATTR:       GETATTR4args opgetattr; +    case OP_GETFH:         void; +    case OP_LINK:          LINK4args oplink; +    case OP_LOCK:          LOCK4args oplock; +    case OP_LOCKT:         LOCKT4args oplockt; +    case OP_LOCKU:         LOCKU4args oplocku; +    case OP_LOOKUP:        LOOKUP4args oplookup; +    case OP_LOOKUPP:       void; +    case OP_NVERIFY:       NVERIFY4args opnverify; +    case OP_OPEN:          OPEN4args opopen; +    case OP_OPENATTR:      OPENATTR4args opopenattr; +    case OP_OPEN_CONFIRM:  OPEN_CONFIRM4args opopen_confirm; +    case OP_OPEN_DOWNGRADE:        OPEN_DOWNGRADE4args opopen_downgrade; +    case OP_PUTFH:         PUTFH4args opputfh; +    case OP_PUTPUBFH:      void; +    case OP_PUTROOTFH:     void; +    case OP_READ:          READ4args opread; +    case OP_READDIR:       READDIR4args opreaddir; +    case OP_READLINK:      void; +    case OP_REMOVE:        REMOVE4args opremove; +    case OP_RENAME:        RENAME4args oprename; +    case OP_RENEW:         RENEW4args oprenew; +    case OP_RESTOREFH:     void; +    case OP_SAVEFH:        void; +    case OP_SECINFO:       SECINFO4args opsecinfo; +    case OP_SETATTR:       SETATTR4args opsetattr; +    case OP_SETCLIENTID:   SETCLIENTID4args opsetclientid; +    case OP_SETCLIENTID_CONFIRM:   SETCLIENTID_CONFIRM4args +                                           opsetclientid_confirm; +    case OP_VERIFY:        VERIFY4args opverify; +    case OP_WRITE:         WRITE4args opwrite; +    case OP_RELEASE_LOCKOWNER:     RELEASE_LOCKOWNER4args +                                       oprelease_lockowner; +    case OP_ILLEGAL:       void; +   }; + +   union nfs_resop4 switch (nfs_opnum4 resop){ +    case OP_ACCESS:        ACCESS4res opaccess; + + + +Shepler, et al.             Standards Track                   [Page 264] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +    case OP_CLOSE:         CLOSE4res opclose; +    case OP_COMMIT:        COMMIT4res opcommit; +    case OP_CREATE:        CREATE4res opcreate; +    case OP_DELEGPURGE:    DELEGPURGE4res opdelegpurge; +    case OP_DELEGRETURN:   DELEGRETURN4res opdelegreturn; +    case OP_GETATTR:       GETATTR4res opgetattr; +    case OP_GETFH:         GETFH4res opgetfh; +    case OP_LINK:          LINK4res oplink; +    case OP_LOCK:          LOCK4res oplock; +    case OP_LOCKT:         LOCKT4res oplockt; +    case OP_LOCKU:         LOCKU4res oplocku; +    case OP_LOOKUP:        LOOKUP4res oplookup; +    case OP_LOOKUPP:       LOOKUPP4res oplookupp; +    case OP_NVERIFY:       NVERIFY4res opnverify; +    case OP_OPEN:          OPEN4res opopen; +    case OP_OPENATTR:      OPENATTR4res opopenattr; +    case OP_OPEN_CONFIRM:  OPEN_CONFIRM4res opopen_confirm; +    case OP_OPEN_DOWNGRADE:        OPEN_DOWNGRADE4res opopen_downgrade; +    case OP_PUTFH:         PUTFH4res opputfh; +    case OP_PUTPUBFH:      PUTPUBFH4res opputpubfh; +    case OP_PUTROOTFH:     PUTROOTFH4res opputrootfh; +    case OP_READ:          READ4res opread; +    case OP_READDIR:       READDIR4res opreaddir; +    case OP_READLINK:      READLINK4res opreadlink; +    case OP_REMOVE:        REMOVE4res opremove; +    case OP_RENAME:        RENAME4res oprename; +    case OP_RENEW:         RENEW4res oprenew; +    case OP_RESTOREFH:     RESTOREFH4res oprestorefh; +    case OP_SAVEFH:        SAVEFH4res opsavefh; +    case OP_SECINFO:       SECINFO4res opsecinfo; +    case OP_SETATTR:       SETATTR4res opsetattr; +    case OP_SETCLIENTID:   SETCLIENTID4res opsetclientid; +    case OP_SETCLIENTID_CONFIRM:   SETCLIENTID_CONFIRM4res +                                           opsetclientid_confirm; +    case OP_VERIFY:        VERIFY4res opverify; +    case OP_WRITE:         WRITE4res opwrite; +    case OP_RELEASE_LOCKOWNER:     RELEASE_LOCKOWNER4res +                                       oprelease_lockowner; +    case OP_ILLEGAL:       ILLEGAL4res opillegal; +   }; + +   struct COMPOUND4args { +           utf8str_cs      tag; +           uint32_t        minorversion; +           nfs_argop4      argarray<>; +   }; + +   struct COMPOUND4res { + + + +Shepler, et al.             Standards Track                   [Page 265] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           nfsstat4 status; +           utf8str_cs      tag; +           nfs_resop4      resarray<>; +   }; + +   /* +    * Remote file service routines +    */ +   program NFS4_PROGRAM { +           version NFS_V4 { +                   void +                           NFSPROC4_NULL(void) = 0; + +                   COMPOUND4res +                           NFSPROC4_COMPOUND(COMPOUND4args) = 1; + +           } = 4; +   } = 100003; + + + +   /* +    * NFS4 Callback Procedure Definitions and Program +    */ + +   /* +    * CB_GETATTR: Get Current Attributes +    */ +   struct CB_GETATTR4args { +           nfs_fh4 fh; +           bitmap4 attr_request; +   }; + +   struct CB_GETATTR4resok { +           fattr4  obj_attributes; +   }; + +   union CB_GETATTR4res switch (nfsstat4 status) { +    case NFS4_OK: +            CB_GETATTR4resok       resok4; +    default: +            void; +   }; + +   /* +    * CB_RECALL: Recall an Open Delegation +    */ +   struct CB_RECALL4args { + + + +Shepler, et al.             Standards Track                   [Page 266] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +           stateid4        stateid; +           bool            truncate; +           nfs_fh4         fh; +   }; + +   struct CB_RECALL4res { +           nfsstat4        status; +   }; + +   /* +    * CB_ILLEGAL: Response for illegal operation numbers +    */ +   struct CB_ILLEGAL4res { +           nfsstat4        status; +   }; + +   /* +    * Various definitions for CB_COMPOUND +    */ +   enum nfs_cb_opnum4 { +           OP_CB_GETATTR           = 3, +           OP_CB_RECALL            = 4, +           OP_CB_ILLEGAL           = 10044 +   }; + +   union nfs_cb_argop4 switch (unsigned argop) { +    case OP_CB_GETATTR:    CB_GETATTR4args opcbgetattr; +    case OP_CB_RECALL:     CB_RECALL4args  opcbrecall; +    case OP_CB_ILLEGAL:    void; +   }; + +   union nfs_cb_resop4 switch (unsigned resop){ +    case OP_CB_GETATTR:    CB_GETATTR4res  opcbgetattr; +    case OP_CB_RECALL:     CB_RECALL4res   opcbrecall; +    case OP_CB_ILLEGAL:    CB_ILLEGAL4res  opcbillegal; +   }; + +   struct CB_COMPOUND4args { +           utf8str_cs      tag; +           uint32_t        minorversion; +           uint32_t        callback_ident; +           nfs_cb_argop4   argarray<>; +   }; + +   struct CB_COMPOUND4res { +           nfsstat4 status; +           utf8str_cs      tag; +           nfs_cb_resop4   resarray<>; + + + +Shepler, et al.             Standards Track                   [Page 267] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   }; + + +   /* +    * Program number is in the transient range since the client +    * will assign the exact transient program number and provide +    * that to the server via the SETCLIENTID operation. +    */ +   program NFS4_CALLBACK { +           version NFS_CB { +                   void +                           CB_NULL(void) = 0; +                   CB_COMPOUND4res +                           CB_COMPOUND(CB_COMPOUND4args) = 1; +           } = 1; +   } = 0x40000000; + +19.  Acknowledgements + +   The authors thank and acknowledge: + +   Neil Brown for his extensive review and comments of various +   documents. Rick Macklem at the University of Guelph, Mike Frisch, +   Sergey Klyushin, and Dan Trufasiu of Hummingbird Ltd., and Andy +   Adamson, Bruce Fields, Jim Rees, and Kendrick Smith from the CITI +   organization at the University of Michigan, for their implementation +   efforts and feedback on the protocol specification. Mike Kupfer for +   his review of the file locking and ACL mechanisms.  Alan Yoder for +   his input to ACL mechanisms. Peter Astrand for his close review of +   the protocol specification. Ran Atkinson for his constant reminder +   that users do matter. + +20.  Normative References + +   [ISO10646]                "ISO/IEC 10646-1:1993. International +                             Standard -- Information technology -- +                             Universal Multiple-Octet Coded Character +                             Set (UCS) -- Part 1: Architecture and Basic +                             Multilingual Plane." + +   [RFC793]                  Postel, J., "Transmission Control +                             Protocol", STD 7, RFC 793, September 1981. + +   [RFC1831]                 Srinivasan, R., "RPC: Remote Procedure Call +                             Protocol Specification Version 2", RFC +                             1831, August 1995. + + + + + +Shepler, et al.             Standards Track                   [Page 268] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   [RFC1832]                 Srinivasan, R., "XDR: External Data +                             Representation Standard", RFC 1832, August +                             1995. + +   [RFC2373]                 Hinden, R. and S. Deering, "IP Version 6 +                             Addressing Architecture", RFC 2373, July +                             1998. + +   [RFC1964]                 Linn, J., "The Kerberos Version 5 GSS-API +                             Mechanism", RFC 1964, June 1996. + +   [RFC2025]                 Adams, C., "The Simple Public-Key GSS-API +                             Mechanism (SPKM)", RFC 2025, October 1996. + +   [RFC2119]                 Bradner, S., "Key words for use in RFCs to +                             Indicate Requirement Levels", BCP 14, RFC +                             2119, March 1997. + +   [RFC2203]                 Eisler, M., Chiu, A. and L. Ling, +                             "RPCSEC_GSS Protocol Specification", RFC +                             2203, September 1997. + +   [RFC2277]                 Alvestrand, H., "IETF Policy on Character +                             Sets and Languages", BCP 19, RFC 2277, +                             January 1998. + +   [RFC2279]                 Yergeau, F., "UTF-8, a transformation +                             format of ISO 10646", RFC 2279, January +                             1998. + +   [RFC2623]                 Eisler, M., "NFS Version 2 and Version 3 +                             Security Issues and the NFS Protocol's Use +                             of RPCSEC_GSS and Kerberos V5", RFC 2623, +                             June 1999. + +   [RFC2743]                 Linn, J., "Generic Security Service +                             Application Program Interface, Version 2, +                             Update 1", RFC 2743, January 2000. + +   [RFC2847]                 Eisler, M., "LIPKEY - A Low Infrastructure +                             Public Key Mechanism Using SPKM", RFC 2847, +                             June 2000. + +   [RFC3010]                 Shepler, S., Callaghan, B., Robinson, D., +                             Thurlow, R., Beame, C., Eisler, M. and D. +                             Noveck, "NFS version 4 Protocol", RFC 3010, +                             December 2000. + + + + +Shepler, et al.             Standards Track                   [Page 269] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   [RFC3454]                 Hoffman, P. and P. Blanchet, "Preparation +                             of Internationalized Strings +                             ("stringprep")", RFC 3454, December 2002. + +   [Unicode1]                The Unicode Consortium, "The Unicode +                             Standard, Version 3.0", Addison-Wesley +                             Developers Press, Reading, MA, 2000. ISBN +                             0-201-61633-5. + +                             More information available at: +                             http://www.unicode.org/ + +   [Unicode2]                "Unsupported Scripts" Unicode, Inc., The +                             Unicode Consortium, P.O. Box 700519, San +                             Jose, CA 95710-0519 USA, September 1999. +                             http://www.unicode.org/unicode/standard/ +                             unsupported.html + +21.  Informative References + +   [Floyd]                   S. Floyd, V. Jacobson, "The Synchronization +                             of Periodic Routing Messages," IEEE/ACM +                             Transactions on Networking, 2(2), pp. 122- +                             136, April 1994. + +   [Gray]                    C. Gray, D. Cheriton, "Leases: An Efficient +                             Fault-Tolerant Mechanism for Distributed +                             File Cache Consistency," Proceedings of the +                             Twelfth Symposium on Operating Systems +                             Principles, p. 202-210, December 1989. + +   [Juszczak]                Juszczak, Chet, "Improving the Performance +                             and Correctness of an NFS Server," USENIX +                             Conference Proceedings, USENIX Association, +                             Berkeley, CA, June 1990, pages 53-63. +                             Describes reply cache implementation that +                             avoids work in the server by handling +                             duplicate requests. More important, though +                             listed as a side-effect, the reply cache +                             aids in the avoidance of destructive non- +                             idempotent operation re-application -- +                             improving correctness. + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 270] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   [Kazar]                   Kazar, Michael Leon, "Synchronization and +                             Caching Issues in the Andrew File System," +                             USENIX Conference Proceedings, USENIX +                             Association, Berkeley, CA, Dallas Winter +                             1988, pages 27-36.  A description of the +                             cache consistency scheme in AFS. +                             Contrasted with other distributed file +                             systems. + +   [Macklem]                 Macklem, Rick, "Lessons Learned Tuning the +                             4.3BSD Reno Implementation of the NFS +                             Protocol," Winter USENIX Conference +                             Proceedings, USENIX Association, Berkeley, +                             CA, January 1991.  Describes performance +                             work in tuning the 4.3BSD Reno NFS +                             implementation. Describes performance +                             improvement (reduced CPU loading) through +                             elimination of data copies. + +   [Mogul]                   Mogul, Jeffrey C., "A Recovery Protocol for +                             Spritely NFS," USENIX File System Workshop +                             Proceedings, Ann Arbor, MI, USENIX +                             Association, Berkeley, CA, May 1992. +                             Second paper on Spritely NFS proposes a +                             lease-based scheme for recovering state of +                             consistency protocol. + +   [Nowicki]                 Nowicki, Bill, "Transport Issues in the +                             Network File System," ACM SIGCOMM +                             newsletter Computer Communication Review, +                             April 1989.  A brief description of the +                             basis for the dynamic retransmission work. + +   [Pawlowski]               Pawlowski, Brian, Ron Hixon, Mark Stein, +                             Joseph Tumminaro, "Network Computing in the +                             UNIX and IBM Mainframe Environment," +                             Uniforum `89 Conf.  Proc., (1989) +                             Description of an NFS server implementation +                             for IBM's MVS operating system. + +   [RFC1094]                 Sun Microsystems, Inc., "NFS: Network File +                             System Protocol Specification", RFC 1094, +                             March 1989. + +   [RFC1345]                 Simonsen, K., "Character Mnemonics & +                             Character Sets", RFC 1345, June 1992. + + + + + +Shepler, et al.             Standards Track                   [Page 271] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   [RFC1813]                 Callaghan, B., Pawlowski, B. and P. +                             Staubach, "NFS Version 3 Protocol +                             Specification", RFC 1813, June 1995. + +   [RFC3232]                 Reynolds, J., Editor, "Assigned Numbers: +                             RFC 1700 is Replaced by an On-line +                             Database", RFC 3232, January 2002. + +   [RFC1833]                 Srinivasan, R., "Binding Protocols for ONC +                             RPC Version 2", RFC 1833, August 1995. + +   [RFC2054]                 Callaghan, B., "WebNFS Client +                             Specification", RFC 2054, October 1996. + +   [RFC2055]                 Callaghan, B., "WebNFS Server +                             Specification", RFC 2055,  October 1996. + +   [RFC2152]                 Goldsmith, D. and M. Davis, "UTF-7 A Mail- +                             Safe Transformation Format of Unicode", RFC +                             2152, May 1997. + +   [RFC2224]                 Callaghan, B., "NFS URL Scheme", RFC 2224, +                             October 1997. + +   [RFC2624]                 Shepler, S., "NFS Version 4 Design +                             Considerations", RFC 2624, June 1999. + +   [RFC2755]                 Chiu, A., Eisler, M. and B. Callaghan, +                             "Security Negotiation for WebNFS" , RFC +                             2755, June 2000. + +   [Sandberg]                Sandberg, R., D. Goldberg, S. Kleiman, D. +                             Walsh, B.  Lyon, "Design and Implementation +                             of the Sun Network Filesystem," USENIX +                             Conference Proceedings, USENIX Association, +                             Berkeley, CA, Summer 1985.  The basic paper +                             describing the SunOS implementation of the +                             NFS version 2 protocol, and discusses the +                             goals, protocol specification and trade- +                             offs. + + + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 272] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +   [Srinivasan]              Srinivasan, V., Jeffrey C. Mogul, "Spritely +                             NFS: Implementation and Performance of +                             Cache Consistency Protocols", WRL Research +                             Report 89/5, Digital Equipment Corporation +                             Western Research Laboratory, 100 Hamilton +                             Ave., Palo Alto, CA, 94301, May 1989.  This +                             paper analyzes the effect of applying a +                             Sprite-like consistency protocol applied to +                             standard NFS. The issues of recovery in a +                             stateful environment are covered in +                             [Mogul]. + +   [XNFS]                    The Open Group, Protocols for Interworking: +                             XNFS, Version 3W, The Open Group, 1010 El +                             Camino Real Suite 380, Menlo Park, CA +                             94025, ISBN 1-85912-184-5, February 1998. + +                             HTML version available: +                             http://www.opengroup.org + +22.  Authors' Information + +22.1.  Editor's Address + +   Spencer Shepler +   Sun Microsystems, Inc. +   7808 Moonflower Drive +   Austin, Texas  78750 + +   Phone: +1 512-349-9376 +   EMail: spencer.shepler@sun.com + + + + + + + + + + + + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 273] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +22.2.  Authors' Addresses + +   Carl Beame +   Hummingbird Ltd. + +   EMail: beame@bws.com + +   Brent Callaghan +   Sun Microsystems, Inc. +   17 Network Circle +   Menlo Park, CA  94025 + +   Phone: +1 650-786-5067 +   EMail: brent.callaghan@sun.com + +   Mike Eisler +   5765 Chase Point Circle +   Colorado Springs, CO  80919 + +   Phone: +1 719-599-9026 +   EMail: mike@eisler.com + +   David Noveck +   Network Appliance +   375 Totten Pond Road +   Waltham, MA  02451 + +   Phone: +1 781-768-5347 +   EMail: dnoveck@netapp.com + +   David Robinson +   Sun Microsystems, Inc. +   5300 Riata Park Court +   Austin, TX  78727 + +   Phone: +1 650-786-5088 +   EMail: david.robinson@sun.com + +   Robert Thurlow +   Sun Microsystems, Inc. +   500 Eldorado Blvd. +   Broomfield, CO  80021 + +   Phone: +1 650-786-5096 +   EMail: robert.thurlow@sun.com + + + + + + +Shepler, et al.             Standards Track                   [Page 274] + +RFC 3530                 NFS version 4 Protocol               April 2003 + + +23.  Full Copyright Statement + +   Copyright (C) The Internet Society (2003).  All Rights Reserved. + +   This document and translations of it may be copied and furnished to +   others, and derivative works that comment on or otherwise explain it +   or assist in its implementation may be prepared, copied, published +   and distributed, in whole or in part, without restriction of any +   kind, provided that the above copyright notice and this paragraph are +   included on all such copies and derivative works.  However, this +   document itself may not be modified in any way, such as by removing +   the copyright notice or references to the Internet Society or other +   Internet organizations, except as needed for the purpose of +   developing Internet standards in which case the procedures for +   copyrights defined in the Internet Standards process must be +   followed, or as required to translate it into languages other than +   English. + +   The limited permissions granted above are perpetual and will not be +   revoked by the Internet Society or its successors or assigns. + +   This document and the information contained herein is provided on an +   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING +   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING +   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION +   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF +   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + +   Funding for the RFC Editor function is currently provided by the +   Internet Society. + + + + + + + + + + + + + + + + + + + +Shepler, et al.             Standards Track                   [Page 275] + |