From 4bfd864f10b68b71482b35c818559068ef8d5797 Mon Sep 17 00:00:00 2001 From: Thomas Voss Date: Wed, 27 Nov 2024 20:54:24 +0100 Subject: doc: Add RFC documents --- doc/rfc/rfc3530.txt | 15403 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 15403 insertions(+) create mode 100644 doc/rfc/rfc3530.txt (limited to 'doc/rfc/rfc3530.txt') diff --git a/doc/rfc/rfc3530.txt b/doc/rfc/rfc3530.txt new file mode 100644 index 0000000..93422d3 --- /dev/null +++ b/doc/rfc/rfc3530.txt @@ -0,0 +1,15403 @@ + + + + + + +Network Working Group S. Shepler +Request for Comments: 3530 B. Callaghan +Obsoletes: 3010 D. Robinson +Category: Standards Track R. Thurlow + Sun Microsystems, Inc. + C. Beame + Hummingbird Ltd. + M. Eisler + D. Noveck + Network Appliance, Inc. + April 2003 + + + Network File System (NFS) version 4 Protocol + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2003). All Rights Reserved. + +Abstract + + The Network File System (NFS) version 4 is a distributed filesystem + protocol which owes heritage to NFS protocol version 2, RFC 1094, and + version 3, RFC 1813. Unlike earlier versions, the NFS version 4 + protocol supports traditional file access while integrating support + for file locking and the mount protocol. In addition, support for + strong security (and its negotiation), compound operations, client + caching, and internationalization have been added. Of course, + attention has been applied to making NFS version 4 operate well in an + Internet environment. + + This document replaces RFC 3010 as the definition of the NFS version + 4 protocol. + +Key Words + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [RFC2119]. + + + + +Shepler, et al. Standards Track [Page 1] + +RFC 3530 NFS version 4 Protocol April 2003 + + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 8 + 1.1. Changes since RFC 3010 . . . . . . . . . . . . . . . 8 + 1.2. NFS version 4 Goals. . . . . . . . . . . . . . . . . 9 + 1.3. Inconsistencies of this Document with Section 18 . . 9 + 1.4. Overview of NFS version 4 Features . . . . . . . . . 10 + 1.4.1. RPC and Security . . . . . . . . . . . . . . 10 + 1.4.2. Procedure and Operation Structure. . . . . . 10 + 1.4.3. Filesystem Mode. . . . . . . . . . . . . . . 11 + 1.4.3.1. Filehandle Types . . . . . . . . . 11 + 1.4.3.2. Attribute Types. . . . . . . . . . 12 + 1.4.3.3. Filesystem Replication and + Migration. . . . . . . . . . . . . 13 + 1.4.4. OPEN and CLOSE . . . . . . . . . . . . . . . 13 + 1.4.5. File locking . . . . . . . . . . . . . . . . 13 + 1.4.6. Client Caching and Delegation. . . . . . . . 13 + 1.5. General Definitions. . . . . . . . . . . . . . . . . 14 + 2. Protocol Data Types. . . . . . . . . . . . . . . . . . . . 16 + 2.1. Basic Data Types . . . . . . . . . . . . . . . . . . 16 + 2.2. Structured Data Types. . . . . . . . . . . . . . . . 18 + 3. RPC and Security Flavor. . . . . . . . . . . . . . . . . . 23 + 3.1. Ports and Transports . . . . . . . . . . . . . . . . 23 + 3.1.1. Client Retransmission Behavior . . . . . . . 24 + 3.2. Security Flavors . . . . . . . . . . . . . . . . . . 25 + 3.2.1. Security mechanisms for NFS version 4. . . . 25 + 3.2.1.1. Kerberos V5 as a security triple . 25 + 3.2.1.2. LIPKEY as a security triple. . . . 26 + 3.2.1.3. SPKM-3 as a security triple. . . . 27 + 3.3. Security Negotiation . . . . . . . . . . . . . . . . 27 + 3.3.1. SECINFO. . . . . . . . . . . . . . . . . . . 28 + 3.3.2. Security Error . . . . . . . . . . . . . . . 28 + 3.4. Callback RPC Authentication. . . . . . . . . . . . . 28 + 4. Filehandles . . . . . . . . . . . . . . . . . . . . . . . . 30 + 4.1. Obtaining the First Filehandle . . . . . . . . . . . 30 + 4.1.1. Root Filehandle. . . . . . . . . . . . . . . 31 + 4.1.2. Public Filehandle. . . . . . . . . . . . . . 31 + 4.2. Filehandle Types . . . . . . . . . . . . . . . . . . 31 + 4.2.1. General Properties of a Filehandle . . . . . 32 + 4.2.2. Persistent Filehandle. . . . . . . . . . . . 32 + 4.2.3. Volatile Filehandle. . . . . . . . . . . . . 33 + 4.2.4. One Method of Constructing a + Volatile Filehandle. . . . . . . . . . . . . 34 + 4.3. Client Recovery from Filehandle Expiration . . . . . 35 + 5. File Attributes. . . . . . . . . . . . . . . . . . . . . . 35 + 5.1. Mandatory Attributes . . . . . . . . . . . . . . . . 37 + 5.2. Recommended Attributes . . . . . . . . . . . . . . . 37 + 5.3. Named Attributes . . . . . . . . . . . . . . . . . . 37 + + + +Shepler, et al. Standards Track [Page 2] + +RFC 3530 NFS version 4 Protocol April 2003 + + + 5.4. Classification of Attributes . . . . . . . . . . . . 38 + 5.5. Mandatory Attributes - Definitions . . . . . . . . . 39 + 5.6. Recommended Attributes - Definitions . . . . . . . . 41 + 5.7. Time Access. . . . . . . . . . . . . . . . . . . . . 46 + 5.8. Interpreting owner and owner_group . . . . . . . . . 47 + 5.9. Character Case Attributes. . . . . . . . . . . . . . 49 + 5.10. Quota Attributes . . . . . . . . . . . . . . . . . . 49 + 5.11. Access Control Lists . . . . . . . . . . . . . . . . 50 + 5.11.1. ACE type . . . . . . . . . . . . . . . . . 51 + 5.11.2. ACE Access Mask. . . . . . . . . . . . . . 52 + 5.11.3. ACE flag . . . . . . . . . . . . . . . . . 54 + 5.11.4. ACE who . . . . . . . . . . . . . . . . . 55 + 5.11.5. Mode Attribute . . . . . . . . . . . . . . 56 + 5.11.6. Mode and ACL Attribute . . . . . . . . . . 57 + 5.11.7. mounted_on_fileid. . . . . . . . . . . . . 57 + 6. Filesystem Migration and Replication . . . . . . . . . . . 58 + 6.1. Replication. . . . . . . . . . . . . . . . . . . . . 58 + 6.2. Migration. . . . . . . . . . . . . . . . . . . . . . 59 + 6.3. Interpretation of the fs_locations Attribute . . . . 60 + 6.4. Filehandle Recovery for Migration or Replication . . 61 + 7. NFS Server Name Space . . . . . . . . . . . . . . . . . . . 61 + 7.1. Server Exports . . . . . . . . . . . . . . . . . . . 61 + 7.2. Browsing Exports . . . . . . . . . . . . . . . . . . 62 + 7.3. Server Pseudo Filesystem . . . . . . . . . . . . . . 62 + 7.4. Multiple Roots . . . . . . . . . . . . . . . . . . . 63 + 7.5. Filehandle Volatility. . . . . . . . . . . . . . . . 63 + 7.6. Exported Root. . . . . . . . . . . . . . . . . . . . 63 + 7.7. Mount Point Crossing . . . . . . . . . . . . . . . . 63 + 7.8. Security Policy and Name Space Presentation. . . . . 64 + 8. File Locking and Share Reservations. . . . . . . . . . . . 65 + 8.1. Locking. . . . . . . . . . . . . . . . . . . . . . . 65 + 8.1.1. Client ID. . . . . . . . . . . . . . . . . 66 + 8.1.2. Server Release of Clientid . . . . . . . . 69 + 8.1.3. lock_owner and stateid Definition. . . . . 69 + 8.1.4. Use of the stateid and Locking . . . . . . 71 + 8.1.5. Sequencing of Lock Requests. . . . . . . . 73 + 8.1.6. Recovery from Replayed Requests. . . . . . 74 + 8.1.7. Releasing lock_owner State . . . . . . . . 74 + 8.1.8. Use of Open Confirmation . . . . . . . . . 75 + 8.2. Lock Ranges. . . . . . . . . . . . . . . . . . . . . 76 + 8.3. Upgrading and Downgrading Locks. . . . . . . . . . . 76 + 8.4. Blocking Locks . . . . . . . . . . . . . . . . . . . 77 + 8.5. Lease Renewal. . . . . . . . . . . . . . . . . . . . 77 + 8.6. Crash Recovery . . . . . . . . . . . . . . . . . . . 78 + 8.6.1. Client Failure and Recovery. . . . . . . . 79 + 8.6.2. Server Failure and Recovery. . . . . . . . 79 + 8.6.3. Network Partitions and Recovery. . . . . . 81 + 8.7. Recovery from a Lock Request Timeout or Abort . . . 85 + + + +Shepler, et al. Standards Track [Page 3] + +RFC 3530 NFS version 4 Protocol April 2003 + + + 8.8. Server Revocation of Locks. . . . . . . . . . . . . 85 + 8.9. Share Reservations. . . . . . . . . . . . . . . . . 86 + 8.10. OPEN/CLOSE Operations . . . . . . . . . . . . . . . 87 + 8.10.1. Close and Retention of State + Information. . . . . . . . . . . . . . . . 88 + 8.11. Open Upgrade and Downgrade. . . . . . . . . . . . . 88 + 8.12. Short and Long Leases . . . . . . . . . . . . . . . 89 + 8.13. Clocks, Propagation Delay, and Calculating Lease + Expiration. . . . . . . . . . . . . . . . . . . . . 89 + 8.14. Migration, Replication and State. . . . . . . . . . 90 + 8.14.1. Migration and State. . . . . . . . . . . . 90 + 8.14.2. Replication and State. . . . . . . . . . . 91 + 8.14.3. Notification of Migrated Lease . . . . . . 92 + 8.14.4. Migration and the Lease_time Attribute . . 92 + 9. Client-Side Caching . . . . . . . . . . . . . . . . . . . . 93 + 9.1. Performance Challenges for Client-Side Caching. . . 93 + 9.2. Delegation and Callbacks. . . . . . . . . . . . . . 94 + 9.2.1. Delegation Recovery . . . . . . . . . . . . 96 + 9.3. Data Caching. . . . . . . . . . . . . . . . . . . . 98 + 9.3.1. Data Caching and OPENs . . . . . . . . . . 98 + 9.3.2. Data Caching and File Locking. . . . . . . 99 + 9.3.3. Data Caching and Mandatory File Locking. . 101 + 9.3.4. Data Caching and File Identity . . . . . . 101 + 9.4. Open Delegation . . . . . . . . . . . . . . . . . . 102 + 9.4.1. Open Delegation and Data Caching . . . . . 104 + 9.4.2. Open Delegation and File Locks . . . . . . 106 + 9.4.3. Handling of CB_GETATTR . . . . . . . . . . 106 + 9.4.4. Recall of Open Delegation. . . . . . . . . 109 + 9.4.5. Clients that Fail to Honor + Delegation Recalls . . . . . . . . . . . . 111 + 9.4.6. Delegation Revocation. . . . . . . . . . . 112 + 9.5. Data Caching and Revocation . . . . . . . . . . . . 112 + 9.5.1. Revocation Recovery for Write Open + Delegation . . . . . . . . . . . . . . . . 113 + 9.6. Attribute Caching . . . . . . . . . . . . . . . . . 113 + 9.7. Data and Metadata Caching and Memory Mapped Files . 115 + 9.8. Name Caching . . . . . . . . . . . . . . . . . . . 118 + 9.9. Directory Caching . . . . . . . . . . . . . . . . . 119 + 10. Minor Versioning . . . . . . . . . . . . . . . . . . . . . 120 + 11. Internationalization . . . . . . . . . . . . . . . . . . . 122 + 11.1. Stringprep profile for the utf8str_cs type. . . . . 123 + 11.1.1. Intended applicability of the + nfs4_cs_prep profile . . . . . . . . . . . 123 + 11.1.2. Character repertoire of nfs4_cs_prep . . . 124 + 11.1.3. Mapping used by nfs4_cs_prep . . . . . . . 124 + 11.1.4. Normalization used by nfs4_cs_prep . . . . 124 + 11.1.5. Prohibited output for nfs4_cs_prep . . . . 125 + 11.1.6. Bidirectional output for nfs4_cs_prep. . . 125 + + + +Shepler, et al. Standards Track [Page 4] + +RFC 3530 NFS version 4 Protocol April 2003 + + + 11.2. Stringprep profile for the utf8str_cis type . . . . 125 + 11.2.1. Intended applicability of the + nfs4_cis_prep profile. . . . . . . . . . . 125 + 11.2.2. Character repertoire of nfs4_cis_prep . . 125 + 11.2.3. Mapping used by nfs4_cis_prep . . . . . . 125 + 11.2.4. Normalization used by nfs4_cis_prep . . . 125 + 11.2.5. Prohibited output for nfs4_cis_prep . . . 126 + 11.2.6. Bidirectional output for nfs4_cis_prep . . 126 + 11.3. Stringprep profile for the utf8str_mixed type . . . 126 + 11.3.1. Intended applicability of the + nfs4_mixed_prep profile. . . . . . . . . . 126 + 11.3.2. Character repertoire of nfs4_mixed_prep . 126 + 11.3.3. Mapping used by nfs4_cis_prep . . . . . . 126 + 11.3.4. Normalization used by nfs4_mixed_prep . . 127 + 11.3.5. Prohibited output for nfs4_mixed_prep . . 127 + 11.3.6. Bidirectional output for nfs4_mixed_prep . 127 + 11.4. UTF-8 Related Errors. . . . . . . . . . . . . . . . 127 + 12. Error Definitions . . . . . . . . . . . . . . . . . . . . 128 + 13. NFS version 4 Requests . . . . . . . . . . . . . . . . . . 134 + 13.1. Compound Procedure. . . . . . . . . . . . . . . . . 134 + 13.2. Evaluation of a Compound Request. . . . . . . . . . 135 + 13.3. Synchronous Modifying Operations. . . . . . . . . . 136 + 13.4. Operation Values. . . . . . . . . . . . . . . . . . 136 + 14. NFS version 4 Procedures . . . . . . . . . . . . . . . . . 136 + 14.1. Procedure 0: NULL - No Operation. . . . . . . . . . 136 + 14.2. Procedure 1: COMPOUND - Compound Operations . . . . 137 + 14.2.1. Operation 3: ACCESS - Check Access + Rights. . . . . . . . . . . . . . . . . . 140 + 14.2.2. Operation 4: CLOSE - Close File . . . . . 142 + 14.2.3. Operation 5: COMMIT - Commit + Cached Data . . . . . . . . . . . . . . . 144 + 14.2.4. Operation 6: CREATE - Create a + Non-Regular File Object . . . . . . . . . 147 + 14.2.5. Operation 7: DELEGPURGE - + Purge Delegations Awaiting Recovery . . . 150 + 14.2.6. Operation 8: DELEGRETURN - Return + Delegation. . . . . . . . . . . . . . . . 151 + 14.2.7. Operation 9: GETATTR - Get Attributes . . 152 + 14.2.8. Operation 10: GETFH - Get Current + Filehandle. . . . . . . . . . . . . . . . 153 + 14.2.9. Operation 11: LINK - Create Link to a + File. . . . . . . . . . . . . . . . . . . 154 + 14.2.10. Operation 12: LOCK - Create Lock . . . . 156 + 14.2.11. Operation 13: LOCKT - Test For Lock . . . 160 + 14.2.12. Operation 14: LOCKU - Unlock File . . . . 162 + 14.2.13. Operation 15: LOOKUP - Lookup Filename. . 163 + 14.2.14. Operation 16: LOOKUPP - Lookup + Parent Directory. . . . . . . . . . . . . 165 + + + +Shepler, et al. Standards Track [Page 5] + +RFC 3530 NFS version 4 Protocol April 2003 + + + 14.2.15. Operation 17: NVERIFY - Verify + Difference in Attributes . . . . . . . . 166 + 14.2.16. Operation 18: OPEN - Open a Regular + File. . . . . . . . . . . . . . . . . . . 168 + 14.2.17. Operation 19: OPENATTR - Open Named + Attribute Directory . . . . . . . . . . . 178 + 14.2.18. Operation 20: OPEN_CONFIRM - + Confirm Open . . . . . . . . . . . . . . 180 + 14.2.19. Operation 21: OPEN_DOWNGRADE - + Reduce Open File Access . . . . . . . . . 182 + 14.2.20. Operation 22: PUTFH - Set + Current Filehandle. . . . . . . . . . . . 184 + 14.2.21. Operation 23: PUTPUBFH - + Set Public Filehandle . . . . . . . . . . 185 + 14.2.22. Operation 24: PUTROOTFH - + Set Root Filehandle . . . . . . . . . . . 186 + 14.2.23. Operation 25: READ - Read from File . . . 187 + 14.2.24. Operation 26: READDIR - + Read Directory. . . . . . . . . . . . . . 190 + 14.2.25. Operation 27: READLINK - + Read Symbolic Link. . . . . . . . . . . . 193 + 14.2.26. Operation 28: REMOVE - + Remove Filesystem Object. . . . . . . . . 195 + 14.2.27. Operation 29: RENAME - + Rename Directory Entry. . . . . . . . . . 197 + 14.2.28. Operation 30: RENEW - Renew a Lease . . . 200 + 14.2.29. Operation 31: RESTOREFH - + Restore Saved Filehandle. . . . . . . . . 201 + 14.2.30. Operation 32: SAVEFH - Save + Current Filehandle. . . . . . . . . . . . 202 + 14.2.31. Operation 33: SECINFO - Obtain + Available Security. . . . . . . . . . . . 203 + 14.2.32. Operation 34: SETATTR - Set Attributes. . 206 + 14.2.33. Operation 35: SETCLIENTID - + Negotiate Clientid. . . . . . . . . . . . 209 + 14.2.34. Operation 36: SETCLIENTID_CONFIRM - + Confirm Clientid. . . . . . . . . . . . . 213 + 14.2.35. Operation 37: VERIFY - + Verify Same Attributes. . . . . . . . . . 217 + 14.2.36. Operation 38: WRITE - Write to File . . . 218 + 14.2.37. Operation 39: RELEASE_LOCKOWNER - + Release Lockowner State . . . . . . . . . 223 + 14.2.38. Operation 10044: ILLEGAL - + Illegal operation . . . . . . . . . . . . 224 + 15. NFS version 4 Callback Procedures . . . . . . . . . . . . 225 + 15.1. Procedure 0: CB_NULL - No Operation . . . . . . . . 225 + 15.2. Procedure 1: CB_COMPOUND - Compound + Operations. . . . . . . . . . . . . . . . . . . . . 226 + + + +Shepler, et al. Standards Track [Page 6] + +RFC 3530 NFS version 4 Protocol April 2003 + + + 15.2.1. Operation 3: CB_GETATTR - Get + Attributes . . . . . . . . . . . . . . . . 228 + 15.2.2. Operation 4: CB_RECALL - + Recall an Open Delegation. . . . . . . . . 229 + 15.2.3. Operation 10044: CB_ILLEGAL - + Illegal Callback Operation . . . . . . . . 230 + 16. Security Considerations . . . . . . . . . . . . . . . . . 231 + 17. IANA Considerations . . . . . . . . . . . . . . . . . . . 232 + 17.1. Named Attribute Definition. . . . . . . . . . . . . 232 + 17.2. ONC RPC Network Identifiers (netids). . . . . . . . 232 + 18. RPC definition file . . . . . . . . . . . . . . . . . . . 234 + 19. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 268 + 20. Normative References . . . . . . . . . . . . . . . . . . . 268 + 21. Informative References . . . . . . . . . . . . . . . . . . 270 + 22. Authors' Information . . . . . . . . . . . . . . . . . . . 273 + 22.1. Editor's Address. . . . . . . . . . . . . . . . . . 273 + 22.2. Authors' Addresses. . . . . . . . . . . . . . . . . 274 + 23. Full Copyright Statement . . . . . . . . . . . . . . . . . 275 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Shepler, et al. Standards Track [Page 7] + +RFC 3530 NFS version 4 Protocol April 2003 + + +1. Introduction + +1.1. Changes since RFC 3010 + + This definition of the NFS version 4 protocol replaces or obsoletes + the definition present in [RFC3010]. While portions of the two + documents have remained the same, there have been substantive changes + in others. The changes made between [RFC3010] and this document + represent implementation experience and further review of the + protocol. While some modifications were made for ease of + implementation or clarification, most updates represent errors or + situations where the [RFC3010] definition were untenable. + + The following list is not all inclusive of all changes but presents + some of the most notable changes or additions made: + + o The state model has added an open_owner4 identifier. This was + done to accommodate Posix based clients and the model they use for + file locking. For Posix clients, an open_owner4 would correspond + to a file descriptor potentially shared amongst a set of processes + and the lock_owner4 identifier would correspond to a process that + is locking a file. + + o Clarifications and error conditions were added for the handling of + the owner and group attributes. Since these attributes are string + based (as opposed to the numeric uid/gid of previous versions of + NFS), translations may not be available and hence the changes + made. + + o Clarifications for the ACL and mode attributes to address + evaluation and partial support. + + o For identifiers that are defined as XDR opaque, limits were set on + their size. + + o Added the mounted_on_filed attribute to allow Posix clients to + correctly construct local mounts. + + o Modified the SETCLIENTID/SETCLIENTID_CONFIRM operations to deal + correctly with confirmation details along with adding the ability + to specify new client callback information. Also added + clarification of the callback information itself. + + o Added a new operation LOCKOWNER_RELEASE to enable notifying the + server that a lock_owner4 will no longer be used by the client. + + o RENEW operation changes to identify the client correctly and allow + for additional error returns. + + + +Shepler, et al. Standards Track [Page 8] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o Verify error return possibilities for all operations. + + o Remove use of the pathname4 data type from LOOKUP and OPEN in + favor of having the client construct a sequence of LOOKUP + operations to achieive the same effect. + + o Clarification of the internationalization issues and adoption of + the new stringprep profile framework. + +1.2. NFS Version 4 Goals + + The NFS version 4 protocol is a further revision of the NFS protocol + defined already by versions 2 [RFC1094] and 3 [RFC1813]. It retains + the essential characteristics of previous versions: design for easy + recovery, independent of transport protocols, operating systems and + filesystems, simplicity, and good performance. The NFS version 4 + revision has the following goals: + + o Improved access and good performance on the Internet. + + The protocol is designed to transit firewalls easily, perform well + where latency is high and bandwidth is low, and scale to very + large numbers of clients per server. + + o Strong security with negotiation built into the protocol. + + The protocol builds on the work of the ONCRPC working group in + supporting the RPCSEC_GSS protocol. Additionally, the NFS version + 4 protocol provides a mechanism to allow clients and servers the + ability to negotiate security and require clients and servers to + support a minimal set of security schemes. + + o Good cross-platform interoperability. + + The protocol features a filesystem model that provides a useful, + common set of features that does not unduly favor one filesystem + or operating system over another. + + o Designed for protocol extensions. + + The protocol is designed to accept standard extensions that do not + compromise backward compatibility. + +1.3. Inconsistencies of this Document with Section 18 + + Section 18, RPC Definition File, contains the definitions in XDR + description language of the constructs used by the protocol. Prior + to Section 18, several of the constructs are reproduced for purposes + + + +Shepler, et al. Standards Track [Page 9] + +RFC 3530 NFS version 4 Protocol April 2003 + + + of explanation. The reader is warned of the possibility of errors in + the reproduced constructs outside of Section 18. For any part of the + document that is inconsistent with Section 18, Section 18 is to be + considered authoritative. + +1.4. Overview of NFS version 4 Features + + To provide a reasonable context for the reader, the major features of + NFS version 4 protocol will be reviewed in brief. This will be done + to provide an appropriate context for both the reader who is familiar + with the previous versions of the NFS protocol and the reader that is + new to the NFS protocols. For the reader new to the NFS protocols, + there is still a fundamental knowledge that is expected. The reader + should be familiar with the XDR and RPC protocols as described in + [RFC1831] and [RFC1832]. A basic knowledge of filesystems and + distributed filesystems is expected as well. + +1.4.1. RPC and Security + + As with previous versions of NFS, the External Data Representation + (XDR) and Remote Procedure Call (RPC) mechanisms used for the NFS + version 4 protocol are those defined in [RFC1831] and [RFC1832]. To + meet end to end security requirements, the RPCSEC_GSS framework + [RFC2203] will be used to extend the basic RPC security. With the + use of RPCSEC_GSS, various mechanisms can be provided to offer + authentication, integrity, and privacy to the NFS version 4 protocol. + Kerberos V5 will be used as described in [RFC1964] to provide one + security framework. The LIPKEY GSS-API mechanism described in + [RFC2847] will be used to provide for the use of user password and + server public key by the NFS version 4 protocol. With the use of + RPCSEC_GSS, other mechanisms may also be specified and used for NFS + version 4 security. + + To enable in-band security negotiation, the NFS version 4 protocol + has added a new operation which provides the client a method of + querying the server about its policies regarding which security + mechanisms must be used for access to the server's filesystem + resources. With this, the client can securely match the security + mechanism that meets the policies specified at both the client and + server. + +1.4.2. Procedure and Operation Structure + + A significant departure from the previous versions of the NFS + protocol is the introduction of the COMPOUND procedure. For the NFS + version 4 protocol, there are two RPC procedures, NULL and COMPOUND. + The COMPOUND procedure is defined in terms of operations and these + operations correspond more closely to the traditional NFS procedures. + + + +Shepler, et al. Standards Track [Page 10] + +RFC 3530 NFS version 4 Protocol April 2003 + + + With the use of the COMPOUND procedure, the client is able to build + simple or complex requests. These COMPOUND requests allow for a + reduction in the number of RPCs needed for logical filesystem + operations. For example, without previous contact with a server a + client will be able to read data from a file in one request by + combining LOOKUP, OPEN, and READ operations in a single COMPOUND RPC. + With previous versions of the NFS protocol, this type of single + request was not possible. + + The model used for COMPOUND is very simple. There is no logical OR + or ANDing of operations. The operations combined within a COMPOUND + request are evaluated in order by the server. Once an operation + returns a failing result, the evaluation ends and the results of all + evaluated operations are returned to the client. + + The NFS version 4 protocol continues to have the client refer to a + file or directory at the server by a "filehandle". The COMPOUND + procedure has a method of passing a filehandle from one operation to + another within the sequence of operations. There is a concept of a + "current filehandle" and "saved filehandle". Most operations use the + "current filehandle" as the filesystem object to operate upon. The + "saved filehandle" is used as temporary filehandle storage within a + COMPOUND procedure as well as an additional operand for certain + operations. + +1.4.3. Filesystem Model + + The general filesystem model used for the NFS version 4 protocol is + the same as previous versions. The server filesystem is hierarchical + with the regular files contained within being treated as opaque byte + streams. In a slight departure, file and directory names are encoded + with UTF-8 to deal with the basics of internationalization. + + The NFS version 4 protocol does not require a separate protocol to + provide for the initial mapping between path name and filehandle. + Instead of using the older MOUNT protocol for this mapping, the + server provides a ROOT filehandle that represents the logical root or + top of the filesystem tree provided by the server. The server + provides multiple filesystems by gluing them together with pseudo + filesystems. These pseudo filesystems provide for potential gaps in + the path names between real filesystems. + +1.4.3.1. Filehandle Types + + In previous versions of the NFS protocol, the filehandle provided by + the server was guaranteed to be valid or persistent for the lifetime + of the filesystem object to which it referred. For some server + implementations, this persistence requirement has been difficult to + + + +Shepler, et al. Standards Track [Page 11] + +RFC 3530 NFS version 4 Protocol April 2003 + + + meet. For the NFS version 4 protocol, this requirement has been + relaxed by introducing another type of filehandle, volatile. With + persistent and volatile filehandle types, the server implementation + can match the abilities of the filesystem at the server along with + the operating environment. The client will have knowledge of the + type of filehandle being provided by the server and can be prepared + to deal with the semantics of each. + +1.4.3.2. Attribute Types + + The NFS version 4 protocol introduces three classes of filesystem or + file attributes. Like the additional filehandle type, the + classification of file attributes has been done to ease server + implementations along with extending the overall functionality of the + NFS protocol. This attribute model is structured to be extensible + such that new attributes can be introduced in minor revisions of the + protocol without requiring significant rework. + + The three classifications are: mandatory, recommended and named + attributes. This is a significant departure from the previous + attribute model used in the NFS protocol. Previously, the attributes + for the filesystem and file objects were a fixed set of mainly UNIX + attributes. If the server or client did not support a particular + attribute, it would have to simulate the attribute the best it could. + + Mandatory attributes are the minimal set of file or filesystem + attributes that must be provided by the server and must be properly + represented by the server. Recommended attributes represent + different filesystem types and operating environments. The + recommended attributes will allow for better interoperability and the + inclusion of more operating environments. The mandatory and + recommended attribute sets are traditional file or filesystem + attributes. The third type of attribute is the named attribute. A + named attribute is an opaque byte stream that is associated with a + directory or file and referred to by a string name. Named attributes + are meant to be used by client applications as a method to associate + application specific data with a regular file or directory. + + One significant addition to the recommended set of file attributes is + the Access Control List (ACL) attribute. This attribute provides for + directory and file access control beyond the model used in previous + versions of the NFS protocol. The ACL definition allows for + specification of user and group level access control. + + + + + + + + +Shepler, et al. Standards Track [Page 12] + +RFC 3530 NFS version 4 Protocol April 2003 + + +1.4.3.3. Filesystem Replication and Migration + + With the use of a special file attribute, the ability to migrate or + replicate server filesystems is enabled within the protocol. The + filesystem locations attribute provides a method for the client to + probe the server about the location of a filesystem. In the event of + a migration of a filesystem, the client will receive an error when + operating on the filesystem and it can then query as to the new file + system location. Similar steps are used for replication, the client + is able to query the server for the multiple available locations of a + particular filesystem. From this information, the client can use its + own policies to access the appropriate filesystem location. + +1.4.4. OPEN and CLOSE + + The NFS version 4 protocol introduces OPEN and CLOSE operations. The + OPEN operation provides a single point where file lookup, creation, + and share semantics can be combined. The CLOSE operation also + provides for the release of state accumulated by OPEN. + +1.4.5. File locking + + With the NFS version 4 protocol, the support for byte range file + locking is part of the NFS protocol. The file locking support is + structured so that an RPC callback mechanism is not required. This + is a departure from the previous versions of the NFS file locking + protocol, Network Lock Manager (NLM). The state associated with file + locks is maintained at the server under a lease-based model. The + server defines a single lease period for all state held by a NFS + client. If the client does not renew its lease within the defined + period, all state associated with the client's lease may be released + by the server. The client may renew its lease with use of the RENEW + operation or implicitly by use of other operations (primarily READ). + +1.4.6. Client Caching and Delegation + + The file, attribute, and directory caching for the NFS version 4 + protocol is similar to previous versions. Attributes and directory + information are cached for a duration determined by the client. At + the end of a predefined timeout, the client will query the server to + see if the related filesystem object has been updated. + + For file data, the client checks its cache validity when the file is + opened. A query is sent to the server to determine if the file has + been changed. Based on this information, the client determines if + the data cache for the file should kept or released. Also, when the + file is closed, any modified data is written to the server. + + + + +Shepler, et al. Standards Track [Page 13] + +RFC 3530 NFS version 4 Protocol April 2003 + + + If an application wants to serialize access to file data, file + locking of the file data ranges in question should be used. + + The major addition to NFS version 4 in the area of caching is the + ability of the server to delegate certain responsibilities to the + client. When the server grants a delegation for a file to a client, + the client is guaranteed certain semantics with respect to the + sharing of that file with other clients. At OPEN, the server may + provide the client either a read or write delegation for the file. + If the client is granted a read delegation, it is assured that no + other client has the ability to write to the file for the duration of + the delegation. If the client is granted a write delegation, the + client is assured that no other client has read or write access to + the file. + + Delegations can be recalled by the server. If another client + requests access to the file in such a way that the access conflicts + with the granted delegation, the server is able to notify the initial + client and recall the delegation. This requires that a callback path + exist between the server and client. If this callback path does not + exist, then delegations can not be granted. The essence of a + delegation is that it allows the client to locally service operations + such as OPEN, CLOSE, LOCK, LOCKU, READ, WRITE without immediate + interaction with the server. + +1.5. General Definitions + + The following definitions are provided for the purpose of providing + an appropriate context for the reader. + + + Client The "client" is the entity that accesses the NFS server's + resources. The client may be an application which contains + the logic to access the NFS server directly. The client + may also be the traditional operating system client remote + filesystem services for a set of applications. + + In the case of file locking the client is the entity that + maintains a set of locks on behalf of one or more + applications. This client is responsible for crash or + failure recovery for those locks it manages. + + Note that multiple clients may share the same transport and + multiple clients may exist on the same network node. + + Clientid A 64-bit quantity used as a unique, short-hand reference to + a client supplied Verifier and ID. The server is + responsible for supplying the Clientid. + + + +Shepler, et al. Standards Track [Page 14] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Lease An interval of time defined by the server for which the + client is irrevocably granted a lock. At the end of a + lease period the lock may be revoked if the lease has not + been extended. The lock must be revoked if a conflicting + lock has been granted after the lease interval. + + All leases granted by a server have the same fixed + interval. Note that the fixed interval was chosen to + alleviate the expense a server would have in maintaining + state about variable length leases across server failures. + + Lock The term "lock" is used to refer to both record (byte- + range) locks as well as share reservations unless + specifically stated otherwise. + + Server The "Server" is the entity responsible for coordinating + client access to a set of filesystems. + + Stable Storage + NFS version 4 servers must be able to recover without data + loss from multiple power failures (including cascading + power failures, that is, several power failures in quick + succession), operating system failures, and hardware + failure of components other than the storage medium itself + (for example, disk, nonvolatile RAM). + + Some examples of stable storage that are allowable for an + NFS server include: + + 1. Media commit of data, that is, the modified data has + been successfully written to the disk media, for + example, the disk platter. + + 2. An immediate reply disk drive with battery-backed on- + drive intermediate storage or uninterruptible power + system (UPS). + + 3. Server commit of data with battery-backed intermediate + storage and recovery software. + + 4. Cache commit with uninterruptible power system (UPS) and + recovery software. + + Stateid A 128-bit quantity returned by a server that uniquely + defines the open and locking state provided by the server + for a specific open or lock owner for a specific file. + + + + + +Shepler, et al. Standards Track [Page 15] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Stateids composed of all bits 0 or all bits 1 have special + meaning and are reserved values. + + Verifier A 64-bit quantity generated by the client that the server + can use to determine if the client has restarted and lost + all previous lock state. + +2. Protocol Data Types + + The syntax and semantics to describe the data types of the NFS + version 4 protocol are defined in the XDR [RFC1832] and RPC [RFC1831] + documents. The next sections build upon the XDR data types to define + types and structures specific to this protocol. + +2.1. Basic Data Types + + Data Type Definition + ____________________________________________________________________ + int32_t typedef int int32_t; + + uint32_t typedef unsigned int uint32_t; + + int64_t typedef hyper int64_t; + + uint64_t typedef unsigned hyper uint64_t; + + attrlist4 typedef opaque attrlist4<>; + Used for file/directory attributes + + bitmap4 typedef uint32_t bitmap4<>; + Used in attribute array encoding. + + changeid4 typedef uint64_t changeid4; + Used in definition of change_info + + clientid4 typedef uint64_t clientid4; + Shorthand reference to client identification + + component4 typedef utf8str_cs component4; + Represents path name components + + count4 typedef uint32_t count4; + Various count parameters (READ, WRITE, COMMIT) + + length4 typedef uint64_t length4; + Describes LOCK lengths + + + + + +Shepler, et al. Standards Track [Page 16] + +RFC 3530 NFS version 4 Protocol April 2003 + + + linktext4 typedef utf8str_cs linktext4; + Symbolic link contents + + mode4 typedef uint32_t mode4; + Mode attribute data type + + nfs_cookie4 typedef uint64_t nfs_cookie4; + Opaque cookie value for READDIR + + nfs_fh4 typedef opaque nfs_fh4; + Filehandle definition; NFS4_FHSIZE is defined as 128 + + nfs_ftype4 enum nfs_ftype4; + Various defined file types + + nfsstat4 enum nfsstat4; + Return value for operations + + offset4 typedef uint64_t offset4; + Various offset designations (READ, WRITE, + LOCK, COMMIT) + + pathname4 typedef component4 pathname4<>; + Represents path name for LOOKUP, OPEN and others + + qop4 typedef uint32_t qop4; + Quality of protection designation in SECINFO + + sec_oid4 typedef opaque sec_oid4<>; + Security Object Identifier + The sec_oid4 data type is not really opaque. + Instead contains an ASN.1 OBJECT IDENTIFIER as used + by GSS-API in the mech_type argument to + GSS_Init_sec_context. See [RFC2743] for details. + + seqid4 typedef uint32_t seqid4; + Sequence identifier used for file locking + + utf8string typedef opaque utf8string<>; + UTF-8 encoding for strings + + utf8str_cis typedef opaque utf8str_cis; + Case-insensitive UTF-8 string + + utf8str_cs typedef opaque utf8str_cs; + Case-sensitive UTF-8 string + + + + + +Shepler, et al. Standards Track [Page 17] + +RFC 3530 NFS version 4 Protocol April 2003 + + + utf8str_mixed typedef opaque utf8str_mixed; + UTF-8 strings with a case sensitive prefix and + a case insensitive suffix. + + verifier4 typedef opaque verifier4[NFS4_VERIFIER_SIZE]; + Verifier used for various operations (COMMIT, + CREATE, OPEN, READDIR, SETCLIENTID, + SETCLIENTID_CONFIRM, WRITE) NFS4_VERIFIER_SIZE is + defined as 8. + +2.2. Structured Data Types + + nfstime4 + struct nfstime4 { + int64_t seconds; + uint32_t nseconds; + } + + The nfstime4 structure gives the number of seconds and nanoseconds + since midnight or 0 hour January 1, 1970 Coordinated Universal Time + (UTC). Values greater than zero for the seconds field denote dates + after the 0 hour January 1, 1970. Values less than zero for the + seconds field denote dates before the 0 hour January 1, 1970. In + both cases, the nseconds field is to be added to the seconds field + for the final time representation. For example, if the time to be + represented is one-half second before 0 hour January 1, 1970, the + seconds field would have a value of negative one (-1) and the + nseconds fields would have a value of one-half second (500000000). + Values greater than 999,999,999 for nseconds are considered invalid. + + This data type is used to pass time and date information. A server + converts to and from its local representation of time when processing + time values, preserving as much accuracy as possible. If the + precision of timestamps stored for a filesystem object is less than + defined, loss of precision can occur. An adjunct time maintenance + protocol is recommended to reduce client and server time skew. + + time_how4 + + enum time_how4 { + SET_TO_SERVER_TIME4 = 0, + SET_TO_CLIENT_TIME4 = 1 + }; + + + + + + + + +Shepler, et al. Standards Track [Page 18] + +RFC 3530 NFS version 4 Protocol April 2003 + + + settime4 + + union settime4 switch (time_how4 set_it) { + case SET_TO_CLIENT_TIME4: + nfstime4 time; + default: + void; + }; + + The above definitions are used as the attribute definitions to set + time values. If set_it is SET_TO_SERVER_TIME4, then the server uses + its local representation of time for the time value. + + specdata4 + + struct specdata4 { + uint32_t specdata1; /* major device number */ + uint32_t specdata2; /* minor device number */ + }; + + This data type represents additional information for the device file + types NF4CHR and NF4BLK. + + fsid4 + + struct fsid4 { + uint64_t major; + uint64_t minor; + }; + + This type is the filesystem identifier that is used as a mandatory + attribute. + + fs_location4 + + struct fs_location4 { + utf8str_cis server<>; + pathname4 rootpath; + }; + + + fs_locations4 + + struct fs_locations4 { + pathname4 fs_root; + fs_location4 locations<>; + }; + + + + +Shepler, et al. Standards Track [Page 19] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The fs_location4 and fs_locations4 data types are used for the + fs_locations recommended attribute which is used for migration and + replication support. + + fattr4 + + struct fattr4 { + bitmap4 attrmask; + attrlist4 attr_vals; + }; + + The fattr4 structure is used to represent file and directory + attributes. + + The bitmap is a counted array of 32 bit integers used to contain bit + values. The position of the integer in the array that contains bit n + can be computed from the expression (n / 32) and its bit within that + integer is (n mod 32). + + 0 1 + +-----------+-----------+-----------+-- + | count | 31 .. 0 | 63 .. 32 | + +-----------+-----------+-----------+-- + + change_info4 + + struct change_info4 { + bool atomic; + changeid4 before; + changeid4 after; + }; + + This structure is used with the CREATE, LINK, REMOVE, RENAME + operations to let the client know the value of the change attribute + for the directory in which the target filesystem object resides. + + clientaddr4 + + struct clientaddr4 { + /* see struct rpcb in RFC 1833 */ + string r_netid<>; /* network id */ + string r_addr<>; /* universal address */ + }; + + The clientaddr4 structure is used as part of the SETCLIENTID + operation to either specify the address of the client that is using a + clientid or as part of the callback registration. The + + + + +Shepler, et al. Standards Track [Page 20] + +RFC 3530 NFS version 4 Protocol April 2003 + + + r_netid and r_addr fields are specified in [RFC1833], but they are + underspecified in [RFC1833] as far as what they should look like for + specific protocols. + + For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the + US-ASCII string: + + h1.h2.h3.h4.p1.p2 + + The prefix, "h1.h2.h3.h4", is the standard textual form for + representing an IPv4 address, which is always four octets long. + Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, + the first through fourth octets each converted to ASCII-decimal. + Assuming big-endian ordering, p1 and p2 are, respectively, the first + and second octets each converted to ASCII-decimal. For example, if a + host, in big-endian order, has an address of 0x0A010307 and there is + a service listening on, in big endian order, port 0x020F (decimal + 527), then the complete universal address is "10.1.3.7.2.15". + + For TCP over IPv4 the value of r_netid is the string "tcp". For UDP + over IPv4 the value of r_netid is the string "udp". + + For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the + US-ASCII string: + + x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 + + The suffix "p1.p2" is the service port, and is computed the same way + as with universal addresses for TCP and UDP over IPv4. The prefix, + "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form for + representing an IPv6 address as defined in Section 2.2 of [RFC2373]. + Additionally, the two alternative forms specified in Section 2.2 of + [RFC2373] are also acceptable. + + For TCP over IPv6 the value of r_netid is the string "tcp6". For UDP + over IPv6 the value of r_netid is the string "udp6". + + cb_client4 + + struct cb_client4 { + unsigned int cb_program; + clientaddr4 cb_location; + }; + + This structure is used by the client to inform the server of its call + back address; includes the program number and client address. + + + + + +Shepler, et al. Standards Track [Page 21] + +RFC 3530 NFS version 4 Protocol April 2003 + + + nfs_client_id4 + + struct nfs_client_id4 { + verifier4 verifier; + opaque id; + }; + + This structure is part of the arguments to the SETCLIENTID operation. + NFS4_OPAQUE_LIMIT is defined as 1024. + + open_owner4 + + struct open_owner4 { + clientid4 clientid; + opaque owner; + }; + + This structure is used to identify the owner of open state. + NFS4_OPAQUE_LIMIT is defined as 1024. + + lock_owner4 + + struct lock_owner4 { + clientid4 clientid; + opaque owner; + }; + + This structure is used to identify the owner of file locking state. + NFS4_OPAQUE_LIMIT is defined as 1024. + + open_to_lock_owner4 + + struct open_to_lock_owner4 { + seqid4 open_seqid; + stateid4 open_stateid; + seqid4 lock_seqid; + lock_owner4 lock_owner; + }; + + This structure is used for the first LOCK operation done for an + open_owner4. It provides both the open_stateid and lock_owner such + that the transition is made from a valid open_stateid sequence to + that of the new lock_stateid sequence. Using this mechanism avoids + the confirmation of the lock_owner/lock_seqid pair since it is tied + to established state in the form of the open_stateid/open_seqid. + + + + + + +Shepler, et al. Standards Track [Page 22] + +RFC 3530 NFS version 4 Protocol April 2003 + + + stateid4 + + struct stateid4 { + uint32_t seqid; + opaque other[12]; + }; + + This structure is used for the various state sharing mechanisms + between the client and server. For the client, this data structure + is read-only. The starting value of the seqid field is undefined. + The server is required to increment the seqid field monotonically at + each transition of the stateid. This is important since the client + will inspect the seqid in OPEN stateids to determine the order of + OPEN processing done by the server. + +3. RPC and Security Flavor + + The NFS version 4 protocol is a Remote Procedure Call (RPC) + application that uses RPC version 2 and the corresponding eXternal + Data Representation (XDR) as defined in [RFC1831] and [RFC1832]. The + RPCSEC_GSS security flavor as defined in [RFC2203] MUST be used as + the mechanism to deliver stronger security for the NFS version 4 + protocol. + +3.1. Ports and Transports + + Historically, NFS version 2 and version 3 servers have resided on + port 2049. The registered port 2049 [RFC3232] for the NFS protocol + should be the default configuration. Using the registered port for + NFS services means the NFS client will not need to use the RPC + binding protocols as described in [RFC1833]; this will allow NFS to + transit firewalls. + + Where an NFS version 4 implementation supports operation over the IP + network protocol, the supported transports between NFS and IP MUST be + among the IETF-approved congestion control transport protocols, which + include TCP and SCTP. To enhance the possibilities for + interoperability, an NFS version 4 implementation MUST support + operation over the TCP transport protocol, at least until such time + as a standards track RFC revises this requirement to use a different + IETF-approved congestion control transport protocol. + + If TCP is used as the transport, the client and server SHOULD use + persistent connections. This will prevent the weakening of TCP's + congestion control via short lived connections and will improve + performance for the WAN environment by eliminating the need for SYN + handshakes. + + + + +Shepler, et al. Standards Track [Page 23] + +RFC 3530 NFS version 4 Protocol April 2003 + + + As noted in the Security Considerations section, the authentication + model for NFS version 4 has moved from machine-based to principal- + based. However, this modification of the authentication model does + not imply a technical requirement to move the TCP connection + management model from whole machine-based to one based on a per user + model. In particular, NFS over TCP client implementations have + traditionally multiplexed traffic for multiple users over a common + TCP connection between an NFS client and server. This has been true, + regardless whether the NFS client is using AUTH_SYS, AUTH_DH, + RPCSEC_GSS or any other flavor. Similarly, NFS over TCP server + implementations have assumed such a model and thus scale the + implementation of TCP connection management in proportion to the + number of expected client machines. It is intended that NFS version + 4 will not modify this connection management model. NFS version 4 + clients that violate this assumption can expect scaling issues on the + server and hence reduced service. + + Note that for various timers, the client and server should avoid + inadvertent synchronization of those timers. For further discussion + of the general issue refer to [Floyd]. + +3.1.1. Client Retransmission Behavior + + When processing a request received over a reliable transport such as + TCP, the NFS version 4 server MUST NOT silently drop the request, + except if the transport connection has been broken. Given such a + contract between NFS version 4 clients and servers, clients MUST NOT + retry a request unless one or both of the following are true: + + o The transport connection has been broken + + o The procedure being retried is the NULL procedure + + Since reliable transports, such as TCP, do not always synchronously + inform a peer when the other peer has broken the connection (for + example, when an NFS server reboots), the NFS version 4 client may + want to actively "probe" the connection to see if has been broken. + Use of the NULL procedure is one recommended way to do so. So, when + a client experiences a remote procedure call timeout (of some + arbitrary implementation specific amount), rather than retrying the + remote procedure call, it could instead issue a NULL procedure call + to the server. If the server has died, the transport connection + break will eventually be indicated to the NFS version 4 client. The + client can then reconnect, and then retry the original request. If + the NULL procedure call gets a response, the connection has not + broken. The client can decide to wait longer for the original + request's response, or it can break the transport connection and + reconnect before re-sending the original request. + + + +Shepler, et al. Standards Track [Page 24] + +RFC 3530 NFS version 4 Protocol April 2003 + + + For callbacks from the server to the client, the same rules apply, + but the server doing the callback becomes the client, and the client + receiving the callback becomes the server. + +3.2. Security Flavors + + Traditional RPC implementations have included AUTH_NONE, AUTH_SYS, + AUTH_DH, and AUTH_KRB4 as security flavors. With [RFC2203] an + additional security flavor of RPCSEC_GSS has been introduced which + uses the functionality of GSS-API [RFC2743]. This allows for the use + of various security mechanisms by the RPC layer without the + additional implementation overhead of adding RPC security flavors. + For NFS version 4, the RPCSEC_GSS security flavor MUST be used to + enable the mandatory security mechanism. Other flavors, such as, + AUTH_NONE, AUTH_SYS, and AUTH_DH MAY be implemented as well. + +3.2.1. Security mechanisms for NFS version 4 + + The use of RPCSEC_GSS requires selection of: mechanism, quality of + protection, and service (authentication, integrity, privacy). The + remainder of this document will refer to these three parameters of + the RPCSEC_GSS security as the security triple. + +3.2.1.1. Kerberos V5 as a security triple + + The Kerberos V5 GSS-API mechanism as described in [RFC1964] MUST be + implemented and provide the following security triples. + + column descriptions: + + 1 == number of pseudo flavor + 2 == name of pseudo flavor + 3 == mechanism's OID + 4 == mechanism's algorithm(s) + 5 == RPCSEC_GSS service + + 1 2 3 4 5 + -------------------------------------------------------------------- + 390003 krb5 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_none + 390004 krb5i 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_integrity + 390005 krb5p 1.2.840.113554.1.2.2 DES MAC MD5 rpc_gss_svc_privacy + for integrity, + and 56 bit DES + for privacy. + + Note that the pseudo flavor is presented here as a mapping aid to the + implementor. Because this NFS protocol includes a method to + negotiate security and it understands the GSS-API mechanism, the + + + +Shepler, et al. Standards Track [Page 25] + +RFC 3530 NFS version 4 Protocol April 2003 + + + pseudo flavor is not needed. The pseudo flavor is needed for NFS + version 3 since the security negotiation is done via the MOUNT + protocol. + + For a discussion of NFS' use of RPCSEC_GSS and Kerberos V5, please + see [RFC2623]. + + Users and implementors are warned that 56 bit DES is no longer + considered state of the art in terms of resistance to brute force + attacks. Once a revision to [RFC1964] is available that adds support + for AES, implementors are urged to incorporate AES into their NFSv4 + over Kerberos V5 protocol stacks, and users are similarly urged to + migrate to the use of AES. + +3.2.1.2. LIPKEY as a security triple + + The LIPKEY GSS-API mechanism as described in [RFC2847] MUST be + implemented and provide the following security triples. The + definition of the columns matches the previous subsection "Kerberos + V5 as security triple" + + 1 2 3 4 5 + -------------------------------------------------------------------- + 390006 lipkey 1.3.6.1.5.5.9 negotiated rpc_gss_svc_none + 390007 lipkey-i 1.3.6.1.5.5.9 negotiated rpc_gss_svc_integrity + 390008 lipkey-p 1.3.6.1.5.5.9 negotiated rpc_gss_svc_privacy + + The mechanism algorithm is listed as "negotiated". This is because + LIPKEY is layered on SPKM-3 and in SPKM-3 [RFC2847] the + confidentiality and integrity algorithms are negotiated. Since + SPKM-3 specifies HMAC-MD5 for integrity as MANDATORY, 128 bit + cast5CBC for confidentiality for privacy as MANDATORY, and further + specifies that HMAC-MD5 and cast5CBC MUST be listed first before + weaker algorithms, specifying "negotiated" in column 4 does not + impair interoperability. In the event an SPKM-3 peer does not + support the mandatory algorithms, the other peer is free to accept or + reject the GSS-API context creation. + + Because SPKM-3 negotiates the algorithms, subsequent calls to + LIPKEY's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality + of protection value of 0 (zero). See section 5.2 of [RFC2025] for an + explanation. + + LIPKEY uses SPKM-3 to create a secure channel in which to pass a user + name and password from the client to the server. Once the user name + and password have been accepted by the server, calls to the LIPKEY + context are redirected to the SPKM-3 context. See [RFC2847] for more + details. + + + +Shepler, et al. Standards Track [Page 26] + +RFC 3530 NFS version 4 Protocol April 2003 + + +3.2.1.3. SPKM-3 as a security triple + + The SPKM-3 GSS-API mechanism as described in [RFC2847] MUST be + implemented and provide the following security triples. The + definition of the columns matches the previous subsection "Kerberos + V5 as security triple". + + 1 2 3 4 5 + -------------------------------------------------------------------- + 390009 spkm3 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_none + 390010 spkm3i 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_integrity + 390011 spkm3p 1.3.6.1.5.5.1.3 negotiated rpc_gss_svc_privacy + + For a discussion as to why the mechanism algorithm is listed as + "negotiated", see the previous section "LIPKEY as a security triple." + + Because SPKM-3 negotiates the algorithms, subsequent calls to SPKM- + 3's GSS_Wrap() and GSS_GetMIC() by RPCSEC_GSS will use a quality of + protection value of 0 (zero). See section 5.2 of [RFC2025] for an + explanation. + + Even though LIPKEY is layered over SPKM-3, SPKM-3 is specified as a + mandatory set of triples to handle the situations where the initiator + (the client) is anonymous or where the initiator has its own + certificate. If the initiator is anonymous, there will not be a user + name and password to send to the target (the server). If the + initiator has its own certificate, then using passwords is + superfluous. + +3.3. Security Negotiation + + With the NFS version 4 server potentially offering multiple security + mechanisms, the client needs a method to determine or negotiate which + mechanism is to be used for its communication with the server. The + NFS server may have multiple points within its filesystem name space + that are available for use by NFS clients. In turn the NFS server + may be configured such that each of these entry points may have + different or multiple security mechanisms in use. + + The security negotiation between client and server must be done with + a secure channel to eliminate the possibility of a third party + intercepting the negotiation sequence and forcing the client and + server to choose a lower level of security than required or desired. + See the section "Security Considerations" for further discussion. + + + + + + + +Shepler, et al. Standards Track [Page 27] + +RFC 3530 NFS version 4 Protocol April 2003 + + +3.3.1. SECINFO + + The new SECINFO operation will allow the client to determine, on a + per filehandle basis, what security triple is to be used for server + access. In general, the client will not have to use the SECINFO + operation except during initial communication with the server or when + the client crosses policy boundaries at the server. It is possible + that the server's policies change during the client's interaction + therefore forcing the client to negotiate a new security triple. + +3.3.2. Security Error + + Based on the assumption that each NFS version 4 client and server + must support a minimum set of security (i.e., LIPKEY, SPKM-3, and + Kerberos-V5 all under RPCSEC_GSS), the NFS client will start its + communication with the server with one of the minimal security + triples. During communication with the server, the client may + receive an NFS error of NFS4ERR_WRONGSEC. This error allows the + server to notify the client that the security triple currently being + used is not appropriate for access to the server's filesystem + resources. The client is then responsible for determining what + security triples are available at the server and choose one which is + appropriate for the client. See the section for the "SECINFO" + operation for further discussion of how the client will respond to + the NFS4ERR_WRONGSEC error and use SECINFO. + +3.4. Callback RPC Authentication + + Except as noted elsewhere in this section, the callback RPC + (described later) MUST mutually authenticate the NFS server to the + principal that acquired the clientid (also described later), using + the security flavor the original SETCLIENTID operation used. + + For AUTH_NONE, there are no principals, so this is a non-issue. + + AUTH_SYS has no notions of mutual authentication or a server + principal, so the callback from the server simply uses the AUTH_SYS + credential that the user used when he set up the delegation. + + For AUTH_DH, one commonly used convention is that the server uses the + credential corresponding to this AUTH_DH principal: + + unix.host@domain + + where host and domain are variables corresponding to the name of + server host and directory services domain in which it lives such as a + Network Information System domain or a DNS domain. + + + + +Shepler, et al. Standards Track [Page 28] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Because LIPKEY is layered over SPKM-3, it is permissible for the + server to use SPKM-3 and not LIPKEY for the callback even if the + client used LIPKEY for SETCLIENTID. + + Regardless of what security mechanism under RPCSEC_GSS is being used, + the NFS server, MUST identify itself in GSS-API via a + GSS_C_NT_HOSTBASED_SERVICE name type. GSS_C_NT_HOSTBASED_SERVICE + names are of the form: + + service@hostname + + For NFS, the "service" element is + + nfs + + Implementations of security mechanisms will convert nfs@hostname to + various different forms. For Kerberos V5 and LIPKEY, the following + form is RECOMMENDED: + + nfs/hostname + + For Kerberos V5, nfs/hostname would be a server principal in the + Kerberos Key Distribution Center database. This is the same + principal the client acquired a GSS-API context for when it issued + the SETCLIENTID operation, therefore, the realm name for the server + principal must be the same for the callback as it was for the + SETCLIENTID. + + For LIPKEY, this would be the username passed to the target (the NFS + version 4 client that receives the callback). + + It should be noted that LIPKEY may not work for callbacks, since the + LIPKEY client uses a user id/password. If the NFS client receiving + the callback can authenticate the NFS server's user name/password + pair, and if the user that the NFS server is authenticating to has a + public key certificate, then it works. + + In situations where the NFS client uses LIPKEY and uses a per-host + principal for the SETCLIENTID operation, instead of using LIPKEY for + SETCLIENTID, it is RECOMMENDED that SPKM-3 with mutual authentication + be used. This effectively means that the client will use a + certificate to authenticate and identify the initiator to the target + on the NFS server. Using SPKM-3 and not LIPKEY has the following + advantages: + + o When the server does a callback, it must authenticate to the + principal used in the SETCLIENTID. Even if LIPKEY is used, + because LIPKEY is layered over SPKM-3, the NFS client will need to + + + +Shepler, et al. Standards Track [Page 29] + +RFC 3530 NFS version 4 Protocol April 2003 + + + have a certificate that corresponds to the principal used in the + SETCLIENTID operation. From an administrative perspective, having + a user name, password, and certificate for both the client and + server is redundant. + + o LIPKEY was intended to minimize additional infrastructure + requirements beyond a certificate for the target, and the + expectation is that existing password infrastructure can be + leveraged for the initiator. In some environments, a per-host + password does not exist yet. If certificates are used for any + per-host principals, then additional password infrastructure is + not needed. + + o In cases when a host is both an NFS client and server, it can + share the same per-host certificate. + +4. Filehandles + + The filehandle in the NFS protocol is a per server unique identifier + for a filesystem object. The contents of the filehandle are opaque + to the client. Therefore, the server is responsible for translating + the filehandle to an internal representation of the filesystem + object. + +4.1. Obtaining the First Filehandle + + The operations of the NFS protocol are defined in terms of one or + more filehandles. Therefore, the client needs a filehandle to + initiate communication with the server. With the NFS version 2 + protocol [RFC1094] and the NFS version 3 protocol [RFC1813], there + exists an ancillary protocol to obtain this first filehandle. The + MOUNT protocol, RPC program number 100005, provides the mechanism of + translating a string based filesystem path name to a filehandle which + can then be used by the NFS protocols. + + The MOUNT protocol has deficiencies in the area of security and use + via firewalls. This is one reason that the use of the public + filehandle was introduced in [RFC2054] and [RFC2055]. With the use + of the public filehandle in combination with the LOOKUP operation in + the NFS version 2 and 3 protocols, it has been demonstrated that the + MOUNT protocol is unnecessary for viable interaction between NFS + client and server. + + Therefore, the NFS version 4 protocol will not use an ancillary + protocol for translation from string based path names to a + filehandle. Two special filehandles will be used as starting points + for the NFS client. + + + + +Shepler, et al. Standards Track [Page 30] + +RFC 3530 NFS version 4 Protocol April 2003 + + +4.1.1. Root Filehandle + + The first of the special filehandles is the ROOT filehandle. The + ROOT filehandle is the "conceptual" root of the filesystem name space + at the NFS server. The client uses or starts with the ROOT + filehandle by employing the PUTROOTFH operation. The PUTROOTFH + operation instructs the server to set the "current" filehandle to the + ROOT of the server's file tree. Once this PUTROOTFH operation is + used, the client can then traverse the entirety of the server's file + tree with the LOOKUP operation. A complete discussion of the server + name space is in the section "NFS Server Name Space". + +4.1.2. Public Filehandle + + The second special filehandle is the PUBLIC filehandle. Unlike the + ROOT filehandle, the PUBLIC filehandle may be bound or represent an + arbitrary filesystem object at the server. The server is responsible + for this binding. It may be that the PUBLIC filehandle and the ROOT + filehandle refer to the same filesystem object. However, it is up to + the administrative software at the server and the policies of the + server administrator to define the binding of the PUBLIC filehandle + and server filesystem object. The client may not make any + assumptions about this binding. The client uses the PUBLIC + filehandle via the PUTPUBFH operation. + +4.2. Filehandle Types + + In the NFS version 2 and 3 protocols, there was one type of + filehandle with a single set of semantics. This type of filehandle + is termed "persistent" in NFS Version 4. The semantics of a + persistent filehandle remain the same as before. A new type of + filehandle introduced in NFS Version 4 is the "volatile" filehandle, + which attempts to accommodate certain server environments. + + The volatile filehandle type was introduced to address server + functionality or implementation issues which make correct + implementation of a persistent filehandle infeasible. Some server + environments do not provide a filesystem level invariant that can be + used to construct a persistent filehandle. The underlying server + filesystem may not provide the invariant or the server's filesystem + programming interfaces may not provide access to the needed + invariant. Volatile filehandles may ease the implementation of + server functionality such as hierarchical storage management or + filesystem reorganization or migration. However, the volatile + filehandle increases the implementation burden for the client. + + + + + + +Shepler, et al. Standards Track [Page 31] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Since the client will need to handle persistent and volatile + filehandles differently, a file attribute is defined which may be + used by the client to determine the filehandle types being returned + by the server. + +4.2.1. General Properties of a Filehandle + + The filehandle contains all the information the server needs to + distinguish an individual file. To the client, the filehandle is + opaque. The client stores filehandles for use in a later request and + can compare two filehandles from the same server for equality by + doing a byte-by-byte comparison. However, the client MUST NOT + otherwise interpret the contents of filehandles. If two filehandles + from the same server are equal, they MUST refer to the same file. + Servers SHOULD try to maintain a one-to-one correspondence between + filehandles and files but this is not required. Clients MUST use + filehandle comparisons only to improve performance, not for correct + behavior. All clients need to be prepared for situations in which it + cannot be determined whether two filehandles denote the same object + and in such cases, avoid making invalid assumptions which might cause + incorrect behavior. Further discussion of filehandle and attribute + comparison in the context of data caching is presented in the section + "Data Caching and File Identity". + + As an example, in the case that two different path names when + traversed at the server terminate at the same filesystem object, the + server SHOULD return the same filehandle for each path. This can + occur if a hard link is used to create two file names which refer to + the same underlying file object and associated data. For example, if + paths /a/b/c and /a/d/c refer to the same file, the server SHOULD + return the same filehandle for both path names traversals. + +4.2.2. Persistent Filehandle + + A persistent filehandle is defined as having a fixed value for the + lifetime of the filesystem object to which it refers. Once the + server creates the filehandle for a filesystem object, the server + MUST accept the same filehandle for the object for the lifetime of + the object. If the server restarts or reboots the NFS server must + honor the same filehandle value as it did in the server's previous + instantiation. Similarly, if the filesystem is migrated, the new NFS + server must honor the same filehandle as the old NFS server. + + The persistent filehandle will be become stale or invalid when the + filesystem object is removed. When the server is presented with a + persistent filehandle that refers to a deleted object, it MUST return + an error of NFS4ERR_STALE. A filehandle may become stale when the + filesystem containing the object is no longer available. The file + + + +Shepler, et al. Standards Track [Page 32] + +RFC 3530 NFS version 4 Protocol April 2003 + + + system may become unavailable if it exists on removable media and the + media is no longer available at the server or the filesystem in whole + has been destroyed or the filesystem has simply been removed from the + server's name space (i.e., unmounted in a UNIX environment). + +4.2.3. Volatile Filehandle + + A volatile filehandle does not share the same longevity + characteristics of a persistent filehandle. The server may determine + that a volatile filehandle is no longer valid at many different + points in time. If the server can definitively determine that a + volatile filehandle refers to an object that has been removed, the + server should return NFS4ERR_STALE to the client (as is the case for + persistent filehandles). In all other cases where the server + determines that a volatile filehandle can no longer be used, it + should return an error of NFS4ERR_FHEXPIRED. + + The mandatory attribute "fh_expire_type" is used by the client to + determine what type of filehandle the server is providing for a + particular filesystem. This attribute is a bitmask with the + following values: + + FH4_PERSISTENT + The value of FH4_PERSISTENT is used to indicate a + persistent filehandle, which is valid until the object is + removed from the filesystem. The server will not return + NFS4ERR_FHEXPIRED for this filehandle. FH4_PERSISTENT is + defined as a value in which none of the bits specified + below are set. + + FH4_VOLATILE_ANY + The filehandle may expire at any time, except as + specifically excluded (i.e., FH4_NO_EXPIRE_WITH_OPEN). + + FH4_NOEXPIRE_WITH_OPEN + May only be set when FH4_VOLATILE_ANY is set. If this bit + is set, then the meaning of FH4_VOLATILE_ANY is qualified + to exclude any expiration of the filehandle when it is + open. + + FH4_VOL_MIGRATION + The filehandle will expire as a result of migration. If + FH4_VOL_ANY is set, FH4_VOL_MIGRATION is redundant. + + + + + + + + +Shepler, et al. Standards Track [Page 33] + +RFC 3530 NFS version 4 Protocol April 2003 + + + FH4_VOL_RENAME + The filehandle will expire during rename. This includes a + rename by the requesting client or a rename by any other + client. If FH4_VOL_ANY is set, FH4_VOL_RENAME is + redundant. + + Servers which provide volatile filehandles that may expire while open + (i.e., if FH4_VOL_MIGRATION or FH4_VOL_RENAME is set or if + FH4_VOLATILE_ANY is set and FH4_NOEXPIRE_WITH_OPEN not set), should + deny a RENAME or REMOVE that would affect an OPEN file of any of the + components leading to the OPEN file. In addition, the server should + deny all RENAME or REMOVE requests during the grace period upon + server restart. + + Note that the bits FH4_VOL_MIGRATION and FH4_VOL_RENAME allow the + client to determine that expiration has occurred whenever a specific + event occurs, without an explicit filehandle expiration error from + the server. FH4_VOL_ANY does not provide this form of information. + In situations where the server will expire many, but not all + filehandles upon migration (e.g., all but those that are open), + FH4_VOLATILE_ANY (in this case with FH4_NOEXPIRE_WITH_OPEN) is a + better choice since the client may not assume that all filehandles + will expire when migration occurs, and it is likely that additional + expirations will occur (as a result of file CLOSE) that are separated + in time from the migration event itself. + +4.2.4. One Method of Constructing a Volatile Filehandle + + A volatile filehandle, while opaque to the client could contain: + + [volatile bit = 1 | server boot time | slot | generation number] + + o slot is an index in the server volatile filehandle table + + o generation number is the generation number for the table + entry/slot + + When the client presents a volatile filehandle, the server makes the + following checks, which assume that the check for the volatile bit + has passed. If the server boot time is less than the current server + boot time, return NFS4ERR_FHEXPIRED. If slot is out of range, return + NFS4ERR_BADHANDLE. If the generation number does not match, return + NFS4ERR_FHEXPIRED. + + When the server reboots, the table is gone (it is volatile). + + If volatile bit is 0, then it is a persistent filehandle with a + different structure following it. + + + +Shepler, et al. Standards Track [Page 34] + +RFC 3530 NFS version 4 Protocol April 2003 + + +4.3. Client Recovery from Filehandle Expiration + + If possible, the client SHOULD recover from the receipt of an + NFS4ERR_FHEXPIRED error. The client must take on additional + responsibility so that it may prepare itself to recover from the + expiration of a volatile filehandle. If the server returns + persistent filehandles, the client does not need these additional + steps. + + For volatile filehandles, most commonly the client will need to store + the component names leading up to and including the filesystem object + in question. With these names, the client should be able to recover + by finding a filehandle in the name space that is still available or + by starting at the root of the server's filesystem name space. + + If the expired filehandle refers to an object that has been removed + from the filesystem, obviously the client will not be able to recover + from the expired filehandle. + + It is also possible that the expired filehandle refers to a file that + has been renamed. If the file was renamed by another client, again + it is possible that the original client will not be able to recover. + However, in the case that the client itself is renaming the file and + the file is open, it is possible that the client may be able to + recover. The client can determine the new path name based on the + processing of the rename request. The client can then regenerate the + new filehandle based on the new path name. The client could also use + the compound operation mechanism to construct a set of operations + like: + RENAME A B + LOOKUP B + GETFH + + Note that the COMPOUND procedure does not provide atomicity. This + example only reduces the overhead of recovering from an expired + filehandle. + +5. File Attributes + + To meet the requirements of extensibility and increased + interoperability with non-UNIX platforms, attributes must be handled + in a flexible manner. The NFS version 3 fattr3 structure contains a + fixed list of attributes that not all clients and servers are able to + support or care about. The fattr3 structure can not be extended as + new needs arise and it provides no way to indicate non-support. With + the NFS version 4 protocol, the client is able query what attributes + the server supports and construct requests with only those supported + attributes (or a subset thereof). + + + +Shepler, et al. Standards Track [Page 35] + +RFC 3530 NFS version 4 Protocol April 2003 + + + To this end, attributes are divided into three groups: mandatory, + recommended, and named. Both mandatory and recommended attributes + are supported in the NFS version 4 protocol by a specific and well- + defined encoding and are identified by number. They are requested by + setting a bit in the bit vector sent in the GETATTR request; the + server response includes a bit vector to list what attributes were + returned in the response. New mandatory or recommended attributes + may be added to the NFS protocol between major revisions by + publishing a standards-track RFC which allocates a new attribute + number value and defines the encoding for the attribute. See the + section "Minor Versioning" for further discussion. + + Named attributes are accessed by the new OPENATTR operation, which + accesses a hidden directory of attributes associated with a file + system object. OPENATTR takes a filehandle for the object and + returns the filehandle for the attribute hierarchy. The filehandle + for the named attributes is a directory object accessible by LOOKUP + or READDIR and contains files whose names represent the named + attributes and whose data bytes are the value of the attribute. For + example: + + LOOKUP "foo" ; look up file + GETATTR attrbits + OPENATTR ; access foo's named attributes + LOOKUP "x11icon" ; look up specific attribute + READ 0,4096 ; read stream of bytes + + Named attributes are intended for data needed by applications rather + than by an NFS client implementation. NFS implementors are strongly + encouraged to define their new attributes as recommended attributes + by bringing them to the IETF standards-track process. + + The set of attributes which are classified as mandatory is + deliberately small since servers must do whatever it takes to support + them. A server should support as many of the recommended attributes + as possible but by their definition, the server is not required to + support all of them. Attributes are deemed mandatory if the data is + both needed by a large number of clients and is not otherwise + reasonably computable by the client when support is not provided on + the server. + + Note that the hidden directory returned by OPENATTR is a convenience + for protocol processing. The client should not make any assumptions + about the server's implementation of named attributes and whether the + underlying filesystem at the server has a named attribute directory + or not. Therefore, operations such as SETATTR and GETATTR on the + named attribute directory are undefined. + + + + +Shepler, et al. Standards Track [Page 36] + +RFC 3530 NFS version 4 Protocol April 2003 + + +5.1. Mandatory Attributes + + These MUST be supported by every NFS version 4 client and server in + order to ensure a minimum level of interoperability. The server must + store and return these attributes and the client must be able to + function with an attribute set limited to these attributes. With + just the mandatory attributes some client functionality may be + impaired or limited in some ways. A client may ask for any of these + attributes to be returned by setting a bit in the GETATTR request and + the server must return their value. + +5.2. Recommended Attributes + + These attributes are understood well enough to warrant support in the + NFS version 4 protocol. However, they may not be supported on all + clients and servers. A client may ask for any of these attributes to + be returned by setting a bit in the GETATTR request but must handle + the case where the server does not return them. A client may ask for + the set of attributes the server supports and should not request + attributes the server does not support. A server should be tolerant + of requests for unsupported attributes and simply not return them + rather than considering the request an error. It is expected that + servers will support all attributes they comfortably can and only + fail to support attributes which are difficult to support in their + operating environments. A server should provide attributes whenever + they don't have to "tell lies" to the client. For example, a file + modification time should be either an accurate time or should not be + supported by the server. This will not always be comfortable to + clients but the client is better positioned decide whether and how to + fabricate or construct an attribute or whether to do without the + attribute. + +5.3. Named Attributes + + These attributes are not supported by direct encoding in the NFS + Version 4 protocol but are accessed by string names rather than + numbers and correspond to an uninterpreted stream of bytes which are + stored with the filesystem object. The name space for these + attributes may be accessed by using the OPENATTR operation. The + OPENATTR operation returns a filehandle for a virtual "attribute + directory" and further perusal of the name space may be done using + READDIR and LOOKUP operations on this filehandle. Named attributes + may then be examined or changed by normal READ and WRITE and CREATE + operations on the filehandles returned from READDIR and LOOKUP. + Named attributes may have attributes. + + + + + + +Shepler, et al. Standards Track [Page 37] + +RFC 3530 NFS version 4 Protocol April 2003 + + + It is recommended that servers support arbitrary named attributes. A + client should not depend on the ability to store any named attributes + in the server's filesystem. If a server does support named + attributes, a client which is also able to handle them should be able + to copy a file's data and meta-data with complete transparency from + one location to another; this would imply that names allowed for + regular directory entries are valid for named attribute names as + well. + + Names of attributes will not be controlled by this document or other + IETF standards track documents. See the section "IANA + Considerations" for further discussion. + +5.4. Classification of Attributes + + Each of the Mandatory and Recommended attributes can be classified in + one of three categories: per server, per filesystem, or per + filesystem object. Note that it is possible that some per filesystem + attributes may vary within the filesystem. See the "homogeneous" + attribute for its definition. Note that the attributes + time_access_set and time_modify_set are not listed in this section + because they are write-only attributes corresponding to time_access + and time_modify, and are used in a special instance of SETATTR. + + o The per server attribute is: + + lease_time + + o The per filesystem attributes are: + + supp_attr, fh_expire_type, link_support, symlink_support, + unique_handles, aclsupport, cansettime, case_insensitive, + case_preserving, chown_restricted, files_avail, files_free, + files_total, fs_locations, homogeneous, maxfilesize, maxname, + maxread, maxwrite, no_trunc, space_avail, space_free, space_total, + time_delta + + o The per filesystem object attributes are: + + type, change, size, named_attr, fsid, rdattr_error, filehandle, + ACL, archive, fileid, hidden, maxlink, mimetype, mode, numlinks, + owner, owner_group, rawdev, space_used, system, time_access, + time_backup, time_create, time_metadata, time_modify, + mounted_on_fileid + + For quota_avail_hard, quota_avail_soft, and quota_used see their + definitions below for the appropriate classification. + + + + +Shepler, et al. Standards Track [Page 38] + +RFC 3530 NFS version 4 Protocol April 2003 + + +5.5. Mandatory Attributes - Definitions + + Name # DataType Access Description + ___________________________________________________________________ + supp_attr 0 bitmap READ The bit vector which + would retrieve all + mandatory and + recommended attributes + that are supported for + this object. The + scope of this + attribute applies to + all objects with a + matching fsid. + + type 1 nfs4_ftype READ The type of the object + (file, directory, + symlink, etc.) + + fh_expire_type 2 uint32 READ Server uses this to + specify filehandle + expiration behavior to + the client. See the + section "Filehandles" + for additional + description. + + change 3 uint64 READ A value created by the + server that the client + can use to determine + if file data, + directory contents or + attributes of the + object have been + modified. The server + may return the + object's time_metadata + attribute for this + attribute's value but + only if the filesystem + object can not be + updated more + frequently than the + resolution of + time_metadata. + + size 4 uint64 R/W The size of the object + in bytes. + + + +Shepler, et al. Standards Track [Page 39] + +RFC 3530 NFS version 4 Protocol April 2003 + + + link_support 5 bool READ True, if the object's + filesystem supports + hard links. + + symlink_support 6 bool READ True, if the object's + filesystem supports + symbolic links. + + named_attr 7 bool READ True, if this object + has named attributes. + In other words, object + has a non-empty named + attribute directory. + + fsid 8 fsid4 READ Unique filesystem + identifier for the + filesystem holding + this object. fsid + contains major and + minor components each + of which are uint64. + + unique_handles 9 bool READ True, if two distinct + filehandles guaranteed + to refer to two + different filesystem + objects. + + lease_time 10 nfs_lease4 READ Duration of leases at + server in seconds. + + rdattr_error 11 enum READ Error returned from + getattr during + readdir. + + filehandle 19 nfs_fh4 READ The filehandle of this + object (primarily for + readdir requests). + + + + + + + + + + + + + +Shepler, et al. Standards Track [Page 40] + +RFC 3530 NFS version 4 Protocol April 2003 + + +5.6. Recommended Attributes - Definitions + + Name # Data Type Access Description + _____________________________________________________________________ + ACL 12 nfsace4<> R/W The access control + list for the object. + + aclsupport 13 uint32 READ Indicates what types + of ACLs are + supported on the + current filesystem. + + archive 14 bool R/W True, if this file + has been archived + since the time of + last modification + (deprecated in favor + of time_backup). + + cansettime 15 bool READ True, if the server + is able to change + the times for a + filesystem object as + specified in a + SETATTR operation. + + case_insensitive 16 bool READ True, if filename + comparisons on this + filesystem are case + insensitive. + + case_preserving 17 bool READ True, if filename + case on this + filesystem are + preserved. + + chown_restricted 18 bool READ If TRUE, the server + will reject any + request to change + either the owner or + the group associated + with a file if the + caller is not a + privileged user (for + example, "root" in + UNIX operating + environments or in + Windows 2000 the + + + +Shepler, et al. Standards Track [Page 41] + +RFC 3530 NFS version 4 Protocol April 2003 + + + "Take Ownership" + privilege). + + fileid 20 uint64 READ A number uniquely + identifying the file + within the + filesystem. + + files_avail 21 uint64 READ File slots available + to this user on the + filesystem + containing this + object - this should + be the smallest + relevant limit. + + files_free 22 uint64 READ Free file slots on + the filesystem + containing this + object - this should + be the smallest + relevant limit. + + files_total 23 uint64 READ Total file slots on + the filesystem + containing this + object. + + fs_locations 24 fs_locations READ Locations where this + filesystem may be + found. If the + server returns + NFS4ERR_MOVED + as an error, this + attribute MUST be + supported. + + hidden 25 bool R/W True, if the file is + considered hidden + with respect to the + Windows API. + + homogeneous 26 bool READ True, if this + object's filesystem + is homogeneous, + i.e., are per + filesystem + attributes the same + + + +Shepler, et al. Standards Track [Page 42] + +RFC 3530 NFS version 4 Protocol April 2003 + + + for all filesystem's + objects? + + maxfilesize 27 uint64 READ Maximum supported + file size for the + filesystem of this + object. + + maxlink 28 uint32 READ Maximum number of + links for this + object. + + maxname 29 uint32 READ Maximum filename + size supported for + this object. + + maxread 30 uint64 READ Maximum read size + supported for this + object. + + maxwrite 31 uint64 READ Maximum write size + supported for this + object. This + attribute SHOULD be + supported if the + file is writable. + Lack of this + attribute can + lead to the client + either wasting + bandwidth or not + receiving the best + performance. + + mimetype 32 utf8<> R/W MIME body + type/subtype of this + object. + + mode 33 mode4 R/W UNIX-style mode and + permission bits for + this object. + + no_trunc 34 bool READ True, if a name + longer than name_max + is used, an error be + returned and name is + not truncated. + + + + +Shepler, et al. Standards Track [Page 43] + +RFC 3530 NFS version 4 Protocol April 2003 + + + numlinks 35 uint32 READ Number of hard links + to this object. + + owner 36 utf8<> R/W The string name of + the owner of this + object. + + owner_group 37 utf8<> R/W The string name of + the group ownership + of this object. + + quota_avail_hard 38 uint64 READ For definition see + "Quota Attributes" + section below. + + quota_avail_soft 39 uint64 READ For definition see + "Quota Attributes" + section below. + + quota_used 40 uint64 READ For definition see + "Quota Attributes" + section below. + + rawdev 41 specdata4 READ Raw device + identifier. UNIX + device major/minor + node information. + If the value of + type is not + NF4BLK or NF4CHR, + the value return + SHOULD NOT be + considered useful. + + space_avail 42 uint64 READ Disk space in bytes + available to this + user on the + filesystem + containing this + object - this should + be the smallest + relevant limit. + + space_free 43 uint64 READ Free disk space in + bytes on the + filesystem + containing this + object - this should + + + +Shepler, et al. Standards Track [Page 44] + +RFC 3530 NFS version 4 Protocol April 2003 + + + be the smallest + relevant limit. + + space_total 44 uint64 READ Total disk space in + bytes on the + filesystem + containing this + object. + + space_used 45 uint64 READ Number of filesystem + bytes allocated to + this object. + + system 46 bool R/W True, if this file + is a "system" file + with respect to the + Windows API. + + time_access 47 nfstime4 READ The time of last + access to the object + by a read that was + satisfied by the + server. + + time_access_set 48 settime4 WRITE Set the time of last + access to the + object. SETATTR + use only. + + time_backup 49 nfstime4 R/W The time of last + backup of the + object. + + time_create 50 nfstime4 R/W The time of creation + of the object. This + attribute does not + have any relation to + the traditional UNIX + file attribute + "ctime" or "change + time". + + time_delta 51 nfstime4 READ Smallest useful + server time + granularity. + + + + + + +Shepler, et al. Standards Track [Page 45] + +RFC 3530 NFS version 4 Protocol April 2003 + + + time_metadata 52 nfstime4 READ The time of last + meta-data + modification of the + object. + + time_modify 53 nfstime4 READ The time of last + modification to the + object. + + time_modify_set 54 settime4 WRITE Set the time of last + modification to the + object. SETATTR use + only. + + mounted_on_fileid 55 uint64 READ Like fileid, but if + the target + filehandle is the + root of a filesystem + return the fileid of + the underlying + directory. + +5.7. Time Access + + As defined above, the time_access attribute represents the time of + last access to the object by a read that was satisfied by the server. + The notion of what is an "access" depends on server's operating + environment and/or the server's filesystem semantics. For example, + for servers obeying POSIX semantics, time_access would be updated + only by the READLINK, READ, and READDIR operations and not any of the + operations that modify the content of the object. Of course, setting + the corresponding time_access_set attribute is another way to modify + the time_access attribute. + + Whenever the file object resides on a writable filesystem, the server + should make best efforts to record time_access into stable storage. + However, to mitigate the performance effects of doing so, and most + especially whenever the server is satisfying the read of the object's + content from its cache, the server MAY cache access time updates and + lazily write them to stable storage. It is also acceptable to give + administrators of the server the option to disable time_access + updates. + + + + + + + + + +Shepler, et al. Standards Track [Page 46] + +RFC 3530 NFS version 4 Protocol April 2003 + + +5.8. Interpreting owner and owner_group + + The recommended attributes "owner" and "owner_group" (and also users + and groups within the "acl" attribute) are represented in terms of a + UTF-8 string. To avoid a representation that is tied to a particular + underlying implementation at the client or server, the use of the + UTF-8 string has been chosen. Note that section 6.1 of [RFC2624] + provides additional rationale. It is expected that the client and + server will have their own local representation of owner and + owner_group that is used for local storage or presentation to the end + user. Therefore, it is expected that when these attributes are + transferred between the client and server that the local + representation is translated to a syntax of the form + "user@dns_domain". This will allow for a client and server that do + not use the same local representation the ability to translate to a + common syntax that can be interpreted by both. + + Similarly, security principals may be represented in different ways + by different security mechanisms. Servers normally translate these + representations into a common format, generally that used by local + storage, to serve as a means of identifying the users corresponding + to these security principals. When these local identifiers are + translated to the form of the owner attribute, associated with files + created by such principals they identify, in a common format, the + users associated with each corresponding set of security principals. + + The translation used to interpret owner and group strings is not + specified as part of the protocol. This allows various solutions to + be employed. For example, a local translation table may be consulted + that maps between a numeric id to the user@dns_domain syntax. A name + service may also be used to accomplish the translation. A server may + provide a more general service, not limited by any particular + translation (which would only translate a limited set of possible + strings) by storing the owner and owner_group attributes in local + storage without any translation or it may augment a translation + method by storing the entire string for attributes for which no + translation is available while using the local representation for + those cases in which a translation is available. + + Servers that do not provide support for all possible values of the + owner and owner_group attributes, should return an error + (NFS4ERR_BADOWNER) when a string is presented that has no + translation, as the value to be set for a SETATTR of the owner, + owner_group, or acl attributes. When a server does accept an owner + or owner_group value as valid on a SETATTR (and similarly for the + owner and group strings in an acl), it is promising to return that + same string when a corresponding GETATTR is done. Configuration + changes and ill-constructed name translations (those that contain + + + +Shepler, et al. Standards Track [Page 47] + +RFC 3530 NFS version 4 Protocol April 2003 + + + aliasing) may make that promise impossible to honor. Servers should + make appropriate efforts to avoid a situation in which these + attributes have their values changed when no real change to ownership + has occurred. + + The "dns_domain" portion of the owner string is meant to be a DNS + domain name. For example, user@ietf.org. Servers should accept as + valid a set of users for at least one domain. A server may treat + other domains as having no valid translations. A more general + service is provided when a server is capable of accepting users for + multiple domains, or for all domains, subject to security + constraints. + + In the case where there is no translation available to the client or + server, the attribute value must be constructed without the "@". + Therefore, the absence of the @ from the owner or owner_group + attribute signifies that no translation was available at the sender + and that the receiver of the attribute should not use that string as + a basis for translation into its own internal format. Even though + the attribute value can not be translated, it may still be useful. + In the case of a client, the attribute string may be used for local + display of ownership. + + To provide a greater degree of compatibility with previous versions + of NFS (i.e., v2 and v3), which identified users and groups by 32-bit + unsigned uid's and gid's, owner and group strings that consist of + decimal numeric values with no leading zeros can be given a special + interpretation by clients and servers which choose to provide such + support. The receiver may treat such a user or group string as + representing the same user as would be represented by a v2/v3 uid or + gid having the corresponding numeric value. A server is not + obligated to accept such a string, but may return an NFS4ERR_BADOWNER + instead. To avoid this mechanism being used to subvert user and + group translation, so that a client might pass all of the owners and + groups in numeric form, a server SHOULD return an NFS4ERR_BADOWNER + error when there is a valid translation for the user or owner + designated in this way. In that case, the client must use the + appropriate name@domain string and not the special form for + compatibility. + + The owner string "nobody" may be used to designate an anonymous user, + which will be associated with a file created by a security principal + that cannot be mapped through normal means to the owner attribute. + + + + + + + + +Shepler, et al. Standards Track [Page 48] + +RFC 3530 NFS version 4 Protocol April 2003 + + +5.9. Character Case Attributes + + With respect to the case_insensitive and case_preserving attributes, + each UCS-4 character (which UTF-8 encodes) has a "long descriptive + name" [RFC1345] which may or may not included the word "CAPITAL" or + "SMALL". The presence of SMALL or CAPITAL allows an NFS server to + implement unambiguous and efficient table driven mappings for case + insensitive comparisons, and non-case-preserving storage. For + general character handling and internationalization issues, see the + section "Internationalization". + +5.10. Quota Attributes + + For the attributes related to filesystem quotas, the following + definitions apply: + + quota_avail_soft + The value in bytes which represents the amount of additional + disk space that can be allocated to this file or directory + before the user may reasonably be warned. It is understood + that this space may be consumed by allocations to other files + or directories though there is a rule as to which other files + or directories. + + quota_avail_hard + The value in bytes which represent the amount of additional + disk space beyond the current allocation that can be allocated + to this file or directory before further allocations will be + refused. It is understood that this space may be consumed by + allocations to other files or directories. + + quota_used + The value in bytes which represent the amount of disc space + used by this file or directory and possibly a number of other + similar files or directories, where the set of "similar" meets + at least the criterion that allocating space to any file or + directory in the set will reduce the "quota_avail_hard" of + every other file or directory in the set. + + Note that there may be a number of distinct but overlapping + sets of files or directories for which a quota_used value is + maintained (e.g., "all files with a given owner", "all files + with a given group owner", etc.). + + The server is at liberty to choose any of those sets but should + do so in a repeatable way. The rule may be configured per- + filesystem or may be "choose the set with the smallest quota". + + + + +Shepler, et al. Standards Track [Page 49] + +RFC 3530 NFS version 4 Protocol April 2003 + + +5.11. Access Control Lists + + The NFS version 4 ACL attribute is an array of access control entries + (ACE). Although, the client can read and write the ACL attribute, + the NFSv4 model is the server does all access control based on the + server's interpretation of the ACL. If at any point the client wants + to check access without issuing an operation that modifies or reads + data or metadata, the client can use the OPEN and ACCESS operations + to do so. There are various access control entry types, as defined + in the Section "ACE type". The server is able to communicate which + ACE types are supported by returning the appropriate value within the + aclsupport attribute. Each ACE covers one or more operations on a + file or directory as described in the Section "ACE Access Mask". It + may also contain one or more flags that modify the semantics of the + ACE as defined in the Section "ACE flag". + + The NFS ACE attribute is defined as follows: + + typedef uint32_t acetype4; + typedef uint32_t aceflag4; + typedef uint32_t acemask4; + + struct nfsace4 { + acetype4 type; + aceflag4 flag; + acemask4 access_mask; + utf8str_mixed who; + }; + + To determine if a request succeeds, each nfsace4 entry is processed + in order by the server. Only ACEs which have a "who" that matches + the requester are considered. Each ACE is processed until all of the + bits of the requester's access have been ALLOWED. Once a bit (see + below) has been ALLOWED by an ACCESS_ALLOWED_ACE, it is no longer + considered in the processing of later ACEs. If an ACCESS_DENIED_ACE + is encountered where the requester's access still has unALLOWED bits + in common with the "access_mask" of the ACE, the request is denied. + However, unlike the ALLOWED and DENIED ACE types, the ALARM and AUDIT + ACE types do not affect a requester's access, and instead are for + triggering events as a result of a requester's access attempt. + + Therefore, all AUDIT and ALARM ACEs are processed until end of the + ACL. When the ACL is fully processed, if there are bits in + requester's mask that have not been considered whether the server + allows or denies the access is undefined. If there is a mode + attribute on the file, then this cannot happen, since the mode's + MODE4_*OTH bits will map to EVERYONE@ ACEs that unambiguously specify + the requester's access. + + + +Shepler, et al. Standards Track [Page 50] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The NFS version 4 ACL model is quite rich. Some server platforms may + provide access control functionality that goes beyond the UNIX-style + mode attribute, but which is not as rich as the NFS ACL model. So + that users can take advantage of this more limited functionality, the + server may indicate that it supports ACLs as long as it follows the + guidelines for mapping between its ACL model and the NFS version 4 + ACL model. + + The situation is complicated by the fact that a server may have + multiple modules that enforce ACLs. For example, the enforcement for + NFS version 4 access may be different from the enforcement for local + access, and both may be different from the enforcement for access + through other protocols such as SMB. So it may be useful for a + server to accept an ACL even if not all of its modules are able to + support it. + + The guiding principle in all cases is that the server must not accept + ACLs that appear to make the file more secure than it really is. + +5.11.1. ACE type + + Type Description + _____________________________________________________ + ALLOW Explicitly grants the access defined in + acemask4 to the file or directory. + + DENY Explicitly denies the access defined in + acemask4 to the file or directory. + + AUDIT LOG (system dependent) any access + attempt to a file or directory which + uses any of the access methods specified + in acemask4. + + ALARM Generate a system ALARM (system + dependent) when any access attempt is + made to a file or directory for the + access methods specified in acemask4. + + A server need not support all of the above ACE types. The bitmask + constants used to represent the above definitions within the + + aclsupport attribute are as follows: + + const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; + const ACL4_SUPPORT_DENY_ACL = 0x00000002; + const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; + const ACL4_SUPPORT_ALARM_ACL = 0x00000008; + + + +Shepler, et al. Standards Track [Page 51] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The semantics of the "type" field follow the descriptions provided + above. + + The constants used for the type field (acetype4) are as follows: + + const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; + const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; + const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; + const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; + + Clients should not attempt to set an ACE unless the server claims + support for that ACE type. If the server receives a request to set + an ACE that it cannot store, it MUST reject the request with + NFS4ERR_ATTRNOTSUPP. If the server receives a request to set an ACE + that it can store but cannot enforce, the server SHOULD reject the + request with NFS4ERR_ATTRNOTSUPP. + + Example: suppose a server can enforce NFS ACLs for NFS access but + cannot enforce ACLs for local access. If arbitrary processes can run + on the server, then the server SHOULD NOT indicate ACL support. On + the other hand, if only trusted administrative programs run locally, + then the server may indicate ACL support. + +5.11.2. ACE Access Mask + + The access_mask field contains values based on the following: + + Access Description + _______________________________________________________________ + READ_DATA Permission to read the data of the file + LIST_DIRECTORY Permission to list the contents of a + directory + WRITE_DATA Permission to modify the file's data + ADD_FILE Permission to add a new file to a + directory + APPEND_DATA Permission to append data to a file + ADD_SUBDIRECTORY Permission to create a subdirectory to a + directory + READ_NAMED_ATTRS Permission to read the named attributes + of a file + WRITE_NAMED_ATTRS Permission to write the named attributes + of a file + EXECUTE Permission to execute a file + DELETE_CHILD Permission to delete a file or directory + within a directory + READ_ATTRIBUTES The ability to read basic attributes + (non-acls) of a file + WRITE_ATTRIBUTES Permission to change basic attributes + + + +Shepler, et al. Standards Track [Page 52] + +RFC 3530 NFS version 4 Protocol April 2003 + + + (non-acls) of a file + DELETE Permission to Delete the file + READ_ACL Permission to Read the ACL + WRITE_ACL Permission to Write the ACL + WRITE_OWNER Permission to change the owner + SYNCHRONIZE Permission to access file locally at the + server with synchronous reads and writes + + The bitmask constants used for the access mask field are as follows: + + const ACE4_READ_DATA = 0x00000001; + const ACE4_LIST_DIRECTORY = 0x00000001; + const ACE4_WRITE_DATA = 0x00000002; + const ACE4_ADD_FILE = 0x00000002; + const ACE4_APPEND_DATA = 0x00000004; + const ACE4_ADD_SUBDIRECTORY = 0x00000004; + const ACE4_READ_NAMED_ATTRS = 0x00000008; + const ACE4_WRITE_NAMED_ATTRS = 0x00000010; + const ACE4_EXECUTE = 0x00000020; + const ACE4_DELETE_CHILD = 0x00000040; + const ACE4_READ_ATTRIBUTES = 0x00000080; + const ACE4_WRITE_ATTRIBUTES = 0x00000100; + const ACE4_DELETE = 0x00010000; + const ACE4_READ_ACL = 0x00020000; + const ACE4_WRITE_ACL = 0x00040000; + const ACE4_WRITE_OWNER = 0x00080000; + const ACE4_SYNCHRONIZE = 0x00100000; + + Server implementations need not provide the granularity of control + that is implied by this list of masks. For example, POSIX-based + systems might not distinguish APPEND_DATA (the ability to append to a + file) from WRITE_DATA (the ability to modify existing contents); both + masks would be tied to a single "write" permission. When such a + server returns attributes to the client, it would show both + APPEND_DATA and WRITE_DATA if and only if the write permission is + enabled. + + If a server receives a SETATTR request that it cannot accurately + implement, it should error in the direction of more restricted + access. For example, suppose a server cannot distinguish overwriting + data from appending new data, as described in the previous paragraph. + If a client submits an ACE where APPEND_DATA is set but WRITE_DATA is + not (or vice versa), the server should reject the request with + NFS4ERR_ATTRNOTSUPP. Nonetheless, if the ACE has type DENY, the + server may silently turn on the other bit, so that both APPEND_DATA + and WRITE_DATA are denied. + + + + + +Shepler, et al. Standards Track [Page 53] + +RFC 3530 NFS version 4 Protocol April 2003 + + +5.11.3. ACE flag + + The "flag" field contains values based on the following descriptions. + + ACE4_FILE_INHERIT_ACE + Can be placed on a directory and indicates that this ACE should be + added to each new non-directory file created. + + ACE4_DIRECTORY_INHERIT_ACE + Can be placed on a directory and indicates that this ACE should be + added to each new directory created. + + ACE4_INHERIT_ONLY_ACE + Can be placed on a directory but does not apply to the directory, + only to newly created files/directories as specified by the above + two flags. + + ACE4_NO_PROPAGATE_INHERIT_ACE + Can be placed on a directory. Normally when a new directory is + created and an ACE exists on the parent directory which is marked + ACL4_DIRECTORY_INHERIT_ACE, two ACEs are placed on the new + directory. One for the directory itself and one which is an + inheritable ACE for newly created directories. This flag tells + the server to not place an ACE on the newly created directory + which is inheritable by subdirectories of the created directory. + + ACE4_SUCCESSFUL_ACCESS_ACE_FLAG + + ACL4_FAILED_ACCESS_ACE_FLAG + The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and + ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits relate only to + ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE + (ALARM) ACE types. If during the processing of the file's ACL, + the server encounters an AUDIT or ALARM ACE that matches the + principal attempting the OPEN, the server notes that fact, and the + presence, if any, of the SUCCESS and FAILED flags encountered in + the AUDIT or ALARM ACE. Once the server completes the ACL + processing, and the share reservation processing, and the OPEN + call, it then notes if the OPEN succeeded or failed. If the OPEN + succeeded, and if the SUCCESS flag was set for a matching AUDIT or + ALARM, then the appropriate AUDIT or ALARM event occurs. If the + OPEN failed, and if the FAILED flag was set for the matching AUDIT + or ALARM, then the appropriate AUDIT or ALARM event occurs. + Clearly either or both of the SUCCESS or FAILED can be set, but if + neither is set, the AUDIT or ALARM ACE is not useful. + + + + + + +Shepler, et al. Standards Track [Page 54] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The previously described processing applies to that of the ACCESS + operation as well. The difference being that "success" or + "failure" does not mean whether ACCESS returns NFS4_OK or not. + Success means whether ACCESS returns all requested and supported + bits. Failure means whether ACCESS failed to return a bit that + was requested and supported. + + ACE4_IDENTIFIER_GROUP + Indicates that the "who" refers to a GROUP as defined under UNIX. + + The bitmask constants used for the flag field are as follows: + + const ACE4_FILE_INHERIT_ACE = 0x00000001; + const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; + const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; + const ACE4_INHERIT_ONLY_ACE = 0x00000008; + const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; + const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; + const ACE4_IDENTIFIER_GROUP = 0x00000040; + + A server need not support any of these flags. If the server supports + flags that are similar to, but not exactly the same as, these flags, + the implementation may define a mapping between the protocol-defined + flags and the implementation-defined flags. Again, the guiding + principle is that the file not appear to be more secure than it + really is. + + For example, suppose a client tries to set an ACE with + ACE4_FILE_INHERIT_ACE set but not ACE4_DIRECTORY_INHERIT_ACE. If the + server does not support any form of ACL inheritance, the server + should reject the request with NFS4ERR_ATTRNOTSUPP. If the server + supports a single "inherit ACE" flag that applies to both files and + directories, the server may reject the request (i.e., requiring the + client to set both the file and directory inheritance flags). The + server may also accept the request and silently turn on the + ACE4_DIRECTORY_INHERIT_ACE flag. + +5.11.4. ACE who + + There are several special identifiers ("who") which need to be + understood universally, rather than in the context of a particular + DNS domain. Some of these identifiers cannot be understood when an + NFS client accesses the server, but have meaning when a local process + accesses the file. The ability to display and modify these + permissions is permitted over NFS, even if none of the access methods + on the server understands the identifiers. + + + + + +Shepler, et al. Standards Track [Page 55] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Who Description + _______________________________________________________________ + "OWNER" The owner of the file. + "GROUP" The group associated with the file. + "EVERYONE" The world. + "INTERACTIVE" Accessed from an interactive terminal. + "NETWORK" Accessed via the network. + "DIALUP" Accessed as a dialup user to the server. + "BATCH" Accessed from a batch job. + "ANONYMOUS" Accessed without any authentication. + "AUTHENTICATED" Any authenticated user (opposite of + ANONYMOUS) + "SERVICE" Access from a system service. + + To avoid conflict, these special identifiers are distinguish by an + appended "@" and should appear in the form "xxxx@" (note: no domain + name after the "@"). For example: ANONYMOUS@. + +5.11.5. Mode Attribute + + The NFS version 4 mode attribute is based on the UNIX mode bits. The + following bits are defined: + + const MODE4_SUID = 0x800; /* set user id on execution */ + const MODE4_SGID = 0x400; /* set group id on execution */ + const MODE4_SVTX = 0x200; /* save text even after use */ + const MODE4_RUSR = 0x100; /* read permission: owner */ + const MODE4_WUSR = 0x080; /* write permission: owner */ + const MODE4_XUSR = 0x040; /* execute permission: owner */ + const MODE4_RGRP = 0x020; /* read permission: group */ + const MODE4_WGRP = 0x010; /* write permission: group */ + const MODE4_XGRP = 0x008; /* execute permission: group */ + const MODE4_ROTH = 0x004; /* read permission: other */ + const MODE4_WOTH = 0x002; /* write permission: other */ + const MODE4_XOTH = 0x001; /* execute permission: other */ + + Bits MODE4_RUSR, MODE4_WUSR, and MODE4_XUSR apply to the principal + identified in the owner attribute. Bits MODE4_RGRP, MODE4_WGRP, and + + MODE4_XGRP apply to the principals identified in the owner_group + attribute. Bits MODE4_ROTH, MODE4_WOTH, MODE4_XOTH apply to any + principal that does not match that in the owner group, and does not + have a group matching that of the owner_group attribute. + + The remaining bits are not defined by this protocol and MUST NOT be + used. The minor version mechanism must be used to define further bit + usage. + + + + +Shepler, et al. Standards Track [Page 56] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Note that in UNIX, if a file has the MODE4_SGID bit set and no + MODE4_XGRP bit set, then READ and WRITE must use mandatory file + locking. + +5.11.6. Mode and ACL Attribute + + The server that supports both mode and ACL must take care to + synchronize the MODE4_*USR, MODE4_*GRP, and MODE4_*OTH bits with the + ACEs which have respective who fields of "OWNER@", "GROUP@", and + "EVERYONE@" so that the client can see semantically equivalent access + permissions exist whether the client asks for owner, owner_group and + mode attributes, or for just the ACL. + + Because the mode attribute includes bits (e.g., MODE4_SVTX) that have + nothing to do with ACL semantics, it is permitted for clients to + specify both the ACL attribute and mode in the same SETATTR + operation. However, because there is no prescribed order for + processing the attributes in a SETATTR, the client must ensure that + ACL attribute, if specified without mode, would produce the desired + mode bits, and conversely, the mode attribute if specified without + ACL, would produce the desired "OWNER@", "GROUP@", and "EVERYONE@" + ACEs. + +5.11.7. mounted_on_fileid + + UNIX-based operating environments connect a filesystem into the + namespace by connecting (mounting) the filesystem onto the existing + file object (the mount point, usually a directory) of an existing + filesystem. When the mount point's parent directory is read via an + API like readdir(), the return results are directory entries, each + with a component name and a fileid. The fileid of the mount point's + directory entry will be different from the fileid that the stat() + system call returns. The stat() system call is returning the fileid + of the root of the mounted filesystem, whereas readdir() is returning + the fileid stat() would have returned before any filesystems were + mounted on the mount point. + + Unlike NFS version 3, NFS version 4 allows a client's LOOKUP request + to cross other filesystems. The client detects the filesystem + crossing whenever the filehandle argument of LOOKUP has an fsid + attribute different from that of the filehandle returned by LOOKUP. + A UNIX-based client will consider this a "mount point crossing". + UNIX has a legacy scheme for allowing a process to determine its + current working directory. This relies on readdir() of a mount + point's parent and stat() of the mount point returning fileids as + previously described. The mounted_on_fileid attribute corresponds to + the fileid that readdir() would have returned as described + previously. + + + +Shepler, et al. Standards Track [Page 57] + +RFC 3530 NFS version 4 Protocol April 2003 + + + While the NFS version 4 client could simply fabricate a fileid + corresponding to what mounted_on_fileid provides (and if the server + does not support mounted_on_fileid, the client has no choice), there + is a risk that the client will generate a fileid that conflicts with + one that is already assigned to another object in the filesystem. + Instead, if the server can provide the mounted_on_fileid, the + potential for client operational problems in this area is eliminated. + + If the server detects that there is no mounted point at the target + file object, then the value for mounted_on_fileid that it returns is + the same as that of the fileid attribute. + + The mounted_on_fileid attribute is RECOMMENDED, so the server SHOULD + provide it if possible, and for a UNIX-based server, this is + straightforward. Usually, mounted_on_fileid will be requested during + a READDIR operation, in which case it is trivial (at least for UNIX- + based servers) to return mounted_on_fileid since it is equal to the + fileid of a directory entry returned by readdir(). If + mounted_on_fileid is requested in a GETATTR operation, the server + should obey an invariant that has it returning a value that is equal + to the file object's entry in the object's parent directory, i.e., + what readdir() would have returned. Some operating environments + allow a series of two or more filesystems to be mounted onto a single + mount point. In this case, for the server to obey the aforementioned + invariant, it will need to find the base mount point, and not the + intermediate mount points. + +6. Filesystem Migration and Replication + + With the use of the recommended attribute "fs_locations", the NFS + version 4 server has a method of providing filesystem migration or + replication services. For the purposes of migration and replication, + a filesystem will be defined as all files that share a given fsid + (both major and minor values are the same). + + The fs_locations attribute provides a list of filesystem locations. + These locations are specified by providing the server name (either + DNS domain or IP address) and the path name representing the root of + the filesystem. Depending on the type of service being provided, the + list will provide a new location or a set of alternate locations for + the filesystem. The client will use this information to redirect its + requests to the new server. + +6.1. Replication + + It is expected that filesystem replication will be used in the case + of read-only data. Typically, the filesystem will be replicated on + two or more servers. The fs_locations attribute will provide the + + + +Shepler, et al. Standards Track [Page 58] + +RFC 3530 NFS version 4 Protocol April 2003 + + + list of these locations to the client. On first access of the + filesystem, the client should obtain the value of the fs_locations + attribute. If, in the future, the client finds the server + unresponsive, the client may attempt to use another server specified + by fs_locations. + + If applicable, the client must take the appropriate steps to recover + valid filehandles from the new server. This is described in more + detail in the following sections. + +6.2. Migration + + Filesystem migration is used to move a filesystem from one server to + another. Migration is typically used for a filesystem that is + writable and has a single copy. The expected use of migration is for + load balancing or general resource reallocation. The protocol does + not specify how the filesystem will be moved between servers. This + server-to-server transfer mechanism is left to the server + implementor. However, the method used to communicate the migration + event between client and server is specified here. + + Once the servers participating in the migration have completed the + move of the filesystem, the error NFS4ERR_MOVED will be returned for + subsequent requests received by the original server. The + NFS4ERR_MOVED error is returned for all operations except PUTFH and + GETATTR. Upon receiving the NFS4ERR_MOVED error, the client will + obtain the value of the fs_locations attribute. The client will then + use the contents of the attribute to redirect its requests to the + specified server. To facilitate the use of GETATTR, operations such + as PUTFH must also be accepted by the server for the migrated file + system's filehandles. Note that if the server returns NFS4ERR_MOVED, + the server MUST support the fs_locations attribute. + + If the client requests more attributes than just fs_locations, the + server may return fs_locations only. This is to be expected since + the server has migrated the filesystem and may not have a method of + obtaining additional attribute data. + + The server implementor needs to be careful in developing a migration + solution. The server must consider all of the state information + clients may have outstanding at the server. This includes but is not + limited to locking/share state, delegation state, and asynchronous + file writes which are represented by WRITE and COMMIT verifiers. The + server should strive to minimize the impact on its clients during and + after the migration process. + + + + + + +Shepler, et al. Standards Track [Page 59] + +RFC 3530 NFS version 4 Protocol April 2003 + + +6.3. Interpretation of the fs_locations Attribute + + The fs_location attribute is structured in the following way: + + struct fs_location { + utf8str_cis server<>; + pathname4 rootpath; + }; + + struct fs_locations { + pathname4 fs_root; + fs_location locations<>; + }; + + The fs_location struct is used to represent the location of a + filesystem by providing a server name and the path to the root of the + filesystem. For a multi-homed server or a set of servers that use + the same rootpath, an array of server names may be provided. An + entry in the server array is an UTF8 string and represents one of a + traditional DNS host name, IPv4 address, or IPv6 address. It is not + a requirement that all servers that share the same rootpath be listed + in one fs_location struct. The array of server names is provided for + convenience. Servers that share the same rootpath may also be listed + in separate fs_location entries in the fs_locations attribute. + + The fs_locations struct and attribute then contains an array of + locations. Since the name space of each server may be constructed + differently, the "fs_root" field is provided. The path represented + by fs_root represents the location of the filesystem in the server's + name space. Therefore, the fs_root path is only associated with the + server from which the fs_locations attribute was obtained. The + fs_root path is meant to aid the client in locating the filesystem at + the various servers listed. + + As an example, there is a replicated filesystem located at two + servers (servA and servB). At servA the filesystem is located at + path "/a/b/c". At servB the filesystem is located at path "/x/y/z". + In this example the client accesses the filesystem first at servA + with a multi-component lookup path of "/a/b/c/d". Since the client + used a multi-component lookup to obtain the filehandle at "/a/b/c/d", + it is unaware that the filesystem's root is located in servA's name + space at "/a/b/c". When the client switches to servB, it will need + to determine that the directory it first referenced at servA is now + represented by the path "/x/y/z/d" on servB. To facilitate this, the + fs_locations attribute provided by servA would have a fs_root value + of "/a/b/c" and two entries in fs_location. One entry in fs_location + will be for itself (servA) and the other will be for servB with a + + + + +Shepler, et al. Standards Track [Page 60] + +RFC 3530 NFS version 4 Protocol April 2003 + + + path of "/x/y/z". With this information, the client is able to + substitute "/x/y/z" for the "/a/b/c" at the beginning of its access + path and construct "/x/y/z/d" to use for the new server. + + See the section "Security Considerations" for a discussion on the + recommendations for the security flavor to be used by any GETATTR + operation that requests the "fs_locations" attribute. + +6.4. Filehandle Recovery for Migration or Replication + + Filehandles for filesystems that are replicated or migrated generally + have the same semantics as for filesystems that are not replicated or + migrated. For example, if a filesystem has persistent filehandles + and it is migrated to another server, the filehandle values for the + filesystem will be valid at the new server. + + For volatile filehandles, the servers involved likely do not have a + mechanism to transfer filehandle format and content between + themselves. Therefore, a server may have difficulty in determining + if a volatile filehandle from an old server should return an error of + NFS4ERR_FHEXPIRED. Therefore, the client is informed, with the use + of the fh_expire_type attribute, whether volatile filehandles will + expire at the migration or replication event. If the bit + FH4_VOL_MIGRATION is set in the fh_expire_type attribute, the client + must treat the volatile filehandle as if the server had returned the + NFS4ERR_FHEXPIRED error. At the migration or replication event in + the presence of the FH4_VOL_MIGRATION bit, the client will not + present the original or old volatile filehandle to the new server. + The client will start its communication with the new server by + recovering its filehandles using the saved file names. + +7. NFS Server Name Space + +7.1. Server Exports + + On a UNIX server the name space describes all the files reachable by + pathnames under the root directory or "/". On a Windows NT server + the name space constitutes all the files on disks named by mapped + disk letters. NFS server administrators rarely make the entire + server's filesystem name space available to NFS clients. More often + portions of the name space are made available via an "export" + feature. In previous versions of the NFS protocol, the root + filehandle for each export is obtained through the MOUNT protocol; + the client sends a string that identifies the export of name space + and the server returns the root filehandle for it. The MOUNT + protocol supports an EXPORTS procedure that will enumerate the + server's exports. + + + + +Shepler, et al. Standards Track [Page 61] + +RFC 3530 NFS version 4 Protocol April 2003 + + +7.2. Browsing Exports + + The NFS version 4 protocol provides a root filehandle that clients + can use to obtain filehandles for these exports via a multi-component + LOOKUP. A common user experience is to use a graphical user + interface (perhaps a file "Open" dialog window) to find a file via + progressive browsing through a directory tree. The client must be + able to move from one export to another export via single-component, + progressive LOOKUP operations. + + This style of browsing is not well supported by the NFS version 2 and + 3 protocols. The client expects all LOOKUP operations to remain + within a single server filesystem. For example, the device attribute + will not change. This prevents a client from taking name space paths + that span exports. + + An automounter on the client can obtain a snapshot of the server's + name space using the EXPORTS procedure of the MOUNT protocol. If it + understands the server's pathname syntax, it can create an image of + the server's name space on the client. The parts of the name space + that are not exported by the server are filled in with a "pseudo + filesystem" that allows the user to browse from one mounted + filesystem to another. There is a drawback to this representation of + the server's name space on the client: it is static. If the server + administrator adds a new export the client will be unaware of it. + +7.3. Server Pseudo Filesystem + + NFS version 4 servers avoid this name space inconsistency by + presenting all the exports within the framework of a single server + name space. An NFS version 4 client uses LOOKUP and READDIR + operations to browse seamlessly from one export to another. Portions + of the server name space that are not exported are bridged via a + "pseudo filesystem" that provides a view of exported directories + only. A pseudo filesystem has a unique fsid and behaves like a + normal, read only filesystem. + + Based on the construction of the server's name space, it is possible + that multiple pseudo filesystems may exist. For example, + + /a pseudo filesystem + /a/b real filesystem + /a/b/c pseudo filesystem + /a/b/c/d real filesystem + + Each of the pseudo filesystems are considered separate entities and + therefore will have a unique fsid. + + + + +Shepler, et al. Standards Track [Page 62] + +RFC 3530 NFS version 4 Protocol April 2003 + + +7.4. Multiple Roots + + The DOS and Windows operating environments are sometimes described as + having "multiple roots". Filesystems are commonly represented as + disk letters. MacOS represents filesystems as top level names. NFS + version 4 servers for these platforms can construct a pseudo file + system above these root names so that disk letters or volume names + are simply directory names in the pseudo root. + +7.5. Filehandle Volatility + + The nature of the server's pseudo filesystem is that it is a logical + representation of filesystem(s) available from the server. + Therefore, the pseudo filesystem is most likely constructed + dynamically when the server is first instantiated. It is expected + that the pseudo filesystem may not have an on disk counterpart from + which persistent filehandles could be constructed. Even though it is + preferable that the server provide persistent filehandles for the + pseudo filesystem, the NFS client should expect that pseudo file + system filehandles are volatile. This can be confirmed by checking + the associated "fh_expire_type" attribute for those filehandles in + question. If the filehandles are volatile, the NFS client must be + prepared to recover a filehandle value (e.g., with a multi-component + LOOKUP) when receiving an error of NFS4ERR_FHEXPIRED. + +7.6. Exported Root + + If the server's root filesystem is exported, one might conclude that + a pseudo-filesystem is not needed. This would be wrong. Assume the + following filesystems on a server: + + / disk1 (exported) + /a disk2 (not exported) + /a/b disk3 (exported) + + Because disk2 is not exported, disk3 cannot be reached with simple + LOOKUPs. The server must bridge the gap with a pseudo-filesystem. + +7.7. Mount Point Crossing + + The server filesystem environment may be constructed in such a way + that one filesystem contains a directory which is 'covered' or + mounted upon by a second filesystem. For example: + + /a/b (filesystem 1) + /a/b/c/d (filesystem 2) + + + + + +Shepler, et al. Standards Track [Page 63] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The pseudo filesystem for this server may be constructed to look + like: + + / (place holder/not exported) + /a/b (filesystem 1) + /a/b/c/d (filesystem 2) + + It is the server's responsibility to present the pseudo filesystem + that is complete to the client. If the client sends a lookup request + for the path "/a/b/c/d", the server's response is the filehandle of + the filesystem "/a/b/c/d". In previous versions of the NFS protocol, + the server would respond with the filehandle of directory "/a/b/c/d" + within the filesystem "/a/b". + + The NFS client will be able to determine if it crosses a server mount + point by a change in the value of the "fsid" attribute. + +7.8. Security Policy and Name Space Presentation + + The application of the server's security policy needs to be carefully + considered by the implementor. One may choose to limit the + viewability of portions of the pseudo filesystem based on the + server's perception of the client's ability to authenticate itself + properly. However, with the support of multiple security mechanisms + and the ability to negotiate the appropriate use of these mechanisms, + the server is unable to properly determine if a client will be able + to authenticate itself. If, based on its policies, the server + chooses to limit the contents of the pseudo filesystem, the server + may effectively hide filesystems from a client that may otherwise + have legitimate access. + + As suggested practice, the server should apply the security policy of + a shared resource in the server's namespace to the components of the + resource's ancestors. For example: + + / + /a/b + /a/b/c + + The /a/b/c directory is a real filesystem and is the shared resource. + The security policy for /a/b/c is Kerberos with integrity. The + server should apply the same security policy to /, /a, and /a/b. + This allows for the extension of the protection of the server's + namespace to the ancestors of the real shared resource. + + + + + + + +Shepler, et al. Standards Track [Page 64] + +RFC 3530 NFS version 4 Protocol April 2003 + + + For the case of the use of multiple, disjoint security mechanisms in + the server's resources, the security for a particular object in the + server's namespace should be the union of all security mechanisms of + all direct descendants. + +8. File Locking and Share Reservations + + Integrating locking into the NFS protocol necessarily causes it to be + stateful. With the inclusion of share reservations the protocol + becomes substantially more dependent on state than the traditional + combination of NFS and NLM [XNFS]. There are three components to + making this state manageable: + + o Clear division between client and server + + o Ability to reliably detect inconsistency in state between client + and server + + o Simple and robust recovery mechanisms + + In this model, the server owns the state information. The client + communicates its view of this state to the server as needed. The + client is also able to detect inconsistent state before modifying a + file. + + To support Win32 share reservations it is necessary to atomically + OPEN or CREATE files. Having a separate share/unshare operation + would not allow correct implementation of the Win32 OpenFile API. In + order to correctly implement share semantics, the previous NFS + protocol mechanisms used when a file is opened or created (LOOKUP, + CREATE, ACCESS) need to be replaced. The NFS version 4 protocol has + an OPEN operation that subsumes the NFS version 3 methodology of + LOOKUP, CREATE, and ACCESS. However, because many operations require + a filehandle, the traditional LOOKUP is preserved to map a file name + to filehandle without establishing state on the server. The policy + of granting access or modifying files is managed by the server based + on the client's state. These mechanisms can implement policy ranging + from advisory only locking to full mandatory locking. + +8.1. Locking + + It is assumed that manipulating a lock is rare when compared to READ + and WRITE operations. It is also assumed that crashes and network + partitions are relatively rare. Therefore it is important that the + READ and WRITE operations have a lightweight mechanism to indicate if + they possess a held lock. A lock request contains the heavyweight + information required to establish a lock and uniquely define the lock + owner. + + + +Shepler, et al. Standards Track [Page 65] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The following sections describe the transition from the heavy weight + information to the eventual stateid used for most client and server + locking and lease interactions. + +8.1.1. Client ID + + For each LOCK request, the client must identify itself to the server. + + This is done in such a way as to allow for correct lock + identification and crash recovery. A sequence of a SETCLIENTID + operation followed by a SETCLIENTID_CONFIRM operation is required to + establish the identification onto the server. Establishment of + identification by a new incarnation of the client also has the effect + of immediately breaking any leased state that a previous incarnation + of the client might have had on the server, as opposed to forcing the + new client incarnation to wait for the leases to expire. Breaking + the lease state amounts to the server removing all lock, share + reservation, and, where the server is not supporting the + CLAIM_DELEGATE_PREV claim type, all delegation state associated with + same client with the same identity. For discussion of delegation + state recovery, see the section "Delegation Recovery". + + Client identification is encapsulated in the following structure: + + struct nfs_client_id4 { + verifier4 verifier; + opaque id; + }; + + The first field, verifier is a client incarnation verifier that is + used to detect client reboots. Only if the verifier is different + from that which the server has previously recorded the client (as + identified by the second field of the structure, id) does the server + start the process of canceling the client's leased state. + + The second field, id is a variable length string that uniquely + defines the client. + + There are several considerations for how the client generates the id + string: + + o The string should be unique so that multiple clients do not + present the same string. The consequences of two clients + presenting the same string range from one client getting an error + to one client having its leased state abruptly and unexpectedly + canceled. + + + + + +Shepler, et al. Standards Track [Page 66] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o The string should be selected so the subsequent incarnations + (e.g., reboots) of the same client cause the client to present the + same string. The implementor is cautioned against an approach + that requires the string to be recorded in a local file because + this precludes the use of the implementation in an environment + where there is no local disk and all file access is from an NFS + version 4 server. + + o The string should be different for each server network address + that the client accesses, rather than common to all server network + addresses. The reason is that it may not be possible for the + client to tell if the same server is listening on multiple network + addresses. If the client issues SETCLIENTID with the same id + string to each network address of such a server, the server will + think it is the same client, and each successive SETCLIENTID will + cause the server to begin the process of removing the client's + previous leased state. + + o The algorithm for generating the string should not assume that the + client's network address won't change. This includes changes + between client incarnations and even changes while the client is + stilling running in its current incarnation. This means that if + the client includes just the client's and server's network address + in the id string, there is a real risk, after the client gives up + the network address, that another client, using a similar + algorithm for generating the id string, will generate a + conflicting id string. + + Given the above considerations, an example of a well generated id + string is one that includes: + + o The server's network address. + + o The client's network address. + + o For a user level NFS version 4 client, it should contain + additional information to distinguish the client from other user + level clients running on the same host, such as a process id or + other unique sequence. + + o Additional information that tends to be unique, such as one or + more of: + + - The client machine's serial number (for privacy reasons, it is + best to perform some one way function on the serial number). + + - A MAC address. + + + + +Shepler, et al. Standards Track [Page 67] + +RFC 3530 NFS version 4 Protocol April 2003 + + + - The timestamp of when the NFS version 4 software was first + installed on the client (though this is subject to the + previously mentioned caution about using information that is + stored in a file, because the file might only be accessible + over NFS version 4). + + - A true random number. However since this number ought to be + the same between client incarnations, this shares the same + problem as that of the using the timestamp of the software + installation. + + As a security measure, the server MUST NOT cancel a client's leased + state if the principal established the state for a given id string is + not the same as the principal issuing the SETCLIENTID. + + Note that SETCLIENTID and SETCLIENTID_CONFIRM has a secondary purpose + of establishing the information the server needs to make callbacks to + the client for purpose of supporting delegations. It is permitted to + change this information via SETCLIENTID and SETCLIENTID_CONFIRM + within the same incarnation of the client without removing the + client's leased state. + + Once a SETCLIENTID and SETCLIENTID_CONFIRM sequence has successfully + completed, the client uses the shorthand client identifier, of type + clientid4, instead of the longer and less compact nfs_client_id4 + structure. This shorthand client identifier (a clientid) is assigned + by the server and should be chosen so that it will not conflict with + a clientid previously assigned by the server. This applies across + server restarts or reboots. When a clientid is presented to a server + and that clientid is not recognized, as would happen after a server + reboot, the server will reject the request with the error + NFS4ERR_STALE_CLIENTID. When this happens, the client must obtain a + new clientid by use of the SETCLIENTID operation and then proceed to + any other necessary recovery for the server reboot case (See the + section "Server Failure and Recovery"). + + The client must also employ the SETCLIENTID operation when it + receives a NFS4ERR_STALE_STATEID error using a stateid derived from + its current clientid, since this also indicates a server reboot which + has invalidated the existing clientid (see the next section + "lock_owner and stateid Definition" for details). + + See the detailed descriptions of SETCLIENTID and SETCLIENTID_CONFIRM + for a complete specification of the operations. + + + + + + + +Shepler, et al. Standards Track [Page 68] + +RFC 3530 NFS version 4 Protocol April 2003 + + +8.1.2. Server Release of Clientid + + If the server determines that the client holds no associated state + for its clientid, the server may choose to release the clientid. The + server may make this choice for an inactive client so that resources + are not consumed by those intermittently active clients. If the + client contacts the server after this release, the server must ensure + the client receives the appropriate error so that it will use the + SETCLIENTID/SETCLIENTID_CONFIRM sequence to establish a new identity. + It should be clear that the server must be very hesitant to release a + clientid since the resulting work on the client to recover from such + an event will be the same burden as if the server had failed and + restarted. Typically a server would not release a clientid unless + there had been no activity from that client for many minutes. + + Note that if the id string in a SETCLIENTID request is properly + constructed, and if the client takes care to use the same principal + for each successive use of SETCLIENTID, then, barring an active + denial of service attack, NFS4ERR_CLID_INUSE should never be + returned. + + However, client bugs, server bugs, or perhaps a deliberate change of + the principal owner of the id string (such as the case of a client + that changes security flavors, and under the new flavor, there is no + mapping to the previous owner) will in rare cases result in + NFS4ERR_CLID_INUSE. + + In that event, when the server gets a SETCLIENTID for a client id + that currently has no state, or it has state, but the lease has + expired, rather than returning NFS4ERR_CLID_INUSE, the server MUST + allow the SETCLIENTID, and confirm the new clientid if followed by + the appropriate SETCLIENTID_CONFIRM. + +8.1.3. lock_owner and stateid Definition + + When requesting a lock, the client must present to the server the + clientid and an identifier for the owner of the requested lock. + These two fields are referred to as the lock_owner and the definition + of those fields are: + + o A clientid returned by the server as part of the client's use of + the SETCLIENTID operation. + + o A variable length opaque array used to uniquely define the owner + of a lock managed by the client. + + This may be a thread id, process id, or other unique value. + + + + +Shepler, et al. Standards Track [Page 69] + +RFC 3530 NFS version 4 Protocol April 2003 + + + When the server grants the lock, it responds with a unique stateid. + The stateid is used as a shorthand reference to the lock_owner, since + the server will be maintaining the correspondence between them. + + The server is free to form the stateid in any manner that it chooses + as long as it is able to recognize invalid and out-of-date stateids. + This requirement includes those stateids generated by earlier + instances of the server. From this, the client can be properly + notified of a server restart. This notification will occur when the + client presents a stateid to the server from a previous + instantiation. + + The server must be able to distinguish the following situations and + return the error as specified: + + o The stateid was generated by an earlier server instance (i.e., + before a server reboot). The error NFS4ERR_STALE_STATEID should + be returned. + + o The stateid was generated by the current server instance but the + stateid no longer designates the current locking state for the + lockowner-file pair in question (i.e., one or more locking + operations has occurred). The error NFS4ERR_OLD_STATEID should be + returned. + + This error condition will only occur when the client issues a + locking request which changes a stateid while an I/O request that + uses that stateid is outstanding. + + o The stateid was generated by the current server instance but the + stateid does not designate a locking state for any active + lockowner-file pair. The error NFS4ERR_BAD_STATEID should be + returned. + + This error condition will occur when there has been a logic error + on the part of the client or server. This should not happen. + + One mechanism that may be used to satisfy these requirements is for + the server to, + + o divide the "other" field of each stateid into two fields: + + - A server verifier which uniquely designates a particular server + instantiation. + + - An index into a table of locking-state structures. + + + + + +Shepler, et al. Standards Track [Page 70] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o utilize the "seqid" field of each stateid, such that seqid is + monotonically incremented for each stateid that is associated with + the same index into the locking-state table. + + By matching the incoming stateid and its field values with the state + held at the server, the server is able to easily determine if a + stateid is valid for its current instantiation and state. If the + stateid is not valid, the appropriate error can be supplied to the + client. + +8.1.4. Use of the stateid and Locking + + All READ, WRITE and SETATTR operations contain a stateid. For the + purposes of this section, SETATTR operations which change the size + attribute of a file are treated as if they are writing the area + between the old and new size (i.e., the range truncated or added to + the file by means of the SETATTR), even where SETATTR is not + explicitly mentioned in the text. + + If the lock_owner performs a READ or WRITE in a situation in which it + has established a lock or share reservation on the server (any OPEN + constitutes a share reservation) the stateid (previously returned by + the server) must be used to indicate what locks, including both + record locks and share reservations, are held by the lockowner. If + no state is established by the client, either record lock or share + reservation, a stateid of all bits 0 is used. Regardless whether a + stateid of all bits 0, or a stateid returned by the server is used, + if there is a conflicting share reservation or mandatory record lock + held on the file, the server MUST refuse to service the READ or WRITE + operation. + + Share reservations are established by OPEN operations and by their + nature are mandatory in that when the OPEN denies READ or WRITE + operations, that denial results in such operations being rejected + with error NFS4ERR_LOCKED. Record locks may be implemented by the + server as either mandatory or advisory, or the choice of mandatory or + advisory behavior may be determined by the server on the basis of the + file being accessed (for example, some UNIX-based servers support a + "mandatory lock bit" on the mode attribute such that if set, record + locks are required on the file before I/O is possible). When record + locks are advisory, they only prevent the granting of conflicting + lock requests and have no effect on READs or WRITEs. Mandatory + record locks, however, prevent conflicting I/O operations. When they + are attempted, they are rejected with NFS4ERR_LOCKED. When the + client gets NFS4ERR_LOCKED on a file it knows it has the proper share + reservation for, it will need to issue a LOCK request on the region + + + + + +Shepler, et al. Standards Track [Page 71] + +RFC 3530 NFS version 4 Protocol April 2003 + + + of the file that includes the region the I/O was to be performed on, + with an appropriate locktype (i.e., READ*_LT for a READ operation, + WRITE*_LT for a WRITE operation). + + With NFS version 3, there was no notion of a stateid so there was no + way to tell if the application process of the client sending the READ + or WRITE operation had also acquired the appropriate record lock on + the file. Thus there was no way to implement mandatory locking. + With the stateid construct, this barrier has been removed. + + Note that for UNIX environments that support mandatory file locking, + the distinction between advisory and mandatory locking is subtle. In + fact, advisory and mandatory record locks are exactly the same in so + far as the APIs and requirements on implementation. If the mandatory + lock attribute is set on the file, the server checks to see if the + lockowner has an appropriate shared (read) or exclusive (write) + record lock on the region it wishes to read or write to. If there is + no appropriate lock, the server checks if there is a conflicting lock + (which can be done by attempting to acquire the conflicting lock on + the behalf of the lockowner, and if successful, release the lock + after the READ or WRITE is done), and if there is, the server returns + NFS4ERR_LOCKED. + + For Windows environments, there are no advisory record locks, so the + server always checks for record locks during I/O requests. + + Thus, the NFS version 4 LOCK operation does not need to distinguish + between advisory and mandatory record locks. It is the NFS version 4 + server's processing of the READ and WRITE operations that introduces + the distinction. + + Every stateid other than the special stateid values noted in this + section, whether returned by an OPEN-type operation (i.e., OPEN, + OPEN_DOWNGRADE), or by a LOCK-type operation (i.e., LOCK or LOCKU), + defines an access mode for the file (i.e., READ, WRITE, or READ- + WRITE) as established by the original OPEN which began the stateid + sequence, and as modified by subsequent OPENs and OPEN_DOWNGRADEs + within that stateid sequence. When a READ, WRITE, or SETATTR which + specifies the size attribute, is done, the operation is subject to + checking against the access mode to verify that the operation is + appropriate given the OPEN with which the operation is associated. + + In the case of WRITE-type operations (i.e., WRITEs and SETATTRs which + set size), the server must verify that the access mode allows writing + and return an NFS4ERR_OPENMODE error if it does not. In the case, of + READ, the server may perform the corresponding check on the access + mode, or it may choose to allow READ on opens for WRITE only, to + accommodate clients whose write implementation may unavoidably do + + + +Shepler, et al. Standards Track [Page 72] + +RFC 3530 NFS version 4 Protocol April 2003 + + + reads (e.g., due to buffer cache constraints). However, even if + READs are allowed in these circumstances, the server MUST still check + for locks that conflict with the READ (e.g., another open specify + denial of READs). Note that a server which does enforce the access + mode check on READs need not explicitly check for conflicting share + reservations since the existence of OPEN for read access guarantees + that no conflicting share reservation can exist. + + A stateid of all bits 1 (one) MAY allow READ operations to bypass + locking checks at the server. However, WRITE operations with a + stateid with bits all 1 (one) MUST NOT bypass locking checks and are + treated exactly the same as if a stateid of all bits 0 were used. + + A lock may not be granted while a READ or WRITE operation using one + of the special stateids is being performed and the range of the lock + request conflicts with the range of the READ or WRITE operation. For + the purposes of this paragraph, a conflict occurs when a shared lock + is requested and a WRITE operation is being performed, or an + exclusive lock is requested and either a READ or a WRITE operation is + being performed. A SETATTR that sets size is treated similarly to a + WRITE as discussed above. + +8.1.5. Sequencing of Lock Requests + + Locking is different than most NFS operations as it requires "at- + most-one" semantics that are not provided by ONCRPC. ONCRPC over a + reliable transport is not sufficient because a sequence of locking + requests may span multiple TCP connections. In the face of + retransmission or reordering, lock or unlock requests must have a + well defined and consistent behavior. To accomplish this, each lock + request contains a sequence number that is a consecutively increasing + integer. Different lock_owners have different sequences. The server + maintains the last sequence number (L) received and the response that + was returned. The first request issued for any given lock_owner is + issued with a sequence number of zero. + + Note that for requests that contain a sequence number, for each + lock_owner, there should be no more than one outstanding request. + + If a request (r) with a previous sequence number (r < L) is received, + it is rejected with the return of error NFS4ERR_BAD_SEQID. Given a + properly-functioning client, the response to (r) must have been + received before the last request (L) was sent. If a duplicate of + last request (r == L) is received, the stored response is returned. + If a request beyond the next sequence (r == L + 2) is received, it is + rejected with the return of error NFS4ERR_BAD_SEQID. Sequence + history is reinitialized whenever the SETCLIENTID/SETCLIENTID_CONFIRM + sequence changes the client verifier. + + + +Shepler, et al. Standards Track [Page 73] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Since the sequence number is represented with an unsigned 32-bit + integer, the arithmetic involved with the sequence number is mod + 2^32. For an example of modulo arithmetic involving sequence numbers + see [RFC793]. + + It is critical the server maintain the last response sent to the + client to provide a more reliable cache of duplicate non-idempotent + requests than that of the traditional cache described in [Juszczak]. + The traditional duplicate request cache uses a least recently used + algorithm for removing unneeded requests. However, the last lock + request and response on a given lock_owner must be cached as long as + the lock state exists on the server. + + The client MUST monotonically increment the sequence number for the + CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE + operations. This is true even in the event that the previous + operation that used the sequence number received an error. The only + exception to this rule is if the previous operation received one of + the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID, + NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR, + NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE. + +8.1.6. Recovery from Replayed Requests + + As described above, the sequence number is per lock_owner. As long + as the server maintains the last sequence number received and follows + the methods described above, there are no risks of a Byzantine router + re-sending old requests. The server need only maintain the + (lock_owner, sequence number) state as long as there are open files + or closed files with locks outstanding. + + LOCK, LOCKU, OPEN, OPEN_DOWNGRADE, and CLOSE each contain a sequence + number and therefore the risk of the replay of these operations + resulting in undesired effects is non-existent while the server + maintains the lock_owner state. + +8.1.7. Releasing lock_owner State + + When a particular lock_owner no longer holds open or file locking + state at the server, the server may choose to release the sequence + number state associated with the lock_owner. The server may make + this choice based on lease expiration, for the reclamation of server + memory, or other implementation specific details. In any event, the + server is able to do this safely only when the lock_owner no longer + is being utilized by the client. The server may choose to hold the + lock_owner state in the event that retransmitted requests are + received. However, the period to hold this state is implementation + specific. + + + +Shepler, et al. Standards Track [Page 74] + +RFC 3530 NFS version 4 Protocol April 2003 + + + In the case that a LOCK, LOCKU, OPEN_DOWNGRADE, or CLOSE is + retransmitted after the server has previously released the lock_owner + state, the server will find that the lock_owner has no files open and + an error will be returned to the client. If the lock_owner does have + a file open, the stateid will not match and again an error is + returned to the client. + +8.1.8. Use of Open Confirmation + + In the case that an OPEN is retransmitted and the lock_owner is being + used for the first time or the lock_owner state has been previously + released by the server, the use of the OPEN_CONFIRM operation will + prevent incorrect behavior. When the server observes the use of the + lock_owner for the first time, it will direct the client to perform + the OPEN_CONFIRM for the corresponding OPEN. This sequence + establishes the use of an lock_owner and associated sequence number. + Since the OPEN_CONFIRM sequence connects a new open_owner on the + server with an existing open_owner on a client, the sequence number + may have any value. The OPEN_CONFIRM step assures the server that + the value received is the correct one. See the section "OPEN_CONFIRM + - Confirm Open" for further details. + + There are a number of situations in which the requirement to confirm + an OPEN would pose difficulties for the client and server, in that + they would be prevented from acting in a timely fashion on + information received, because that information would be provisional, + subject to deletion upon non-confirmation. Fortunately, these are + situations in which the server can avoid the need for confirmation + when responding to open requests. The two constraints are: + + o The server must not bestow a delegation for any open which would + require confirmation. + + o The server MUST NOT require confirmation on a reclaim-type open + (i.e., one specifying claim type CLAIM_PREVIOUS or + CLAIM_DELEGATE_PREV). + + These constraints are related in that reclaim-type opens are the only + ones in which the server may be required to send a delegation. For + CLAIM_NULL, sending the delegation is optional while for + CLAIM_DELEGATE_CUR, no delegation is sent. + + Delegations being sent with an open requiring confirmation are + troublesome because recovering from non-confirmation adds undue + complexity to the protocol while requiring confirmation on reclaim- + type opens poses difficulties in that the inability to resolve + + + + + +Shepler, et al. Standards Track [Page 75] + +RFC 3530 NFS version 4 Protocol April 2003 + + + the status of the reclaim until lease expiration may make it + difficult to have timely determination of the set of locks being + reclaimed (since the grace period may expire). + + Requiring open confirmation on reclaim-type opens is avoidable + because of the nature of the environments in which such opens are + done. For CLAIM_PREVIOUS opens, this is immediately after server + reboot, so there should be no time for lockowners to be created, + found to be unused, and recycled. For CLAIM_DELEGATE_PREV opens, we + are dealing with a client reboot situation. A server which supports + delegation can be sure that no lockowners for that client have been + recycled since client initialization and thus can ensure that + confirmation will not be required. + +8.2. Lock Ranges + + The protocol allows a lock owner to request a lock with a byte range + and then either upgrade or unlock a sub-range of the initial lock. + It is expected that this will be an uncommon type of request. In any + case, servers or server filesystems may not be able to support sub- + range lock semantics. In the event that a server receives a locking + request that represents a sub-range of current locking state for the + lock owner, the server is allowed to return the error + NFS4ERR_LOCK_RANGE to signify that it does not support sub-range lock + operations. Therefore, the client should be prepared to receive this + error and, if appropriate, report the error to the requesting + application. + + The client is discouraged from combining multiple independent locking + ranges that happen to be adjacent into a single request since the + server may not support sub-range requests and for reasons related to + the recovery of file locking state in the event of server failure. + As discussed in the section "Server Failure and Recovery" below, the + server may employ certain optimizations during recovery that work + effectively only when the client's behavior during lock recovery is + similar to the client's locking behavior prior to server failure. + +8.3. Upgrading and Downgrading Locks + + If a client has a write lock on a record, it can request an atomic + downgrade of the lock to a read lock via the LOCK request, by setting + the type to READ_LT. If the server supports atomic downgrade, the + request will succeed. If not, it will return NFS4ERR_LOCK_NOTSUPP. + The client should be prepared to receive this error, and if + appropriate, report the error to the requesting application. + + + + + + +Shepler, et al. Standards Track [Page 76] + +RFC 3530 NFS version 4 Protocol April 2003 + + + If a client has a read lock on a record, it can request an atomic + upgrade of the lock to a write lock via the LOCK request by setting + the type to WRITE_LT or WRITEW_LT. If the server does not support + atomic upgrade, it will return NFS4ERR_LOCK_NOTSUPP. If the upgrade + can be achieved without an existing conflict, the request will + succeed. Otherwise, the server will return either NFS4ERR_DENIED or + NFS4ERR_DEADLOCK. The error NFS4ERR_DEADLOCK is returned if the + client issued the LOCK request with the type set to WRITEW_LT and the + server has detected a deadlock. The client should be prepared to + receive such errors and if appropriate, report the error to the + requesting application. + +8.4. Blocking Locks + + Some clients require the support of blocking locks. The NFS version + 4 protocol must not rely on a callback mechanism and therefore is + unable to notify a client when a previously denied lock has been + granted. Clients have no choice but to continually poll for the + lock. This presents a fairness problem. Two new lock types are + added, READW and WRITEW, and are used to indicate to the server that + the client is requesting a blocking lock. The server should maintain + an ordered list of pending blocking locks. When the conflicting lock + is released, the server may wait the lease period for the first + waiting client to re-request the lock. After the lease period + expires the next waiting client request is allowed the lock. Clients + are required to poll at an interval sufficiently small that it is + likely to acquire the lock in a timely manner. The server is not + required to maintain a list of pending blocked locks as it is used to + increase fairness and not correct operation. Because of the + unordered nature of crash recovery, storing of lock state to stable + storage would be required to guarantee ordered granting of blocking + locks. + + Servers may also note the lock types and delay returning denial of + the request to allow extra time for a conflicting lock to be + released, allowing a successful return. In this way, clients can + avoid the burden of needlessly frequent polling for blocking locks. + The server should take care in the length of delay in the event the + client retransmits the request. + +8.5. Lease Renewal + + The purpose of a lease is to allow a server to remove stale locks + that are held by a client that has crashed or is otherwise + unreachable. It is not a mechanism for cache consistency and lease + renewals may not be denied if the lease interval has not expired. + + + + + +Shepler, et al. Standards Track [Page 77] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The following events cause implicit renewal of all of the leases for + a given client (i.e., all those sharing a given clientid). Each of + these is a positive indication that the client is still active and + that the associated state held at the server, for the client, is + still valid. + + o An OPEN with a valid clientid. + + o Any operation made with a valid stateid (CLOSE, DELEGPURGE, + DELEGRETURN, LOCK, LOCKU, OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, + READ, RENEW, SETATTR, WRITE). This does not include the special + stateids of all bits 0 or all bits 1. + + Note that if the client had restarted or rebooted, the client + would not be making these requests without issuing the + SETCLIENTID/SETCLIENTID_CONFIRM sequence. The use of the + SETCLIENTID/SETCLIENTID_CONFIRM sequence (one that changes the + client verifier) notifies the server to drop the locking state + associated with the client. SETCLIENTID/SETCLIENTID_CONFIRM never + renews a lease. + + If the server has rebooted, the stateids (NFS4ERR_STALE_STATEID + error) or the clientid (NFS4ERR_STALE_CLIENTID error) will not be + valid hence preventing spurious renewals. + + This approach allows for low overhead lease renewal which scales + well. In the typical case no extra RPC calls are required for lease + renewal and in the worst case one RPC is required every lease period + (i.e., a RENEW operation). The number of locks held by the client is + not a factor since all state for the client is involved with the + lease renewal action. + + Since all operations that create a new lease also renew existing + leases, the server must maintain a common lease expiration time for + all valid leases for a given client. This lease time can then be + easily updated upon implicit lease renewal actions. + +8.6. Crash Recovery + + The important requirement in crash recovery is that both the client + and the server know when the other has failed. Additionally, it is + required that a client sees a consistent view of data across server + restarts or reboots. All READ and WRITE operations that may have + been queued within the client or network buffers must wait until the + client has successfully recovered the locks protecting the READ and + WRITE operations. + + + + + +Shepler, et al. Standards Track [Page 78] + +RFC 3530 NFS version 4 Protocol April 2003 + + +8.6.1. Client Failure and Recovery + + In the event that a client fails, the server may recover the client's + locks when the associated leases have expired. Conflicting locks + from another client may only be granted after this lease expiration. + If the client is able to restart or reinitialize within the lease + period the client may be forced to wait the remainder of the lease + period before obtaining new locks. + + To minimize client delay upon restart, lock requests are associated + with an instance of the client by a client supplied verifier. This + verifier is part of the initial SETCLIENTID call made by the client. + The server returns a clientid as a result of the SETCLIENTID + operation. The client then confirms the use of the clientid with + SETCLIENTID_CONFIRM. The clientid in combination with an opaque + owner field is then used by the client to identify the lock owner for + OPEN. This chain of associations is then used to identify all locks + for a particular client. + + Since the verifier will be changed by the client upon each + initialization, the server can compare a new verifier to the verifier + associated with currently held locks and determine that they do not + match. This signifies the client's new instantiation and subsequent + loss of locking state. As a result, the server is free to release + all locks held which are associated with the old clientid which was + derived from the old verifier. + + Note that the verifier must have the same uniqueness properties of + the verifier for the COMMIT operation. + +8.6.2. Server Failure and Recovery + + If the server loses locking state (usually as a result of a restart + or reboot), it must allow clients time to discover this fact and re- + establish the lost locking state. The client must be able to re- + establish the locking state without having the server deny valid + requests because the server has granted conflicting access to another + client. Likewise, if there is the possibility that clients have not + yet re-established their locking state for a file, the server must + disallow READ and WRITE operations for that file. The duration of + this recovery period is equal to the duration of the lease period. + + A client can determine that server failure (and thus loss of locking + state) has occurred, when it receives one of two errors. The + NFS4ERR_STALE_STATEID error indicates a stateid invalidated by a + reboot or restart. The NFS4ERR_STALE_CLIENTID error indicates a + + + + + +Shepler, et al. Standards Track [Page 79] + +RFC 3530 NFS version 4 Protocol April 2003 + + + clientid invalidated by reboot or restart. When either of these are + received, the client must establish a new clientid (See the section + "Client ID") and re-establish the locking state as discussed below. + + The period of special handling of locking and READs and WRITEs, equal + in duration to the lease period, is referred to as the "grace + period". During the grace period, clients recover locks and the + associated state by reclaim-type locking requests (i.e., LOCK + requests with reclaim set to true and OPEN operations with a claim + type of CLAIM_PREVIOUS). During the grace period, the server must + reject READ and WRITE operations and non-reclaim locking requests + (i.e., other LOCK and OPEN operations) with an error of + NFS4ERR_GRACE. + + If the server can reliably determine that granting a non-reclaim + request will not conflict with reclamation of locks by other clients, + the NFS4ERR_GRACE error does not have to be returned and the non- + reclaim client request can be serviced. For the server to be able to + service READ and WRITE operations during the grace period, it must + again be able to guarantee that no possible conflict could arise + between an impending reclaim locking request and the READ or WRITE + operation. If the server is unable to offer that guarantee, the + NFS4ERR_GRACE error must be returned to the client. + + For a server to provide simple, valid handling during the grace + period, the easiest method is to simply reject all non-reclaim + locking requests and READ and WRITE operations by returning the + NFS4ERR_GRACE error. However, a server may keep information about + granted locks in stable storage. With this information, the server + could determine if a regular lock or READ or WRITE operation can be + safely processed. + + For example, if a count of locks on a given file is available in + stable storage, the server can track reclaimed locks for the file and + when all reclaims have been processed, non-reclaim locking requests + may be processed. This way the server can ensure that non-reclaim + locking requests will not conflict with potential reclaim requests. + With respect to I/O requests, if the server is able to determine that + there are no outstanding reclaim requests for a file by information + from stable storage or another similar mechanism, the processing of + I/O requests could proceed normally for the file. + + To reiterate, for a server that allows non-reclaim lock and I/O + requests to be processed during the grace period, it MUST determine + that no lock subsequently reclaimed will be rejected and that no lock + subsequently reclaimed would have prevented any I/O operation + processed during the grace period. + + + + +Shepler, et al. Standards Track [Page 80] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Clients should be prepared for the return of NFS4ERR_GRACE errors for + non-reclaim lock and I/O requests. In this case the client should + employ a retry mechanism for the request. A delay (on the order of + several seconds) between retries should be used to avoid overwhelming + the server. Further discussion of the general issue is included in + [Floyd]. The client must account for the server that is able to + perform I/O and non-reclaim locking requests within the grace period + as well as those that can not do so. + + A reclaim-type locking request outside the server's grace period can + only succeed if the server can guarantee that no conflicting lock or + I/O request has been granted since reboot or restart. + + A server may, upon restart, establish a new value for the lease + period. Therefore, clients should, once a new clientid is + established, refetch the lease_time attribute and use it as the basis + for lease renewal for the lease associated with that server. + However, the server must establish, for this restart event, a grace + period at least as long as the lease period for the previous server + instantiation. This allows the client state obtained during the + previous server instance to be reliably re-established. + +8.6.3. Network Partitions and Recovery + + If the duration of a network partition is greater than the lease + period provided by the server, the server will have not received a + lease renewal from the client. If this occurs, the server may free + all locks held for the client. As a result, all stateids held by the + client will become invalid or stale. Once the client is able to + reach the server after such a network partition, all I/O submitted by + the client with the now invalid stateids will fail with the server + returning the error NFS4ERR_EXPIRED. Once this error is received, + the client will suitably notify the application that held the lock. + + As a courtesy to the client or as an optimization, the server may + continue to hold locks on behalf of a client for which recent + communication has extended beyond the lease period. If the server + receives a lock or I/O request that conflicts with one of these + courtesy locks, the server must free the courtesy lock and grant the + new request. + + When a network partition is combined with a server reboot, there are + edge conditions that place requirements on the server in order to + avoid silent data corruption following the server reboot. Two of + these edge conditions are known, and are discussed below. + + + + + + +Shepler, et al. Standards Track [Page 81] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The first edge condition has the following scenario: + + 1. Client A acquires a lock. + + 2. Client A and server experience mutual network partition, such + that client A is unable to renew its lease. + + 3. Client A's lease expires, so server releases lock. + + 4. Client B acquires a lock that would have conflicted with that + of Client A. + + 5. Client B releases the lock + + 6. Server reboots + + 7. Network partition between client A and server heals. + + 8. Client A issues a RENEW operation, and gets back a + NFS4ERR_STALE_CLIENTID. + + 9. Client A reclaims its lock within the server's grace period. + + Thus, at the final step, the server has erroneously granted client + A's lock reclaim. If client B modified the object the lock was + protecting, client A will experience object corruption. + + The second known edge condition follows: + + 1. Client A acquires a lock. + + 2. Server reboots. + + 3. Client A and server experience mutual network partition, such + that client A is unable to reclaim its lock within the grace + period. + + 4. Server's reclaim grace period ends. Client A has no locks + recorded on server. + + 5. Client B acquires a lock that would have conflicted with that + of Client A. + + 6. Client B releases the lock. + + 7. Server reboots a second time. + + 8. Network partition between client A and server heals. + + + +Shepler, et al. Standards Track [Page 82] + +RFC 3530 NFS version 4 Protocol April 2003 + + + 9. Client A issues a RENEW operation, and gets back a + NFS4ERR_STALE_CLIENTID. + + 10. Client A reclaims its lock within the server's grace period. + + As with the first edge condition, the final step of the scenario of + the second edge condition has the server erroneously granting client + A's lock reclaim. + + Solving the first and second edge conditions requires that the server + either assume after it reboots that edge condition occurs, and thus + return NFS4ERR_NO_GRACE for all reclaim attempts, or that the server + record some information stable storage. The amount of information + the server records in stable storage is in inverse proportion to how + harsh the server wants to be whenever the edge conditions occur. The + server that is completely tolerant of all edge conditions will record + in stable storage every lock that is acquired, removing the lock + record from stable storage only when the lock is unlocked by the + client and the lock's lockowner advances the sequence number such + that the lock release is not the last stateful event for the + lockowner's sequence. For the two aforementioned edge conditions, + the harshest a server can be, and still support a grace period for + reclaims, requires that the server record in stable storage + information some minimal information. For example, a server + implementation could, for each client, save in stable storage a + record containing: + + o the client's id string + + o a boolean that indicates if the client's lease expired or if there + was administrative intervention (see the section, Server + Revocation of Locks) to revoke a record lock, share reservation, + or delegation + + o a timestamp that is updated the first time after a server boot or + reboot the client acquires record locking, share reservation, or + delegation state on the server. The timestamp need not be updated + on subsequent lock requests until the server reboots. + + The server implementation would also record in the stable storage the + timestamps from the two most recent server reboots. + + Assuming the above record keeping, for the first edge condition, + after the server reboots, the record that client A's lease expired + means that another client could have acquired a conflicting record + lock, share reservation, or delegation. Hence the server must reject + a reclaim from client A with the error NFS4ERR_NO_GRACE. + + + + +Shepler, et al. Standards Track [Page 83] + +RFC 3530 NFS version 4 Protocol April 2003 + + + For the second edge condition, after the server reboots for a second + time, the record that the client had an unexpired record lock, share + reservation, or delegation established before the server's previous + incarnation means that the server must reject a reclaim from client A + with the error NFS4ERR_NO_GRACE. + + Regardless of the level and approach to record keeping, the server + MUST implement one of the following strategies (which apply to + reclaims of share reservations, record locks, and delegations): + + 1. Reject all reclaims with NFS4ERR_NO_GRACE. This is superharsh, + but necessary if the server does not want to record lock state + in stable storage. + + 2. Record sufficient state in stable storage such that all known + edge conditions involving server reboot, including the two + noted in this section, are detected. False positives are + acceptable. Note that at this time, it is not known if there + are other edge conditions. + + In the event, after a server reboot, the server determines that + there is unrecoverable damage or corruption to the the stable + storage, then for all clients and/or locks affected, the server + MUST return NFS4ERR_NO_GRACE. + + A mandate for the client's handling of the NFS4ERR_NO_GRACE error is + outside the scope of this specification, since the strategies for + such handling are very dependent on the client's operating + environment. However, one potential approach is described below. + + When the client receives NFS4ERR_NO_GRACE, it could examine the + change attribute of the objects the client is trying to reclaim state + for, and use that to determine whether to re-establish the state via + normal OPEN or LOCK requests. This is acceptable provided the + client's operating environment allows it. In otherwords, the client + implementor is advised to document for his users the behavior. The + client could also inform the application that its record lock or + share reservations (whether they were delegated or not) have been + lost, such as via a UNIX signal, a GUI pop-up window, etc. See the + section, "Data Caching and Revocation" for a discussion of what the + client should do for dealing with unreclaimed delegations on client + state. + + For further discussion of revocation of locks see the section "Server + Revocation of Locks". + + + + + + +Shepler, et al. Standards Track [Page 84] + +RFC 3530 NFS version 4 Protocol April 2003 + + +8.7. Recovery from a Lock Request Timeout or Abort + + In the event a lock request times out, a client may decide to not + retry the request. The client may also abort the request when the + process for which it was issued is terminated (e.g., in UNIX due to a + signal). It is possible though that the server received the request + and acted upon it. This would change the state on the server without + the client being aware of the change. It is paramount that the + client re-synchronize state with server before it attempts any other + operation that takes a seqid and/or a stateid with the same + lock_owner. This is straightforward to do without a special re- + synchronize operation. + + Since the server maintains the last lock request and response + received on the lock_owner, for each lock_owner, the client should + cache the last lock request it sent such that the lock request did + not receive a response. From this, the next time the client does a + lock operation for the lock_owner, it can send the cached request, if + there is one, and if the request was one that established state + (e.g., a LOCK or OPEN operation), the server will return the cached + result or if never saw the request, perform it. The client can + follow up with a request to remove the state (e.g., a LOCKU or CLOSE + operation). With this approach, the sequencing and stateid + information on the client and server for the given lock_owner will + re-synchronize and in turn the lock state will re-synchronize. + +8.8. Server Revocation of Locks + + At any point, the server can revoke locks held by a client and the + client must be prepared for this event. When the client detects that + its locks have been or may have been revoked, the client is + responsible for validating the state information between itself and + the server. Validating locking state for the client means that it + must verify or reclaim state for each lock currently held. + + The first instance of lock revocation is upon server reboot or re- + initialization. In this instance the client will receive an error + (NFS4ERR_STALE_STATEID or NFS4ERR_STALE_CLIENTID) and the client will + proceed with normal crash recovery as described in the previous + section. + + The second lock revocation event is the inability to renew the lease + before expiration. While this is considered a rare or unusual event, + the client must be prepared to recover. Both the server and client + will be able to detect the failure to renew the lease and are capable + of recovering without data corruption. For the server, it tracks the + last renewal event serviced for the client and knows when the lease + will expire. Similarly, the client must track operations which will + + + +Shepler, et al. Standards Track [Page 85] + +RFC 3530 NFS version 4 Protocol April 2003 + + + renew the lease period. Using the time that each such request was + sent and the time that the corresponding reply was received, the + client should bound the time that the corresponding renewal could + have occurred on the server and thus determine if it is possible that + a lease period expiration could have occurred. + + The third lock revocation event can occur as a result of + administrative intervention within the lease period. While this is + considered a rare event, it is possible that the server's + administrator has decided to release or revoke a particular lock held + by the client. As a result of revocation, the client will receive an + error of NFS4ERR_ADMIN_REVOKED. In this instance the client may + assume that only the lock_owner's locks have been lost. The client + notifies the lock holder appropriately. The client may not assume + the lease period has been renewed as a result of failed operation. + + When the client determines the lease period may have expired, the + client must mark all locks held for the associated lease as + "unvalidated". This means the client has been unable to re-establish + or confirm the appropriate lock state with the server. As described + in the previous section on crash recovery, there are scenarios in + which the server may grant conflicting locks after the lease period + has expired for a client. When it is possible that the lease period + has expired, the client must validate each lock currently held to + ensure that a conflicting lock has not been granted. The client may + accomplish this task by issuing an I/O request, either a pending I/O + or a zero-length read, specifying the stateid associated with the + lock in question. If the response to the request is success, the + client has validated all of the locks governed by that stateid and + re-established the appropriate state between itself and the server. + + If the I/O request is not successful, then one or more of the locks + associated with the stateid was revoked by the server and the client + must notify the owner. + +8.9. Share Reservations + + A share reservation is a mechanism to control access to a file. It + is a separate and independent mechanism from record locking. When a + client opens a file, it issues an OPEN operation to the server + specifying the type of access required (READ, WRITE, or BOTH) and the + type of access to deny others (deny NONE, READ, WRITE, or BOTH). If + the OPEN fails the client will fail the application's open request. + + Pseudo-code definition of the semantics: + + if (request.access == 0) + return (NFS4ERR_INVAL) + + + +Shepler, et al. Standards Track [Page 86] + +RFC 3530 NFS version 4 Protocol April 2003 + + + else + if ((request.access & file_state.deny)) || + (request.deny & file_state.access)) + return (NFS4ERR_DENIED) + + This checking of share reservations on OPEN is done with no exception + for an existing OPEN for the same open_owner. + + The constants used for the OPEN and OPEN_DOWNGRADE operations for the + access and deny fields are as follows: + + const OPEN4_SHARE_ACCESS_READ = 0x00000001; + const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; + const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; + + const OPEN4_SHARE_DENY_NONE = 0x00000000; + const OPEN4_SHARE_DENY_READ = 0x00000001; + const OPEN4_SHARE_DENY_WRITE = 0x00000002; + const OPEN4_SHARE_DENY_BOTH = 0x00000003; + +8.10. OPEN/CLOSE Operations + + To provide correct share semantics, a client MUST use the OPEN + operation to obtain the initial filehandle and indicate the desired + access and what if any access to deny. Even if the client intends to + use a stateid of all 0's or all 1's, it must still obtain the + filehandle for the regular file with the OPEN operation so the + appropriate share semantics can be applied. For clients that do not + have a deny mode built into their open programming interfaces, deny + equal to NONE should be used. + + The OPEN operation with the CREATE flag, also subsumes the CREATE + operation for regular files as used in previous versions of the NFS + protocol. This allows a create with a share to be done atomically. + + The CLOSE operation removes all share reservations held by the + lock_owner on that file. If record locks are held, the client SHOULD + release all locks before issuing a CLOSE. The server MAY free all + outstanding locks on CLOSE but some servers may not support the CLOSE + of a file that still has record locks held. The server MUST return + failure, NFS4ERR_LOCKS_HELD, if any locks would exist after the + CLOSE. + + The LOOKUP operation will return a filehandle without establishing + any lock state on the server. Without a valid stateid, the server + will assume the client has the least access. For example, a file + + + + + +Shepler, et al. Standards Track [Page 87] + +RFC 3530 NFS version 4 Protocol April 2003 + + + opened with deny READ/WRITE cannot be accessed using a filehandle + obtained through LOOKUP because it would not have a valid stateid + (i.e., using a stateid of all bits 0 or all bits 1). + +8.10.1. Close and Retention of State Information + + Since a CLOSE operation requests deallocation of a stateid, dealing + with retransmission of the CLOSE, may pose special difficulties, + since the state information, which normally would be used to + determine the state of the open file being designated, might be + deallocated, resulting in an NFS4ERR_BAD_STATEID error. + + Servers may deal with this problem in a number of ways. To provide + the greatest degree assurance that the protocol is being used + properly, a server should, rather than deallocate the stateid, mark + it as close-pending, and retain the stateid with this status, until + later deallocation. In this way, a retransmitted CLOSE can be + recognized since the stateid points to state information with this + distinctive status, so that it can be handled without error. + + When adopting this strategy, a server should retain the state + information until the earliest of: + + o Another validly sequenced request for the same lockowner, that is + not a retransmission. + + o The time that a lockowner is freed by the server due to period + with no activity. + + o All locks for the client are freed as a result of a SETCLIENTID. + + Servers may avoid this complexity, at the cost of less complete + protocol error checking, by simply responding NFS4_OK in the event of + a CLOSE for a deallocated stateid, on the assumption that this case + must be caused by a retransmitted close. When adopting this + approach, it is desirable to at least log an error when returning a + no-error indication in this situation. If the server maintains a + reply-cache mechanism, it can verify the CLOSE is indeed a + retransmission and avoid error logging in most cases. + +8.11. Open Upgrade and Downgrade + + When an OPEN is done for a file and the lockowner for which the open + is being done already has the file open, the result is to upgrade the + open file status maintained on the server to include the access and + deny bits specified by the new OPEN as well as those for the existing + OPEN. The result is that there is one open file, as far as the + protocol is concerned, and it includes the union of the access and + + + +Shepler, et al. Standards Track [Page 88] + +RFC 3530 NFS version 4 Protocol April 2003 + + + deny bits for all of the OPEN requests completed. Only a single + CLOSE will be done to reset the effects of both OPENs. Note that the + client, when issuing the OPEN, may not know that the same file is in + fact being opened. The above only applies if both OPENs result in + the OPENed object being designated by the same filehandle. + + When the server chooses to export multiple filehandles corresponding + to the same file object and returns different filehandles on two + different OPENs of the same file object, the server MUST NOT "OR" + together the access and deny bits and coalesce the two open files. + Instead the server must maintain separate OPENs with separate + stateids and will require separate CLOSEs to free them. + + When multiple open files on the client are merged into a single open + file object on the server, the close of one of the open files (on the + client) may necessitate change of the access and deny status of the + open file on the server. This is because the union of the access and + deny bits for the remaining opens may be smaller (i.e., a proper + subset) than previously. The OPEN_DOWNGRADE operation is used to + make the necessary change and the client should use it to update the + server so that share reservation requests by other clients are + handled properly. + +8.12. Short and Long Leases + + When determining the time period for the server lease, the usual + lease tradeoffs apply. Short leases are good for fast server + recovery at a cost of increased RENEW or READ (with zero length) + requests. Longer leases are certainly kinder and gentler to servers + trying to handle very large numbers of clients. The number of RENEW + requests drop in proportion to the lease time. The disadvantages of + long leases are slower recovery after server failure (the server must + wait for the leases to expire and the grace period to elapse before + granting new lock requests) and increased file contention (if client + fails to transmit an unlock request then server must wait for lease + expiration before granting new locks). + + Long leases are usable if the server is able to store lease state in + non-volatile memory. Upon recovery, the server can reconstruct the + lease state from its non-volatile memory and continue operation with + its clients and therefore long leases would not be an issue. + +8.13. Clocks, Propagation Delay, and Calculating Lease Expiration + + To avoid the need for synchronized clocks, lease times are granted by + the server as a time delta. However, there is a requirement that the + client and server clocks do not drift excessively over the duration + of the lock. There is also the issue of propagation delay across the + + + +Shepler, et al. Standards Track [Page 89] + +RFC 3530 NFS version 4 Protocol April 2003 + + + network which could easily be several hundred milliseconds as well as + the possibility that requests will be lost and need to be + retransmitted. + + To take propagation delay into account, the client should subtract it + from lease times (e.g., if the client estimates the one-way + propagation delay as 200 msec, then it can assume that the lease is + already 200 msec old when it gets it). In addition, it will take + another 200 msec to get a response back to the server. So the client + must send a lock renewal or write data back to the server 400 msec + before the lease would expire. + + The server's lease period configuration should take into account the + network distance of the clients that will be accessing the server's + resources. It is expected that the lease period will take into + account the network propagation delays and other network delay + factors for the client population. Since the protocol does not allow + for an automatic method to determine an appropriate lease period, the + server's administrator may have to tune the lease period. + +8.14. Migration, Replication and State + + When responsibility for handling a given file system is transferred + to a new server (migration) or the client chooses to use an alternate + server (e.g., in response to server unresponsiveness) in the context + of file system replication, the appropriate handling of state shared + between the client and server (i.e., locks, leases, stateids, and + clientids) is as described below. The handling differs between + migration and replication. For related discussion of file server + state and recover of such see the sections under "File Locking and + Share Reservations". + + If server replica or a server immigrating a filesystem agrees to, or + is expected to, accept opaque values from the client that originated + from another server, then it is a wise implementation practice for + the servers to encode the "opaque" values in network byte order. + This way, servers acting as replicas or immigrating filesystems will + be able to parse values like stateids, directory cookies, + filehandles, etc. even if their native byte order is different from + other servers cooperating in the replication and migration of the + filesystem. + +8.14.1. Migration and State + + In the case of migration, the servers involved in the migration of a + filesystem SHOULD transfer all server state from the original to the + new server. This must be done in a way that is transparent to the + client. This state transfer will ease the client's transition when a + + + +Shepler, et al. Standards Track [Page 90] + +RFC 3530 NFS version 4 Protocol April 2003 + + + filesystem migration occurs. If the servers are successful in + transferring all state, the client will continue to use stateids + assigned by the original server. Therefore the new server must + recognize these stateids as valid. This holds true for the clientid + as well. Since responsibility for an entire filesystem is + transferred with a migration event, there is no possibility that + conflicts will arise on the new server as a result of the transfer of + locks. + + As part of the transfer of information between servers, leases would + be transferred as well. The leases being transferred to the new + server will typically have a different expiration time from those for + the same client, previously on the old server. To maintain the + property that all leases on a given server for a given client expire + at the same time, the server should advance the expiration time to + the later of the leases being transferred or the leases already + present. This allows the client to maintain lease renewal of both + classes without special effort. + + The servers may choose not to transfer the state information upon + migration. However, this choice is discouraged. In this case, when + the client presents state information from the original server, the + client must be prepared to receive either NFS4ERR_STALE_CLIENTID or + NFS4ERR_STALE_STATEID from the new server. The client should then + recover its state information as it normally would in response to a + server failure. The new server must take care to allow for the + recovery of state information as it would in the event of server + restart. + +8.14.2. Replication and State + + Since client switch-over in the case of replication is not under + server control, the handling of state is different. In this case, + leases, stateids and clientids do not have validity across a + transition from one server to another. The client must re-establish + its locks on the new server. This can be compared to the re- + establishment of locks by means of reclaim-type requests after a + server reboot. The difference is that the server has no provision to + distinguish requests reclaiming locks from those obtaining new locks + or to defer the latter. Thus, a client re-establishing a lock on the + new server (by means of a LOCK or OPEN request), may have the + requests denied due to a conflicting lock. Since replication is + intended for read-only use of filesystems, such denial of locks + should not pose large difficulties in practice. When an attempt to + re-establish a lock on a new server is denied, the client should + treat the situation as if his original lock had been revoked. + + + + + +Shepler, et al. Standards Track [Page 91] + +RFC 3530 NFS version 4 Protocol April 2003 + + +8.14.3. Notification of Migrated Lease + + In the case of lease renewal, the client may not be submitting + requests for a filesystem that has been migrated to another server. + This can occur because of the implicit lease renewal mechanism. The + client renews leases for all filesystems when submitting a request to + any one filesystem at the server. + + In order for the client to schedule renewal of leases that may have + been relocated to the new server, the client must find out about + lease relocation before those leases expire. To accomplish this, all + operations which implicitly renew leases for a client (i.e., OPEN, + CLOSE, READ, WRITE, RENEW, LOCK, LOCKT, LOCKU), will return the error + NFS4ERR_LEASE_MOVED if responsibility for any of the leases to be + renewed has been transferred to a new server. This condition will + continue until the client receives an NFS4ERR_MOVED error and the + server receives the subsequent GETATTR(fs_locations) for an access to + each filesystem for which a lease has been moved to a new server. + + When a client receives an NFS4ERR_LEASE_MOVED error, it should + perform an operation on each filesystem associated with the server in + question. When the client receives an NFS4ERR_MOVED error, the + client can follow the normal process to obtain the new server + information (through the fs_locations attribute) and perform renewal + of those leases on the new server. If the server has not had state + transferred to it transparently, the client will receive either + NFS4ERR_STALE_CLIENTID or NFS4ERR_STALE_STATEID from the new server, + as described above, and the client can then recover state information + as it does in the event of server failure. + +8.14.4. Migration and the Lease_time Attribute + + In order that the client may appropriately manage its leases in the + case of migration, the destination server must establish proper + values for the lease_time attribute. + + When state is transferred transparently, that state should include + the correct value of the lease_time attribute. The lease_time + attribute on the destination server must never be less than that on + the source since this would result in premature expiration of leases + granted by the source server. Upon migration in which state is + transferred transparently, the client is under no obligation to re- + fetch the lease_time attribute and may continue to use the value + previously fetched (on the source server). + + If state has not been transferred transparently (i.e., the client + sees a real or simulated server reboot), the client should fetch the + value of lease_time on the new (i.e., destination) server, and use it + + + +Shepler, et al. Standards Track [Page 92] + +RFC 3530 NFS version 4 Protocol April 2003 + + + for subsequent locking requests. However the server must respect a + grace period at least as long as the lease_time on the source server, + in order to ensure that clients have ample time to reclaim their + locks before potentially conflicting non-reclaimed locks are granted. + The means by which the new server obtains the value of lease_time on + the old server is left to the server implementations. It is not + specified by the NFS version 4 protocol. + +9. Client-Side Caching + + Client-side caching of data, of file attributes, and of file names is + essential to providing good performance with the NFS protocol. + Providing distributed cache coherence is a difficult problem and + previous versions of the NFS protocol have not attempted it. + Instead, several NFS client implementation techniques have been used + to reduce the problems that a lack of coherence poses for users. + These techniques have not been clearly defined by earlier protocol + specifications and it is often unclear what is valid or invalid + client behavior. + + The NFS version 4 protocol uses many techniques similar to those that + have been used in previous protocol versions. The NFS version 4 + protocol does not provide distributed cache coherence. However, it + defines a more limited set of caching guarantees to allow locks and + share reservations to be used without destructive interference from + client side caching. + + In addition, the NFS version 4 protocol introduces a delegation + mechanism which allows many decisions normally made by the server to + be made locally by clients. This mechanism provides efficient + support of the common cases where sharing is infrequent or where + sharing is read-only. + +9.1. Performance Challenges for Client-Side Caching + + Caching techniques used in previous versions of the NFS protocol have + been successful in providing good performance. However, several + scalability challenges can arise when those techniques are used with + very large numbers of clients. This is particularly true when + clients are geographically distributed which classically increases + the latency for cache revalidation requests. + + The previous versions of the NFS protocol repeat their file data + cache validation requests at the time the file is opened. This + behavior can have serious performance drawbacks. A common case is + one in which a file is only accessed by a single client. Therefore, + sharing is infrequent. + + + + +Shepler, et al. Standards Track [Page 93] + +RFC 3530 NFS version 4 Protocol April 2003 + + + In this case, repeated reference to the server to find that no + conflicts exist is expensive. A better option with regards to + performance is to allow a client that repeatedly opens a file to do + so without reference to the server. This is done until potentially + conflicting operations from another client actually occur. + + A similar situation arises in connection with file locking. Sending + file lock and unlock requests to the server as well as the read and + write requests necessary to make data caching consistent with the + locking semantics (see the section "Data Caching and File Locking") + can severely limit performance. When locking is used to provide + protection against infrequent conflicts, a large penalty is incurred. + This penalty may discourage the use of file locking by applications. + + The NFS version 4 protocol provides more aggressive caching + strategies with the following design goals: + + o Compatibility with a large range of server semantics. + + o Provide the same caching benefits as previous versions of the NFS + protocol when unable to provide the more aggressive model. + + o Requirements for aggressive caching are organized so that a large + portion of the benefit can be obtained even when not all of the + requirements can be met. + + The appropriate requirements for the server are discussed in later + sections in which specific forms of caching are covered. (see the + section "Open Delegation"). + +9.2. Delegation and Callbacks + + Recallable delegation of server responsibilities for a file to a + client improves performance by avoiding repeated requests to the + server in the absence of inter-client conflict. With the use of a + "callback" RPC from server to client, a server recalls delegated + responsibilities when another client engages in sharing of a + delegated file. + + A delegation is passed from the server to the client, specifying the + object of the delegation and the type of delegation. There are + different types of delegations but each type contains a stateid to be + used to represent the delegation when performing operations that + depend on the delegation. This stateid is similar to those + associated with locks and share reservations but differs in that the + stateid for a delegation is associated with a clientid and may be + + + + + +Shepler, et al. Standards Track [Page 94] + +RFC 3530 NFS version 4 Protocol April 2003 + + + used on behalf of all the open_owners for the given client. A + delegation is made to the client as a whole and not to any specific + process or thread of control within it. + + Because callback RPCs may not work in all environments (due to + firewalls, for example), correct protocol operation does not depend + on them. Preliminary testing of callback functionality by means of a + CB_NULL procedure determines whether callbacks can be supported. The + CB_NULL procedure checks the continuity of the callback path. A + server makes a preliminary assessment of callback availability to a + given client and avoids delegating responsibilities until it has + determined that callbacks are supported. Because the granting of a + delegation is always conditional upon the absence of conflicting + access, clients must not assume that a delegation will be granted and + they must always be prepared for OPENs to be processed without any + delegations being granted. + + Once granted, a delegation behaves in most ways like a lock. There + is an associated lease that is subject to renewal together with all + of the other leases held by that client. + + Unlike locks, an operation by a second client to a delegated file + will cause the server to recall a delegation through a callback. + + On recall, the client holding the delegation must flush modified + state (such as modified data) to the server and return the + delegation. The conflicting request will not receive a response + until the recall is complete. The recall is considered complete when + the client returns the delegation or the server times out on the + recall and revokes the delegation as a result of the timeout. + Following the resolution of the recall, the server has the + information necessary to grant or deny the second client's request. + + At the time the client receives a delegation recall, it may have + substantial state that needs to be flushed to the server. Therefore, + the server should allow sufficient time for the delegation to be + returned since it may involve numerous RPCs to the server. If the + server is able to determine that the client is diligently flushing + state to the server as a result of the recall, the server may extend + the usual time allowed for a recall. However, the time allowed for + recall completion should not be unbounded. + + An example of this is when responsibility to mediate opens on a given + file is delegated to a client (see the section "Open Delegation"). + The server will not know what opens are in effect on the client. + Without this knowledge the server will be unable to determine if the + access and deny state for the file allows any particular open until + the delegation for the file has been returned. + + + +Shepler, et al. Standards Track [Page 95] + +RFC 3530 NFS version 4 Protocol April 2003 + + + A client failure or a network partition can result in failure to + respond to a recall callback. In this case, the server will revoke + the delegation which in turn will render useless any modified state + still on the client. + +9.2.1. Delegation Recovery + + There are three situations that delegation recovery must deal with: + + o Client reboot or restart + + o Server reboot or restart + + o Network partition (full or callback-only) + + In the event the client reboots or restarts, the failure to renew + leases will result in the revocation of record locks and share + reservations. Delegations, however, may be treated a bit + differently. + + There will be situations in which delegations will need to be + reestablished after a client reboots or restarts. The reason for + this is the client may have file data stored locally and this data + was associated with the previously held delegations. The client will + need to reestablish the appropriate file state on the server. + + To allow for this type of client recovery, the server MAY extend the + period for delegation recovery beyond the typical lease expiration + period. This implies that requests from other clients that conflict + with these delegations will need to wait. Because the normal recall + process may require significant time for the client to flush changed + state to the server, other clients need be prepared for delays that + occur because of a conflicting delegation. This longer interval + would increase the window for clients to reboot and consult stable + storage so that the delegations can be reclaimed. For open + delegations, such delegations are reclaimed using OPEN with a claim + type of CLAIM_DELEGATE_PREV. (See the sections on "Data Caching and + Revocation" and "Operation 18: OPEN" for discussion of open + delegation and the details of OPEN respectively). + + A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it + does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and + instead MUST, for a period of time no less than that of the value of + the lease_time attribute, maintain the client's delegations to allow + time for the client to issue CLAIM_DELEGATE_PREV requests. The + server that supports CLAIM_DELEGATE_PREV MUST support the DELEGPURGE + operation. + + + + +Shepler, et al. Standards Track [Page 96] + +RFC 3530 NFS version 4 Protocol April 2003 + + + When the server reboots or restarts, delegations are reclaimed (using + the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to + record locks and share reservations. However, there is a slight + semantic difference. In the normal case if the server decides that a + delegation should not be granted, it performs the requested action + (e.g., OPEN) without granting any delegation. For reclaim, the + server grants the delegation but a special designation is applied so + that the client treats the delegation as having been granted but + recalled by the server. Because of this, the client has the duty to + write all modified state to the server and then return the + delegation. This process of handling delegation reclaim reconciles + three principles of the NFS version 4 protocol: + + o Upon reclaim, a client reporting resources assigned to it by an + earlier server instance must be granted those resources. + + o The server has unquestionable authority to determine whether + delegations are to be granted and, once granted, whether they are + to be continued. + + o The use of callbacks is not to be depended upon until the client + has proven its ability to receive them. + + When a network partition occurs, delegations are subject to freeing + by the server when the lease renewal period expires. This is similar + to the behavior for locks and share reservations. For delegations, + however, the server may extend the period in which conflicting + requests are held off. Eventually the occurrence of a conflicting + request from another client will cause revocation of the delegation. + A loss of the callback path (e.g., by later network configuration + change) will have the same effect. A recall request will fail and + revocation of the delegation will result. + + A client normally finds out about revocation of a delegation when it + uses a stateid associated with a delegation and receives the error + NFS4ERR_EXPIRED. It also may find out about delegation revocation + after a client reboot when it attempts to reclaim a delegation and + receives that same error. Note that in the case of a revoked write + open delegation, there are issues because data may have been modified + by the client whose delegation is revoked and separately by other + clients. See the section "Revocation Recovery for Write Open + Delegation" for a discussion of such issues. Note also that when + delegations are revoked, information about the revoked delegation + will be written by the server to stable storage (as described in the + section "Crash Recovery"). This is done to deal with the case in + which a server reboots after revoking a delegation but before the + client holding the revoked delegation is notified about the + revocation. + + + +Shepler, et al. Standards Track [Page 97] + +RFC 3530 NFS version 4 Protocol April 2003 + + +9.3. Data Caching + + When applications share access to a set of files, they need to be + implemented so as to take account of the possibility of conflicting + access by another application. This is true whether the applications + in question execute on different clients or reside on the same + client. + + Share reservations and record locks are the facilities the NFS + version 4 protocol provides to allow applications to coordinate + access by providing mutual exclusion facilities. The NFS version 4 + protocol's data caching must be implemented such that it does not + invalidate the assumptions that those using these facilities depend + upon. + +9.3.1. Data Caching and OPENs + + In order to avoid invalidating the sharing assumptions that + applications rely on, NFS version 4 clients should not provide cached + data to applications or modify it on behalf of an application when it + would not be valid to obtain or modify that same data via a READ or + WRITE operation. + + Furthermore, in the absence of open delegation (see the section "Open + Delegation") two additional rules apply. Note that these rules are + obeyed in practice by many NFS version 2 and version 3 clients. + + o First, cached data present on a client must be revalidated after + doing an OPEN. Revalidating means that the client fetches the + change attribute from the server, compares it with the cached + change attribute, and if different, declares the cached data (as + well as the cached attributes) as invalid. This is to ensure that + the data for the OPENed file is still correctly reflected in the + client's cache. This validation must be done at least when the + client's OPEN operation includes DENY=WRITE or BOTH thus + terminating a period in which other clients may have had the + opportunity to open the file with WRITE access. Clients may + choose to do the revalidation more often (i.e., at OPENs + specifying DENY=NONE) to parallel the NFS version 3 protocol's + practice for the benefit of users assuming this degree of cache + revalidation. + + Since the change attribute is updated for data and metadata + modifications, some client implementors may be tempted to use the + time_modify attribute and not change to validate cached data, so + that metadata changes do not spuriously invalidate clean data. + The implementor is cautioned in this approach. The change + attribute is guaranteed to change for each update to the file, + + + +Shepler, et al. Standards Track [Page 98] + +RFC 3530 NFS version 4 Protocol April 2003 + + + whereas time_modify is guaranteed to change only at the + granularity of the time_delta attribute. Use by the client's data + cache validation logic of time_modify and not change runs the risk + of the client incorrectly marking stale data as valid. + + o Second, modified data must be flushed to the server before closing + a file OPENed for write. This is complementary to the first rule. + If the data is not flushed at CLOSE, the revalidation done after + client OPENs as file is unable to achieve its purpose. The other + aspect to flushing the data before close is that the data must be + committed to stable storage, at the server, before the CLOSE + operation is requested by the client. In the case of a server + reboot or restart and a CLOSEd file, it may not be possible to + retransmit the data to be written to the file. Hence, this + requirement. + +9.3.2. Data Caching and File Locking + + For those applications that choose to use file locking instead of + share reservations to exclude inconsistent file access, there is an + analogous set of constraints that apply to client side data caching. + These rules are effective only if the file locking is used in a way + that matches in an equivalent way the actual READ and WRITE + operations executed. This is as opposed to file locking that is + based on pure convention. For example, it is possible to manipulate + a two-megabyte file by dividing the file into two one-megabyte + regions and protecting access to the two regions by file locks on + bytes zero and one. A lock for write on byte zero of the file would + represent the right to do READ and WRITE operations on the first + region. A lock for write on byte one of the file would represent the + right to do READ and WRITE operations on the second region. As long + as all applications manipulating the file obey this convention, they + will work on a local filesystem. However, they may not work with the + NFS version 4 protocol unless clients refrain from data caching. + + The rules for data caching in the file locking environment are: + + o First, when a client obtains a file lock for a particular region, + the data cache corresponding to that region (if any cached data + exists) must be revalidated. If the change attribute indicates + that the file may have been updated since the cached data was + obtained, the client must flush or invalidate the cached data for + the newly locked region. A client might choose to invalidate all + of non-modified cached data that it has for the file but the only + requirement for correct operation is to invalidate all of the data + in the newly locked region. + + + + + +Shepler, et al. Standards Track [Page 99] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o Second, before releasing a write lock for a region, all modified + data for that region must be flushed to the server. The modified + data must also be written to stable storage. + + Note that flushing data to the server and the invalidation of cached + data must reflect the actual byte ranges locked or unlocked. + Rounding these up or down to reflect client cache block boundaries + will cause problems if not carefully done. For example, writing a + modified block when only half of that block is within an area being + unlocked may cause invalid modification to the region outside the + unlocked area. This, in turn, may be part of a region locked by + another client. Clients can avoid this situation by synchronously + performing portions of write operations that overlap that portion + (initial or final) that is not a full block. Similarly, invalidating + a locked area which is not an integral number of full buffer blocks + would require the client to read one or two partial blocks from the + server if the revalidation procedure shows that the data which the + client possesses may not be valid. + + The data that is written to the server as a prerequisite to the + unlocking of a region must be written, at the server, to stable + storage. The client may accomplish this either with synchronous + writes or by following asynchronous writes with a COMMIT operation. + This is required because retransmission of the modified data after a + server reboot might conflict with a lock held by another client. + + A client implementation may choose to accommodate applications which + use record locking in non-standard ways (e.g., using a record lock as + a global semaphore) by flushing to the server more data upon an LOCKU + than is covered by the locked range. This may include modified data + within files other than the one for which the unlocks are being done. + In such cases, the client must not interfere with applications whose + READs and WRITEs are being done only within the bounds of record + locks which the application holds. For example, an application locks + a single byte of a file and proceeds to write that single byte. A + client that chose to handle a LOCKU by flushing all modified data to + the server could validly write that single byte in response to an + unrelated unlock. However, it would not be valid to write the entire + block in which that single written byte was located since it includes + an area that is not locked and might be locked by another client. + Client implementations can avoid this problem by dividing files with + modified data into those for which all modifications are done to + areas covered by an appropriate record lock and those for which there + are modifications not covered by a record lock. Any writes done for + the former class of files must not include areas not locked and thus + not modified on the client. + + + + + +Shepler, et al. Standards Track [Page 100] + +RFC 3530 NFS version 4 Protocol April 2003 + + +9.3.3. Data Caching and Mandatory File Locking + + Client side data caching needs to respect mandatory file locking when + it is in effect. The presence of mandatory file locking for a given + file is indicated when the client gets back NFS4ERR_LOCKED from a + READ or WRITE on a file it has an appropriate share reservation for. + When mandatory locking is in effect for a file, the client must check + for an appropriate file lock for data being read or written. If a + lock exists for the range being read or written, the client may + satisfy the request using the client's validated cache. If an + appropriate file lock is not held for the range of the read or write, + the read or write request must not be satisfied by the client's cache + and the request must be sent to the server for processing. When a + read or write request partially overlaps a locked region, the request + should be subdivided into multiple pieces with each region (locked or + not) treated appropriately. + +9.3.4. Data Caching and File Identity + + When clients cache data, the file data needs to be organized + according to the filesystem object to which the data belongs. For + NFS version 3 clients, the typical practice has been to assume for + the purpose of caching that distinct filehandles represent distinct + filesystem objects. The client then has the choice to organize and + maintain the data cache on this basis. + + In the NFS version 4 protocol, there is now the possibility to have + significant deviations from a "one filehandle per object" model + because a filehandle may be constructed on the basis of the object's + pathname. Therefore, clients need a reliable method to determine if + two filehandles designate the same filesystem object. If clients + were simply to assume that all distinct filehandles denote distinct + objects and proceed to do data caching on this basis, caching + inconsistencies would arise between the distinct client side objects + which mapped to the same server side object. + + By providing a method to differentiate filehandles, the NFS version 4 + protocol alleviates a potential functional regression in comparison + with the NFS version 3 protocol. Without this method, caching + inconsistencies within the same client could occur and this has not + been present in previous versions of the NFS protocol. Note that it + is possible to have such inconsistencies with applications executing + on multiple clients but that is not the issue being addressed here. + + For the purposes of data caching, the following steps allow an NFS + version 4 client to determine whether two distinct filehandles denote + the same server side object: + + + + +Shepler, et al. Standards Track [Page 101] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o If GETATTR directed to two filehandles returns different values of + the fsid attribute, then the filehandles represent distinct + objects. + + o If GETATTR for any file with an fsid that matches the fsid of the + two filehandles in question returns a unique_handles attribute + with a value of TRUE, then the two objects are distinct. + + o If GETATTR directed to the two filehandles does not return the + fileid attribute for both of the handles, then it cannot be + determined whether the two objects are the same. Therefore, + operations which depend on that knowledge (e.g., client side data + caching) cannot be done reliably. + + o If GETATTR directed to the two filehandles returns different + values for the fileid attribute, then they are distinct objects. + + o Otherwise they are the same object. + +9.4. Open Delegation + + When a file is being OPENed, the server may delegate further handling + of opens and closes for that file to the opening client. Any such + delegation is recallable, since the circumstances that allowed for + the delegation are subject to change. In particular, the server may + receive a conflicting OPEN from another client, the server must + recall the delegation before deciding whether the OPEN from the other + client may be granted. Making a delegation is up to the server and + clients should not assume that any particular OPEN either will or + will not result in an open delegation. The following is a typical + set of conditions that servers might use in deciding whether OPEN + should be delegated: + + o The client must be able to respond to the server's callback + requests. The server will use the CB_NULL procedure for a test of + callback ability. + + o The client must have responded properly to previous recalls. + + o There must be no current open conflicting with the requested + delegation. + + o There should be no current delegation that conflicts with the + delegation being requested. + + o The probability of future conflicting open requests should be low + based on the recent history of the file. + + + + +Shepler, et al. Standards Track [Page 102] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o The existence of any server-specific semantics of OPEN/CLOSE that + would make the required handling incompatible with the prescribed + handling that the delegated client would apply (see below). + + There are two types of open delegations, read and write. A read open + delegation allows a client to handle, on its own, requests to open a + file for reading that do not deny read access to others. Multiple + read open delegations may be outstanding simultaneously and do not + conflict. A write open delegation allows the client to handle, on + its own, all opens. Only one write open delegation may exist for a + given file at a given time and it is inconsistent with any read open + delegations. + + When a client has a read open delegation, it may not make any changes + to the contents or attributes of the file but it is assured that no + other client may do so. When a client has a write open delegation, + it may modify the file data since no other client will be accessing + the file's data. The client holding a write delegation may only + affect file attributes which are intimately connected with the file + data: size, time_modify, change. + + When a client has an open delegation, it does not send OPENs or + CLOSEs to the server but updates the appropriate status internally. + For a read open delegation, opens that cannot be handled locally + (opens for write or that deny read access) must be sent to the + server. + + When an open delegation is made, the response to the OPEN contains an + open delegation structure which specifies the following: + + o the type of delegation (read or write) + + o space limitation information to control flushing of data on close + (write open delegation only, see the section "Open Delegation and + Data Caching") + + o an nfsace4 specifying read and write permissions + + o a stateid to represent the delegation for READ and WRITE + + The delegation stateid is separate and distinct from the stateid for + the OPEN proper. The standard stateid, unlike the delegation + stateid, is associated with a particular lock_owner and will continue + to be valid after the delegation is recalled and the file remains + open. + + + + + + +Shepler, et al. Standards Track [Page 103] + +RFC 3530 NFS version 4 Protocol April 2003 + + + When a request internal to the client is made to open a file and open + delegation is in effect, it will be accepted or rejected solely on + the basis of the following conditions. Any requirement for other + checks to be made by the delegate should result in open delegation + being denied so that the checks can be made by the server itself. + + o The access and deny bits for the request and the file as described + in the section "Share Reservations". + + o The read and write permissions as determined below. + + The nfsace4 passed with delegation can be used to avoid frequent + ACCESS calls. The permission check should be as follows: + + o If the nfsace4 indicates that the open may be done, then it should + be granted without reference to the server. + + o If the nfsace4 indicates that the open may not be done, then an + ACCESS request must be sent to the server to obtain the definitive + answer. + + The server may return an nfsace4 that is more restrictive than the + actual ACL of the file. This includes an nfsace4 that specifies + denial of all access. Note that some common practices such as + mapping the traditional user "root" to the user "nobody" may make it + incorrect to return the actual ACL of the file in the delegation + response. + + The use of delegation together with various other forms of caching + creates the possibility that no server authentication will ever be + performed for a given user since all of the user's requests might be + satisfied locally. Where the client is depending on the server for + authentication, the client should be sure authentication occurs for + each user by use of the ACCESS operation. This should be the case + even if an ACCESS operation would not be required otherwise. As + mentioned before, the server may enforce frequent authentication by + returning an nfsace4 denying all access with every open delegation. + +9.4.1. Open Delegation and Data Caching + + OPEN delegation allows much of the message overhead associated with + the opening and closing files to be eliminated. An open when an open + delegation is in effect does not require that a validation message be + sent to the server. The continued endurance of the "read open + delegation" provides a guarantee that no OPEN for write and thus no + write has occurred. Similarly, when closing a file opened for write + and if write open delegation is in effect, the data written does not + have to be flushed to the server until the open delegation is + + + +Shepler, et al. Standards Track [Page 104] + +RFC 3530 NFS version 4 Protocol April 2003 + + + recalled. The continued endurance of the open delegation provides a + guarantee that no open and thus no read or write has been done by + another client. + + For the purposes of open delegation, READs and WRITEs done without an + OPEN are treated as the functional equivalents of a corresponding + type of OPEN. This refers to the READs and WRITEs that use the + special stateids consisting of all zero bits or all one bits. + Therefore, READs or WRITEs with a special stateid done by another + client will force the server to recall a write open delegation. A + WRITE with a special stateid done by another client will force a + recall of read open delegations. + + With delegations, a client is able to avoid writing data to the + server when the CLOSE of a file is serviced. The file close system + call is the usual point at which the client is notified of a lack of + stable storage for the modified file data generated by the + application. At the close, file data is written to the server and + through normal accounting the server is able to determine if the + available filesystem space for the data has been exceeded (i.e., + server returns NFS4ERR_NOSPC or NFS4ERR_DQUOT). This accounting + includes quotas. The introduction of delegations requires that a + alternative method be in place for the same type of communication to + occur between client and server. + + In the delegation response, the server provides either the limit of + the size of the file or the number of modified blocks and associated + block size. The server must ensure that the client will be able to + flush data to the server of a size equal to that provided in the + original delegation. The server must make this assurance for all + outstanding delegations. Therefore, the server must be careful in + its management of available space for new or modified data taking + into account available filesystem space and any applicable quotas. + The server can recall delegations as a result of managing the + available filesystem space. The client should abide by the server's + state space limits for delegations. If the client exceeds the stated + limits for the delegation, the server's behavior is undefined. + + Based on server conditions, quotas or available filesystem space, the + server may grant write open delegations with very restrictive space + limitations. The limitations may be defined in a way that will + always force modified data to be flushed to the server on close. + + With respect to authentication, flushing modified data to the server + after a CLOSE has occurred may be problematic. For example, the user + of the application may have logged off the client and unexpired + authentication credentials may not be present. In this case, the + client may need to take special care to ensure that local unexpired + + + +Shepler, et al. Standards Track [Page 105] + +RFC 3530 NFS version 4 Protocol April 2003 + + + credentials will in fact be available. This may be accomplished by + tracking the expiration time of credentials and flushing data well in + advance of their expiration or by making private copies of + credentials to assure their availability when needed. + +9.4.2. Open Delegation and File Locks + + When a client holds a write open delegation, lock operations may be + performed locally. This includes those required for mandatory file + locking. This can be done since the delegation implies that there + can be no conflicting locks. Similarly, all of the revalidations + that would normally be associated with obtaining locks and the + flushing of data associated with the releasing of locks need not be + done. + + When a client holds a read open delegation, lock operations are not + performed locally. All lock operations, including those requesting + non-exclusive locks, are sent to the server for resolution. + +9.4.3. Handling of CB_GETATTR + + The server needs to employ special handling for a GETATTR where the + target is a file that has a write open delegation in effect. The + reason for this is that the client holding the write delegation may + have modified the data and the server needs to reflect this change to + the second client that submitted the GETATTR. Therefore, the client + holding the write delegation needs to be interrogated. The server + will use the CB_GETATTR operation. The only attributes that the + server can reliably query via CB_GETATTR are size and change. + + Since CB_GETATTR is being used to satisfy another client's GETATTR + request, the server only needs to know if the client holding the + delegation has a modified version of the file. If the client's copy + of the delegated file is not modified (data or size), the server can + satisfy the second client's GETATTR request from the attributes + stored locally at the server. If the file is modified, the server + only needs to know about this modified state. If the server + determines that the file is currently modified, it will respond to + the second client's GETATTR as if the file had been modified locally + at the server. + + Since the form of the change attribute is determined by the server + and is opaque to the client, the client and server need to agree on a + method of communicating the modified state of the file. For the size + attribute, the client will report its current view of the file size. + + For the change attribute, the handling is more involved. + + + + +Shepler, et al. Standards Track [Page 106] + +RFC 3530 NFS version 4 Protocol April 2003 + + + For the client, the following steps will be taken when receiving a + write delegation: + + o The value of the change attribute will be obtained from the server + and cached. Let this value be represented by c. + + o The client will create a value greater than c that will be used + for communicating modified data is held at the client. Let this + value be represented by d. + + o When the client is queried via CB_GETATTR for the change + attribute, it checks to see if it holds modified data. If the + file is modified, the value d is returned for the change attribute + value. If this file is not currently modified, the client returns + the value c for the change attribute. + + For simplicity of implementation, the client MAY for each CB_GETATTR + return the same value d. This is true even if, between successive + CB_GETATTR operations, the client again modifies in the file's data + or metadata in its cache. The client can return the same value + because the only requirement is that the client be able to indicate + to the server that the client holds modified data. Therefore, the + value of d may always be c + 1. + + While the change attribute is opaque to the client in the sense that + it has no idea what units of time, if any, the server is counting + change with, it is not opaque in that the client has to treat it as + an unsigned integer, and the server has to be able to see the results + of the client's changes to that integer. Therefore, the server MUST + encode the change attribute in network order when sending it to the + client. The client MUST decode it from network order to its native + order when receiving it and the client MUST encode it network order + when sending it to the server. For this reason, change is defined as + an unsigned integer rather than an opaque array of octets. + + For the server, the following steps will be taken when providing a + write delegation: + + o Upon providing a write delegation, the server will cache a copy of + the change attribute in the data structure it uses to record the + delegation. Let this value be represented by sc. + + o When a second client sends a GETATTR operation on the same file to + the server, the server obtains the change attribute from the first + client. Let this value be cc. + + + + + + +Shepler, et al. Standards Track [Page 107] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o If the value cc is equal to sc, the file is not modified and the + server returns the current values for change, time_metadata, and + time_modify (for example) to the second client. + + o If the value cc is NOT equal to sc, the file is currently modified + at the first client and most likely will be modified at the server + at a future time. The server then uses its current time to + construct attribute values for time_metadata and time_modify. A + new value of sc, which we will call nsc, is computed by the + server, such that nsc >= sc + 1. The server then returns the + constructed time_metadata, time_modify, and nsc values to the + requester. The server replaces sc in the delegation record with + nsc. To prevent the possibility of time_modify, time_metadata, + and change from appearing to go backward (which would happen if + the client holding the delegation fails to write its modified data + to the server before the delegation is revoked or returned), the + server SHOULD update the file's metadata record with the + constructed attribute values. For reasons of reasonable + performance, committing the constructed attribute values to stable + storage is OPTIONAL. + + As discussed earlier in this section, the client MAY return the + same cc value on subsequent CB_GETATTR calls, even if the file was + modified in the client's cache yet again between successive + CB_GETATTR calls. Therefore, the server must assume that the file + has been modified yet again, and MUST take care to ensure that the + new nsc it constructs and returns is greater than the previous nsc + it returned. An example implementation's delegation record would + satisfy this mandate by including a boolean field (let us call it + "modified") that is set to false when the delegation is granted, + and an sc value set at the time of grant to the change attribute + value. The modified field would be set to true the first time cc + != sc, and would stay true until the delegation is returned or + revoked. The processing for constructing nsc, time_modify, and + time_metadata would use this pseudo code: + + if (!modified) { + do CB_GETATTR for change and size; + + if (cc != sc) + modified = TRUE; + } else { + do CB_GETATTR for size; + } + + if (modified) { + sc = sc + 1; + time_modify = time_metadata = current_time; + + + +Shepler, et al. Standards Track [Page 108] + +RFC 3530 NFS version 4 Protocol April 2003 + + + update sc, time_modify, time_metadata into file's metadata; + } + + return to client (that sent GETATTR) the attributes + it requested, but make sure size comes from what + CB_GETATTR returned. Do not update the file's metadata + with the client's modified size. + + o In the case that the file attribute size is different than the + server's current value, the server treats this as a modification + regardless of the value of the change attribute retrieved via + CB_GETATTR and responds to the second client as in the last step. + + This methodology resolves issues of clock differences between client + and server and other scenarios where the use of CB_GETATTR break + down. + + It should be noted that the server is under no obligation to use + CB_GETATTR and therefore the server MAY simply recall the delegation + to avoid its use. + +9.4.4. Recall of Open Delegation + + The following events necessitate recall of an open delegation: + + o Potentially conflicting OPEN request (or READ/WRITE done with + "special" stateid) + + o SETATTR issued by another client + + o REMOVE request for the file + + o RENAME request for the file as either source or target of the + RENAME + + Whether a RENAME of a directory in the path leading to the file + results in recall of an open delegation depends on the semantics of + the server filesystem. If that filesystem denies such RENAMEs when a + file is open, the recall must be performed to determine whether the + file in question is, in fact, open. + + In addition to the situations above, the server may choose to recall + open delegations at any time if resource constraints make it + advisable to do so. Clients should always be prepared for the + possibility of recall. + + + + + + +Shepler, et al. Standards Track [Page 109] + +RFC 3530 NFS version 4 Protocol April 2003 + + + When a client receives a recall for an open delegation, it needs to + update state on the server before returning the delegation. These + same updates must be done whenever a client chooses to return a + delegation voluntarily. The following items of state need to be + dealt with: + + o If the file associated with the delegation is no longer open and + no previous CLOSE operation has been sent to the server, a CLOSE + operation must be sent to the server. + + o If a file has other open references at the client, then OPEN + operations must be sent to the server. The appropriate stateids + will be provided by the server for subsequent use by the client + since the delegation stateid will not longer be valid. These OPEN + requests are done with the claim type of CLAIM_DELEGATE_CUR. This + will allow the presentation of the delegation stateid so that the + client can establish the appropriate rights to perform the OPEN. + (see the section "Operation 18: OPEN" for details.) + + o If there are granted file locks, the corresponding LOCK operations + need to be performed. This applies to the write open delegation + case only. + + o For a write open delegation, if at the time of recall the file is + not open for write, all modified data for the file must be flushed + to the server. If the delegation had not existed, the client + would have done this data flush before the CLOSE operation. + + o For a write open delegation when a file is still open at the time + of recall, any modified data for the file needs to be flushed to + the server. + + o With the write open delegation in place, it is possible that the + file was truncated during the duration of the delegation. For + example, the truncation could have occurred as a result of an OPEN + UNCHECKED with a size attribute value of zero. Therefore, if a + truncation of the file has occurred and this operation has not + been propagated to the server, the truncation must occur before + any modified data is written to the server. + + In the case of write open delegation, file locking imposes some + additional requirements. To precisely maintain the associated + invariant, it is required to flush any modified data in any region + for which a write lock was released while the write delegation was in + effect. However, because the write open delegation implies no other + locking by other clients, a simpler implementation is to flush all + modified data for the file (as described just above) if any write + lock has been released while the write open delegation was in effect. + + + +Shepler, et al. Standards Track [Page 110] + +RFC 3530 NFS version 4 Protocol April 2003 + + + An implementation need not wait until delegation recall (or deciding + to voluntarily return a delegation) to perform any of the above + actions, if implementation considerations (e.g., resource + availability constraints) make that desirable. Generally, however, + the fact that the actual open state of the file may continue to + change makes it not worthwhile to send information about opens and + closes to the server, except as part of delegation return. Only in + the case of closing the open that resulted in obtaining the + delegation would clients be likely to do this early, since, in that + case, the close once done will not be undone. Regardless of the + client's choices on scheduling these actions, all must be performed + before the delegation is returned, including (when applicable) the + close that corresponds to the open that resulted in the delegation. + These actions can be performed either in previous requests or in + previous operations in the same COMPOUND request. + +9.4.5. Clients that Fail to Honor Delegation Recalls + + A client may fail to respond to a recall for various reasons, such as + a failure of the callback path from server to the client. The client + may be unaware of a failure in the callback path. This lack of + awareness could result in the client finding out long after the + failure that its delegation has been revoked, and another client has + modified the data for which the client had a delegation. This is + especially a problem for the client that held a write delegation. + + The server also has a dilemma in that the client that fails to + respond to the recall might also be sending other NFS requests, + including those that renew the lease before the lease expires. + Without returning an error for those lease renewing operations, the + server leads the client to believe that the delegation it has is in + force. + + This difficulty is solved by the following rules: + + o When the callback path is down, the server MUST NOT revoke the + delegation if one of the following occurs: + + - The client has issued a RENEW operation and the server has + returned an NFS4ERR_CB_PATH_DOWN error. The server MUST renew + the lease for any record locks and share reservations the + client has that the server has known about (as opposed to those + locks and share reservations the client has established but not + yet sent to the server, due to the delegation). The server + SHOULD give the client a reasonable time to return its + delegations to the server before revoking the client's + delegations. + + + + +Shepler, et al. Standards Track [Page 111] + +RFC 3530 NFS version 4 Protocol April 2003 + + + - The client has not issued a RENEW operation for some period of + time after the server attempted to recall the delegation. This + period of time MUST NOT be less than the value of the + lease_time attribute. + + o When the client holds a delegation, it can not rely on operations, + except for RENEW, that take a stateid, to renew delegation leases + across callback path failures. The client that wants to keep + delegations in force across callback path failures must use RENEW + to do so. + +9.4.6. Delegation Revocation + + At the point a delegation is revoked, if there are associated opens + on the client, the applications holding these opens need to be + notified. This notification usually occurs by returning errors for + READ/WRITE operations or when a close is attempted for the open file. + + If no opens exist for the file at the point the delegation is + revoked, then notification of the revocation is unnecessary. + However, if there is modified data present at the client for the + file, the user of the application should be notified. Unfortunately, + it may not be possible to notify the user since active applications + may not be present at the client. See the section "Revocation + Recovery for Write Open Delegation" for additional details. + +9.5. Data Caching and Revocation + + When locks and delegations are revoked, the assumptions upon which + successful caching depend are no longer guaranteed. For any locks or + share reservations that have been revoked, the corresponding owner + needs to be notified. This notification includes applications with a + file open that has a corresponding delegation which has been revoked. + Cached data associated with the revocation must be removed from the + client. In the case of modified data existing in the client's cache, + that data must be removed from the client without it being written to + the server. As mentioned, the assumptions made by the client are no + longer valid at the point when a lock or delegation has been revoked. + For example, another client may have been granted a conflicting lock + after the revocation of the lock at the first client. Therefore, the + data within the lock range may have been modified by the other + client. Obviously, the first client is unable to guarantee to the + application what has occurred to the file in the case of revocation. + + Notification to a lock owner will in many cases consist of simply + returning an error on the next and all subsequent READs/WRITEs to the + open file or on the close. Where the methods available to a client + make such notification impossible because errors for certain + + + +Shepler, et al. Standards Track [Page 112] + +RFC 3530 NFS version 4 Protocol April 2003 + + + operations may not be returned, more drastic action such as signals + or process termination may be appropriate. The justification for + this is that an invariant for which an application depends on may be + violated. Depending on how errors are typically treated for the + client operating environment, further levels of notification + including logging, console messages, and GUI pop-ups may be + appropriate. + +9.5.1. Revocation Recovery for Write Open Delegation + + Revocation recovery for a write open delegation poses the special + issue of modified data in the client cache while the file is not + open. In this situation, any client which does not flush modified + data to the server on each close must ensure that the user receives + appropriate notification of the failure as a result of the + revocation. Since such situations may require human action to + correct problems, notification schemes in which the appropriate user + or administrator is notified may be necessary. Logging and console + messages are typical examples. + + If there is modified data on the client, it must not be flushed + normally to the server. A client may attempt to provide a copy of + the file data as modified during the delegation under a different + name in the filesystem name space to ease recovery. Note that when + the client can determine that the file has not been modified by any + other client, or when the client has a complete cached copy of file + in question, such a saved copy of the client's view of the file may + be of particular value for recovery. In other case, recovery using a + copy of the file based partially on the client's cached data and + partially on the server copy as modified by other clients, will be + anything but straightforward, so clients may avoid saving file + contents in these situations or mark the results specially to warn + users of possible problems. + + Saving of such modified data in delegation revocation situations may + be limited to files of a certain size or might be used only when + sufficient disk space is available within the target filesystem. + Such saving may also be restricted to situations when the client has + sufficient buffering resources to keep the cached copy available + until it is properly stored to the target filesystem. + +9.6. Attribute Caching + + The attributes discussed in this section do not include named + attributes. Individual named attributes are analogous to files and + caching of the data for these needs to be handled just as data + + + + + +Shepler, et al. Standards Track [Page 113] + +RFC 3530 NFS version 4 Protocol April 2003 + + + caching is for ordinary files. Similarly, LOOKUP results from an + OPENATTR directory are to be cached on the same basis as any other + pathnames and similarly for directory contents. + + Clients may cache file attributes obtained from the server and use + them to avoid subsequent GETATTR requests. Such caching is write + through in that modification to file attributes is always done by + means of requests to the server and should not be done locally and + cached. The exception to this are modifications to attributes that + are intimately connected with data caching. Therefore, extending a + file by writing data to the local data cache is reflected immediately + in the size as seen on the client without this change being + immediately reflected on the server. Normally such changes are not + propagated directly to the server but when the modified data is + flushed to the server, analogous attribute changes are made on the + server. When open delegation is in effect, the modified attributes + may be returned to the server in the response to a CB_RECALL call. + + The result of local caching of attributes is that the attribute + caches maintained on individual clients will not be coherent. + Changes made in one order on the server may be seen in a different + order on one client and in a third order on a different client. + + The typical filesystem application programming interfaces do not + provide means to atomically modify or interrogate attributes for + multiple files at the same time. The following rules provide an + environment where the potential incoherences mentioned above can be + reasonably managed. These rules are derived from the practice of + previous NFS protocols. + + o All attributes for a given file (per-fsid attributes excepted) are + cached as a unit at the client so that no non-serializability can + arise within the context of a single file. + + o An upper time boundary is maintained on how long a client cache + entry can be kept without being refreshed from the server. + + o When operations are performed that change attributes at the + server, the updated attribute set is requested as part of the + containing RPC. This includes directory operations that update + attributes indirectly. This is accomplished by following the + modifying operation with a GETATTR operation and then using the + results of the GETATTR to update the client's cached attributes. + + Note that if the full set of attributes to be cached is requested by + READDIR, the results can be cached by the client on the same basis as + attributes obtained via GETATTR. + + + + +Shepler, et al. Standards Track [Page 114] + +RFC 3530 NFS version 4 Protocol April 2003 + + + A client may validate its cached version of attributes for a file by + fetching just both the change and time_access attributes and assuming + that if the change attribute has the same value as it did when the + attributes were cached, then no attributes other than time_access + have changed. The reason why time_access is also fetched is because + many servers operate in environments where the operation that updates + change does not update time_access. For example, POSIX file + semantics do not update access time when a file is modified by the + write system call. Therefore, the client that wants a current + time_access value should fetch it with change during the attribute + cache validation processing and update its cached time_access. + + The client may maintain a cache of modified attributes for those + attributes intimately connected with data of modified regular files + (size, time_modify, and change). Other than those three attributes, + the client MUST NOT maintain a cache of modified attributes. + Instead, attribute changes are immediately sent to the server. + + In some operating environments, the equivalent to time_access is + expected to be implicitly updated by each read of the content of the + file object. If an NFS client is caching the content of a file + object, whether it is a regular file, directory, or symbolic link, + the client SHOULD NOT update the time_access attribute (via SETATTR + or a small READ or READDIR request) on the server with each read that + is satisfied from cache. The reason is that this can defeat the + performance benefits of caching content, especially since an explicit + SETATTR of time_access may alter the change attribute on the server. + If the change attribute changes, clients that are caching the content + will think the content has changed, and will re-read unmodified data + from the server. Nor is the client encouraged to maintain a modified + version of time_access in its cache, since this would mean that the + client will either eventually have to write the access time to the + server with bad performance effects, or it would never update the + server's time_access, thereby resulting in a situation where an + application that caches access time between a close and open of the + same file observes the access time oscillating between the past and + present. The time_access attribute always means the time of last + access to a file by a read that was satisfied by the server. This + way clients will tend to see only time_access changes that go forward + in time. + +9.7. Data and Metadata Caching and Memory Mapped Files + + Some operating environments include the capability for an application + to map a file's content into the application's address space. Each + time the application accesses a memory location that corresponds to a + block that has not been loaded into the address space, a page fault + occurs and the file is read (or if the block does not exist in the + + + +Shepler, et al. Standards Track [Page 115] + +RFC 3530 NFS version 4 Protocol April 2003 + + + file, the block is allocated and then instantiated in the + application's address space). + + As long as each memory mapped access to the file requires a page + fault, the relevant attributes of the file that are used to detect + access and modification (time_access, time_metadata, time_modify, and + change) will be updated. However, in many operating environments, + when page faults are not required these attributes will not be + updated on reads or updates to the file via memory access (regardless + whether the file is local file or is being access remotely). A + client or server MAY fail to update attributes of a file that is + being accessed via memory mapped I/O. This has several implications: + + o If there is an application on the server that has memory mapped a + file that a client is also accessing, the client may not be able + to get a consistent value of the change attribute to determine + whether its cache is stale or not. A server that knows that the + file is memory mapped could always pessimistically return updated + values for change so as to force the application to always get the + most up to date data and metadata for the file. However, due to + the negative performance implications of this, such behavior is + OPTIONAL. + + o If the memory mapped file is not being modified on the server, and + instead is just being read by an application via the memory mapped + interface, the client will not see an updated time_access + attribute. However, in many operating environments, neither will + any process running on the server. Thus NFS clients are at no + disadvantage with respect to local processes. + + o If there is another client that is memory mapping the file, and if + that client is holding a write delegation, the same set of issues + as discussed in the previous two bullet items apply. So, when a + server does a CB_GETATTR to a file that the client has modified in + its cache, the response from CB_GETATTR will not necessarily be + accurate. As discussed earlier, the client's obligation is to + report that the file has been modified since the delegation was + granted, not whether it has been modified again between successive + CB_GETATTR calls, and the server MUST assume that any file the + client has modified in cache has been modified again between + successive CB_GETATTR calls. Depending on the nature of the + client's memory management system, this weak obligation may not be + possible. A client MAY return stale information in CB_GETATTR + whenever the file is memory mapped. + + o The mixture of memory mapping and file locking on the same file is + problematic. Consider the following scenario, where the page size + on each client is 8192 bytes. + + + +Shepler, et al. Standards Track [Page 116] + +RFC 3530 NFS version 4 Protocol April 2003 + + + - Client A memory maps first page (8192 bytes) of file X + + - Client B memory maps first page (8192 bytes) of file X + + - Client A write locks first 4096 bytes + + - Client B write locks second 4096 bytes + + - Client A, via a STORE instruction modifies part of its locked + region. + + - Simultaneous to client A, client B issues a STORE on part of + its locked region. + + Here the challenge is for each client to resynchronize to get a + correct view of the first page. In many operating environments, the + virtual memory management systems on each client only know a page is + modified, not that a subset of the page corresponding to the + respective lock regions has been modified. So it is not possible for + each client to do the right thing, which is to only write to the + server that portion of the page that is locked. For example, if + client A simply writes out the page, and then client B writes out the + page, client A's data is lost. + + Moreover, if mandatory locking is enabled on the file, then we have a + different problem. When clients A and B issue the STORE + instructions, the resulting page faults require a record lock on the + entire page. Each client then tries to extend their locked range to + the entire page, which results in a deadlock. + + Communicating the NFS4ERR_DEADLOCK error to a STORE instruction is + difficult at best. + + If a client is locking the entire memory mapped file, there is no + problem with advisory or mandatory record locking, at least until the + client unlocks a region in the middle of the file. + + Given the above issues the following are permitted: + + - Clients and servers MAY deny memory mapping a file they know there + are record locks for. + + - Clients and servers MAY deny a record lock on a file they know is + memory mapped. + + + + + + + +Shepler, et al. Standards Track [Page 117] + +RFC 3530 NFS version 4 Protocol April 2003 + + + - A client MAY deny memory mapping a file that it knows requires + mandatory locking for I/O. If mandatory locking is enabled after + the file is opened and mapped, the client MAY deny the application + further access to its mapped file. + +9.8. Name Caching + + The results of LOOKUP and READDIR operations may be cached to avoid + the cost of subsequent LOOKUP operations. Just as in the case of + attribute caching, inconsistencies may arise among the various client + caches. To mitigate the effects of these inconsistencies and given + the context of typical filesystem APIs, an upper time boundary is + maintained on how long a client name cache entry can be kept without + verifying that the entry has not been made invalid by a directory + change operation performed by another client. + + When a client is not making changes to a directory for which there + exist name cache entries, the client needs to periodically fetch + attributes for that directory to ensure that it is not being + modified. After determining that no modification has occurred, the + expiration time for the associated name cache entries may be updated + to be the current time plus the name cache staleness bound. + + When a client is making changes to a given directory, it needs to + determine whether there have been changes made to the directory by + other clients. It does this by using the change attribute as + reported before and after the directory operation in the associated + change_info4 value returned for the operation. The server is able to + communicate to the client whether the change_info4 data is provided + atomically with respect to the directory operation. If the change + values are provided atomically, the client is then able to compare + the pre-operation change value with the change value in the client's + name cache. If the comparison indicates that the directory was + updated by another client, the name cache associated with the + modified directory is purged from the client. If the comparison + indicates no modification, the name cache can be updated on the + client to reflect the directory operation and the associated timeout + extended. The post-operation change value needs to be saved as the + basis for future change_info4 comparisons. + + As demonstrated by the scenario above, name caching requires that the + client revalidate name cache data by inspecting the change attribute + of a directory at the point when the name cache item was cached. + This requires that the server update the change attribute for + directories when the contents of the corresponding directory is + modified. For a client to use the change_info4 information + appropriately and correctly, the server must report the pre and post + operation change attribute values atomically. When the server is + + + +Shepler, et al. Standards Track [Page 118] + +RFC 3530 NFS version 4 Protocol April 2003 + + + unable to report the before and after values atomically with respect + to the directory operation, the server must indicate that fact in the + change_info4 return value. When the information is not atomically + reported, the client should not assume that other clients have not + changed the directory. + +9.9. Directory Caching + + The results of READDIR operations may be used to avoid subsequent + READDIR operations. Just as in the cases of attribute and name + caching, inconsistencies may arise among the various client caches. + To mitigate the effects of these inconsistencies, and given the + context of typical filesystem APIs, the following rules should be + followed: + + o Cached READDIR information for a directory which is not obtained + in a single READDIR operation must always be a consistent snapshot + of directory contents. This is determined by using a GETATTR + before the first READDIR and after the last of READDIR that + contributes to the cache. + + o An upper time boundary is maintained to indicate the length of + time a directory cache entry is considered valid before the client + must revalidate the cached information. + + The revalidation technique parallels that discussed in the case of + name caching. When the client is not changing the directory in + question, checking the change attribute of the directory with GETATTR + is adequate. The lifetime of the cache entry can be extended at + these checkpoints. When a client is modifying the directory, the + client needs to use the change_info4 data to determine whether there + are other clients modifying the directory. If it is determined that + no other client modifications are occurring, the client may update + its directory cache to reflect its own changes. + + As demonstrated previously, directory caching requires that the + client revalidate directory cache data by inspecting the change + attribute of a directory at the point when the directory was cached. + This requires that the server update the change attribute for + directories when the contents of the corresponding directory is + modified. For a client to use the change_info4 information + appropriately and correctly, the server must report the pre and post + operation change attribute values atomically. When the server is + unable to report the before and after values atomically with respect + to the directory operation, the server must indicate that fact in the + change_info4 return value. When the information is not atomically + reported, the client should not assume that other clients have not + changed the directory. + + + +Shepler, et al. Standards Track [Page 119] + +RFC 3530 NFS version 4 Protocol April 2003 + + +10. Minor Versioning + + To address the requirement of an NFS protocol that can evolve as the + need arises, the NFS version 4 protocol contains the rules and + framework to allow for future minor changes or versioning. + + The base assumption with respect to minor versioning is that any + future accepted minor version must follow the IETF process and be + documented in a standards track RFC. Therefore, each minor version + number will correspond to an RFC. Minor version zero of the NFS + version 4 protocol is represented by this RFC. The COMPOUND + procedure will support the encoding of the minor version being + requested by the client. + + The following items represent the basic rules for the development of + minor versions. Note that a future minor version may decide to + modify or add to the following rules as part of the minor version + definition. + + 1. Procedures are not added or deleted + + To maintain the general RPC model, NFS version 4 minor versions + will not add to or delete procedures from the NFS program. + + 2. Minor versions may add operations to the COMPOUND and + CB_COMPOUND procedures. + + The addition of operations to the COMPOUND and CB_COMPOUND + procedures does not affect the RPC model. + + 2.1 Minor versions may append attributes to GETATTR4args, bitmap4, + and GETATTR4res. + + This allows for the expansion of the attribute model to allow + for future growth or adaptation. + + 2.2 Minor version X must append any new attributes after the last + documented attribute. + + Since attribute results are specified as an opaque array of + per-attribute XDR encoded results, the complexity of adding new + attributes in the midst of the current definitions will be too + burdensome. + + 3. Minor versions must not modify the structure of an existing + operation's arguments or results. + + + + + +Shepler, et al. Standards Track [Page 120] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Again the complexity of handling multiple structure definitions + for a single operation is too burdensome. New operations should + be added instead of modifying existing structures for a minor + version. + + This rule does not preclude the following adaptations in a minor + version. + + o adding bits to flag fields such as new attributes to GETATTR's + bitmap4 data type + + o adding bits to existing attributes like ACLs that have flag + words + + o extending enumerated types (including NFS4ERR_*) with new + values + + 4. Minor versions may not modify the structure of existing + attributes. + + 5. Minor versions may not delete operations. + + This prevents the potential reuse of a particular operation + "slot" in a future minor version. + + 6. Minor versions may not delete attributes. + + 7. Minor versions may not delete flag bits or enumeration values. + + 8. Minor versions may declare an operation as mandatory to NOT + implement. + + Specifying an operation as "mandatory to not implement" is + equivalent to obsoleting an operation. For the client, it means + that the operation should not be sent to the server. For the + server, an NFS error can be returned as opposed to "dropping" + the request as an XDR decode error. This approach allows for + the obsolescence of an operation while maintaining its structure + so that a future minor version can reintroduce the operation. + + 8.1 Minor versions may declare attributes mandatory to NOT + implement. + + 8.2 Minor versions may declare flag bits or enumeration values as + mandatory to NOT implement. + + 9. Minor versions may downgrade features from mandatory to + recommended, or recommended to optional. + + + +Shepler, et al. Standards Track [Page 121] + +RFC 3530 NFS version 4 Protocol April 2003 + + + 10. Minor versions may upgrade features from optional to recommended + or recommended to mandatory. + + 11. A client and server that support minor version X must support + minor versions 0 (zero) through X-1 as well. + + 12. No new features may be introduced as mandatory in a minor + version. + + This rule allows for the introduction of new functionality and + forces the use of implementation experience before designating a + feature as mandatory. + + 13. A client MUST NOT attempt to use a stateid, filehandle, or + similar returned object from the COMPOUND procedure with minor + version X for another COMPOUND procedure with minor version Y, + where X != Y. + +11. Internationalization + + The primary issue in which NFS version 4 needs to deal with + internationalization, or I18N, is with respect to file names and + other strings as used within the protocol. The choice of string + representation must allow reasonable name/string access to clients + which use various languages. The UTF-8 encoding of the UCS as + defined by [ISO10646] allows for this type of access and follows the + policy described in "IETF Policy on Character Sets and Languages", + [RFC2277]. + + [RFC3454], otherwise know as "stringprep", documents a framework for + using Unicode/UTF-8 in networking protocols, so as "to increase the + likelihood that string input and string comparison work in ways that + make sense for typical users throughout the world." A protocol must + define a profile of stringprep "in order to fully specify the + processing options." The remainder of this Internationalization + section defines the NFS version 4 stringprep profiles. Much of + terminology used for the remainder of this section comes from + stringprep. + + There are three UTF-8 string types defined for NFS version 4: + utf8str_cs, utf8str_cis, and utf8str_mixed. Separate profiles are + defined for each. Each profile defines the following, as required by + stringprep: + + o The intended applicability of the profile + + + + + + +Shepler, et al. Standards Track [Page 122] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o The character repertoire that is the input and output to + stringprep (which is Unicode 3.2 for referenced version of + stringprep) + + o The mapping tables from stringprep used (as described in section 3 + of stringprep) + + o Any additional mapping tables specific to the profile + + o The Unicode normalization used, if any (as described in section 4 + of stringprep) + + o The tables from stringprep listing of characters that are + prohibited as output (as described in section 5 of stringprep) + + o The bidirectional string testing used, if any (as described in + section 6 of stringprep) + + o Any additional characters that are prohibited as output specific + to the profile + + Stringprep discusses Unicode characters, whereas NFS version 4 + renders UTF-8 characters. Since there is a one to one mapping from + UTF-8 to Unicode, where ever the remainder of this document refers to + to Unicode, the reader should assume UTF-8. + + Much of the text for the profiles comes from [RFC3454]. + +11.1. Stringprep profile for the utf8str_cs type + + Every use of the utf8str_cs type definition in the NFS version 4 + protocol specification follows the profile named nfs4_cs_prep. + +11.1.1. Intended applicability of the nfs4_cs_prep profile + + The utf8str_cs type is a case sensitive string of UTF-8 characters. + Its primary use in NFS Version 4 is for naming components and + pathnames. Components and pathnames are stored on the server's + filesystem. Two valid distinct UTF-8 strings might be the same after + processing via the utf8str_cs profile. If the strings are two names + inside a directory, the NFS version 4 server will need to either: + + o disallow the creation of a second name if it's post processed form + collides with that of an existing name, or + + o allow the creation of the second name, but arrange so that after + post processing, the second name is different than the post + processed form of the first name. + + + +Shepler, et al. Standards Track [Page 123] + +RFC 3530 NFS version 4 Protocol April 2003 + + +11.1.2. Character repertoire of nfs4_cs_prep + + The nfs4_cs_prep profile uses Unicode 3.2, as defined in stringprep's + Appendix A.1 + +11.1.3. Mapping used by nfs4_cs_prep + + The nfs4_cs_prep profile specifies mapping using the following tables + from stringprep: + + Table B.1 + + Table B.2 is normally not part of the nfs4_cs_prep profile as it is + primarily for dealing with case-insensitive comparisons. However, if + the NFS version 4 file server supports the case_insensitive + filesystem attribute, and if case_insensitive is true, the NFS + version 4 server MUST use Table B.2 (in addition to Table B1) when + processing utf8str_cs strings, and the NFS version 4 client MUST + assume Table B.2 (in addition to Table B.1) are being used. + + If the case_preserving attribute is present and set to false, then + the NFS version 4 server MUST use table B.2 to map case when + processing utf8str_cs strings. Whether the server maps from lower to + upper case or the upper to lower case is an implementation + dependency. + +11.1.4. Normalization used by nfs4_cs_prep + + The nfs4_cs_prep profile does not specify a normalization form. A + later revision of this specification may specify a particular + normalization form. Therefore, the server and client can expect that + they may receive unnormalized characters within protocol requests and + responses. If the operating environment requires normalization, then + the implementation must normalize utf8str_cs strings within the + protocol before presenting the information to an application (at the + client) or local filesystem (at the server). + + + + + + + + + + + + + + + +Shepler, et al. Standards Track [Page 124] + +RFC 3530 NFS version 4 Protocol April 2003 + + +11.1.5. Prohibited output for nfs4_cs_prep + + The nfs4_cs_prep profile specifies prohibiting using the following + tables from stringprep: + + Table C.3 + Table C.4 + Table C.5 + Table C.6 + Table C.7 + Table C.8 + Table C.9 + +11.1.6. Bidirectional output for nfs4_cs_prep + + The nfs4_cs_prep profile does not specify any checking of + bidirectional strings. + +11.2. Stringprep profile for the utf8str_cis type + + Every use of the utf8str_cis type definition in the NFS version 4 + protocol specification follows the profile named nfs4_cis_prep. + +11.2.1. Intended applicability of the nfs4_cis_prep profile + + The utf8str_cis type is a case insensitive string of UTF-8 + characters. Its primary use in NFS Version 4 is for naming NFS + servers. + +11.2.2. Character repertoire of nfs4_cis_prep + + The nfs4_cis_prep profile uses Unicode 3.2, as defined in + stringprep's Appendix A.1 + +11.2.3. Mapping used by nfs4_cis_prep + + The nfs4_cis_prep profile specifies mapping using the following + tables from stringprep: + + Table B.1 + Table B.2 + +11.2.4. Normalization used by nfs4_cis_prep + + The nfs4_cis_prep profile specifies using Unicode normalization form + KC, as described in stringprep. + + + + + +Shepler, et al. Standards Track [Page 125] + +RFC 3530 NFS version 4 Protocol April 2003 + + +11.2.5. Prohibited output for nfs4_cis_prep + + The nfs4_cis_prep profile specifies prohibiting using the following + tables from stringprep: + + Table C.1.2 + Table C.2.2 + Table C.3 + Table C.4 + Table C.5 + Table C.6 + Table C.7 + Table C.8 + Table C.9 + +11.2.6. Bidirectional output for nfs4_cis_prep + + The nfs4_cis_prep profile specifies checking bidirectional strings as + described in stringprep's section 6. + +11.3. Stringprep profile for the utf8str_mixed type + + Every use of the utf8str_mixed type definition in the NFS version 4 + protocol specification follows the profile named nfs4_mixed_prep. + +11.3.1. Intended applicability of the nfs4_mixed_prep profile + + The utf8str_mixed type is a string of UTF-8 characters, with a prefix + that is case sensitive, a separator equal to '@', and a suffix that + is fully qualified domain name. Its primary use in NFS Version 4 is + for naming principals identified in an Access Control Entry. + +11.3.2. Character repertoire of nfs4_mixed_prep + + The nfs4_mixed_prep profile uses Unicode 3.2, as defined in + stringprep's Appendix A.1 + +11.3.3. Mapping used by nfs4_cis_prep + + For the prefix and the separator of a utf8str_mixed string, the + nfs4_mixed_prep profile specifies mapping using the following table + from stringprep: + + Table B.1 + + For the suffix of a utf8str_mixed string, the nfs4_mixed_prep profile + specifies mapping using the following tables from stringprep: + + + + +Shepler, et al. Standards Track [Page 126] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Table B.1 + Table B.2 + +11.3.4. Normalization used by nfs4_mixed_prep + + The nfs4_mixed_prep profile specifies using Unicode normalization + form KC, as described in stringprep. + +11.3.5. Prohibited output for nfs4_mixed_prep + + The nfs4_mixed_prep profile specifies prohibiting using the following + tables from stringprep: + + Table C.1.2 + Table C.2.2 + Table C.3 + Table C.4 + Table C.5 + Table C.6 + Table C.7 + Table C.8 + Table C.9 + +11.3.6. Bidirectional output for nfs4_mixed_prep + + The nfs4_mixed_prep profile specifies checking bidirectional strings + as described in stringprep's section 6. + +11.4. UTF-8 Related Errors + + Where the client sends an invalid UTF-8 string, the server should + return an NFS4ERR_INVAL error. This includes cases in which + inappropriate prefixes are detected and where the count includes + trailing bytes that do not constitute a full UCS character. + + Where the client supplied string is valid UTF-8 but contains + characters that are not supported by the server as a value for that + string (e.g., names containing characters that have more than two + octets on a filesystem that supports Unicode characters only), the + server should return an NFS4ERR_BADCHAR error. + + Where a UTF-8 string is used as a file name, and the filesystem, + while supporting all of the characters within the name, does not + allow that particular name to be used, the server should return the + error NFS4ERR_BADNAME. This includes situations in which the server + filesystem imposes a normalization constraint on name strings, but + + + + + +Shepler, et al. Standards Track [Page 127] + +RFC 3530 NFS version 4 Protocol April 2003 + + + will also include such situations as filesystem prohibitions of "." + and ".." as file names for certain operations, and other such + constraints. + +12. Error Definitions + + NFS error numbers are assigned to failed operations within a compound + request. A compound request contains a number of NFS operations that + have their results encoded in sequence in a compound reply. The + results of successful operations will consist of an NFS4_OK status + followed by the encoded results of the operation. If an NFS + operation fails, an error status will be entered in the reply and the + compound request will be terminated. + + A description of each defined error follows: + + NFS4_OK Indicates the operation completed successfully. + + + NFS4ERR_ACCESS Permission denied. The caller does not have the + correct permission to perform the requested + operation. Contrast this with NFS4ERR_PERM, + which restricts itself to owner or privileged + user permission failures. + + NFS4ERR_ATTRNOTSUPP An attribute specified is not supported by the + server. Does not apply to the GETATTR + operation. + + NFS4ERR_ADMIN_REVOKED Due to administrator intervention, the + lockowner's record locks, share reservations, + and delegations have been revoked by the + server. + + NFS4ERR_BADCHAR A UTF-8 string contains a character which is + not supported by the server in the context in + which it being used. + + NFS4ERR_BAD_COOKIE READDIR cookie is stale. + + NFS4ERR_BADHANDLE Illegal NFS filehandle. The filehandle failed + internal consistency checks. + + NFS4ERR_BADNAME A name string in a request consists of valid + UTF-8 characters supported by the server but + the name is not supported by the server as a + valid name for current operation. + + + + +Shepler, et al. Standards Track [Page 128] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_BADOWNER An owner, owner_group, or ACL attribute value + can not be translated to local representation. + + NFS4ERR_BADTYPE An attempt was made to create an object of a + type not supported by the server. + + NFS4ERR_BAD_RANGE The range for a LOCK, LOCKT, or LOCKU operation + is not appropriate to the allowable range of + offsets for the server. + + NFS4ERR_BAD_SEQID The sequence number in a locking request is + neither the next expected number or the last + number processed. + + NFS4ERR_BAD_STATEID A stateid generated by the current server + instance, but which does not designate any + locking state (either current or superseded) + for a current lockowner-file pair, was used. + + NFS4ERR_BADXDR The server encountered an XDR decoding error + while processing an operation. + + NFS4ERR_CLID_INUSE The SETCLIENTID operation has found that a + client id is already in use by another client. + + NFS4ERR_DEADLOCK The server has been able to determine a file + locking deadlock condition for a blocking lock + request. + + NFS4ERR_DELAY The server initiated the request, but was not + able to complete it in a timely fashion. The + client should wait and then try the request + with a new RPC transaction ID. For example, + this error should be returned from a server + that supports hierarchical storage and receives + a request to process a file that has been + migrated. In this case, the server should start + the immigration process and respond to client + with this error. This error may also occur + when a necessary delegation recall makes + processing a request in a timely fashion + impossible. + + NFS4ERR_DENIED An attempt to lock a file is denied. Since + this may be a temporary condition, the client + is encouraged to retry the lock request until + the lock is accepted. + + + + +Shepler, et al. Standards Track [Page 129] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_DQUOT Resource (quota) hard limit exceeded. The + user's resource limit on the server has been + exceeded. + + NFS4ERR_EXIST File exists. The file specified already exists. + + NFS4ERR_EXPIRED A lease has expired that is being used in the + current operation. + + NFS4ERR_FBIG File too large. The operation would have caused + a file to grow beyond the server's limit. + + NFS4ERR_FHEXPIRED The filehandle provided is volatile and has + expired at the server. + + NFS4ERR_FILE_OPEN The operation can not be successfully processed + because a file involved in the operation is + currently open. + + NFS4ERR_GRACE The server is in its recovery or grace period + which should match the lease period of the + server. + + NFS4ERR_INVAL Invalid argument or unsupported argument for an + operation. Two examples are attempting a + READLINK on an object other than a symbolic + link or specifying a value for an enum field + that is not defined in the protocol (e.g., + nfs_ftype4). + + NFS4ERR_IO I/O error. A hard error (for example, a disk + error) occurred while processing the requested + operation. + + NFS4ERR_ISDIR Is a directory. The caller specified a + directory in a non-directory operation. + + NFS4ERR_LEASE_MOVED A lease being renewed is associated with a + filesystem that has been migrated to a new + server. + + NFS4ERR_LOCKED A read or write operation was attempted on a + locked file. + + NFS4ERR_LOCK_NOTSUPP Server does not support atomic upgrade or + downgrade of locks. + + + + + +Shepler, et al. Standards Track [Page 130] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_LOCK_RANGE A lock request is operating on a sub-range of a + current lock for the lock owner and the server + does not support this type of request. + + NFS4ERR_LOCKS_HELD A CLOSE was attempted and file locks would + exist after the CLOSE. + + NFS4ERR_MINOR_VERS_MISMATCH + The server has received a request that + specifies an unsupported minor version. The + server must return a COMPOUND4res with a zero + length operations result array. + + NFS4ERR_MLINK Too many hard links. + + NFS4ERR_MOVED The filesystem which contains the current + filehandle object has been relocated or + migrated to another server. The client may + obtain the new filesystem location by obtaining + the "fs_locations" attribute for the current + filehandle. For further discussion, refer to + the section "Filesystem Migration or + Relocation". + + NFS4ERR_NAMETOOLONG The filename in an operation was too long. + + NFS4ERR_NOENT No such file or directory. The file or + directory name specified does not exist. + + NFS4ERR_NOFILEHANDLE The logical current filehandle value (or, in + the case of RESTOREFH, the saved filehandle + value) has not been set properly. This may be + a result of a malformed COMPOUND operation + (i.e., no PUTFH or PUTROOTFH before an + operation that requires the current filehandle + be set). + + NFS4ERR_NO_GRACE A reclaim of client state has fallen outside of + the grace period of the server. As a result, + the server can not guarantee that conflicting + state has not been provided to another client. + + NFS4ERR_NOSPC No space left on device. The operation would + have caused the server's filesystem to exceed + its limit. + + NFS4ERR_NOTDIR Not a directory. The caller specified a non- + directory in a directory operation. + + + +Shepler, et al. Standards Track [Page 131] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_NOTEMPTY An attempt was made to remove a directory that + was not empty. + + NFS4ERR_NOTSUPP Operation is not supported. + + NFS4ERR_NOT_SAME This error is returned by the VERIFY operation + to signify that the attributes compared were + not the same as provided in the client's + request. + + NFS4ERR_NXIO I/O error. No such device or address. + + NFS4ERR_OLD_STATEID A stateid which designates the locking state + for a lockowner-file at an earlier time was + used. + + NFS4ERR_OPENMODE The client attempted a READ, WRITE, LOCK or + SETATTR operation not sanctioned by the stateid + passed (e.g., writing to a file opened only for + read). + + NFS4ERR_OP_ILLEGAL An illegal operation value has been specified + in the argop field of a COMPOUND or CB_COMPOUND + procedure. + + NFS4ERR_PERM Not owner. The operation was not allowed + because the caller is either not a privileged + user (root) or not the owner of the target of + the operation. + + NFS4ERR_RECLAIM_BAD The reclaim provided by the client does not + match any of the server's state consistency + checks and is bad. + + NFS4ERR_RECLAIM_CONFLICT + The reclaim provided by the client has + encountered a conflict and can not be provided. + Potentially indicates a misbehaving client. + + NFS4ERR_RESOURCE For the processing of the COMPOUND procedure, + the server may exhaust available resources and + can not continue processing operations within + the COMPOUND procedure. This error will be + returned from the server in those instances of + resource exhaustion related to the processing + of the COMPOUND procedure. + + + + + +Shepler, et al. Standards Track [Page 132] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_RESTOREFH The RESTOREFH operation does not have a saved + filehandle (identified by SAVEFH) to operate + upon. + + NFS4ERR_ROFS Read-only filesystem. A modifying operation was + attempted on a read-only filesystem. + + NFS4ERR_SAME This error is returned by the NVERIFY operation + to signify that the attributes compared were + the same as provided in the client's request. + + NFS4ERR_SERVERFAULT An error occurred on the server which does not + map to any of the legal NFS version 4 protocol + error values. The client should translate this + into an appropriate error. UNIX clients may + choose to translate this to EIO. + + NFS4ERR_SHARE_DENIED An attempt to OPEN a file with a share + reservation has failed because of a share + conflict. + + NFS4ERR_STALE Invalid filehandle. The filehandle given in the + arguments was invalid. The file referred to by + that filehandle no longer exists or access to + it has been revoked. + + NFS4ERR_STALE_CLIENTID A clientid not recognized by the server was + used in a locking or SETCLIENTID_CONFIRM + request. + + NFS4ERR_STALE_STATEID A stateid generated by an earlier server + instance was used. + + NFS4ERR_SYMLINK The current filehandle provided for a LOOKUP is + not a directory but a symbolic link. Also used + if the final component of the OPEN path is a + symbolic link. + + NFS4ERR_TOOSMALL The encoded response to a READDIR request + exceeds the size limit set by the initial + request. + + NFS4ERR_WRONGSEC The security mechanism being used by the client + for the operation does not match the server's + security policy. The client should change the + security mechanism being used and retry the + operation. + + + + +Shepler, et al. Standards Track [Page 133] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_XDEV Attempt to do an operation between different + fsids. + +13. NFS version 4 Requests + + For the NFS version 4 RPC program, there are two traditional RPC + procedures: NULL and COMPOUND. All other functionality is defined as + a set of operations and these operations are defined in normal + XDR/RPC syntax and semantics. However, these operations are + encapsulated within the COMPOUND procedure. This requires that the + client combine one or more of the NFS version 4 operations into a + single request. + + The NFS4_CALLBACK program is used to provide server to client + signaling and is constructed in a similar fashion as the NFS version + 4 program. The procedures CB_NULL and CB_COMPOUND are defined in the + same way as NULL and COMPOUND are within the NFS program. The + CB_COMPOUND request also encapsulates the remaining operations of the + NFS4_CALLBACK program. There is no predefined RPC program number for + the NFS4_CALLBACK program. It is up to the client to specify a + program number in the "transient" program range. The program and + port number of the NFS4_CALLBACK program are provided by the client + as part of the SETCLIENTID/SETCLIENTID_CONFIRM sequence. The program + and port can be changed by another SETCLIENTID/SETCLIENTID_CONFIRM + sequence, and it is possible to use the sequence to change them + within a client incarnation without removing relevant leased client + state. + +13.1. Compound Procedure + + The COMPOUND procedure provides the opportunity for better + performance within high latency networks. The client can avoid + cumulative latency of multiple RPCs by combining multiple dependent + operations into a single COMPOUND procedure. A compound operation + may provide for protocol simplification by allowing the client to + combine basic procedures into a single request that is customized for + the client's environment. + + The CB_COMPOUND procedure precisely parallels the features of + COMPOUND as described above. + + The basic structure of the COMPOUND procedure is: + + +-----+--------------+--------+-----------+-----------+-----------+-- + | tag | minorversion | numops | op + args | op + args | op + args | + +-----+--------------+--------+-----------+-----------+-----------+-- + + + + + +Shepler, et al. Standards Track [Page 134] + +RFC 3530 NFS version 4 Protocol April 2003 + + + and the reply's structure is: + + +------------+-----+--------+-----------------------+-- + |last status | tag | numres | status + op + results | + +------------+-----+--------+-----------------------+-- + + The numops and numres fields, used in the depiction above, represent + the count for the counted array encoding use to signify the number of + arguments or results encoded in the request and response. As per the + XDR encoding, these counts must match exactly the number of operation + arguments or results encoded. + +13.2. Evaluation of a Compound Request + + The server will process the COMPOUND procedure by evaluating each of + the operations within the COMPOUND procedure in order. Each + component operation consists of a 32 bit operation code, followed by + the argument of length determined by the type of operation. The + results of each operation are encoded in sequence into a reply + buffer. The results of each operation are preceded by the opcode and + a status code (normally zero). If an operation results in a non-zero + status code, the status will be encoded and evaluation of the + compound sequence will halt and the reply will be returned. Note + that evaluation stops even in the event of "non error" conditions + such as NFS4ERR_SAME. + + There are no atomicity requirements for the operations contained + within the COMPOUND procedure. The operations being evaluated as + part of a COMPOUND request may be evaluated simultaneously with other + COMPOUND requests that the server receives. + + It is the client's responsibility for recovering from any partially + completed COMPOUND procedure. Partially completed COMPOUND + procedures may occur at any point due to errors such as + NFS4ERR_RESOURCE and NFS4ERR_DELAY. This may occur even given an + otherwise valid operation string. Further, a server reboot which + occurs in the middle of processing a COMPOUND procedure may leave the + client with the difficult task of determining how far COMPOUND + processing has proceeded. Therefore, the client should avoid overly + complex COMPOUND procedures in the event of the failure of an + operation within the procedure. + + Each operation assumes a "current" and "saved" filehandle that is + available as part of the execution context of the compound request. + Operations may set, change, or return the current filehandle. The + "saved" filehandle is used for temporary storage of a filehandle + value and as operands for the RENAME and LINK operations. + + + + +Shepler, et al. Standards Track [Page 135] + +RFC 3530 NFS version 4 Protocol April 2003 + + +13.3. Synchronous Modifying Operations + + NFS version 4 operations that modify the filesystem are synchronous. + When an operation is successfully completed at the server, the client + can depend that any data associated with the request is now on stable + storage (the one exception is in the case of the file data in a WRITE + operation with the UNSTABLE option specified). + + This implies that any previous operations within the same compound + request are also reflected in stable storage. This behavior enables + the client's ability to recover from a partially executed compound + request which may resulted from the failure of the server. For + example, if a compound request contains operations A and B and the + server is unable to send a response to the client, depending on the + progress the server made in servicing the request the result of both + operations may be reflected in stable storage or just operation A may + be reflected. The server must not have just the results of operation + B in stable storage. + +13.4. Operation Values + + The operations encoded in the COMPOUND procedure are identified by + operation values. To avoid overlap with the RPC procedure numbers, + operations 0 (zero) and 1 are not defined. Operation 2 is not + defined but reserved for future use with minor versioning. + +14. NFS version 4 Procedures + +14.1. Procedure 0: NULL - No Operation + + SYNOPSIS + + + + ARGUMENT + + void; + + RESULT + + void; + + + + + + + + + + +Shepler, et al. Standards Track [Page 136] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + Standard NULL procedure. Void argument, void response. This + procedure has no functionality associated with it. Because of + this it is sometimes used to measure the overhead of processing a + service request. Therefore, the server should ensure that no + unnecessary work is done in servicing this procedure. + + ERRORS + + None. + +14.2. Procedure 1: COMPOUND - Compound Operations + + SYNOPSIS + + compoundargs -> compoundres + + ARGUMENT + + union nfs_argop4 switch (nfs_opnum4 argop) { + case : ; + ... + }; + + struct COMPOUND4args { + utf8str_cs tag; + uint32_t minorversion; + nfs_argop4 argarray<>; + }; + + RESULT + + union nfs_resop4 switch (nfs_opnum4 resop){ + case : ; + ... + }; + + struct COMPOUND4res { + nfsstat4 status; + utf8str_cs tag; + nfs_resop4 resarray<>; + }; + + + + + + + + +Shepler, et al. Standards Track [Page 137] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + The COMPOUND procedure is used to combine one or more of the NFS + operations into a single RPC request. The main NFS RPC program has + two main procedures: NULL and COMPOUND. All other operations use the + COMPOUND procedure as a wrapper. + + The COMPOUND procedure is used to combine individual operations into + a single RPC request. The server interprets each of the operations + in turn. If an operation is executed by the server and the status of + that operation is NFS4_OK, then the next operation in the COMPOUND + procedure is executed. The server continues this process until there + are no more operations to be executed or one of the operations has a + status value other than NFS4_OK. + + In the processing of the COMPOUND procedure, the server may find that + it does not have the available resources to execute any or all of the + operations within the COMPOUND sequence. In this case, the error + NFS4ERR_RESOURCE will be returned for the particular operation within + the COMPOUND procedure where the resource exhaustion occurred. This + assumes that all previous operations within the COMPOUND sequence + have been evaluated successfully. The results for all of the + evaluated operations must be returned to the client. + + The server will generally choose between two methods of decoding the + client's request. The first would be the traditional one-pass XDR + decode, in which decoding of the entire COMPOUND precedes execution + of any operation within it. If there is an XDR decoding error in + this case, an RPC XDR decode error would be returned. The second + method would be to make an initial pass to decode the basic COMPOUND + request and then to XDR decode each of the individual operations, as + the server is ready to execute it. In this case, the server may + encounter an XDR decode error during such an operation decode, after + previous operations within the COMPOUND have been executed. In this + case, the server would return the error NFS4ERR_BADXDR to signify the + decode error. + + The COMPOUND arguments contain a "minorversion" field. The initial + and default value for this field is 0 (zero). This field will be + used by future minor versions such that the client can communicate to + the server what minor version is being requested. If the server + receives a COMPOUND procedure with a minorversion field value that it + does not support, the server MUST return an error of + NFS4ERR_MINOR_VERS_MISMATCH and a zero length resultdata array. + + Contained within the COMPOUND results is a "status" field. If the + results array length is non-zero, this status must be equivalent to + the status of the last operation that was executed within the + + + +Shepler, et al. Standards Track [Page 138] + +RFC 3530 NFS version 4 Protocol April 2003 + + + COMPOUND procedure. Therefore, if an operation incurred an error + then the "status" value will be the same error value as is being + returned for the operation that failed. + + Note that operations, 0 (zero) and 1 (one) are not defined for the + COMPOUND procedure. Operation 2 is not defined but reserved for + future definition and use with minor versioning. If the server + receives a operation array that contains operation 2 and the + minorversion field has a value of 0 (zero), an error of + NFS4ERR_OP_ILLEGAL, as described in the next paragraph, is returned + to the client. If an operation array contains an operation 2 and the + minorversion field is non-zero and the server does not support the + minor version, the server returns an error of + NFS4ERR_MINOR_VERS_MISMATCH. Therefore, the + NFS4ERR_MINOR_VERS_MISMATCH error takes precedence over all other + errors. + + It is possible that the server receives a request that contains an + operation that is less than the first legal operation (OP_ACCESS) or + greater than the last legal operation (OP_RELEASE_LOCKOWNER). + + In this case, the server's response will encode the opcode OP_ILLEGAL + rather than the illegal opcode of the request. The status field in + the ILLEGAL return results will set to NFS4ERR_OP_ILLEGAL. The + COMPOUND procedure's return results will also be NFS4ERR_OP_ILLEGAL. + + The definition of the "tag" in the request is left to the + implementor. It may be used to summarize the content of the compound + request for the benefit of packet sniffers and engineers debugging + implementations. However, the value of "tag" in the response SHOULD + be the same value as provided in the request. This applies to the + tag field of the CB_COMPOUND procedure as well. + + IMPLEMENTATION + + Since an error of any type may occur after only a portion of the + operations have been evaluated, the client must be prepared to + recover from any failure. If the source of an NFS4ERR_RESOURCE error + was a complex or lengthy set of operations, it is likely that if the + number of operations were reduced the server would be able to + evaluate them successfully. Therefore, the client is responsible for + dealing with this type of complexity in recovery. + + ERRORS + + All errors defined in the protocol + + + + + +Shepler, et al. Standards Track [Page 139] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.1. Operation 3: ACCESS - Check Access Rights + + SYNOPSIS + + (cfh), accessreq -> supported, accessrights + + ARGUMENT + + const ACCESS4_READ = 0x00000001; + const ACCESS4_LOOKUP = 0x00000002; + const ACCESS4_MODIFY = 0x00000004; + const ACCESS4_EXTEND = 0x00000008; + const ACCESS4_DELETE = 0x00000010; + const ACCESS4_EXECUTE = 0x00000020; + + struct ACCESS4args { + /* CURRENT_FH: object */ + uint32_t access; + }; + + RESULT + + struct ACCESS4resok { + uint32_t supported; + uint32_t access; + }; + + union ACCESS4res switch (nfsstat4 status) { + case NFS4_OK: + ACCESS4resok resok4; + default: + void; + }; + + DESCRIPTION + + ACCESS determines the access rights that a user, as identified by the + credentials in the RPC request, has with respect to the file system + object specified by the current filehandle. The client encodes the + set of access rights that are to be checked in the bit mask "access". + The server checks the permissions encoded in the bit mask. If a + status of NFS4_OK is returned, two bit masks are included in the + response. The first, "supported", represents the access rights for + which the server can verify reliably. The second, "access", + represents the access rights available to the user for the filehandle + provided. On success, the current filehandle retains its value. + + + + + +Shepler, et al. Standards Track [Page 140] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Note that the supported field will contain only as many values as + were originally sent in the arguments. For example, if the client + sends an ACCESS operation with only the ACCESS4_READ value set and + the server supports this value, the server will return only + ACCESS4_READ even if it could have reliably checked other values. + + The results of this operation are necessarily advisory in nature. A + return status of NFS4_OK and the appropriate bit set in the bit mask + does not imply that such access will be allowed to the file system + object in the future. This is because access rights can be revoked by + the server at any time. + + The following access permissions may be requested: + + ACCESS4_READ Read data from file or read a directory. + + ACCESS4_LOOKUP Look up a name in a directory (no meaning for non- + directory objects). + + ACCESS4_MODIFY Rewrite existing file data or modify existing + directory entries. + + ACCESS4_EXTEND Write new data or add directory entries. + + ACCESS4_DELETE Delete an existing directory entry. + + ACCESS4_EXECUTE Execute file (no meaning for a directory). + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + In general, it is not sufficient for the client to attempt to deduce + access permissions by inspecting the uid, gid, and mode fields in the + file attributes or by attempting to interpret the contents of the ACL + attribute. This is because the server may perform uid or gid mapping + or enforce additional access control restrictions. It is also + possible that the server may not be in the same ID space as the + client. In these cases (and perhaps others), the client can not + reliably perform an access check with only current file attributes. + + In the NFS version 2 protocol, the only reliable way to determine + whether an operation was allowed was to try it and see if it + succeeded or failed. Using the ACCESS operation in the NFS version 4 + protocol, the client can ask the server to indicate whether or not + one or more classes of operations are permitted. The ACCESS + operation is provided to allow clients to check before doing a series + of operations which will result in an access failure. The OPEN + + + +Shepler, et al. Standards Track [Page 141] + +RFC 3530 NFS version 4 Protocol April 2003 + + + operation provides a point where the server can verify access to the + file object and method to return that information to the client. The + ACCESS operation is still useful for directory operations or for use + in the case the UNIX API "access" is used on the client. + + The information returned by the server in response to an ACCESS call + is not permanent. It was correct at the exact time that the server + performed the checks, but not necessarily afterwards. The server can + revoke access permission at any time. + + The client should use the effective credentials of the user to build + the authentication information in the ACCESS request used to + determine access rights. It is the effective user and group + credentials that are used in subsequent read and write operations. + + Many implementations do not directly support the ACCESS4_DELETE + permission. Operating systems like UNIX will ignore the + ACCESS4_DELETE bit if set on an access request on a non-directory + object. In these systems, delete permission on a file is determined + by the access permissions on the directory in which the file resides, + instead of being determined by the permissions of the file itself. + Therefore, the mask returned enumerating which access rights can be + determined will have the ACCESS4_DELETE value set to 0. This + indicates to the client that the server was unable to check that + particular access right. The ACCESS4_DELETE bit in the access mask + returned will then be ignored by the client. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.2. Operation 4: CLOSE - Close File + + SYNOPSIS + + (cfh), seqid, open_stateid -> open_stateid + + + + +Shepler, et al. Standards Track [Page 142] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ARGUMENT + + struct CLOSE4args { + /* CURRENT_FH: object */ + seqid4 seqid + stateid4 open_stateid; + }; + + RESULT + + union CLOSE4res switch (nfsstat4 status) { + case NFS4_OK: + stateid4 open_stateid; + default: + void; + }; + + DESCRIPTION + + The CLOSE operation releases share reservations for the regular or + named attribute file as specified by the current filehandle. The + share reservations and other state information released at the server + as a result of this CLOSE is only associated with the supplied + stateid. The sequence id provides for the correct ordering. State + associated with other OPENs is not affected. + + If record locks are held, the client SHOULD release all locks before + issuing a CLOSE. The server MAY free all outstanding locks on CLOSE + but some servers may not support the CLOSE of a file that still has + record locks held. The server MUST return failure if any locks would + exist after the CLOSE. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + Even though CLOSE returns a stateid, this stateid is not useful to + the client and should be treated as deprecated. CLOSE "shuts down" + the state associated with all OPENs for the file by a single + open_owner. As noted above, CLOSE will either release all file + locking state or return an error. Therefore, the stateid returned by + CLOSE is not useful for operations that follow. + + ERRORS + + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADHANDLE + NFS4ERR_BAD_SEQID + + + +Shepler, et al. Standards Track [Page 143] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_EXPIRED + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_ISDIR + NFS4ERR_LEASE_MOVED + NFS4ERR_LOCKS_HELD + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_OLD_STATEID + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + +14.2.3. Operation 5: COMMIT - Commit Cached Data + + SYNOPSIS + + (cfh), offset, count -> verifier + + ARGUMENT + + struct COMMIT4args { + /* CURRENT_FH: file */ + offset4 offset; + count4 count; + }; + + RESULT + + struct COMMIT4resok { + verifier4 writeverf; + }; + + union COMMIT4res switch (nfsstat4 status) { + case NFS4_OK: + COMMIT4resok resok4; + default: + void; + }; + + + + + + + + +Shepler, et al. Standards Track [Page 144] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + The COMMIT operation forces or flushes data to stable storage for the + file specified by the current filehandle. The flushed data is that + which was previously written with a WRITE operation which had the + stable field set to UNSTABLE4. + + The offset specifies the position within the file where the flush is + to begin. An offset value of 0 (zero) means to flush data starting + at the beginning of the file. The count specifies the number of + bytes of data to flush. If count is 0 (zero), a flush from offset to + the end of the file is done. + + The server returns a write verifier upon successful completion of the + COMMIT. The write verifier is used by the client to determine if the + server has restarted or rebooted between the initial WRITE(s) and the + COMMIT. The client does this by comparing the write verifier + returned from the initial writes and the verifier returned by the + COMMIT operation. The server must vary the value of the write + verifier at each server event or instantiation that may lead to a + loss of uncommitted data. Most commonly this occurs when the server + is rebooted; however, other events at the server may result in + uncommitted data loss as well. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + The COMMIT operation is similar in operation and semantics to the + POSIX fsync(2) system call that synchronizes a file's state with the + disk (file data and metadata is flushed to disk or stable storage). + COMMIT performs the same operation for a client, flushing any + unsynchronized data and metadata on the server to the server's disk + or stable storage for the specified file. Like fsync(2), it may be + that there is some modified data or no modified data to synchronize. + The data may have been synchronized by the server's normal periodic + buffer synchronization activity. COMMIT should return NFS4_OK, + unless there has been an unexpected error. + + COMMIT differs from fsync(2) in that it is possible for the client to + flush a range of the file (most likely triggered by a buffer- + reclamation scheme on the client before file has been completely + written). + + The server implementation of COMMIT is reasonably simple. If the + server receives a full file COMMIT request, that is starting at + offset 0 and count 0, it should do the equivalent of fsync()'ing the + file. Otherwise, it should arrange to have the cached data in the + + + +Shepler, et al. Standards Track [Page 145] + +RFC 3530 NFS version 4 Protocol April 2003 + + + range specified by offset and count to be flushed to stable storage. + In both cases, any metadata associated with the file must be flushed + to stable storage before returning. It is not an error for there to + be nothing to flush on the server. This means that the data and + metadata that needed to be flushed have already been flushed or lost + during the last server failure. + + The client implementation of COMMIT is a little more complex. There + are two reasons for wanting to commit a client buffer to stable + storage. The first is that the client wants to reuse a buffer. In + this case, the offset and count of the buffer are sent to the server + in the COMMIT request. The server then flushes any cached data based + on the offset and count, and flushes any metadata associated with the + file. It then returns the status of the flush and the write + verifier. The other reason for the client to generate a COMMIT is + for a full file flush, such as may be done at close. In this case, + the client would gather all of the buffers for this file that contain + uncommitted data, do the COMMIT operation with an offset of 0 and + count of 0, and then free all of those buffers. Any other dirty + buffers would be sent to the server in the normal fashion. + + After a buffer is written by the client with the stable parameter set + to UNSTABLE4, the buffer must be considered as modified by the client + until the buffer has either been flushed via a COMMIT operation or + written via a WRITE operation with stable parameter set to FILE_SYNC4 + or DATA_SYNC4. This is done to prevent the buffer from being freed + and reused before the data can be flushed to stable storage on the + server. + + When a response is returned from either a WRITE or a COMMIT operation + and it contains a write verifier that is different than previously + returned by the server, the client will need to retransmit all of the + buffers containing uncommitted cached data to the server. How this + is to be done is up to the implementor. If there is only one buffer + of interest, then it should probably be sent back over in a WRITE + request with the appropriate stable parameter. If there is more than + one buffer, it might be worthwhile retransmitting all of the buffers + in WRITE requests with the stable parameter set to UNSTABLE4 and then + retransmitting the COMMIT operation to flush all of the data on the + server to stable storage. The timing of these retransmissions is + left to the implementor. + + The above description applies to page-cache-based systems as well as + buffer-cache-based systems. In those systems, the virtual memory + system will need to be modified instead of the buffer cache. + + + + + + +Shepler, et al. Standards Track [Page 146] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_ISDIR + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.4. Operation 6: CREATE - Create a Non-Regular File Object + + SYNOPSIS + + (cfh), name, type, attrs -> (cfh), change_info, attrs_set + + ARGUMENT + + union createtype4 switch (nfs_ftype4 type) { + case NF4LNK: + linktext4 linkdata; + case NF4BLK: + case NF4CHR: + specdata4 devdata; + case NF4SOCK: + case NF4FIFO: + case NF4DIR: + void; + }; + + struct CREATE4args { + /* CURRENT_FH: directory for creation */ + createtype4 objtype; + component4 objname; + fattr4 createattrs; + }; + + RESULT + + struct CREATE4resok { + change_info4 cinfo; + bitmap4 attrset; /* attributes set */ + + + +Shepler, et al. Standards Track [Page 147] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + union CREATE4res switch (nfsstat4 status) { + case NFS4_OK: + CREATE4resok resok4; + default: + void; + }; + + DESCRIPTION + + The CREATE operation creates a non-regular file object in a directory + with a given name. The OPEN operation MUST be used to create a + regular file. + + The objname specifies the name for the new object. The objtype + determines the type of object to be created: directory, symlink, etc. + + If an object of the same name already exists in the directory, the + server will return the error NFS4ERR_EXIST. + + For the directory where the new file object was created, the server + returns change_info4 information in cinfo. With the atomic field of + the change_info4 struct, the server will indicate if the before and + after change attributes were obtained atomically with respect to the + file object creation. + + If the objname has a length of 0 (zero), or if objname does not obey + the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + + The current filehandle is replaced by that of the new object. + + The createattrs specifies the initial set of attributes for the + object. The set of attributes may include any writable attribute + valid for the object type. When the operation is successful, the + server will return to the client an attribute mask signifying which + attributes were successfully set for the object. + + If createattrs includes neither the owner attribute nor an ACL with + an ACE for the owner, and if the server's filesystem both supports + and requires an owner attribute (or an owner ACE) then the server + MUST derive the owner (or the owner ACE). This would typically be + from the principal indicated in the RPC credentials of the call, but + the server's operating environment or filesystem semantics may + dictate other methods of derivation. Similarly, if createattrs + includes neither the group attribute nor a group ACE, and if the + server's filesystem both supports and requires the notion of a group + attribute (or group ACE), the server MUST derive the group attribute + + + +Shepler, et al. Standards Track [Page 148] + +RFC 3530 NFS version 4 Protocol April 2003 + + + (or the corresponding owner ACE) for the file. This could be from the + RPC call's credentials, such as the group principal if the + credentials include it (such as with AUTH_SYS), from the group + identifier associated with the principal in the credentials (for + e.g., POSIX systems have a passwd database that has the group + identifier for every user identifier), inherited from directory the + object is created in, or whatever else the server's operating + environment or filesystem semantics dictate. This applies to the OPEN + operation too. + + Conversely, it is possible the client will specify in createattrs an + owner attribute or group attribute or ACL that the principal + indicated the RPC call's credentials does not have permissions to + create files for. The error to be returned in this instance is + NFS4ERR_PERM. This applies to the OPEN operation too. + + IMPLEMENTATION + + If the client desires to set attribute values after the create, a + SETATTR operation can be added to the COMPOUND request so that the + appropriate attributes will be set. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ATTRNOTSUPP + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADNAME + NFS4ERR_BADOWNER + NFS4ERR_BADTYPE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DQUOT + NFS4ERR_EXIST + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NAMETOOLONG + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOSPC + NFS4ERR_NOTDIR + NFS4ERR_PERM + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + + + +Shepler, et al. Standards Track [Page 149] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.5. Operation 7: DELEGPURGE - Purge Delegations Awaiting Recovery + + SYNOPSIS + + clientid -> + + ARGUMENT + + struct DELEGPURGE4args { + clientid4 clientid; + }; + + RESULT + + struct DELEGPURGE4res { + nfsstat4 status; + }; + + DESCRIPTION + + Purges all of the delegations awaiting recovery for a given client. + This is useful for clients which do not commit delegation information + to stable storage to indicate that conflicting requests need not be + delayed by the server awaiting recovery of delegation information. + + This operation should be used by clients that record delegation + information on stable storage on the client. In this case, + DELEGPURGE should be issued immediately after doing delegation + recovery on all delegations known to the client. Doing so will + notify the server that no additional delegations for the client will + be recovered allowing it to free resources, and avoid delaying other + clients who make requests that conflict with the unrecovered + delegations. The set of delegations known to the server and the + client may be different. The reason for this is that a client may + fail after making a request which resulted in delegation but before + it received the results and committed them to the client's stable + storage. + + The server MAY support DELEGPURGE, but if it does not, it MUST NOT + support CLAIM_DELEGATE_PREV. + + ERRORS + + NFS4ERR_BADXDR + NFS4ERR_NOTSUPP + NFS4ERR_LEASE_MOVED + NFS4ERR_MOVED + NFS4ERR_RESOURCE + + + +Shepler, et al. Standards Track [Page 150] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_SERVERFAULT + NFS4ERR_STALE_CLIENTID + +14.2.6. Operation 8: DELEGRETURN - Return Delegation + + SYNOPSIS + + (cfh), stateid -> + + ARGUMENT + + struct DELEGRETURN4args { + /* CURRENT_FH: delegated file */ + stateid4 stateid; + }; + + RESULT + + struct DELEGRETURN4res { + nfsstat4 status; + }; + + DESCRIPTION + + Returns the delegation represented by the current filehandle and + stateid. + + Delegations may be returned when recalled or voluntarily (i.e., + before the server has recalled them). In either case the client must + properly propagate state changed under the context of the delegation + to the server before returning the delegation. + + ERRORS + + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_EXPIRED + NFS4ERR_INVAL + NFS4ERR_LEASE_MOVED + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOTSUPP + NFS4ERR_OLD_STATEID + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + + + +Shepler, et al. Standards Track [Page 151] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.7. Operation 9: GETATTR - Get Attributes + + SYNOPSIS + + (cfh), attrbits -> attrbits, attrvals + + ARGUMENT + + struct GETATTR4args { + /* CURRENT_FH: directory or file */ + bitmap4 attr_request; + }; + + RESULT + + struct GETATTR4resok { + fattr4 obj_attributes; + }; + + union GETATTR4res switch (nfsstat4 status) { + case NFS4_OK: + GETATTR4resok resok4; + default: + void; + }; + + DESCRIPTION + + The GETATTR operation will obtain attributes for the filesystem + object specified by the current filehandle. The client sets a bit in + the bitmap argument for each attribute value that it would like the + server to return. The server returns an attribute bitmap that + indicates the attribute values for which it was able to return, + followed by the attribute values ordered lowest attribute number + first. + + The server must return a value for each attribute that the client + requests if the attribute is supported by the server. If the server + does not support an attribute or cannot approximate a useful value + then it must not return the attribute value and must not set the + attribute bit in the result bitmap. The server must return an error + if it supports an attribute but cannot obtain its value. In that + case no attribute values will be returned. + + All servers must support the mandatory attributes as specified in the + section "File Attributes". + + On success, the current filehandle retains its value. + + + +Shepler, et al. Standards Track [Page 152] + +RFC 3530 NFS version 4 Protocol April 2003 + + + IMPLEMENTATION + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.8. Operation 10: GETFH - Get Current Filehandle + + SYNOPSIS + + (cfh) -> filehandle + + ARGUMENT + + /* CURRENT_FH: */ + void; + + + RESULT + + struct GETFH4resok { + nfs_fh4 object; + }; + + union GETFH4res switch (nfsstat4 status) { + case NFS4_OK: + GETFH4resok resok4; + default: + void; + }; + + DESCRIPTION + + This operation returns the current filehandle value. + + On success, the current filehandle retains its value. + + + + +Shepler, et al. Standards Track [Page 153] + +RFC 3530 NFS version 4 Protocol April 2003 + + + IMPLEMENTATION + + Operations that change the current filehandle like LOOKUP or CREATE + do not automatically return the new filehandle as a result. For + instance, if a client needs to lookup a directory entry and obtain + its filehandle then the following request is needed. + + PUTFH (directory filehandle) + LOOKUP (entry name) + GETFH + + ERRORS + + NFS4ERR_BADHANDLE + NFS4ERR_FHEXPIRED + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.9. Operation 11: LINK - Create Link to a File + + SYNOPSIS + + (sfh), (cfh), newname -> (cfh), change_info + + ARGUMENT + + struct LINK4args { + /* SAVED_FH: source object */ + /* CURRENT_FH: target directory */ + component4 newname; + }; + + RESULT + + struct LINK4resok { + change_info4 cinfo; + }; + + union LINK4res switch (nfsstat4 status) { + case NFS4_OK: + LINK4resok resok4; + default: + void; + }; + + + + +Shepler, et al. Standards Track [Page 154] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + The LINK operation creates an additional newname for the file + represented by the saved filehandle, as set by the SAVEFH operation, + in the directory represented by the current filehandle. The existing + file and the target directory must reside within the same filesystem + on the server. On success, the current filehandle will continue to + be the target directory. If an object exists in the target directory + with the same name as newname, the server must return NFS4ERR_EXIST. + + For the target directory, the server returns change_info4 information + in cinfo. With the atomic field of the change_info4 struct, the + server will indicate if the before and after change attributes were + obtained atomically with respect to the link creation. + + If the newname has a length of 0 (zero), or if newname does not obey + the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + + IMPLEMENTATION + + Changes to any property of the "hard" linked files are reflected in + all of the linked files. When a link is made to a file, the + attributes for the file should have a value for numlinks that is one + greater than the value before the LINK operation. + + The statement "file and the target directory must reside within the + same filesystem on the server" means that the fsid fields in the + attributes for the objects are the same. If they reside on different + filesystems, the error, NFS4ERR_XDEV, is returned. On some servers, + the filenames, "." and "..", are illegal as newname. + + In the case that newname is already linked to the file represented by + the saved filehandle, the server will return NFS4ERR_EXIST. + + Note that symbolic links are created with the CREATE operation. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADNAME + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DQUOT + NFS4ERR_EXIST + NFS4ERR_FHEXPIRED + NFS4ERR_FILE_OPEN + + + +Shepler, et al. Standards Track [Page 155] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_ISDIR + NFS4ERR_MLINK + NFS4ERR_MOVED + NFS4ERR_NAMETOOLONG + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOSPC + NFS4ERR_NOTDIR + NFS4ERR_NOTSUPP + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_WRONGSEC + NFS4ERR_XDEV + +14.2.10. Operation 12: LOCK - Create Lock + + SYNOPSIS + + (cfh) locktype, reclaim, offset, length, locker -> stateid + + ARGUMENT + + struct open_to_lock_owner4 { + seqid4 open_seqid; + stateid4 open_stateid; + seqid4 lock_seqid; + lock_owner4 lock_owner; + }; + + struct exist_lock_owner4 { + stateid4 lock_stateid; + seqid4 lock_seqid; + }; + + union locker4 switch (bool new_lock_owner) { + case TRUE: + open_to_lock_owner4 open_owner; + case FALSE: + exist_lock_owner4 lock_owner; + }; + + enum nfs_lock_type4 { + READ_LT = 1, + WRITE_LT = 2, + + + +Shepler, et al. Standards Track [Page 156] + +RFC 3530 NFS version 4 Protocol April 2003 + + + READW_LT = 3, /* blocking read */ + WRITEW_LT = 4 /* blocking write */ + }; + + struct LOCK4args { + /* CURRENT_FH: file */ + nfs_lock_type4 locktype; + bool reclaim; + offset4 offset; + length4 length; + locker4 locker; + }; + + RESULT + + struct LOCK4denied { + offset4 offset; + length4 length; + nfs_lock_type4 locktype; + lock_owner4 owner; + }; + + struct LOCK4resok { + stateid4 lock_stateid; + }; + + union LOCK4res switch (nfsstat4 status) { + case NFS4_OK: + LOCK4resok resok4; + case NFS4ERR_DENIED: + LOCK4denied denied; + default: + void; + }; + + DESCRIPTION + + The LOCK operation requests a record lock for the byte range + specified by the offset and length parameters. The lock type is also + specified to be one of the nfs_lock_type4s. If this is a reclaim + request, the reclaim parameter will be TRUE; + + Bytes in a file may be locked even if those bytes are not currently + allocated to the file. To lock the file from a specific offset + through the end-of-file (no matter how long the file actually is) use + a length field with all bits set to 1 (one). If the length is zero, + + + + + +Shepler, et al. Standards Track [Page 157] + +RFC 3530 NFS version 4 Protocol April 2003 + + + or if a length which is not all bits set to one is specified, and + length when added to the offset exceeds the maximum 64-bit unsigned + integer value, the error NFS4ERR_INVAL will result. + + Some servers may only support locking for byte offsets that fit + within 32 bits. If the client specifies a range that includes a byte + beyond the last byte offset of the 32-bit range, but does not include + the last byte offset of the 32-bit and all of the byte offsets beyond + it, up to the end of the valid 64-bit range, such a 32-bit server + MUST return the error NFS4ERR_BAD_RANGE. + + In the case that the lock is denied, the owner, offset, and length of + a conflicting lock are returned. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + If the server is unable to determine the exact offset and length of + the conflicting lock, the same offset and length that were provided + in the arguments should be returned in the denied results. The File + Locking section contains a full description of this and the other + file locking operations. + + LOCK operations are subject to permission checks and to checks + against the access type of the associated file. However, the + specific right and modes required for various type of locks, reflect + the semantics of the server-exported filesystem, and are not + specified by the protocol. For example, Windows 2000 allows a write + lock of a file open for READ, while a POSIX-compliant system does + not. + + When the client makes a lock request that corresponds to a range that + the lockowner has locked already (with the same or different lock + type), or to a sub-region of such a range, or to a region which + includes multiple locks already granted to that lockowner, in whole + or in part, and the server does not support such locking operations + (i.e., does not support POSIX locking semantics), the server will + return the error NFS4ERR_LOCK_RANGE. In that case, the client may + return an error, or it may emulate the required operations, using + only LOCK for ranges that do not include any bytes already locked by + that lock_owner and LOCKU of locks held by that lock_owner + (specifying an exactly-matching range and type). Similarly, when the + client makes a lock request that amounts to upgrading (changing from + a read lock to a write lock) or downgrading (changing from write lock + to a read lock) an existing record lock, and the server does not + + + + + +Shepler, et al. Standards Track [Page 158] + +RFC 3530 NFS version 4 Protocol April 2003 + + + support such a lock, the server will return NFS4ERR_LOCK_NOTSUPP. + Such operations may not perfectly reflect the required semantics in + the face of conflicting lock requests from other clients. + + The locker argument specifies the lock_owner that is associated with + the LOCK request. The locker4 structure is a switched union that + indicates whether the lock_owner is known to the server or if the + lock_owner is new to the server. In the case that the lock_owner is + known to the server and has an established lock_seqid, the argument + is just the lock_owner and lock_seqid. In the case that the + lock_owner is not known to the server, the argument contains not only + the lock_owner and lock_seqid but also the open_stateid and + open_seqid. The new lock_owner case covers the very first lock done + by the lock_owner and offers a method to use the established state of + the open_stateid to transition to the use of the lock_owner. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADHANDLE + NFS4ERR_BAD_RANGE + NFS4ERR_BAD_SEQID + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_DEADLOCK + NFS4ERR_DELAY + NFS4ERR_DENIED + NFS4ERR_EXPIRED + NFS4ERR_FHEXPIRED + NFS4ERR_GRACE + NFS4ERR_INVAL + NFS4ERR_ISDIR + NFS4ERR_LEASE_MOVED + NFS4ERR_LOCK_NOTSUPP + NFS4ERR_LOCK_RANGE + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NO_GRACE + NFS4ERR_OLD_STATEID + NFS4ERR_OPENMODE + NFS4ERR_RECLAIM_BAD + NFS4ERR_RECLAIM_CONFLICT + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_CLIENTID + NFS4ERR_STALE_STATEID + + + +Shepler, et al. Standards Track [Page 159] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.11. Operation 13: LOCKT - Test For Lock + + SYNOPSIS + + (cfh) locktype, offset, length owner -> {void, NFS4ERR_DENIED -> + owner} + + ARGUMENT + + struct LOCKT4args { + /* CURRENT_FH: file */ + nfs_lock_type4 locktype; + offset4 offset; + length4 length; + lock_owner4 owner; + }; + + RESULT + + struct LOCK4denied { + offset4 offset; + length4 length; + nfs_lock_type4 locktype; + lock_owner4 owner; + }; + + union LOCKT4res switch (nfsstat4 status) { + case NFS4ERR_DENIED: + LOCK4denied denied; + case NFS4_OK: + void; + default: + void; + }; + + DESCRIPTION + + The LOCKT operation tests the lock as specified in the arguments. If + a conflicting lock exists, the owner, offset, length, and type of the + conflicting lock are returned; if no lock is held, nothing other than + NFS4_OK is returned. Lock types READ_LT and READW_LT are processed + in the same way in that a conflicting lock test is done without + regard to blocking or non-blocking. The same is true for WRITE_LT + and WRITEW_LT. + + The ranges are specified as for LOCK. The NFS4ERR_INVAL and + NFS4ERR_BAD_RANGE errors are returned under the same circumstances as + for LOCK. + + + +Shepler, et al. Standards Track [Page 160] + +RFC 3530 NFS version 4 Protocol April 2003 + + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + If the server is unable to determine the exact offset and length of + the conflicting lock, the same offset and length that were provided + in the arguments should be returned in the denied results. The File + Locking section contains further discussion of the file locking + mechanisms. + + LOCKT uses a lock_owner4 rather a stateid4, as is used in LOCK to + identify the owner. This is because the client does not have to open + the file to test for the existence of a lock, so a stateid may not be + available. + + The test for conflicting locks should exclude locks for the current + lockowner. Note that since such locks are not examined the possible + existence of overlapping ranges may not affect the results of LOCKT. + If the server does examine locks that match the lockowner for the + purpose of range checking, NFS4ERR_LOCK_RANGE may be returned.. In + the event that it returns NFS4_OK, clients may do a LOCK and receive + NFS4ERR_LOCK_RANGE on the LOCK request because of the flexibility + provided to the server. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_BAD_RANGE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DENIED + NFS4ERR_FHEXPIRED + NFS4ERR_GRACE + NFS4ERR_INVAL + NFS4ERR_ISDIR + NFS4ERR_LEASE_MOVED + NFS4ERR_LOCK_RANGE + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_CLIENTID + + + + + + + +Shepler, et al. Standards Track [Page 161] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.12. Operation 14: LOCKU - Unlock File + + SYNOPSIS + + (cfh) type, seqid, stateid, offset, length -> stateid + + ARGUMENT + + struct LOCKU4args { + /* CURRENT_FH: file */ + nfs_lock_type4 locktype; + seqid4 seqid; + stateid4 stateid; + offset4 offset; + length4 length; + }; + + RESULT + + union LOCKU4res switch (nfsstat4 status) { + case NFS4_OK: + stateid4 stateid; + default: + void; + }; + + DESCRIPTION + + The LOCKU operation unlocks the record lock specified by the + parameters. The client may set the locktype field to any value that + is legal for the nfs_lock_type4 enumerated type, and the server MUST + accept any legal value for locktype. Any legal value for locktype has + no effect on the success or failure of the LOCKU operation. + + The ranges are specified as for LOCK. The NFS4ERR_INVAL and + NFS4ERR_BAD_RANGE errors are returned under the same circumstances as + for LOCK. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + If the area to be unlocked does not correspond exactly to a lock + actually held by the lockowner the server may return the error + NFS4ERR_LOCK_RANGE. This includes the case in which the area is not + locked, where the area is a sub-range of the area locked, where it + overlaps the area locked without matching exactly or the area + specified includes multiple locks held by the lockowner. In all of + + + +Shepler, et al. Standards Track [Page 162] + +RFC 3530 NFS version 4 Protocol April 2003 + + + these cases, allowed by POSIX locking semantics, a client receiving + this error, should if it desires support for such operations, + simulate the operation using LOCKU on ranges corresponding to locks + it actually holds, possibly followed by LOCK requests for the sub- + ranges not being unlocked. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADHANDLE + NFS4ERR_BAD_RANGE + NFS4ERR_BAD_SEQID + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_EXPIRED + NFS4ERR_FHEXPIRED + NFS4ERR_GRACE + NFS4ERR_INVAL + NFS4ERR_ISDIR + NFS4ERR_LEASE_MOVED + NFS4ERR_LOCK_RANGE + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_OLD_STATEID + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + +14.2.13. Operation 15: LOOKUP - Lookup Filename + + SYNOPSIS + + (cfh), component -> (cfh) + + ARGUMENT + + struct LOOKUP4args { + /* CURRENT_FH: directory */ + component4 objname; + }; + + + + + + + + + +Shepler, et al. Standards Track [Page 163] + +RFC 3530 NFS version 4 Protocol April 2003 + + + RESULT + + struct LOOKUP4res { + /* CURRENT_FH: object */ + nfsstat4 status; + }; + + DESCRIPTION + + This operation LOOKUPs or finds a filesystem object using the + directory specified by the current filehandle. LOOKUP evaluates the + component and if the object exists the current filehandle is replaced + with the component's filehandle. + + If the component cannot be evaluated either because it does not exist + or because the client does not have permission to evaluate the + component, then an error will be returned and the current filehandle + will be unchanged. + + If the component is a zero length string or if any component does not + obey the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + + IMPLEMENTATION + + If the client wants to achieve the effect of a multi-component + lookup, it may construct a COMPOUND request such as (and obtain each + filehandle): + + PUTFH (directory filehandle) + LOOKUP "pub" + GETFH + LOOKUP "foo" + GETFH + LOOKUP "bar" + GETFH + + NFS version 4 servers depart from the semantics of previous NFS + versions in allowing LOOKUP requests to cross mountpoints on the + server. The client can detect a mountpoint crossing by comparing the + fsid attribute of the directory with the fsid attribute of the + directory looked up. If the fsids are different then the new + directory is a server mountpoint. UNIX clients that detect a + mountpoint crossing will need to mount the server's filesystem. This + needs to be done to maintain the file object identity checking + mechanisms common to UNIX clients. + + + + + + +Shepler, et al. Standards Track [Page 164] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Servers that limit NFS access to "shares" or "exported" filesystems + should provide a pseudo-filesystem into which the exported + filesystems can be integrated, so that clients can browse the + server's name space. The clients' view of a pseudo filesystem will + be limited to paths that lead to exported filesystems. + + Note: previous versions of the protocol assigned special semantics to + the names "." and "..". NFS version 4 assigns no special semantics + to these names. The LOOKUPP operator must be used to lookup a parent + directory. + + Note that this operation does not follow symbolic links. The client + is responsible for all parsing of filenames including filenames that + are modified by symbolic links encountered during the lookup process. + + If the current filehandle supplied is not a directory but a symbolic + link, the error NFS4ERR_SYMLINK is returned as the error. For all + other non-directory file types, the error NFS4ERR_NOTDIR is returned. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADNAME + NFS4ERR_BADXDR + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NAMETOOLONG + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOTDIR + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_SYMLINK + NFS4ERR_WRONGSEC + +14.2.14. Operation 16: LOOKUPP - Lookup Parent Directory + + SYNOPSIS + + (cfh) -> (cfh) + + + + + + +Shepler, et al. Standards Track [Page 165] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ARGUMENT + + /* CURRENT_FH: object */ + void; + + RESULT + + struct LOOKUPP4res { + /* CURRENT_FH: directory */ + nfsstat4 status; + }; + + DESCRIPTION + + The current filehandle is assumed to refer to a regular directory + or a named attribute directory. LOOKUPP assigns the filehandle for + its parent directory to be the current filehandle. If there is no + parent directory an NFS4ERR_NOENT error must be returned. + Therefore, NFS4ERR_NOENT will be returned by the server when the + current filehandle is at the root or top of the server's file tree. + + IMPLEMENTATION + + As for LOOKUP, LOOKUPP will also cross mountpoints. + + If the current filehandle is not a directory or named attribute + directory, the error NFS4ERR_NOTDIR is returned. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_FHEXPIRED + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOTDIR + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.15. Operation 17: NVERIFY - Verify Difference in Attributes + + SYNOPSIS + + (cfh), fattr -> - + + + + +Shepler, et al. Standards Track [Page 166] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ARGUMENT + + struct NVERIFY4args { + /* CURRENT_FH: object */ + fattr4 obj_attributes; + }; + + RESULT + + struct NVERIFY4res { + nfsstat4 status; + }; + + DESCRIPTION + + This operation is used to prefix a sequence of operations to be + performed if one or more attributes have changed on some filesystem + object. If all the attributes match then the error NFS4ERR_SAME must + be returned. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + This operation is useful as a cache validation operator. If the + object to which the attributes belong has changed then the following + operations may obtain new data associated with that object. For + instance, to check if a file has been changed and obtain new data if + it has: + + PUTFH (public) + LOOKUP "foobar" + NVERIFY attrbits attrs + READ 0 32767 + + In the case that a recommended attribute is specified in the NVERIFY + operation and the server does not support that attribute for the + filesystem object, the error NFS4ERR_ATTRNOTSUPP is returned to the + client. + + When the attribute rdattr_error or any write-only attribute (e.g., + time_modify_set) is specified, the error NFS4ERR_INVAL is returned to + the client. + + + + + + + + +Shepler, et al. Standards Track [Page 167] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ATTRNOTSUPP + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_RESOURCE + NFS4ERR_SAME + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.16. Operation 18: OPEN - Open a Regular File + + SYNOPSIS + + (cfh), seqid, share_access, share_deny, owner, openhow, claim -> + (cfh), stateid, cinfo, rflags, open_confirm, attrset delegation + + ARGUMENT + + struct OPEN4args { + seqid4 seqid; + uint32_t share_access; + uint32_t share_deny; + open_owner4 owner; + openflag4 openhow; + open_claim4 claim; + }; + + enum createmode4 { + UNCHECKED4 = 0, + GUARDED4 = 1, + EXCLUSIVE4 = 2 + }; + + union createhow4 switch (createmode4 mode) { + case UNCHECKED4: + case GUARDED4: + fattr4 createattrs; + case EXCLUSIVE4: + verifier4 createverf; + + + +Shepler, et al. Standards Track [Page 168] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + enum opentype4 { + OPEN4_NOCREATE = 0, + OPEN4_CREATE = 1 + }; + + union openflag4 switch (opentype4 opentype) { + case OPEN4_CREATE: + createhow4 how; + default: + void; + }; + + /* Next definitions used for OPEN delegation */ + enum limit_by4 { + NFS_LIMIT_SIZE = 1, + NFS_LIMIT_BLOCKS = 2 + /* others as needed */ + }; + + struct nfs_modified_limit4 { + uint32_t num_blocks; + uint32_t bytes_per_block; + }; + + union nfs_space_limit4 switch (limit_by4 limitby) { + /* limit specified as file size */ + case NFS_LIMIT_SIZE: + uint64_t filesize; + /* limit specified by number of blocks */ + case NFS_LIMIT_BLOCKS: + nfs_modified_limit4 mod_blocks; + } ; + + enum open_delegation_type4 { + OPEN_DELEGATE_NONE = 0, + OPEN_DELEGATE_READ = 1, + OPEN_DELEGATE_WRITE = 2 + }; + + enum open_claim_type4 { + CLAIM_NULL = 0, + CLAIM_PREVIOUS = 1, + CLAIM_DELEGATE_CUR = 2, + CLAIM_DELEGATE_PREV = 3 + }; + + + + +Shepler, et al. Standards Track [Page 169] + +RFC 3530 NFS version 4 Protocol April 2003 + + + struct open_claim_delegate_cur4 { + stateid4 delegate_stateid; + component4 file; + }; + + union open_claim4 switch (open_claim_type4 claim) { + /* + * No special rights to file. Ordinary OPEN of the specified file. + */ + case CLAIM_NULL: + /* CURRENT_FH: directory */ + component4 file; + + /* + * Right to the file established by an open previous to server + * reboot. File identified by filehandle obtained at that time + * rather than by name. + */ + case CLAIM_PREVIOUS: + /* CURRENT_FH: file being reclaimed */ + open_delegation_type4 delegate_type; + + /* + * Right to file based on a delegation granted by the server. + * File is specified by name. + */ + case CLAIM_DELEGATE_CUR: + /* CURRENT_FH: directory */ + open_claim_delegate_cur4 delegate_cur_info; + + /* Right to file based on a delegation granted to a previous boot + * instance of the client. File is specified by name. + */ + case CLAIM_DELEGATE_PREV: + /* CURRENT_FH: directory */ + component4 file_delegate_prev; + }; + + RESULT + + struct open_read_delegation4 { + stateid4 stateid; /* Stateid for delegation*/ + bool recall; /* Pre-recalled flag for + delegations obtained + by reclaim + (CLAIM_PREVIOUS) */ + nfsace4 permissions; /* Defines users who don't + need an ACCESS call to + + + +Shepler, et al. Standards Track [Page 170] + +RFC 3530 NFS version 4 Protocol April 2003 + + + open for read */ + }; + + struct open_write_delegation4 { + stateid4 stateid; /* Stateid for delegation*/ + bool recall; /* Pre-recalled flag for + delegations obtained + by reclaim + (CLAIM_PREVIOUS) */ + nfs_space_limit4 space_limit; /* Defines condition that + the client must check to + determine whether the + file needs to be flushed + to the server on close. + */ + nfsace4 permissions; /* Defines users who don't + need an ACCESS call as + part of a delegated + open. */ + }; + + union open_delegation4 + switch (open_delegation_type4 delegation_type) { + case OPEN_DELEGATE_NONE: + void; + case OPEN_DELEGATE_READ: + open_read_delegation4 read; + case OPEN_DELEGATE_WRITE: + open_write_delegation4 write; + }; + + const OPEN4_RESULT_CONFIRM = 0x00000002; + const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; + + struct OPEN4resok { + stateid4 stateid; /* Stateid for open */ + change_info4 cinfo; /* Directory Change Info */ + uint32_t rflags; /* Result flags */ + bitmap4 attrset; /* attributes on create */ + open_delegation4 delegation; /* Info on any open + delegation */ + }; + + union OPEN4res switch (nfsstat4 status) { + case NFS4_OK: + /* CURRENT_FH: opened file */ + OPEN4resok resok4; + default: + + + +Shepler, et al. Standards Track [Page 171] + +RFC 3530 NFS version 4 Protocol April 2003 + + + void; + }; + + WARNING TO CLIENT IMPLEMENTORS + + OPEN resembles LOOKUP in that it generates a filehandle for the + client to use. Unlike LOOKUP though, OPEN creates server state on + the filehandle. In normal circumstances, the client can only release + this state with a CLOSE operation. CLOSE uses the current filehandle + to determine which file to close. Therefore the client MUST follow + every OPEN operation with a GETFH operation in the same COMPOUND + procedure. This will supply the client with the filehandle such that + CLOSE can be used appropriately. + + Simply waiting for the lease on the file to expire is insufficient + because the server may maintain the state indefinitely as long as + another client does not attempt to make a conflicting access to the + same file. + + DESCRIPTION + + The OPEN operation creates and/or opens a regular file in a directory + with the provided name. If the file does not exist at the server and + creation is desired, specification of the method of creation is + provided by the openhow parameter. The client has the choice of + three creation methods: UNCHECKED, GUARDED, or EXCLUSIVE. + + If the current filehandle is a named attribute directory, OPEN will + then create or open a named attribute file. Note that exclusive + create of a named attribute is not supported. If the createmode is + EXCLUSIVE4 and the current filehandle is a named attribute directory, + the server will return EINVAL. + + UNCHECKED means that the file should be created if a file of that + name does not exist and encountering an existing regular file of that + name is not an error. For this type of create, createattrs specifies + the initial set of attributes for the file. The set of attributes + may include any writable attribute valid for regular files. When an + UNCHECKED create encounters an existing file, the attributes + specified by createattrs are not used, except that when an size of + zero is specified, the existing file is truncated. If GUARDED is + specified, the server checks for the presence of a duplicate object + by name before performing the create. If a duplicate exists, an + error of NFS4ERR_EXIST is returned as the status. If the object does + not exist, the request is performed as described for UNCHECKED. For + + + + + + +Shepler, et al. Standards Track [Page 172] + +RFC 3530 NFS version 4 Protocol April 2003 + + + each of these cases (UNCHECKED and GUARDED) where the operation is + successful, the server will return to the client an attribute mask + signifying which attributes were successfully set for the object. + + EXCLUSIVE specifies that the server is to follow exclusive creation + semantics, using the verifier to ensure exclusive creation of the + target. The server should check for the presence of a duplicate + object by name. If the object does not exist, the server creates the + object and stores the verifier with the object. If the object does + exist and the stored verifier matches the client provided verifier, + the server uses the existing object as the newly created object. If + the stored verifier does not match, then an error of NFS4ERR_EXIST is + returned. No attributes may be provided in this case, since the + server may use an attribute of the target object to store the + verifier. If the server uses an attribute to store the exclusive + create verifier, it will signify which attribute by setting the + appropriate bit in the attribute mask that is returned in the + results. + + For the target directory, the server returns change_info4 information + in cinfo. With the atomic field of the change_info4 struct, the + server will indicate if the before and after change attributes were + obtained atomically with respect to the link creation. + + Upon successful creation, the current filehandle is replaced by that + of the new object. + + The OPEN operation provides for Windows share reservation capability + with the use of the share_access and share_deny fields of the OPEN + arguments. The client specifies at OPEN the required share_access + and share_deny modes. For clients that do not directly support + SHAREs (i.e., UNIX), the expected deny value is DENY_NONE. In the + case that there is a existing SHARE reservation that conflicts with + the OPEN request, the server returns the error NFS4ERR_SHARE_DENIED. + For a complete SHARE request, the client must provide values for the + owner and seqid fields for the OPEN argument. For additional + discussion of SHARE semantics see the section on 'Share + Reservations'. + + In the case that the client is recovering state from a server + failure, the claim field of the OPEN argument is used to signify that + the request is meant to reclaim state previously held. + + The "claim" field of the OPEN argument is used to specify the file to + be opened and the state information which the client claims to + possess. There are four basic claim types which cover the various + situations for an OPEN. They are as follows: + + + + +Shepler, et al. Standards Track [Page 173] + +RFC 3530 NFS version 4 Protocol April 2003 + + + CLAIM_NULL + For the client, this is a new OPEN + request and there is no previous state + associate with the file for the client. + + CLAIM_PREVIOUS + The client is claiming basic OPEN state + for a file that was held previous to a + server reboot. Generally used when a + server is returning persistent + filehandles; the client may not have the + file name to reclaim the OPEN. + + CLAIM_DELEGATE_CUR + The client is claiming a delegation for + OPEN as granted by the server. + Generally this is done as part of + recalling a delegation. + + CLAIM_DELEGATE_PREV + The client is claiming a delegation + granted to a previous client instance; + used after the client reboots. The + server MAY support CLAIM_DELEGATE_PREV. + If it does support CLAIM_DELEGATE_PREV, + SETCLIENTID_CONFIRM MUST NOT remove the + client's delegation state, and the + server MUST support the DELEGPURGE + operation. + + For OPEN requests whose claim type is other than CLAIM_PREVIOUS + (i.e., requests other than those devoted to reclaiming opens after a + server reboot) that reach the server during its grace or lease + expiration period, the server returns an error of NFS4ERR_GRACE. + + For any OPEN request, the server may return an open delegation, which + allows further opens and closes to be handled locally on the client + as described in the section Open Delegation. Note that delegation is + up to the server to decide. The client should never assume that + delegation will or will not be granted in a particular instance. It + should always be prepared for either case. A partial exception is + the reclaim (CLAIM_PREVIOUS) case, in which a delegation type is + claimed. In this case, delegation will always be granted, although + the server may specify an immediate recall in the delegation + structure. + + The rflags returned by a successful OPEN allow the server to return + information governing how the open file is to be handled. + + + +Shepler, et al. Standards Track [Page 174] + +RFC 3530 NFS version 4 Protocol April 2003 + + + OPEN4_RESULT_CONFIRM indicates that the client MUST execute an + OPEN_CONFIRM operation before using the open file. + OPEN4_RESULT_LOCKTYPE_POSIX indicates the server's file locking + behavior supports the complete set of Posix locking techniques. From + this the client can choose to manage file locking state in a way to + handle a mis-match of file locking management. + + If the component is of zero length, NFS4ERR_INVAL will be returned. + The component is also subject to the normal UTF-8, character support, + and name checks. See the section "UTF-8 Related Errors" for further + discussion. + + When an OPEN is done and the specified lockowner already has the + resulting filehandle open, the result is to "OR" together the new + share and deny status together with the existing status. In this + case, only a single CLOSE need be done, even though multiple OPENs + were completed. When such an OPEN is done, checking of share + reservations for the new OPEN proceeds normally, with no exception + for the existing OPEN held by the same lockowner. + + If the underlying filesystem at the server is only accessible in a + read-only mode and the OPEN request has specified ACCESS_WRITE or + ACCESS_BOTH, the server will return NFS4ERR_ROFS to indicate a read- + only filesystem. + + As with the CREATE operation, the server MUST derive the owner, owner + ACE, group, or group ACE if any of the four attributes are required + and supported by the server's filesystem. For an OPEN with the + EXCLUSIVE4 createmode, the server has no choice, since such OPEN + calls do not include the createattrs field. Conversely, if + createattrs is specified, and includes owner or group (or + corresponding ACEs) that the principal in the RPC call's credentials + does not have authorization to create files for, then the server may + return NFS4ERR_PERM. + + In the case of a OPEN which specifies a size of zero (e.g., + truncation) and the file has named attributes, the named attributes + are left as is. They are not removed. + + IMPLEMENTATION + + The OPEN operation contains support for EXCLUSIVE create. The + mechanism is similar to the support in NFS version 3 [RFC1813]. As + in NFS version 3, this mechanism provides reliable exclusive + creation. Exclusive create is invoked when the how parameter is + EXCLUSIVE. In this case, the client provides a verifier that can + reasonably be expected to be unique. A combination of a client + + + + +Shepler, et al. Standards Track [Page 175] + +RFC 3530 NFS version 4 Protocol April 2003 + + + identifier, perhaps the client network address, and a unique number + generated by the client, perhaps the RPC transaction identifier, may + be appropriate. + + If the object does not exist, the server creates the object and + stores the verifier in stable storage. For filesystems that do not + provide a mechanism for the storage of arbitrary file attributes, the + server may use one or more elements of the object meta-data to store + the verifier. The verifier must be stored in stable storage to + prevent erroneous failure on retransmission of the request. It is + assumed that an exclusive create is being performed because exclusive + semantics are critical to the application. Because of the expected + usage, exclusive CREATE does not rely solely on the normally volatile + duplicate request cache for storage of the verifier. The duplicate + request cache in volatile storage does not survive a crash and may + actually flush on a long network partition, opening failure windows. + In the UNIX local filesystem environment, the expected storage + location for the verifier on creation is the meta-data (time stamps) + of the object. For this reason, an exclusive object create may not + include initial attributes because the server would have nowhere to + store the verifier. + + If the server can not support these exclusive create semantics, + possibly because of the requirement to commit the verifier to stable + storage, it should fail the OPEN request with the error, + NFS4ERR_NOTSUPP. + + During an exclusive CREATE request, if the object already exists, the + server reconstructs the object's verifier and compares it with the + verifier in the request. If they match, the server treats the request + as a success. The request is presumed to be a duplicate of an + earlier, successful request for which the reply was lost and that the + server duplicate request cache mechanism did not detect. If the + verifiers do not match, the request is rejected with the status, + NFS4ERR_EXIST. + + Once the client has performed a successful exclusive create, it must + issue a SETATTR to set the correct object attributes. Until it does + so, it should not rely upon any of the object attributes, since the + server implementation may need to overload object meta-data to store + the verifier. The subsequent SETATTR must not occur in the same + COMPOUND request as the OPEN. This separation will guarantee that + the exclusive create mechanism will continue to function properly in + the face of retransmission of the request. + + Use of the GUARDED attribute does not provide exactly-once semantics. + In particular, if a reply is lost and the server does not detect the + retransmission of the request, the operation can fail with + + + +Shepler, et al. Standards Track [Page 176] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_EXIST, even though the create was performed successfully. + The client would use this behavior in the case that the application + has not requested an exclusive create but has asked to have the file + truncated when the file is opened. In the case of the client timing + out and retransmitting the create request, the client can use GUARDED + to prevent against a sequence like: create, write, create + (retransmitted) from occurring. + + For SHARE reservations, the client must specify a value for + share_access that is one of READ, WRITE, or BOTH. For share_deny, + the client must specify one of NONE, READ, WRITE, or BOTH. If the + client fails to do this, the server must return NFS4ERR_INVAL. + + Based on the share_access value (READ, WRITE, or BOTH) the client + should check that the requester has the proper access rights to + perform the specified operation. This would generally be the results + of applying the ACL access rules to the file for the current + requester. However, just as with the ACCESS operation, the client + should not attempt to second-guess the server's decisions, as access + rights may change and may be subject to server administrative + controls outside the ACL framework. If the requester is not + authorized to READ or WRITE (depending on the share_access value), + the server must return NFS4ERR_ACCESS. Note that since the NFS + version 4 protocol does not impose any requirement that READs and + WRITEs issued for an open file have the same credentials as the OPEN + itself, the server still must do appropriate access checking on the + READs and WRITEs themselves. + + If the component provided to OPEN is a symbolic link, the error + NFS4ERR_SYMLINK will be returned to the client. If the current + filehandle is not a directory, the error NFS4ERR_NOTDIR will be + returned. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ADMIN_REVOKED + NFS4ERR_ATTRNOTSUPP + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADNAME + NFS4ERR_BADOWNER + NFS4ERR_BAD_SEQID + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DQUOT + NFS4ERR_EXIST + NFS4ERR_EXPIRED + + + +Shepler, et al. Standards Track [Page 177] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_FHEXPIRED + NFS4ERR_GRACE + NFS4ERR_IO + NFS4ERR_INVAL + NFS4ERR_ISDIR + NFS4ERR_LEASE_MOVED + NFS4ERR_MOVED + NFS4ERR_NAMETOOLONG + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOSPC + NFS4ERR_NOTDIR + NFS4ERR_NOTSUPP + NFS4ERR_NO_GRACE + NFS4ERR_PERM + NFS4ERR_RECLAIM_BAD + NFS4ERR_RECLAIM_CONFLICT + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_SHARE_DENIED + NFS4ERR_STALE + NFS4ERR_STALE_CLIENTID + NFS4ERR_SYMLINK + NFS4ERR_WRONGSEC + +14.2.17. Operation 19: OPENATTR - Open Named Attribute Directory + + SYNOPSIS + + (cfh) createdir -> (cfh) + + ARGUMENT + + struct OPENATTR4args { + /* CURRENT_FH: object */ + bool createdir; + }; + + RESULT + + struct OPENATTR4res { + /* CURRENT_FH: named attr directory*/ + nfsstat4 status; + }; + + + + + + +Shepler, et al. Standards Track [Page 178] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + The OPENATTR operation is used to obtain the filehandle of the named + attribute directory associated with the current filehandle. The + result of the OPENATTR will be a filehandle to an object of type + NF4ATTRDIR. From this filehandle, READDIR and LOOKUP operations can + be used to obtain filehandles for the various named attributes + associated with the original filesystem object. Filehandles returned + within the named attribute directory will have a type of + NF4NAMEDATTR. + + The createdir argument allows the client to signify if a named + attribute directory should be created as a result of the OPENATTR + operation. Some clients may use the OPENATTR operation with a value + of FALSE for createdir to determine if any named attributes exist for + the object. If none exist, then NFS4ERR_NOENT will be returned. If + createdir has a value of TRUE and no named attribute directory + exists, one is created. The creation of a named attribute directory + assumes that the server has implemented named attribute support in + this fashion and is not required to do so by this definition. + + IMPLEMENTATION + + If the server does not support named attributes for the current + filehandle, an error of NFS4ERR_NOTSUPP will be returned to the + client. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DQUOT + NFS4ERR_FHEXPIRED + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOSPC + NFS4ERR_NOTSUPP + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + + + + + + +Shepler, et al. Standards Track [Page 179] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.18. Operation 20: OPEN_CONFIRM - Confirm Open + + SYNOPSIS + + (cfh), seqid, stateid-> stateid + + ARGUMENT + + struct OPEN_CONFIRM4args { + /* CURRENT_FH: opened file */ + stateid4 open_stateid; + seqid4 seqid; + }; + + RESULT + + struct OPEN_CONFIRM4resok { + stateid4 open_stateid; + }; + + union OPEN_CONFIRM4res switch (nfsstat4 status) { + case NFS4_OK: + OPEN_CONFIRM4resok resok4; + default: + void; + }; + + DESCRIPTION + + This operation is used to confirm the sequence id usage for the first + time that a open_owner is used by a client. The stateid returned + from the OPEN operation is used as the argument for this operation + along with the next sequence id for the open_owner. The sequence id + passed to the OPEN_CONFIRM must be 1 (one) greater than the seqid + passed to the OPEN operation from which the open_confirm value was + obtained. If the server receives an unexpected sequence id with + respect to the original open, then the server assumes that the client + will not confirm the original OPEN and all state associated with the + original OPEN is released by the server. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + A given client might generate many open_owner4 data structures for a + given clientid. The client will periodically either dispose of its + open_owner4s or stop using them for indefinite periods of time. The + latter situation is why the NFS version 4 protocol does not have an + + + +Shepler, et al. Standards Track [Page 180] + +RFC 3530 NFS version 4 Protocol April 2003 + + + explicit operation to exit an open_owner4: such an operation is of no + use in that situation. Instead, to avoid unbounded memory use, the + server needs to implement a strategy for disposing of open_owner4s + that have no current lock, open, or delegation state for any files + and have not been used recently. The time period used to determine + when to dispose of open_owner4s is an implementation choice. The + time period should certainly be no less than the lease time plus any + grace period the server wishes to implement beyond a lease time. The + OPEN_CONFIRM operation allows the server to safely dispose of unused + open_owner4 data structures. + + In the case that a client issues an OPEN operation and the server no + longer has a record of the open_owner4, the server needs to ensure + that this is a new OPEN and not a replay or retransmission. + + Servers must not require confirmation on OPENs that grant delegations + or are doing reclaim operations. See section "Use of Open + Confirmation" for details. The server can easily avoid this by + noting whether it has disposed of one open_owner4 for the given + clientid. If the server does not support delegation, it might simply + maintain a single bit that notes whether any open_owner4 (for any + client) has been disposed of. + + The server must hold unconfirmed OPEN state until one of three events + occur. First, the client sends an OPEN_CONFIRM request with the + appropriate sequence id and stateid within the lease period. In this + case, the OPEN state on the server goes to confirmed, and the + open_owner4 on the server is fully established. + + Second, the client sends another OPEN request with a sequence id that + is incorrect for the open_owner4 (out of sequence). In this case, + the server assumes the second OPEN request is valid and the first one + is a replay. The server cancels the OPEN state of the first OPEN + request, establishes an unconfirmed OPEN state for the second OPEN + request, and responds to the second OPEN request with an indication + that an OPEN_CONFIRM is needed. The process then repeats itself. + While there is a potential for a denial of service attack on the + client, it is mitigated if the client and server require the use of a + security flavor based on Kerberos V5, LIPKEY, or some other flavor + that uses cryptography. + + What if the server is in the unconfirmed OPEN state for a given + open_owner4, and it receives an operation on the open_owner4 that has + a stateid but the operation is not OPEN, or it is OPEN_CONFIRM but + with the wrong stateid? Then, even if the seqid is correct, the + + + + + + +Shepler, et al. Standards Track [Page 181] + +RFC 3530 NFS version 4 Protocol April 2003 + + + server returns NFS4ERR_BAD_STATEID, because the server assumes the + operation is a replay: if the server has no established OPEN state, + then there is no way, for example, a LOCK operation could be valid. + + Third, neither of the two aforementioned events occur for the + open_owner4 within the lease period. In this case, the OPEN state is + canceled and disposal of the open_owner4 can occur. + + ERRORS + + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADHANDLE + NFS4ERR_BAD_SEQID + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_EXPIRED + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_ISDIR + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_OLD_STATEID + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + +14.2.19. Operation 21: OPEN_DOWNGRADE - Reduce Open File Access + + SYNOPSIS + + (cfh), stateid, seqid, access, deny -> stateid + + ARGUMENT + + struct OPEN_DOWNGRADE4args { + /* CURRENT_FH: opened file */ + stateid4 open_stateid; + seqid4 seqid; + uint32_t share_access; + uint32_t share_deny; + }; + + RESULT + + struct OPEN_DOWNGRADE4resok { + stateid4 open_stateid; + }; + + + +Shepler, et al. Standards Track [Page 182] + +RFC 3530 NFS version 4 Protocol April 2003 + + + union OPEN_DOWNGRADE4res switch(nfsstat4 status) { + case NFS4_OK: + OPEN_DOWNGRADE4resok resok4; + default: + void; + }; + + DESCRIPTION + + This operation is used to adjust the share_access and share_deny bits + for a given open. This is necessary when a given openowner opens the + same file multiple times with different share_access and share_deny + flags. In this situation, a close of one of the opens may change the + appropriate share_access and share_deny flags to remove bits + associated with opens no longer in effect. + + The share_access and share_deny bits specified in this operation + replace the current ones for the specified open file. The + share_access and share_deny bits specified must be exactly equal to + the union of the share_access and share_deny bits specified for some + subset of the OPENs in effect for current openowner on the current + file. If that constraint is not respected, the error NFS4ERR_INVAL + should be returned. Since share_access and share_deny bits are + subsets of those already granted, it is not possible for this request + to be denied because of conflicting share reservations. + + On success, the current filehandle retains its value. + + ERRORS + + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADHANDLE + NFS4ERR_BAD_SEQID + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_EXPIRED + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_OLD_STATEID + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + + + + + + +Shepler, et al. Standards Track [Page 183] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.20. Operation 22: PUTFH - Set Current Filehandle + + SYNOPSIS + + filehandle -> (cfh) + + ARGUMENT + + struct PUTFH4args { + nfs_fh4 object; + }; + + RESULT + + struct PUTFH4res { + /* CURRENT_FH: */ + nfsstat4 status; + }; + + DESCRIPTION + + Replaces the current filehandle with the filehandle provided as an + argument. + + If the security mechanism used by the requester does not meet the + requirements of the filehandle provided to this operation, the server + MUST return NFS4ERR_WRONGSEC. + + IMPLEMENTATION + + Commonly used as the first operator in an NFS request to set the + context for following operations. + + ERRORS + + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_FHEXPIRED + NFS4ERR_MOVED + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_WRONGSEC + + + + + + + + +Shepler, et al. Standards Track [Page 184] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.21. Operation 23: PUTPUBFH - Set Public Filehandle + + SYNOPSIS + + - -> (cfh) + + ARGUMENT + + void; + + RESULT + + struct PUTPUBFH4res { + /* CURRENT_FH: public fh */ + nfsstat4 status; + }; + + DESCRIPTION + + Replaces the current filehandle with the filehandle that represents + the public filehandle of the server's name space. This filehandle + may be different from the "root" filehandle which may be associated + with some other directory on the server. + + The public filehandle represents the concepts embodied in [RFC2054], + [RFC2055], [RFC2224]. The intent for NFS version 4 is that the + public filehandle (represented by the PUTPUBFH operation) be used as + a method of providing WebNFS server compatibility with NFS versions 2 + and 3. + + The public filehandle and the root filehandle (represented by the + PUTROOTFH operation) should be equivalent. If the public and root + filehandles are not equivalent, then the public filehandle MUST be a + descendant of the root filehandle. + + IMPLEMENTATION + + Used as the first operator in an NFS request to set the context for + following operations. + + With the NFS version 2 and 3 public filehandle, the client is able to + specify whether the path name provided in the LOOKUP should be + evaluated as either an absolute path relative to the server's root or + relative to the public filehandle. [RFC2224] contains further + discussion of the functionality. With NFS version 4, that type of + specification is not directly available in the LOOKUP operation. The + reason for this is because the component separators needed to specify + absolute vs. relative are not allowed in NFS version 4. Therefore, + + + +Shepler, et al. Standards Track [Page 185] + +RFC 3530 NFS version 4 Protocol April 2003 + + + the client is responsible for constructing its request such that the + use of either PUTROOTFH or PUTPUBFH are used to signify absolute or + relative evaluation of an NFS URL respectively. + + Note that there are warnings mentioned in [RFC2224] with respect to + the use of absolute evaluation and the restrictions the server may + place on that evaluation with respect to how much of its namespace + has been made available. These same warnings apply to NFS version 4. + It is likely, therefore that because of server implementation + details, an NFS version 3 absolute public filehandle lookup may + behave differently than an NFS version 4 absolute resolution. + + There is a form of security negotiation as described in [RFC2755] + that uses the public filehandle a method of employing SNEGO. This + method is not available with NFS version 4 as filehandles are not + overloaded with special meaning and therefore do not provide the same + framework as NFS versions 2 and 3. Clients should therefore use the + security negotiation mechanisms described in this RFC. + + ERRORS + + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_WRONGSEC + +14.2.22. Operation 24: PUTROOTFH - Set Root Filehandle + + SYNOPSIS + + - -> (cfh) + + ARGUMENT + + void; + + RESULT + + struct PUTROOTFH4res { + /* CURRENT_FH: root fh */ + nfsstat4 status; + }; + + + + + + + + + + +Shepler, et al. Standards Track [Page 186] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + Replaces the current filehandle with the filehandle that represents + the root of the server's name space. From this filehandle a LOOKUP + operation can locate any other filehandle on the server. This + filehandle may be different from the "public" filehandle which may be + associated with some other directory on the server. + + IMPLEMENTATION + + Commonly used as the first operator in an NFS request to set the + context for following operations. + + ERRORS + + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_WRONGSEC + +14.2.23. Operation 25: READ - Read from File + + SYNOPSIS + + (cfh), stateid, offset, count -> eof, data + + ARGUMENT + + struct READ4args { + /* CURRENT_FH: file */ + stateid4 stateid; + offset4 offset; + count4 count; + }; + + RESULT + + struct READ4resok { + bool eof; + opaque data<>; + }; + + union READ4res switch (nfsstat4 status) { + case NFS4_OK: + READ4resok resok4; + default: + void; + }; + + + + +Shepler, et al. Standards Track [Page 187] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + The READ operation reads data from the regular file identified by the + current filehandle. + + The client provides an offset of where the READ is to start and a + count of how many bytes are to be read. An offset of 0 (zero) means + to read data starting at the beginning of the file. If offset is + greater than or equal to the size of the file, the status, NFS4_OK, + is returned with a data length set to 0 (zero) and eof is set to + TRUE. The READ is subject to access permissions checking. + + If the client specifies a count value of 0 (zero), the READ succeeds + and returns 0 (zero) bytes of data again subject to access + permissions checking. The server may choose to return fewer bytes + than specified by the client. The client needs to check for this + condition and handle the condition appropriately. + + The stateid value for a READ request represents a value returned from + a previous record lock or share reservation request. The stateid is + used by the server to verify that the associated share reservation + and any record locks are still valid and to update lease timeouts for + the client. + + If the read ended at the end-of-file (formally, in a correctly formed + READ request, if offset + count is equal to the size of the file), or + the read request extends beyond the size of the file (if offset + + count is greater than the size of the file), eof is returned as TRUE; + otherwise it is FALSE. A successful READ of an empty file will + always return eof as TRUE. + + If the current filehandle is not a regular file, an error will be + returned to the client. In the case the current filehandle + represents a directory, NFS4ERR_ISDIR is return; otherwise, + NFS4ERR_INVAL is returned. + + For a READ with a stateid value of all bits 0, the server MAY allow + the READ to be serviced subject to mandatory file locks or the + current share deny modes for the file. For a READ with a stateid + value of all bits 1, the server MAY allow READ operations to bypass + locking checks at the server. + + On success, the current filehandle retains its value. + + + + + + + + +Shepler, et al. Standards Track [Page 188] + +RFC 3530 NFS version 4 Protocol April 2003 + + + IMPLEMENTATION + + It is possible for the server to return fewer than count bytes of + data. If the server returns less than the count requested and eof is + set to FALSE, the client should issue another READ to get the + remaining data. A server may return less data than requested under + several circumstances. The file may have been truncated by another + client or perhaps on the server itself, changing the file size from + what the requesting client believes to be the case. This would + reduce the actual amount of data available to the client. It is + possible that the server may back off the transfer size and reduce + the read request return. Server resource exhaustion may also occur + necessitating a smaller read return. + + If mandatory file locking is on for the file, and if the region + corresponding to the data to be read from file is write locked by an + owner not associated the stateid, the server will return the + NFS4ERR_LOCKED error. The client should try to get the appropriate + read record lock via the LOCK operation before re-attempting the + READ. When the READ completes, the client should release the record + lock via LOCKU. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADHANDLE + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_EXPIRED + NFS4ERR_FHEXPIRED + NFS4ERR_GRACE + NFS4ERR_IO + NFS4ERR_INVAL + NFS4ERR_ISDIR + NFS4ERR_LEASE_MOVED + NFS4ERR_LOCKED + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NXIO + NFS4ERR_OLD_STATEID + NFS4ERR_OPENMODE + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + + + + +Shepler, et al. Standards Track [Page 189] + +RFC 3530 NFS version 4 Protocol April 2003 + + +14.2.24. Operation 26: READDIR - Read Directory + + SYNOPSIS + (cfh), cookie, cookieverf, dircount, maxcount, attr_request -> + cookieverf { cookie, name, attrs } + + ARGUMENT + + struct READDIR4args { + /* CURRENT_FH: directory */ + nfs_cookie4 cookie; + verifier4 cookieverf; + count4 dircount; + count4 maxcount; + bitmap4 attr_request; + }; + + RESULT + + struct entry4 { + nfs_cookie4 cookie; + component4 name; + fattr4 attrs; + entry4 *nextentry; + }; + + struct dirlist4 { + entry4 *entries; + bool eof; + }; + + struct READDIR4resok { + verifier4 cookieverf; + dirlist4 reply; + }; + + + union READDIR4res switch (nfsstat4 status) { + case NFS4_OK: + READDIR4resok resok4; + default: + void; + }; + + + + + + + + +Shepler, et al. Standards Track [Page 190] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + The READDIR operation retrieves a variable number of entries from a + filesystem directory and returns client requested attributes for each + entry along with information to allow the client to request + additional directory entries in a subsequent READDIR. + + The arguments contain a cookie value that represents where the + READDIR should start within the directory. A value of 0 (zero) for + the cookie is used to start reading at the beginning of the + directory. For subsequent READDIR requests, the client specifies a + cookie value that is provided by the server on a previous READDIR + request. + + The cookieverf value should be set to 0 (zero) when the cookie value + is 0 (zero) (first directory read). On subsequent requests, it + should be a cookieverf as returned by the server. The cookieverf + must match that returned by the READDIR in which the cookie was + acquired. If the server determines that the cookieverf is no longer + valid for the directory, the error NFS4ERR_NOT_SAME must be returned. + + The dircount portion of the argument is a hint of the maximum number + of bytes of directory information that should be returned. This + value represents the length of the names of the directory entries and + the cookie value for these entries. This length represents the XDR + encoding of the data (names and cookies) and not the length in the + native format of the server. + + The maxcount value of the argument is the maximum number of bytes for + the result. This maximum size represents all of the data being + returned within the READDIR4resok structure and includes the XDR + overhead. The server may return less data. If the server is unable + to return a single directory entry within the maxcount limit, the + error NFS4ERR_TOOSMALL will be returned to the client. + + Finally, attr_request represents the list of attributes to be + returned for each directory entry supplied by the server. + + On successful return, the server's response will provide a list of + directory entries. Each of these entries contains the name of the + directory entry, a cookie value for that entry, and the associated + attributes as requested. The "eof" flag has a value of TRUE if there + are no more entries in the directory. + + The cookie value is only meaningful to the server and is used as a + "bookmark" for the directory entry. As mentioned, this cookie is + used by the client for subsequent READDIR operations so that it may + continue reading a directory. The cookie is similar in concept to a + + + +Shepler, et al. Standards Track [Page 191] + +RFC 3530 NFS version 4 Protocol April 2003 + + + READ offset but should not be interpreted as such by the client. + Ideally, the cookie value should not change if the directory is + modified since the client may be caching these values. + + In some cases, the server may encounter an error while obtaining the + attributes for a directory entry. Instead of returning an error for + the entire READDIR operation, the server can instead return the + attribute 'fattr4_rdattr_error'. With this, the server is able to + communicate the failure to the client and not fail the entire + operation in the instance of what might be a transient failure. + Obviously, the client must request the fattr4_rdattr_error attribute + for this method to work properly. If the client does not request the + attribute, the server has no choice but to return failure for the + entire READDIR operation. + + For some filesystem environments, the directory entries "." and ".." + have special meaning and in other environments, they may not. If the + server supports these special entries within a directory, they should + not be returned to the client as part of the READDIR response. To + enable some client environments, the cookie values of 0, 1, and 2 are + to be considered reserved. Note that the UNIX client will use these + values when combining the server's response and local representations + to enable a fully formed UNIX directory presentation to the + application. + + For READDIR arguments, cookie values of 1 and 2 should not be used + and for READDIR results cookie values of 0, 1, and 2 should not be + returned. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + The server's filesystem directory representations can differ greatly. + A client's programming interfaces may also be bound to the local + operating environment in a way that does not translate well into the + NFS protocol. Therefore the use of the dircount and maxcount fields + are provided to allow the client the ability to provide guidelines to + the server. If the client is aggressive about attribute collection + during a READDIR, the server has an idea of how to limit the encoded + response. The dircount field provides a hint on the number of + entries based solely on the names of the directory entries. Since it + is a hint, it may be possible that a dircount value is zero. In this + case, the server is free to ignore the dircount value and return + directory information based on the specified maxcount value. + + + + + + +Shepler, et al. Standards Track [Page 192] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The cookieverf may be used by the server to help manage cookie values + that may become stale. It should be a rare occurrence that a server + is unable to continue properly reading a directory with the provided + cookie/cookieverf pair. The server should make every effort to avoid + this condition since the application at the client may not be able to + properly handle this type of failure. + + The use of the cookieverf will also protect the client from using + READDIR cookie values that may be stale. For example, if the file + system has been migrated, the server may or may not be able to use + the same cookie values to service READDIR as the previous server + used. With the client providing the cookieverf, the server is able + to provide the appropriate response to the client. This prevents the + case where the server may accept a cookie value but the underlying + directory has changed and the response is invalid from the client's + context of its previous READDIR. + + Since some servers will not be returning "." and ".." entries as has + been done with previous versions of the NFS protocol, the client that + requires these entries be present in READDIR responses must fabricate + them. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_BAD_COOKIE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOTDIR + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_TOOSMALL + +14.2.25. Operation 27: READLINK - Read Symbolic Link + + SYNOPSIS + + (cfh) -> linktext + + + + + + +Shepler, et al. Standards Track [Page 193] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ARGUMENT + + /* CURRENT_FH: symlink */ + void; + + RESULT + + struct READLINK4resok { + linktext4 link; + }; + + union READLINK4res switch (nfsstat4 status) { + case NFS4_OK: + READLINK4resok resok4; + default: + void; + }; + + DESCRIPTION + + READLINK reads the data associated with a symbolic link. The data is + a UTF-8 string that is opaque to the server. That is, whether + created by an NFS client or created locally on the server, the data + in a symbolic link is not interpreted when created, but is simply + stored. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + A symbolic link is nominally a pointer to another file. The data is + not necessarily interpreted by the server, just stored in the file. + It is possible for a client implementation to store a path name that + is not meaningful to the server operating system in a symbolic link. + A READLINK operation returns the data to the client for + interpretation. If different implementations want to share access to + symbolic links, then they must agree on the interpretation of the + data in the symbolic link. + + The READLINK operation is only allowed on objects of type NF4LNK. + The server should return the error, NFS4ERR_INVAL, if the object is + not of type, NF4LNK. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADHANDLE + NFS4ERR_DELAY + + + +Shepler, et al. Standards Track [Page 194] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_ISDIR + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOTSUPP + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.26. Operation 28: REMOVE - Remove Filesystem Object + + SYNOPSIS + + (cfh), filename -> change_info + + ARGUMENT + + struct REMOVE4args { + /* CURRENT_FH: directory */ + component4 target; + }; + + RESULT + + struct REMOVE4resok { + change_info4 cinfo; + } + + union REMOVE4res switch (nfsstat4 status) { + case NFS4_OK: + REMOVE4resok resok4; + default: + void; + } + + DESCRIPTION + + The REMOVE operation removes (deletes) a directory entry named by + filename from the directory corresponding to the current filehandle. + If the entry in the directory was the last reference to the + corresponding filesystem object, the object may be destroyed. + + + + + + + + +Shepler, et al. Standards Track [Page 195] + +RFC 3530 NFS version 4 Protocol April 2003 + + + For the directory where the filename was removed, the server returns + change_info4 information in cinfo. With the atomic field of the + change_info4 struct, the server will indicate if the before and after + change attributes were obtained atomically with respect to the + removal. + + If the target has a length of 0 (zero), or if target does not obey + the UTF-8 definition, the error NFS4ERR_INVAL will be returned. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + NFS versions 2 and 3 required a different operator RMDIR for + directory removal and REMOVE for non-directory removal. This allowed + clients to skip checking the file type when being passed a non- + directory delete system call (e.g., unlink() in POSIX) to remove a + directory, as well as the converse (e.g., a rmdir() on a non- + directory) because they knew the server would check the file type. + NFS version 4 REMOVE can be used to delete any directory entry + independent of its file type. The implementor of an NFS version 4 + client's entry points from the unlink() and rmdir() system calls + should first check the file type against the types the system call is + allowed to remove before issuing a REMOVE. Alternatively, the + implementor can produce a COMPOUND call that includes a LOOKUP/VERIFY + sequence to verify the file type before a REMOVE operation in the + same COMPOUND call. + + The concept of last reference is server specific. However, if the + numlinks field in the previous attributes of the object had the value + 1, the client should not rely on referring to the object via a + filehandle. Likewise, the client should not rely on the resources + (disk space, directory entry, and so on) formerly associated with the + object becoming immediately available. Thus, if a client needs to be + able to continue to access a file after using REMOVE to remove it, + the client should take steps to make sure that the file will still be + accessible. The usual mechanism used is to RENAME the file from its + old name to a new hidden name. + + If the server finds that the file is still open when the REMOVE + arrives: + + o The server SHOULD NOT delete the file's directory entry if the + file was opened with OPEN4_SHARE_DENY_WRITE or + OPEN4_SHARE_DENY_BOTH. + + + + + + +Shepler, et al. Standards Track [Page 196] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o If the file was not opened with OPEN4_SHARE_DENY_WRITE or + OPEN4_SHARE_DENY_BOTH, the server SHOULD delete the file's + directory entry. However, until last CLOSE of the file, the + server MAY continue to allow access to the file via its + filehandle. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADNAME + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_FHEXPIRED + NFS4ERR_FILE_OPEN + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NAMETOOLONG + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOTDIR + NFS4ERR_NOTEMPTY + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.27. Operation 29: RENAME - Rename Directory Entry + + SYNOPSIS + + (sfh), oldname, (cfh), newname -> source_change_info, + target_change_info + + ARGUMENT + + struct RENAME4args { + /* SAVED_FH: source directory */ + component4 oldname; + /* CURRENT_FH: target directory */ + component4 newname; + }; + + + + + + + +Shepler, et al. Standards Track [Page 197] + +RFC 3530 NFS version 4 Protocol April 2003 + + + RESULT + + struct RENAME4resok { + change_info4 source_cinfo; + change_info4 target_cinfo; + }; + + union RENAME4res switch (nfsstat4 status) { + case NFS4_OK: + RENAME4resok resok4; + default: + void; + }; + + DESCRIPTION + + The RENAME operation renames the object identified by oldname in the + source directory corresponding to the saved filehandle, as set by the + SAVEFH operation, to newname in the target directory corresponding to + the current filehandle. The operation is required to be atomic to + the client. Source and target directories must reside on the same + filesystem on the server. On success, the current filehandle will + continue to be the target directory. + + If the target directory already contains an entry with the name, + newname, the source object must be compatible with the target: + either both are non-directories or both are directories and the + target must be empty. If compatible, the existing target is removed + before the rename occurs (See the IMPLEMENTATION subsection of the + section "Operation 28: REMOVE - Remove Filesystem Object" for client + and server actions whenever a target is removed). If they are not + compatible or if the target is a directory but not empty, the server + will return the error, NFS4ERR_EXIST. + + If oldname and newname both refer to the same file (they might be + hard links of each other), then RENAME should perform no action and + return success. + + For both directories involved in the RENAME, the server returns + change_info4 information. With the atomic field of the change_info4 + struct, the server will indicate if the before and after change + attributes were obtained atomically with respect to the rename. + + If the oldname refers to a named attribute and the saved and current + filehandles refer to different filesystem objects, the server will + return NFS4ERR_XDEV just as if the saved and current filehandles + represented directories on different filesystems. + + + + +Shepler, et al. Standards Track [Page 198] + +RFC 3530 NFS version 4 Protocol April 2003 + + + If the oldname or newname has a length of 0 (zero), or if oldname or + newname does not obey the UTF-8 definition, the error NFS4ERR_INVAL + will be returned. + + IMPLEMENTATION + + The RENAME operation must be atomic to the client. The statement + "source and target directories must reside on the same filesystem on + the server" means that the fsid fields in the attributes for the + directories are the same. If they reside on different filesystems, + the error, NFS4ERR_XDEV, is returned. + + Based on the value of the fh_expire_type attribute for the object, + the filehandle may or may not expire on a RENAME. However, server + implementors are strongly encouraged to attempt to keep filehandles + from expiring in this fashion. + + On some servers, the file names "." and ".." are illegal as either + oldname or newname, and will result in the error NFS4ERR_BADNAME. In + addition, on many servers the case of oldname or newname being an + alias for the source directory will be checked for. Such servers + will return the error NFS4ERR_INVAL in these cases. + + If either of the source or target filehandles are not directories, + the server will return NFS4ERR_NOTDIR. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADNAME + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DQUOT + NFS4ERR_EXIST + NFS4ERR_FHEXPIRED + NFS4ERR_FILE_OPEN + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_MOVED + NFS4ERR_NAMETOOLONG + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOSPC + NFS4ERR_NOTDIR + NFS4ERR_NOTEMPTY + NFS4ERR_RESOURCE + + + +Shepler, et al. Standards Track [Page 199] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_WRONGSEC + NFS4ERR_XDEV + +14.2.28. Operation 30: RENEW - Renew a Lease + + SYNOPSIS + + clientid -> () + + ARGUMENT + + struct RENEW4args { + clientid4 clientid; + }; + + RESULT + + struct RENEW4res { + nfsstat4 status; + }; + + DESCRIPTION + + The RENEW operation is used by the client to renew leases which it + currently holds at a server. In processing the RENEW request, the + server renews all leases associated with the client. The associated + leases are determined by the clientid provided via the SETCLIENTID + operation. + + IMPLEMENTATION + + When the client holds delegations, it needs to use RENEW to detect + when the server has determined that the callback path is down. When + the server has made such a determination, only the RENEW operation + will renew the lease on delegations. If the server determines the + callback path is down, it returns NFS4ERR_CB_PATH_DOWN. Even though + it returns NFS4ERR_CB_PATH_DOWN, the server MUST renew the lease on + the record locks and share reservations that the client has + established on the server. If for some reason the lock and share + reservation lease cannot be renewed, then the server MUST return an + error other than NFS4ERR_CB_PATH_DOWN, even if the callback path is + also down. + + + + + + +Shepler, et al. Standards Track [Page 200] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The client that issues RENEW MUST choose the principal, RPC security + flavor, and if applicable, GSS-API mechanism and service via one of + the following algorithms: + + o The client uses the same principal, RPC security flavor -- and if + the flavor was RPCSEC_GSS -- the same mechanism and service that + was used when the client id was established via + SETCLIENTID_CONFIRM. + + o The client uses any principal, RPC security flavor mechanism and + service combination that currently has an OPEN file on the server. + I.e., the same principal had a successful OPEN operation, the + file is still open by that principal, and the flavor, mechanism, + and service of RENEW match that of the previous OPEN. + + The server MUST reject a RENEW that does not use one the + aforementioned algorithms, with the error NFS4ERR_ACCESS. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADXDR + NFS4ERR_CB_PATH_DOWN + NFS4ERR_EXPIRED + NFS4ERR_LEASE_MOVED + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE_CLIENTID + +14.2.29. Operation 31: RESTOREFH - Restore Saved Filehandle + + SYNOPSIS + + (sfh) -> (cfh) + + ARGUMENT + + /* SAVED_FH: */ + void; + + RESULT + + struct RESTOREFH4res { + /* CURRENT_FH: value of saved fh */ + nfsstat4 status; + }; + + + + +Shepler, et al. Standards Track [Page 201] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + Set the current filehandle to the value in the saved filehandle. If + there is no saved filehandle then return the error NFS4ERR_RESTOREFH. + + IMPLEMENTATION + + Operations like OPEN and LOOKUP use the current filehandle to + represent a directory and replace it with a new filehandle. Assuming + the previous filehandle was saved with a SAVEFH operator, the + previous filehandle can be restored as the current filehandle. This + is commonly used to obtain post-operation attributes for the + directory, e.g., + + PUTFH (directory filehandle) + SAVEFH + GETATTR attrbits (pre-op dir attrs) + CREATE optbits "foo" attrs + GETATTR attrbits (file attributes) + RESTOREFH + GETATTR attrbits (post-op dir attrs) + + ERRORS + + NFS4ERR_BADHANDLE + NFS4ERR_FHEXPIRED + NFS4ERR_MOVED + NFS4ERR_RESOURCE + NFS4ERR_RESTOREFH + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_WRONGSEC + +14.2.30. Operation 32: SAVEFH - Save Current Filehandle + + SYNOPSIS + + (cfh) -> (sfh) + + ARGUMENT + + /* CURRENT_FH: */ + void; + + + + + + + + +Shepler, et al. Standards Track [Page 202] + +RFC 3530 NFS version 4 Protocol April 2003 + + + RESULT + + struct SAVEFH4res { + /* SAVED_FH: value of current fh */ + nfsstat4 status; + }; + + DESCRIPTION + + Save the current filehandle. If a previous filehandle was saved then + it is no longer accessible. The saved filehandle can be restored as + the current filehandle with the RESTOREFH operator. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + ERRORS + + NFS4ERR_BADHANDLE + NFS4ERR_FHEXPIRED + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.31. Operation 33: SECINFO - Obtain Available Security + + SYNOPSIS + + (cfh), name -> { secinfo } + + ARGUMENT + + struct SECINFO4args { + /* CURRENT_FH: directory */ + component4 name; + }; + + RESULT + + enum rpc_gss_svc_t {/* From RFC 2203 */ + RPC_GSS_SVC_NONE = 1, + RPC_GSS_SVC_INTEGRITY = 2, + RPC_GSS_SVC_PRIVACY = 3 + }; + + + + +Shepler, et al. Standards Track [Page 203] + +RFC 3530 NFS version 4 Protocol April 2003 + + + struct rpcsec_gss_info { + sec_oid4 oid; + qop4 qop; + rpc_gss_svc_t service; + }; + + union secinfo4 switch (uint32_t flavor) { + case RPCSEC_GSS: + rpcsec_gss_info flavor_info; + default: + void; + }; + + typedef secinfo4 SECINFO4resok<>; + + union SECINFO4res switch (nfsstat4 status) { + case NFS4_OK: + SECINFO4resok resok4; + default: + void; + }; + + DESCRIPTION + + The SECINFO operation is used by the client to obtain a list of valid + RPC authentication flavors for a specific directory filehandle, file + name pair. SECINFO should apply the same access methodology used for + LOOKUP when evaluating the name. Therefore, if the requester does + not have the appropriate access to LOOKUP the name then SECINFO must + behave the same way and return NFS4ERR_ACCESS. + + The result will contain an array which represents the security + mechanisms available, with an order corresponding to server's + preferences, the most preferred being first in the array. The client + is free to pick whatever security mechanism it both desires and + supports, or to pick in the server's preference order the first one + it supports. The array entries are represented by the secinfo4 + structure. The field 'flavor' will contain a value of AUTH_NONE, + AUTH_SYS (as defined in [RFC1831]), or RPCSEC_GSS (as defined in + [RFC2203]). + + For the flavors AUTH_NONE and AUTH_SYS, no additional security + information is returned. For a return value of RPCSEC_GSS, a + security triple is returned that contains the mechanism object id (as + defined in [RFC2743]), the quality of protection (as defined in + [RFC2743]) and the service type (as defined in [RFC2203]). It is + possible for SECINFO to return multiple entries with flavor equal to + RPCSEC_GSS with different security triple values. + + + +Shepler, et al. Standards Track [Page 204] + +RFC 3530 NFS version 4 Protocol April 2003 + + + On success, the current filehandle retains its value. + + If the name has a length of 0 (zero), or if name does not obey the + UTF-8 definition, the error NFS4ERR_INVAL will be returned. + + IMPLEMENTATION + + The SECINFO operation is expected to be used by the NFS client when + the error value of NFS4ERR_WRONGSEC is returned from another NFS + operation. This signifies to the client that the server's security + policy is different from what the client is currently using. At this + point, the client is expected to obtain a list of possible security + flavors and choose what best suits its policies. + + As mentioned, the server's security policies will determine when a + client request receives NFS4ERR_WRONGSEC. The operations which may + receive this error are: LINK, LOOKUP, OPEN, PUTFH, PUTPUBFH, + PUTROOTFH, RESTOREFH, RENAME, and indirectly READDIR. LINK and + RENAME will only receive this error if the security used for the + operation is inappropriate for saved filehandle. With the exception + of READDIR, these operations represent the point at which the client + can instantiate a filehandle into the "current filehandle" at the + server. The filehandle is either provided by the client (PUTFH, + PUTPUBFH, PUTROOTFH) or generated as a result of a name to filehandle + translation (LOOKUP and OPEN). RESTOREFH is different because the + filehandle is a result of a previous SAVEFH. Even though the + filehandle, for RESTOREFH, might have previously passed the server's + inspection for a security match, the server will check it again on + RESTOREFH to ensure that the security policy has not changed. + + If the client wants to resolve an error return of NFS4ERR_WRONGSEC, + the following will occur: + + o For LOOKUP and OPEN, the client will use SECINFO with the same + current filehandle and name as provided in the original LOOKUP or + OPEN to enumerate the available security triples. + + o For LINK, PUTFH, RENAME, and RESTOREFH, the client will use + SECINFO and provide the parent directory filehandle and object + name which corresponds to the filehandle originally provided by + the PUTFH RESTOREFH, or for LINK and RENAME, the SAVEFH. + + o For PUTROOTFH and PUTPUBFH, the client will be unable to use the + SECINFO operation since SECINFO requires a current filehandle and + none exist for these two operations. Therefore, the client must + iterate through the security triples available at the client and + reattempt the PUTROOTFH or PUTPUBFH operation. In the unfortunate + event none of the MANDATORY security triples are supported by the + + + +Shepler, et al. Standards Track [Page 205] + +RFC 3530 NFS version 4 Protocol April 2003 + + + client and server, the client SHOULD try using others that support + integrity. Failing that, the client can try using AUTH_NONE, but + because such forms lack integrity checks, this puts the client at + risk. Nonetheless, the server SHOULD allow the client to use + whatever security form the client requests and the server + supports, since the risks of doing so are on the client. + + The READDIR operation will not directly return the NFS4ERR_WRONGSEC + error. However, if the READDIR request included a request for + attributes, it is possible that the READDIR request's security triple + does not match that of a directory entry. If this is the case and + the client has requested the rdattr_error attribute, the server will + return the NFS4ERR_WRONGSEC error in rdattr_error for the entry. + + See the section "Security Considerations" for a discussion on the + recommendations for security flavor used by SECINFO. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADNAME + NFS4ERR_BADXDR + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_MOVED + NFS4ERR_NAMETOOLONG + NFS4ERR_NOENT + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOTDIR + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.32. Operation 34: SETATTR - Set Attributes + + SYNOPSIS + + (cfh), stateid, attrmask, attr_vals -> attrsset + + ARGUMENT + + struct SETATTR4args { + /* CURRENT_FH: target object */ + stateid4 stateid; + fattr4 obj_attributes; + }; + + + +Shepler, et al. Standards Track [Page 206] + +RFC 3530 NFS version 4 Protocol April 2003 + + + RESULT + + struct SETATTR4res { + nfsstat4 status; + bitmap4 attrsset; + }; + + DESCRIPTION + + The SETATTR operation changes one or more of the attributes of a + filesystem object. The new attributes are specified with a bitmap + and the attributes that follow the bitmap in bit order. + + The stateid argument for SETATTR is used to provide file locking + context that is necessary for SETATTR requests that set the size + attribute. Since setting the size attribute modifies the file's + data, it has the same locking requirements as a corresponding WRITE. + Any SETATTR that sets the size attribute is incompatible with a share + reservation that specifies DENY_WRITE. The area between the old + end-of-file and the new end-of-file is considered to be modified just + as would have been the case had the area in question been specified + as the target of WRITE, for the purpose of checking conflicts with + record locks, for those cases in which a server is implementing + mandatory record locking behavior. A valid stateid should always be + specified. When the file size attribute is not set, the special + stateid consisting of all bits zero should be passed. + + On either success or failure of the operation, the server will return + the attrsset bitmask to represent what (if any) attributes were + successfully set. The attrsset in the response is a subset of the + bitmap4 that is part of the obj_attributes in the argument. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + If the request specifies the owner attribute to be set, the server + should allow the operation to succeed if the current owner of the + object matches the value specified in the request. Some servers may + be implemented in a way as to prohibit the setting of the owner + attribute unless the requester has privilege to do so. If the server + is lenient in this one case of matching owner values, the client + implementation may be simplified in cases of creation of an object + followed by a SETATTR. + + The file size attribute is used to request changes to the size of a + file. A value of 0 (zero) causes the file to be truncated, a value + less than the current size of the file causes data from new size to + + + +Shepler, et al. Standards Track [Page 207] + +RFC 3530 NFS version 4 Protocol April 2003 + + + the end of the file to be discarded, and a size greater than the + current size of the file causes logically zeroed data bytes to be + added to the end of the file. Servers are free to implement this + using holes or actual zero data bytes. Clients should not make any + assumptions regarding a server's implementation of this feature, + beyond that the bytes returned will be zeroed. Servers must support + extending the file size via SETATTR. + + SETATTR is not guaranteed atomic. A failed SETATTR may partially + change a file's attributes. + + Changing the size of a file with SETATTR indirectly changes the + time_modify. A client must account for this as size changes can + result in data deletion. + + The attributes time_access_set and time_modify_set are write-only + attributes constructed as a switched union so the client can direct + the server in setting the time values. If the switched union + specifies SET_TO_CLIENT_TIME4, the client has provided an nfstime4 to + be used for the operation. If the switch union does not specify + SET_TO_CLIENT_TIME4, the server is to use its current time for the + SETATTR operation. + + If server and client times differ, programs that compare client time + to file times can break. A time maintenance protocol should be used + to limit client/server time skew. + + Use of a COMPOUND containing a VERIFY operation specifying only the + change attribute, immediately followed by a SETATTR, provides a means + whereby a client may specify a request that emulates the + functionality of the SETATTR guard mechanism of NFS version 3. Since + the function of the guard mechanism is to avoid changes to the file + attributes based on stale information, delays between checking of the + guard condition and the setting of the attributes have the potential + to compromise this function, as would the corresponding delay in the + NFS version 4 emulation. Therefore, NFS version 4 servers should + take care to avoid such delays, to the degree possible, when + executing such a request. + + If the server does not support an attribute as requested by the + client, the server should return NFS4ERR_ATTRNOTSUPP. + + A mask of the attributes actually set is returned by SETATTR in all + cases. That mask must not include attributes bits not requested to + be set by the client, and must be equal to the mask of attributes + requested to be set only if the SETATTR completes without error. + + + + + +Shepler, et al. Standards Track [Page 208] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ADMIN_REVOKED + NFS4ERR_ATTRNOTSUPP + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADOWNER + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DQUOT + NFS4ERR_EXPIRED + NFS4ERR_FBIG + NFS4ERR_FHEXPIRED + NFS4ERR_GRACE + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_ISDIR + NFS4ERR_LOCKED + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOSPC + NFS4ERR_OLD_STATEID + NFS4ERR_OPENMODE + NFS4ERR_PERM + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + +14.2.33. Operation 35: SETCLIENTID - Negotiate Clientid + + SYNOPSIS + + client, callback, callback_ident -> clientid, setclientid_confirm + + ARGUMENT + + struct SETCLIENTID4args { + nfs_client_id4 client; + cb_client4 callback; + uint32_t callback_ident; + }; + + + + + + +Shepler, et al. Standards Track [Page 209] + +RFC 3530 NFS version 4 Protocol April 2003 + + + RESULT + + struct SETCLIENTID4resok { + clientid4 clientid; + verifier4 setclientid_confirm; + }; + + union SETCLIENTID4res switch (nfsstat4 status) { + case NFS4_OK: + SETCLIENTID4resok resok4; + case NFS4ERR_CLID_INUSE: + clientaddr4 client_using; + default: + void; + }; + + DESCRIPTION + + The client uses the SETCLIENTID operation to notify the server of its + intention to use a particular client identifier, callback, and + callback_ident for subsequent requests that entail creating lock, + share reservation, and delegation state on the server. Upon + successful completion the server will return a shorthand clientid + which, if confirmed via a separate step, will be used in subsequent + file locking and file open requests. Confirmation of the clientid + must be done via the SETCLIENTID_CONFIRM operation to return the + clientid and setclientid_confirm values, as verifiers, to the server. + The reason why two verifiers are necessary is that it is possible to + use SETCLIENTID and SETCLIENTID_CONFIRM to modify the callback and + callback_ident information but not the shorthand clientid. In that + event, the setclientid_confirm value is effectively the only + verifier. + + The callback information provided in this operation will be used if + the client is provided an open delegation at a future point. + Therefore, the client must correctly reflect the program and port + numbers for the callback program at the time SETCLIENTID is used. + + The callback_ident value is used by the server on the callback. The + client can leverage the callback_ident to eliminate the need for more + than one callback RPC program number, while still being able to + determine which server is initiating the callback. + + + + + + + + + +Shepler, et al. Standards Track [Page 210] + +RFC 3530 NFS version 4 Protocol April 2003 + + + IMPLEMENTATION + + To understand how to implement SETCLIENTID, make the following + notations. Let: + + x be the value of the client.id subfield of the SETCLIENTID4args + structure. + + v be the value of the client.verifier subfield of the + SETCLIENTID4args structure. + + c be the value of the clientid field returned in the + SETCLIENTID4resok structure. + + k represent the value combination of the fields callback and + callback_ident fields of the SETCLIENTID4args structure. + + s be the setclientid_confirm value returned in the + SETCLIENTID4resok structure. + + { v, x, c, k, s } + be a quintuple for a client record. A client record is + confirmed if there has been a SETCLIENTID_CONFIRM operation to + confirm it. Otherwise it is unconfirmed. An unconfirmed + record is established by a SETCLIENTID call. + + Since SETCLIENTID is a non-idempotent operation, let us assume that + the server is implementing the duplicate request cache (DRC). + + When the server gets a SETCLIENTID { v, x, k } request, it processes + it in the following manner. + + o It first looks up the request in the DRC. If there is a hit, it + returns the result cached in the DRC. The server does NOT remove + client state (locks, shares, delegations) nor does it modify any + recorded callback and callback_ident information for client { x }. + + For any DRC miss, the server takes the client id string x, and + searches for client records for x that the server may have + recorded from previous SETCLIENTID calls. For any confirmed record + with the same id string x, if the recorded principal does not + match that of SETCLIENTID call, then the server returns a + NFS4ERR_CLID_INUSE error. + + For brevity of discussion, the remaining description of the + processing assumes that there was a DRC miss, and that where the + server has previously recorded a confirmed record for client x, + the aforementioned principal check has successfully passed. + + + +Shepler, et al. Standards Track [Page 211] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o The server checks if it has recorded a confirmed record for { v, + x, c, l, s }, where l may or may not equal k. If so, and since the + id verifier v of the request matches that which is confirmed and + recorded, the server treats this as a probable callback + information update and records an unconfirmed { v, x, c, k, t } + and leaves the confirmed { v, x, c, l, s } in place, such that t + != s. It does not matter if k equals l or not. Any pre-existing + unconfirmed { v, x, c, *, * } is removed. + + The server returns { c, t }. It is indeed returning the old + clientid4 value c, because the client apparently only wants to + update callback value k to value l. It's possible this request is + one from the Byzantine router that has stale callback information, + but this is not a problem. The callback information update is + only confirmed if followed up by a SETCLIENTID_CONFIRM { c, t }. + + The server awaits confirmation of k via + SETCLIENTID_CONFIRM { c, t }. + + The server does NOT remove client (lock/share/delegation) state + for x. + + o The server has previously recorded a confirmed { u, x, c, l, s } + record such that v != u, l may or may not equal k, and has not + recorded any unconfirmed { *, x, *, *, * } record for x. The + server records an unconfirmed { v, x, d, k, t } (d != c, t != s). + + The server returns { d, t }. + + The server awaits confirmation of { d, k } via SETCLIENTID_CONFIRM + { d, t }. + + The server does NOT remove client (lock/share/delegation) state + for x. + + o The server has previously recorded a confirmed { u, x, c, l, s } + record such that v != u, l may or may not equal k, and recorded an + unconfirmed { w, x, d, m, t } record such that c != d, t != s, m + may or may not equal k, m may or may not equal l, and k may or may + not equal l. Whether w == v or w != v makes no difference. The + server simply removes the unconfirmed { w, x, d, m, t } record and + replaces it with an unconfirmed { v, x, e, k, r } record, such + that e != d, e != c, r != t, r != s. + + The server returns { e, r }. + + The server awaits confirmation of { e, k } via + SETCLIENTID_CONFIRM { e, r }. + + + +Shepler, et al. Standards Track [Page 212] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The server does NOT remove client (lock/share/delegation) state + for x. + + o The server has no confirmed { *, x, *, *, * } for x. It may or may + not have recorded an unconfirmed { u, x, c, l, s }, where l may or + may not equal k, and u may or may not equal v. Any unconfirmed + record { u, x, c, l, * }, regardless whether u == v or l == k, is + replaced with an unconfirmed record { v, x, d, k, t } where d != + c, t != s. + + The server returns { d, t }. + + The server awaits confirmation of { d, k } via SETCLIENTID_CONFIRM + { d, t }. The server does NOT remove client + (lock/share/delegation) state for x. + + The server generates the clientid and setclientid_confirm values and + must take care to ensure that these values are extremely unlikely to + ever be regenerated. + + ERRORS + + NFS4ERR_BADXDR + NFS4ERR_CLID_INUSE + NFS4ERR_INVAL + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + +14.2.34. Operation 36: SETCLIENTID_CONFIRM - Confirm Clientid + + SYNOPSIS + + clientid, verifier -> - + + ARGUMENT + + struct SETCLIENTID_CONFIRM4args { + clientid4 clientid; + verifier4 setclientid_confirm; + }; + + RESULT + + struct SETCLIENTID_CONFIRM4res { + nfsstat4 status; + }; + + + + + +Shepler, et al. Standards Track [Page 213] + +RFC 3530 NFS version 4 Protocol April 2003 + + + DESCRIPTION + + This operation is used by the client to confirm the results from a + previous call to SETCLIENTID. The client provides the server + supplied (from a SETCLIENTID response) clientid. The server responds + with a simple status of success or failure. + + IMPLEMENTATION + + The client must use the SETCLIENTID_CONFIRM operation to confirm the + following two distinct cases: + + o The client's use of a new shorthand client identifier (as returned + from the server in the response to SETCLIENTID), a new callback + value (as specified in the arguments to SETCLIENTID) and a new + callback_ident (as specified in the arguments to SETCLIENTID) + value. The client's use of SETCLIENTID_CONFIRM in this case also + confirms the removal of any of the client's previous relevant + leased state. Relevant leased client state includes record locks, + share reservations, and where the server does not support the + CLAIM_DELEGATE_PREV claim type, delegations. If the server + supports CLAIM_DELEGATE_PREV, then SETCLIENTID_CONFIRM MUST NOT + remove delegations for this client; relevant leased client state + would then just include record locks and share reservations. + + o The client's re-use of an old, previously confirmed, shorthand + client identifier, a new callback value, and a new callback_ident + value. The client's use of SETCLIENTID_CONFIRM in this case MUST + NOT result in the removal of any previous leased state (locks, + share reservations, and delegations) + + We use the same notation and definitions for v, x, c, k, s, and + unconfirmed and confirmed client records as introduced in the + description of the SETCLIENTID operation. The arguments to + SETCLIENTID_CONFIRM are indicated by the notation { c, s }, where c + is a value of type clientid4, and s is a value of type verifier4 + corresponding to the setclientid_confirm field. + + As with SETCLIENTID, SETCLIENTID_CONFIRM is a non-idempotent + operation, and we assume that the server is implementing the + duplicate request cache (DRC). + + When the server gets a SETCLIENTID_CONFIRM { c, s } request, it + processes it in the following manner. + + + + + + + +Shepler, et al. Standards Track [Page 214] + +RFC 3530 NFS version 4 Protocol April 2003 + + + o It first looks up the request in the DRC. If there is a hit, it + returns the result cached in the DRC. The server does not remove + any relevant leased client state nor does it modify any recorded + callback and callback_ident information for client { x } as + represented by the shorthand value c. + + For a DRC miss, the server checks for client records that match the + shorthand value c. The processing cases are as follows: + + o The server has recorded an unconfirmed { v, x, c, k, s } record + and a confirmed { v, x, c, l, t } record, such that s != t. If + the principals of the records do not match that of the + SETCLIENTID_CONFIRM, the server returns NFS4ERR_CLID_INUSE, and no + relevant leased client state is removed and no recorded callback + and callback_ident information for client { x } is changed. + Otherwise, the confirmed { v, x, c, l, t } record is removed and + the unconfirmed { v, x, c, k, s } is marked as confirmed, thereby + modifying recorded and confirmed callback and callback_ident + information for client { x }. + + The server does not remove any relevant leased client state. + + The server returns NFS4_OK. + + o The server has not recorded an unconfirmed { v, x, c, *, * } and + has recorded a confirmed { v, x, c, *, s }. If the principals of + the record and of SETCLIENTID_CONFIRM do not match, the server + returns NFS4ERR_CLID_INUSE without removing any relevant leased + client state and without changing recorded callback and + callback_ident values for client { x }. + + If the principals match, then what has likely happened is that the + client never got the response from the SETCLIENTID_CONFIRM, and + the DRC entry has been purged. Whatever the scenario, since the + principals match, as well as { c, s } matching a confirmed record, + the server leaves client x's relevant leased client state intact, + leaves its callback and callback_ident values unmodified, and + returns NFS4_OK. + + o The server has not recorded a confirmed { *, *, c, *, * }, and has + recorded an unconfirmed { *, x, c, k, s }. Even if this is a + retry from client, nonetheless the client's first + SETCLIENTID_CONFIRM attempt was not received by the server. Retry + or not, the server doesn't know, but it processes it as if were a + first try. If the principal of the unconfirmed { *, x, c, k, s } + record mismatches that of the SETCLIENTID_CONFIRM request the + server returns NFS4ERR_CLID_INUSE without removing any relevant + leased client state. + + + +Shepler, et al. Standards Track [Page 215] + +RFC 3530 NFS version 4 Protocol April 2003 + + + Otherwise, the server records a confirmed { *, x, c, k, s }. If + there is also a confirmed { *, x, d, *, t }, the server MUST + remove the client x's relevant leased client state, and overwrite + the callback state with k. The confirmed record { *, x, d, *, t } + is removed. + + Server returns NFS4_OK. + + o The server has no record of a confirmed or unconfirmed { *, *, c, + *, s }. The server returns NFS4ERR_STALE_CLIENTID. The server + does not remove any relevant leased client state, nor does it + modify any recorded callback and callback_ident information for + any client. + + The server needs to cache unconfirmed { v, x, c, k, s } client + records and await for some time their confirmation. As should be + clear from the record processing discussions for SETCLIENTID and + SETCLIENTID_CONFIRM, there are cases where the server does not + deterministically remove unconfirmed client records. To avoid + running out of resources, the server is not required to hold + unconfirmed records indefinitely. One strategy the server might use + is to set a limit on how many unconfirmed client records it will + maintain, and then when the limit would be exceeded, remove the + oldest record. Another strategy might be to remove an unconfirmed + record when some amount of time has elapsed. The choice of the amount + of time is fairly arbitrary but it is surely no higher than the + server's lease time period. Consider that leases need to be renewed + before the lease time expires via an operation from the client. If + the client cannot issue a SETCLIENTID_CONFIRM after a SETCLIENTID + before a period of time equal to that of a lease expires, then the + client is unlikely to be able maintain state on the server during + steady state operation. + + If the client does send a SETCLIENTID_CONFIRM for an unconfirmed + record that the server has already deleted, the client will get + NFS4ERR_STALE_CLIENTID back. If so, the client should then start + over, and send SETCLIENTID to reestablish an unconfirmed client + record and get back an unconfirmed clientid and setclientid_confirm + verifier. The client should then send the SETCLIENTID_CONFIRM to + confirm the clientid. + + SETCLIENTID_CONFIRM does not establish or renew a lease. However, if + SETCLIENTID_CONFIRM removes relevant leased client state, and that + state does not include existing delegations, the server MUST allow + the client a period of time no less than the value of lease_time + attribute, to reclaim, (via the CLAIM_DELEGATE_PREV claim type of the + OPEN operation) its delegations before removing unreclaimed + delegations. + + + +Shepler, et al. Standards Track [Page 216] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ERRORS + + NFS4ERR_BADXDR + NFS4ERR_CLID_INUSE + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE_CLIENTID + +14.2.35. Operation 37: VERIFY - Verify Same Attributes + + SYNOPSIS + + (cfh), fattr -> - + + ARGUMENT + + struct VERIFY4args { + /* CURRENT_FH: object */ + fattr4 obj_attributes; + }; + + RESULT + + struct VERIFY4res { + nfsstat4 status; + }; + + DESCRIPTION + + The VERIFY operation is used to verify that attributes have a value + assumed by the client before proceeding with following operations in + the compound request. If any of the attributes do not match then the + error NFS4ERR_NOT_SAME must be returned. The current filehandle + retains its value after successful completion of the operation. + + IMPLEMENTATION + + One possible use of the VERIFY operation is the following compound + sequence. With this the client is attempting to verify that the file + being removed will match what the client expects to be removed. This + sequence can help prevent the unintended deletion of a file. + + PUTFH (directory filehandle) + LOOKUP (file name) + VERIFY (filehandle == fh) + PUTFH (directory filehandle) + REMOVE (file name) + + + + +Shepler, et al. Standards Track [Page 217] + +RFC 3530 NFS version 4 Protocol April 2003 + + + This sequence does not prevent a second client from removing and + creating a new file in the middle of this sequence but it does help + avoid the unintended result. + + In the case that a recommended attribute is specified in the VERIFY + operation and the server does not support that attribute for the + filesystem object, the error NFS4ERR_ATTRNOTSUPP is returned to the + client. + + When the attribute rdattr_error or any write-only attribute (e.g., + time_modify_set) is specified, the error NFS4ERR_INVAL is returned to + the client. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ATTRNOTSUPP + NFS4ERR_BADCHAR + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_FHEXPIRED + NFS4ERR_INVAL + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOT_SAME + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + +14.2.36. Operation 38: WRITE - Write to File + + SYNOPSIS + + (cfh), stateid, offset, stable, data -> count, committed, writeverf + + ARGUMENT + + enum stable_how4 { + UNSTABLE4 = 0, + DATA_SYNC4 = 1, + FILE_SYNC4 = 2 + }; + + struct WRITE4args { + /* CURRENT_FH: file */ + stateid4 stateid; + offset4 offset; + + + +Shepler, et al. Standards Track [Page 218] + +RFC 3530 NFS version 4 Protocol April 2003 + + + stable_how4 stable; + opaque data<>; + }; + + RESULT + + struct WRITE4resok { + count4 count; + stable_how4 committed; + verifier4 writeverf; + }; + + union WRITE4res switch (nfsstat4 status) { + case NFS4_OK: + WRITE4resok resok4; + default: + void; + }; + + DESCRIPTION + + The WRITE operation is used to write data to a regular file. The + target file is specified by the current filehandle. The offset + specifies the offset where the data should be written. An offset of + 0 (zero) specifies that the write should start at the beginning of + the file. The count, as encoded as part of the opaque data + parameter, represents the number of bytes of data that are to be + written. If the count is 0 (zero), the WRITE will succeed and return + a count of 0 (zero) subject to permissions checking. The server may + choose to write fewer bytes than requested by the client. + + Part of the write request is a specification of how the write is to + be performed. The client specifies with the stable parameter the + method of how the data is to be processed by the server. If stable + is FILE_SYNC4, the server must commit the data written plus all + filesystem metadata to stable storage before returning results. This + corresponds to the NFS version 2 protocol semantics. Any other + behavior constitutes a protocol violation. If stable is DATA_SYNC4, + then the server must commit all of the data to stable storage and + enough of the metadata to retrieve the data before returning. The + server implementor is free to implement DATA_SYNC4 in the same + fashion as FILE_SYNC4, but with a possible performance drop. If + stable is UNSTABLE4, the server is free to commit any part of the + data and the metadata to stable storage, including all or none, + before returning a reply to the client. There is no guarantee whether + or when any uncommitted data will subsequently be committed to stable + storage. The only guarantees made by the server are that it will not + + + + +Shepler, et al. Standards Track [Page 219] + +RFC 3530 NFS version 4 Protocol April 2003 + + + destroy any data without changing the value of verf and that it will + not commit the data and metadata at a level less than that requested + by the client. + + The stateid value for a WRITE request represents a value returned + from a previous record lock or share reservation request. The + stateid is used by the server to verify that the associated share + reservation and any record locks are still valid and to update lease + timeouts for the client. + + Upon successful completion, the following results are returned. The + count result is the number of bytes of data written to the file. The + server may write fewer bytes than requested. If so, the actual number + of bytes written starting at location, offset, is returned. + + The server also returns an indication of the level of commitment of + the data and metadata via committed. If the server committed all data + and metadata to stable storage, committed should be set to + FILE_SYNC4. If the level of commitment was at least as strong as + DATA_SYNC4, then committed should be set to DATA_SYNC4. Otherwise, + committed must be returned as UNSTABLE4. If stable was FILE4_SYNC, + then committed must also be FILE_SYNC4: anything else constitutes a + protocol violation. If stable was DATA_SYNC4, then committed may be + FILE_SYNC4 or DATA_SYNC4: anything else constitutes a protocol + violation. If stable was UNSTABLE4, then committed may be either + FILE_SYNC4, DATA_SYNC4, or UNSTABLE4. + + The final portion of the result is the write verifier. The write + verifier is a cookie that the client can use to determine whether the + server has changed instance (boot) state between a call to WRITE and + a subsequent call to either WRITE or COMMIT. This cookie must be + consistent during a single instance of the NFS version 4 protocol + service and must be unique between instances of the NFS version 4 + protocol server, where uncommitted data may be lost. + + If a client writes data to the server with the stable argument set to + UNSTABLE4 and the reply yields a committed response of DATA_SYNC4 or + UNSTABLE4, the client will follow up some time in the future with a + COMMIT operation to synchronize outstanding asynchronous data and + metadata with the server's stable storage, barring client error. It + is possible that due to client crash or other error that a subsequent + COMMIT will not be received by the server. + + For a WRITE with a stateid value of all bits 0, the server MAY allow + the WRITE to be serviced subject to mandatory file locks or the + current share deny modes for the file. For a WRITE with a stateid + + + + + +Shepler, et al. Standards Track [Page 220] + +RFC 3530 NFS version 4 Protocol April 2003 + + + value of all bits 1, the server MUST NOT allow the WRITE operation to + bypass locking checks at the server and are treated exactly the same + as if a stateid of all bits 0 were used. + + On success, the current filehandle retains its value. + + IMPLEMENTATION + + It is possible for the server to write fewer bytes of data than + requested by the client. In this case, the server should not return + an error unless no data was written at all. If the server writes + less than the number of bytes specified, the client should issue + another WRITE to write the remaining data. + + It is assumed that the act of writing data to a file will cause the + time_modified of the file to be updated. However, the time_modified + of the file should not be changed unless the contents of the file are + changed. Thus, a WRITE request with count set to 0 should not cause + the time_modified of the file to be updated. + + The definition of stable storage has been historically a point of + contention. The following expected properties of stable storage may + help in resolving design issues in the implementation. Stable storage + is persistent storage that survives: + + 1. Repeated power failures. + 2. Hardware failures (of any board, power supply, etc.). + 3. Repeated software crashes, including reboot cycle. + + This definition does not address failure of the stable storage module + itself. + + The verifier is defined to allow a client to detect different + instances of an NFS version 4 protocol server over which cached, + uncommitted data may be lost. In the most likely case, the verifier + allows the client to detect server reboots. This information is + required so that the client can safely determine whether the server + could have lost cached data. If the server fails unexpectedly and + the client has uncommitted data from previous WRITE requests (done + with the stable argument set to UNSTABLE4 and in which the result + committed was returned as UNSTABLE4 as well) it may not have flushed + cached data to stable storage. The burden of recovery is on the + client and the client will need to retransmit the data to the server. + + A suggested verifier would be to use the time that the server was + booted or the time the server was last started (if restarting the + server without a reboot results in lost buffers). + + + + +Shepler, et al. Standards Track [Page 221] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The committed field in the results allows the client to do more + effective caching. If the server is committing all WRITE requests to + stable storage, then it should return with committed set to + FILE_SYNC4, regardless of the value of the stable field in the + arguments. A server that uses an NVRAM accelerator may choose to + implement this policy. The client can use this to increase the + effectiveness of the cache by discarding cached data that has already + been committed on the server. + + Some implementations may return NFS4ERR_NOSPC instead of + NFS4ERR_DQUOT when a user's quota is exceeded. In the case that the + current filehandle is a directory, the server will return + NFS4ERR_ISDIR. If the current filehandle is not a regular file or a + directory, the server will return NFS4ERR_INVAL. + + If mandatory file locking is on for the file, and corresponding + record of the data to be written file is read or write locked by an + owner that is not associated with the stateid, the server will return + NFS4ERR_LOCKED. If so, the client must check if the owner + corresponding to the stateid used with the WRITE operation has a + conflicting read lock that overlaps with the region that was to be + written. If the stateid's owner has no conflicting read lock, then + the client should try to get the appropriate write record lock via + the LOCK operation before re-attempting the WRITE. When the WRITE + completes, the client should release the record lock via LOCKU. + + If the stateid's owner had a conflicting read lock, then the client + has no choice but to return an error to the application that + attempted the WRITE. The reason is that since the stateid's owner had + a read lock, the server either attempted to temporarily effectively + upgrade this read lock to a write lock, or the server has no upgrade + capability. If the server attempted to upgrade the read lock and + failed, it is pointless for the client to re-attempt the upgrade via + the LOCK operation, because there might be another client also trying + to upgrade. If two clients are blocked trying upgrade the same lock, + the clients deadlock. If the server has no upgrade capability, then + it is pointless to try a LOCK operation to upgrade. + + ERRORS + + NFS4ERR_ACCESS + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADHANDLE + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_DELAY + NFS4ERR_DQUOT + NFS4ERR_EXPIRED + + + +Shepler, et al. Standards Track [Page 222] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_FBIG + NFS4ERR_FHEXPIRED + NFS4ERR_GRACE + NFS4ERR_INVAL + NFS4ERR_IO + NFS4ERR_ISDIR + NFS4ERR_LEASE_MOVED + NFS4ERR_LOCKED + NFS4ERR_MOVED + NFS4ERR_NOFILEHANDLE + NFS4ERR_NOSPC + NFS4ERR_NXIO + NFS4ERR_OLD_STATEID + NFS4ERR_OPENMODE + NFS4ERR_RESOURCE + NFS4ERR_ROFS + NFS4ERR_SERVERFAULT + NFS4ERR_STALE + NFS4ERR_STALE_STATEID + +14.2.37. Operation 39: RELEASE_LOCKOWNER - Release Lockowner State + + SYNOPSIS + + lockowner -> () + + ARGUMENT + + struct RELEASE_LOCKOWNER4args { + lock_owner4 lock_owner; + }; + + RESULT + + struct RELEASE_LOCKOWNER4res { + nfsstat4 status; + }; + + DESCRIPTION + + This operation is used to notify the server that the lock_owner is no + longer in use by the client. This allows the server to release + cached state related to the specified lock_owner. If file locks, + associated with the lock_owner, are held at the server, the error + NFS4ERR_LOCKS_HELD will be returned and no further action will be + taken. + + + + + +Shepler, et al. Standards Track [Page 223] + +RFC 3530 NFS version 4 Protocol April 2003 + + + IMPLEMENTATION + + The client may choose to use this operation to ease the amount of + server state that is held. Depending on behavior of applications at + the client, it may be important for the client to use this operation + since the server has certain obligations with respect to holding a + reference to a lock_owner as long as the associated file is open. + Therefore, if the client knows for certain that the lock_owner will + no longer be used under the context of the associated open_owner4, it + should use RELEASE_LOCKOWNER. + + ERRORS + + NFS4ERR_ADMIN_REVOKED + NFS4ERR_BADXDR + NFS4ERR_EXPIRED + NFS4ERR_LEASE_MOVED + NFS4ERR_LOCKS_HELD + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + NFS4ERR_STALE_CLIENTID + +14.2.38. Operation 10044: ILLEGAL - Illegal operation + + SYNOPSIS + + -> () + + ARGUMENT + + void; + + RESULT + + struct ILLEGAL4res { + nfsstat4 status; + }; + + DESCRIPTION + + This operation is a placeholder for encoding a result to handle the + case of the client sending an operation code within COMPOUND that is + not supported. See the COMPOUND procedure description for more + details. + + The status field of ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL. + + + + + +Shepler, et al. Standards Track [Page 224] + +RFC 3530 NFS version 4 Protocol April 2003 + + + IMPLEMENTATION + + A client will probably not send an operation with code OP_ILLEGAL but + if it does, the response will be ILLEGAL4res just as it would be with + any other invalid operation code. Note that if the server gets an + illegal operation code that is not OP_ILLEGAL, and if the server + checks for legal operation codes during the XDR decode phase, then + the ILLEGAL4res would not be returned. + + ERRORS + + NFS4ERR_OP_ILLEGAL + +15. NFS version 4 Callback Procedures + + The procedures used for callbacks are defined in the following + sections. In the interest of clarity, the terms "client" and + "server" refer to NFS clients and servers, despite the fact that for + an individual callback RPC, the sense of these terms would be + precisely the opposite. + +15.1. Procedure 0: CB_NULL - No Operation + + SYNOPSIS + + + + ARGUMENT + + void; + + RESULT + + void; + + DESCRIPTION + + Standard NULL procedure. Void argument, void response. Even though + there is no direct functionality associated with this procedure, the + server will use CB_NULL to confirm the existence of a path for RPCs + from server to client. + + ERRORS + + None. + + + + + + +Shepler, et al. Standards Track [Page 225] + +RFC 3530 NFS version 4 Protocol April 2003 + + +15.2. Procedure 1: CB_COMPOUND - Compound Operations + + SYNOPSIS + + compoundargs -> compoundres + + ARGUMENT + + enum nfs_cb_opnum4 { + OP_CB_GETATTR = 3, + OP_CB_RECALL = 4, + OP_CB_ILLEGAL = 10044 + }; + + union nfs_cb_argop4 switch (unsigned argop) { + case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; + case OP_CB_RECALL: CB_RECALL4args opcbrecall; + case OP_CB_ILLEGAL: void opcbillegal; + }; + + struct CB_COMPOUND4args { + utf8str_cs tag; + uint32_t minorversion; + uint32_t callback_ident; + nfs_cb_argop4 argarray<>; + }; + + RESULT + + union nfs_cb_resop4 switch (unsigned resop){ + case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; + case OP_CB_RECALL: CB_RECALL4res opcbrecall; + }; + + struct CB_COMPOUND4res { + nfsstat4 status; + utf8str_cs tag; + nfs_cb_resop4 resarray<>; + }; + + DESCRIPTION + + The CB_COMPOUND procedure is used to combine one or more of the + callback procedures into a single RPC request. The main callback RPC + program has two main procedures: CB_NULL and CB_COMPOUND. All other + operations use the CB_COMPOUND procedure as a wrapper. + + + + + +Shepler, et al. Standards Track [Page 226] + +RFC 3530 NFS version 4 Protocol April 2003 + + + In the processing of the CB_COMPOUND procedure, the client may find + that it does not have the available resources to execute any or all + of the operations within the CB_COMPOUND sequence. In this case, the + error NFS4ERR_RESOURCE will be returned for the particular operation + within the CB_COMPOUND procedure where the resource exhaustion + occurred. This assumes that all previous operations within the + CB_COMPOUND sequence have been evaluated successfully. + + Contained within the CB_COMPOUND results is a 'status' field. This + status must be equivalent to the status of the last operation that + was executed within the CB_COMPOUND procedure. Therefore, if an + operation incurred an error then the 'status' value will be the same + error value as is being returned for the operation that failed. + + For the definition of the "tag" field, see the section "Procedure 1: + COMPOUND - Compound Operations". + + The value of callback_ident is supplied by the client during + SETCLIENTID. The server must use the client supplied callback_ident + during the CB_COMPOUND to allow the client to properly identify the + server. + + Illegal operation codes are handled in the same way as they are + handled for the COMPOUND procedure. + + IMPLEMENTATION + + The CB_COMPOUND procedure is used to combine individual operations + into a single RPC request. The client interprets each of the + operations in turn. If an operation is executed by the client and + the status of that operation is NFS4_OK, then the next operation in + the CB_COMPOUND procedure is executed. The client continues this + process until there are no more operations to be executed or one of + the operations has a status value other than NFS4_OK. + + ERRORS + + NFS4ERR_BADHANDLE + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_OP_ILLEGAL + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + + + + + + + + +Shepler, et al. Standards Track [Page 227] + +RFC 3530 NFS version 4 Protocol April 2003 + + +15.2.1. Operation 3: CB_GETATTR - Get Attributes + + SYNOPSIS + + fh, attr_request -> attrmask, attr_vals + + ARGUMENT + + struct CB_GETATTR4args { + nfs_fh4 fh; + bitmap4 attr_request; + }; + + RESULT + + struct CB_GETATTR4resok { + fattr4 obj_attributes; + }; + + union CB_GETATTR4res switch (nfsstat4 status) { + case NFS4_OK: + CB_GETATTR4resok resok4; + default: + void; + }; + +DESCRIPTION + + The CB_GETATTR operation is used by the server to obtain the + current modified state of a file that has been write delegated. + The attributes size and change are the only ones guaranteed to be + serviced by the client. See the section "Handling of CB_GETATTR" + for a full description of how the client and server are to interact + with the use of CB_GETATTR. + + If the filehandle specified is not one for which the client holds a + write open delegation, an NFS4ERR_BADHANDLE error is returned. + + IMPLEMENTATION + + The client returns attrmask bits and the associated attribute + values only for the change attribute, and attributes that it may + change (time_modify, and size). + + + + + + + + +Shepler, et al. Standards Track [Page 228] + +RFC 3530 NFS version 4 Protocol April 2003 + + + ERRORS + + NFS4ERR_BADHANDLE + NFS4ERR_BADXDR + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + +15.2.2. Operation 4: CB_RECALL - Recall an Open Delegation + + SYNOPSIS + + stateid, truncate, fh -> () + + ARGUMENT + + struct CB_RECALL4args { + stateid4 stateid; + bool truncate; + nfs_fh4 fh; + }; + + RESULT + + struct CB_RECALL4res { + nfsstat4 status; + }; + + DESCRIPTION + + The CB_RECALL operation is used to begin the process of recalling an + open delegation and returning it to the server. + + The truncate flag is used to optimize recall for a file which is + about to be truncated to zero. When it is set, the client is freed + of obligation to propagate modified data for the file to the server, + since this data is irrelevant. + + If the handle specified is not one for which the client holds an open + delegation, an NFS4ERR_BADHANDLE error is returned. + + If the stateid specified is not one corresponding to an open + delegation for the file specified by the filehandle, an + NFS4ERR_BAD_STATEID is returned. + + + + + + + + +Shepler, et al. Standards Track [Page 229] + +RFC 3530 NFS version 4 Protocol April 2003 + + + IMPLEMENTATION + + The client should reply to the callback immediately. Replying does + not complete the recall except when an error was returned. The + recall is not complete until the delegation is returned using a + DELEGRETURN. + + ERRORS + + NFS4ERR_BADHANDLE + NFS4ERR_BAD_STATEID + NFS4ERR_BADXDR + NFS4ERR_RESOURCE + NFS4ERR_SERVERFAULT + +15.2.3. Operation 10044: CB_ILLEGAL - Illegal Callback Operation + + SYNOPSIS + + -> () + + ARGUMENT + + void; + + RESULT + + struct CB_ILLEGAL4res { + nfsstat4 status; + }; + + DESCRIPTION + + This operation is a placeholder for encoding a result to handle the + case of the client sending an operation code within COMPOUND that is + not supported. See the COMPOUND procedure description for more + details. + + The status field of CB_ILLEGAL4res MUST be set to NFS4ERR_OP_ILLEGAL. + + IMPLEMENTATION + + A server will probably not send an operation with code OP_CB_ILLEGAL + but if it does, the response will be CB_ILLEGAL4res just as it would + be with any other invalid operation code. Note that if the client + + + + + + +Shepler, et al. Standards Track [Page 230] + +RFC 3530 NFS version 4 Protocol April 2003 + + + gets an illegal operation code that is not OP_ILLEGAL, and if the + client checks for legal operation codes during the XDR decode phase, + then the CB_ILLEGAL4res would not be returned. + + ERRORS + + NFS4ERR_OP_ILLEGAL + +16. Security Considerations + + NFS has historically used a model where, from an authentication + perspective, the client was the entire machine, or at least the + source IP address of the machine. The NFS server relied on the NFS + client to make the proper authentication of the end-user. The NFS + server in turn shared its files only to specific clients, as + identified by the client's source IP address. Given this model, the + AUTH_SYS RPC security flavor simply identified the end-user using the + client to the NFS server. When processing NFS responses, the client + ensured that the responses came from the same IP address and port + number that the request was sent to. While such a model is easy to + implement and simple to deploy and use, it is certainly not a safe + model. Thus, NFSv4 mandates that implementations support a security + model that uses end to end authentication, where an end-user on a + client mutually authenticates (via cryptographic schemes that do not + expose passwords or keys in the clear on the network) to a principal + on an NFS server. Consideration should also be given to the + integrity and privacy of NFS requests and responses. The issues of + end to end mutual authentication, integrity, and privacy are + discussed as part of the section on "RPC and Security Flavor". + + Note that while NFSv4 mandates an end to end mutual authentication + model, the "classic" model of machine authentication via IP address + checking and AUTH_SYS identification can still be supported with the + caveat that the AUTH_SYS flavor is neither MANDATORY nor RECOMMENDED + by this specification, and so interoperability via AUTH_SYS is not + assured. + + For reasons of reduced administration overhead, better performance + and/or reduction of CPU utilization, users of NFS version 4 + implementations may choose to not use security mechanisms that enable + integrity protection on each remote procedure call and response. The + use of mechanisms without integrity leaves the customer vulnerable to + an attacker in between the NFS client and server that modifies the + RPC request and/or the response. While implementations are free to + provide the option to use weaker security mechanisms, there are two + operations in particular that warrant the implementation overriding + user choices. + + + + +Shepler, et al. Standards Track [Page 231] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The first such operation is SECINFO. It is recommended that the + client issue the SECINFO call such that it is protected with a + security flavor that has integrity protection, such as RPCSEC_GSS + with a security triple that uses either rpc_gss_svc_integrity or + rpc_gss_svc_privacy (rpc_gss_svc_privacy includes integrity + protection) service. Without integrity protection encapsulating + SECINFO and therefore its results, an attacker in the middle could + modify results such that the client might select a weaker algorithm + in the set allowed by server, making the client and/or server + vulnerable to further attacks. + + The second operation that should definitely use integrity protection + is any GETATTR for the fs_locations attribute. The attack has two + steps. First the attacker modifies the unprotected results of some + operation to return NFS4ERR_MOVED. Second, when the client follows up + with a GETATTR for the fs_locations attribute, the attacker modifies + the results to cause the client migrate its traffic to a server + controlled by the attacker. + + Because the operations SETCLIENTID/SETCLIENTID_CONFIRM are + responsible for the release of client state, it is imperative that + the principal used for these operations is checked against and match + the previous use of these operations. See the section "Client ID" + for further discussion. + +17. IANA Considerations + +17.1. Named Attribute Definition + + The NFS version 4 protocol provides for the association of named + attributes to files. The name space identifiers for these attributes + are defined as string names. The protocol does not define the + specific assignment of the name space for these file attributes. + Even though the name space is not specifically controlled to prevent + collisions, an IANA registry has been created for the registration of + NFS version 4 named attributes. Registration will be achieved + through the publication of an Informational RFC and will require not + only the name of the attribute but the syntax and semantics of the + named attribute contents; the intent is to promote interoperability + where common interests exist. While application developers are + allowed to define and use attributes as needed, they are encouraged + to register the attributes with IANA. + +17.2. ONC RPC Network Identifiers (netids) + + The section "Structured Data Types" discussed the r_netid field and + the corresponding r_addr field of a clientaddr4 structure. The NFS + version 4 protocol depends on the syntax and semantics of these + + + +Shepler, et al. Standards Track [Page 232] + +RFC 3530 NFS version 4 Protocol April 2003 + + + fields to effectively communicate callback information between client + and server. Therefore, an IANA registry has been created to include + the values defined in this document and to allow for future expansion + based on transport usage/availability. Additions to this ONC RPC + Network Identifier registry must be done with the publication of an + RFC. + + The initial values for this registry are as follows (some of this + text is replicated from section 2.2 for clarity): + + The Network Identifier (or r_netid for short) is used to specify a + transport protocol and associated universal address (or r_addr for + short). The syntax of the Network Identifier is a US-ASCII string. + The initial definitions for r_netid are: + + "tcp" - TCP over IP version 4 + + "udp" - UDP over IP version 4 + + "tcp6" - TCP over IP version 6 + + "udp6" - UDP over IP version 6 + + Note: the '"' marks are used for delimiting the strings for this + document and are not part of the Network Identifier string. + + For the "tcp" and "udp" Network Identifiers the Universal Address or + r_addr (for IPv4) is a US-ASCII string and is of the form: + + h1.h2.h3.h4.p1.p2 + + The prefix, "h1.h2.h3.h4", is the standard textual form for + representing an IPv4 address, which is always four octets long. + Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, + the first through fourth octets each converted to ASCII-decimal. + Assuming big-endian ordering, p1 and p2 are, respectively, the first + and second octets each converted to ASCII-decimal. For example, if a + host, in big-endian order, has an address of 0x0A010307 and there is + a service listening on, in big endian order, port 0x020F (decimal + 527), then complete universal address is "10.1.3.7.2.15". + + For the "tcp6" and "udp6" Network Identifiers the Universal Address + or r_addr (for IPv6) is a US-ASCII string and is of the form: + + x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 + + + + + + +Shepler, et al. Standards Track [Page 233] + +RFC 3530 NFS version 4 Protocol April 2003 + + + The suffix "p1.p2" is the service port, and is computed the same way + as with universal addresses for "tcp" and "udp". The prefix, + "x1:x2:x3:x4:x5:x6:x7:x8", is the standard textual form for + representing an IPv6 address as defined in Section 2.2 of [RFC2373]. + Additionally, the two alternative forms specified in Section 2.2 of + [RFC2373] are also acceptable. + + As mentioned, the registration of new Network Identifiers will + require the publication of an Information RFC with similar detail as + listed above for the Network Identifier itself and corresponding + Universal Address. + +18. RPC definition file + + /* + * Copyright (C) The Internet Society (1998,1999,2000,2001,2002). + * All Rights Reserved. + */ + + /* + * nfs4_prot.x + * + */ + + %#pragma ident "%W%" + + /* + * Basic typedefs for RFC 1832 data type definitions + */ + typedef int int32_t; + typedef unsigned int uint32_t; + typedef hyper int64_t; + typedef unsigned hyper uint64_t; + + /* + * Sizes + */ + const NFS4_FHSIZE = 128; + const NFS4_VERIFIER_SIZE = 8; + const NFS4_OPAQUE_LIMIT = 1024; + + /* + * File types + */ + enum nfs_ftype4 { + NF4REG = 1, /* Regular File */ + NF4DIR = 2, /* Directory */ + NF4BLK = 3, /* Special File - block device */ + + + +Shepler, et al. Standards Track [Page 234] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NF4CHR = 4, /* Special File - character device */ + NF4LNK = 5, /* Symbolic Link */ + NF4SOCK = 6, /* Special File - socket */ + NF4FIFO = 7, /* Special File - fifo */ + NF4ATTRDIR = 8, /* Attribute Directory */ + NF4NAMEDATTR = 9 /* Named Attribute */ + }; + + /* + * Error status + */ + enum nfsstat4 { + NFS4_OK = 0, /* everything is okay */ + NFS4ERR_PERM = 1, /* caller not privileged */ + NFS4ERR_NOENT = 2, /* no such file/directory */ + NFS4ERR_IO = 5, /* hard I/O error */ + NFS4ERR_NXIO = 6, /* no such device */ + NFS4ERR_ACCESS = 13, /* access denied */ + NFS4ERR_EXIST = 17, /* file already exists */ + NFS4ERR_XDEV = 18, /* different filesystems */ + /* Unused/reserved 19 */ + NFS4ERR_NOTDIR = 20, /* should be a directory */ + NFS4ERR_ISDIR = 21, /* should not be directory */ + NFS4ERR_INVAL = 22, /* invalid argument */ + NFS4ERR_FBIG = 27, /* file exceeds server max */ + NFS4ERR_NOSPC = 28, /* no space on filesystem */ + NFS4ERR_ROFS = 30, /* read-only filesystem */ + NFS4ERR_MLINK = 31, /* too many hard links */ + NFS4ERR_NAMETOOLONG = 63, /* name exceeds server max */ + NFS4ERR_NOTEMPTY = 66, /* directory not empty */ + NFS4ERR_DQUOT = 69, /* hard quota limit reached*/ + NFS4ERR_STALE = 70, /* file no longer exists */ + NFS4ERR_BADHANDLE = 10001,/* Illegal filehandle */ + NFS4ERR_BAD_COOKIE = 10003,/* READDIR cookie is stale */ + NFS4ERR_NOTSUPP = 10004,/* operation not supported */ + NFS4ERR_TOOSMALL = 10005,/* response limit exceeded */ + NFS4ERR_SERVERFAULT = 10006,/* undefined server error */ + NFS4ERR_BADTYPE = 10007,/* type invalid for CREATE */ + NFS4ERR_DELAY = 10008,/* file "busy" - retry */ + NFS4ERR_SAME = 10009,/* nverify says attrs same */ + NFS4ERR_DENIED = 10010,/* lock unavailable */ + NFS4ERR_EXPIRED = 10011,/* lock lease expired */ + NFS4ERR_LOCKED = 10012,/* I/O failed due to lock */ + NFS4ERR_GRACE = 10013,/* in grace period */ + NFS4ERR_FHEXPIRED = 10014,/* filehandle expired */ + NFS4ERR_SHARE_DENIED = 10015,/* share reserve denied */ + NFS4ERR_WRONGSEC = 10016,/* wrong security flavor */ + NFS4ERR_CLID_INUSE = 10017,/* clientid in use */ + + + +Shepler, et al. Standards Track [Page 235] + +RFC 3530 NFS version 4 Protocol April 2003 + + + NFS4ERR_RESOURCE = 10018,/* resource exhaustion */ + NFS4ERR_MOVED = 10019,/* filesystem relocated */ + NFS4ERR_NOFILEHANDLE = 10020,/* current FH is not set */ + NFS4ERR_MINOR_VERS_MISMATCH = 10021,/* minor vers not supp */ + NFS4ERR_STALE_CLIENTID = 10022,/* server has rebooted */ + NFS4ERR_STALE_STATEID = 10023,/* server has rebooted */ + NFS4ERR_OLD_STATEID = 10024,/* state is out of sync */ + NFS4ERR_BAD_STATEID = 10025,/* incorrect stateid */ + NFS4ERR_BAD_SEQID = 10026,/* request is out of seq. */ + NFS4ERR_NOT_SAME = 10027,/* verify - attrs not same */ + NFS4ERR_LOCK_RANGE = 10028,/* lock range not supported*/ + NFS4ERR_SYMLINK = 10029,/* should be file/directory*/ + NFS4ERR_RESTOREFH = 10030,/* no saved filehandle */ + NFS4ERR_LEASE_MOVED = 10031,/* some filesystem moved */ + NFS4ERR_ATTRNOTSUPP = 10032,/* recommended attr not sup*/ + NFS4ERR_NO_GRACE = 10033,/* reclaim outside of grace*/ + NFS4ERR_RECLAIM_BAD = 10034,/* reclaim error at server */ + NFS4ERR_RECLAIM_CONFLICT = 10035,/* conflict on reclaim */ + NFS4ERR_BADXDR = 10036,/* XDR decode failed */ + NFS4ERR_LOCKS_HELD = 10037,/* file locks held at CLOSE*/ + NFS4ERR_OPENMODE = 10038,/* conflict in OPEN and I/O*/ + NFS4ERR_BADOWNER = 10039,/* owner translation bad */ + NFS4ERR_BADCHAR = 10040,/* utf-8 char not supported*/ + NFS4ERR_BADNAME = 10041,/* name not supported */ + NFS4ERR_BAD_RANGE = 10042,/* lock range not supported*/ + NFS4ERR_LOCK_NOTSUPP = 10043,/* no atomic up/downgrade */ + NFS4ERR_OP_ILLEGAL = 10044,/* undefined operation */ + NFS4ERR_DEADLOCK = 10045,/* file locking deadlock */ + NFS4ERR_FILE_OPEN = 10046,/* open file blocks op. */ + NFS4ERR_ADMIN_REVOKED = 10047,/* lockowner state revoked */ + NFS4ERR_CB_PATH_DOWN = 10048 /* callback path down */ + }; + + /* + * Basic data types + */ + typedef uint32_t bitmap4<>; + typedef uint64_t offset4; + typedef uint32_t count4; + typedef uint64_t length4; + typedef uint64_t clientid4; + typedef uint32_t seqid4; + typedef opaque utf8string<>; + typedef utf8string utf8str_cis; + typedef utf8string utf8str_cs; + typedef utf8string utf8str_mixed; + typedef utf8str_cs component4; + typedef component4 pathname4<>; + + + +Shepler, et al. Standards Track [Page 236] + +RFC 3530 NFS version 4 Protocol April 2003 + + + typedef uint64_t nfs_lockid4; + typedef uint64_t nfs_cookie4; + typedef utf8str_cs linktext4; + typedef opaque sec_oid4<>; + typedef uint32_t qop4; + typedef uint32_t mode4; + typedef uint64_t changeid4; + typedef opaque verifier4[NFS4_VERIFIER_SIZE]; + + /* + * Timeval + */ + struct nfstime4 { + int64_t seconds; + uint32_t nseconds; + }; + + enum time_how4 { + SET_TO_SERVER_TIME4 = 0, + SET_TO_CLIENT_TIME4 = 1 + }; + + union settime4 switch (time_how4 set_it) { + case SET_TO_CLIENT_TIME4: + nfstime4 time; + default: + void; + }; + + /* + * File access handle + */ + typedef opaque nfs_fh4; + + + /* + * File attribute definitions + */ + + /* + * FSID structure for major/minor + */ + struct fsid4 { + uint64_t major; + uint64_t minor; + }; + + /* + + + +Shepler, et al. Standards Track [Page 237] + +RFC 3530 NFS version 4 Protocol April 2003 + + + * Filesystem locations attribute for relocation/migration + */ + struct fs_location4 { + utf8str_cis server<>; + pathname4 rootpath; + }; + + struct fs_locations4 { + pathname4 fs_root; + fs_location4 locations<>; + }; + + /* + * Various Access Control Entry definitions + */ + + /* + * Mask that indicates which Access Control Entries are supported. + * Values for the fattr4_aclsupport attribute. + */ + const ACL4_SUPPORT_ALLOW_ACL = 0x00000001; + const ACL4_SUPPORT_DENY_ACL = 0x00000002; + const ACL4_SUPPORT_AUDIT_ACL = 0x00000004; + const ACL4_SUPPORT_ALARM_ACL = 0x00000008; + + + typedef uint32_t acetype4; + /* + * acetype4 values, others can be added as needed. + */ + const ACE4_ACCESS_ALLOWED_ACE_TYPE = 0x00000000; + const ACE4_ACCESS_DENIED_ACE_TYPE = 0x00000001; + const ACE4_SYSTEM_AUDIT_ACE_TYPE = 0x00000002; + const ACE4_SYSTEM_ALARM_ACE_TYPE = 0x00000003; + + + /* + * ACE flag + */ + typedef uint32_t aceflag4; + + /* + * ACE flag values + */ + const ACE4_FILE_INHERIT_ACE = 0x00000001; + const ACE4_DIRECTORY_INHERIT_ACE = 0x00000002; + const ACE4_NO_PROPAGATE_INHERIT_ACE = 0x00000004; + const ACE4_INHERIT_ONLY_ACE = 0x00000008; + + + +Shepler, et al. Standards Track [Page 238] + +RFC 3530 NFS version 4 Protocol April 2003 + + + const ACE4_SUCCESSFUL_ACCESS_ACE_FLAG = 0x00000010; + const ACE4_FAILED_ACCESS_ACE_FLAG = 0x00000020; + const ACE4_IDENTIFIER_GROUP = 0x00000040; + + + /* + * ACE mask + */ + typedef uint32_t acemask4; + + /* + * ACE mask values + */ + const ACE4_READ_DATA = 0x00000001; + const ACE4_LIST_DIRECTORY = 0x00000001; + const ACE4_WRITE_DATA = 0x00000002; + const ACE4_ADD_FILE = 0x00000002; + const ACE4_APPEND_DATA = 0x00000004; + const ACE4_ADD_SUBDIRECTORY = 0x00000004; + const ACE4_READ_NAMED_ATTRS = 0x00000008; + const ACE4_WRITE_NAMED_ATTRS = 0x00000010; + const ACE4_EXECUTE = 0x00000020; + const ACE4_DELETE_CHILD = 0x00000040; + const ACE4_READ_ATTRIBUTES = 0x00000080; + const ACE4_WRITE_ATTRIBUTES = 0x00000100; + + const ACE4_DELETE = 0x00010000; + const ACE4_READ_ACL = 0x00020000; + const ACE4_WRITE_ACL = 0x00040000; + const ACE4_WRITE_OWNER = 0x00080000; + const ACE4_SYNCHRONIZE = 0x00100000; + + /* + * ACE4_GENERIC_READ -- defined as combination of + * ACE4_READ_ACL | + * ACE4_READ_DATA | + * ACE4_READ_ATTRIBUTES | + * ACE4_SYNCHRONIZE + */ + + const ACE4_GENERIC_READ = 0x00120081; + + /* + * ACE4_GENERIC_WRITE -- defined as combination of + * ACE4_READ_ACL | + * ACE4_WRITE_DATA | + * ACE4_WRITE_ATTRIBUTES | + * ACE4_WRITE_ACL | + + + +Shepler, et al. Standards Track [Page 239] + +RFC 3530 NFS version 4 Protocol April 2003 + + + * ACE4_APPEND_DATA | + * ACE4_SYNCHRONIZE + */ + const ACE4_GENERIC_WRITE = 0x00160106; + + + /* + * ACE4_GENERIC_EXECUTE -- defined as combination of + * ACE4_READ_ACL + * ACE4_READ_ATTRIBUTES + * ACE4_EXECUTE + * ACE4_SYNCHRONIZE + */ + const ACE4_GENERIC_EXECUTE = 0x001200A0; + + + /* + * Access Control Entry definition + */ + struct nfsace4 { + acetype4 type; + aceflag4 flag; + acemask4 access_mask; + utf8str_mixed who; + }; + + /* + * Field definitions for the fattr4_mode attribute + */ + const MODE4_SUID = 0x800; /* set user id on execution */ + const MODE4_SGID = 0x400; /* set group id on execution */ + const MODE4_SVTX = 0x200; /* save text even after use */ + const MODE4_RUSR = 0x100; /* read permission: owner */ + const MODE4_WUSR = 0x080; /* write permission: owner */ + const MODE4_XUSR = 0x040; /* execute permission: owner */ + const MODE4_RGRP = 0x020; /* read permission: group */ + const MODE4_WGRP = 0x010; /* write permission: group */ + const MODE4_XGRP = 0x008; /* execute permission: group */ + const MODE4_ROTH = 0x004; /* read permission: other */ + const MODE4_WOTH = 0x002; /* write permission: other */ + const MODE4_XOTH = 0x001; /* execute permission: other */ + + /* + * Special data/attribute associated with + * file types NF4BLK and NF4CHR. + */ + struct specdata4 { + uint32_t specdata1; /* major device number */ + + + +Shepler, et al. Standards Track [Page 240] + +RFC 3530 NFS version 4 Protocol April 2003 + + + uint32_t specdata2; /* minor device number */ + }; + + /* + * Values for fattr4_fh_expire_type + */ + const FH4_PERSISTENT = 0x00000000; + const FH4_NOEXPIRE_WITH_OPEN = 0x00000001; + const FH4_VOLATILE_ANY = 0x00000002; + const FH4_VOL_MIGRATION = 0x00000004; + const FH4_VOL_RENAME = 0x00000008; + + + typedef bitmap4 fattr4_supported_attrs; + typedef nfs_ftype4 fattr4_type; + typedef uint32_t fattr4_fh_expire_type; + typedef changeid4 fattr4_change; + typedef uint64_t fattr4_size; + typedef bool fattr4_link_support; + typedef bool fattr4_symlink_support; + typedef bool fattr4_named_attr; + typedef fsid4 fattr4_fsid; + typedef bool fattr4_unique_handles; + typedef uint32_t fattr4_lease_time; + typedef nfsstat4 fattr4_rdattr_error; + + typedef nfsace4 fattr4_acl<>; + typedef uint32_t fattr4_aclsupport; + typedef bool fattr4_archive; + typedef bool fattr4_cansettime; + typedef bool fattr4_case_insensitive; + typedef bool fattr4_case_preserving; + typedef bool fattr4_chown_restricted; + typedef uint64_t fattr4_fileid; + typedef uint64_t fattr4_files_avail; + typedef nfs_fh4 fattr4_filehandle; + typedef uint64_t fattr4_files_free; + typedef uint64_t fattr4_files_total; + typedef fs_locations4 fattr4_fs_locations; + typedef bool fattr4_hidden; + typedef bool fattr4_homogeneous; + typedef uint64_t fattr4_maxfilesize; + typedef uint32_t fattr4_maxlink; + typedef uint32_t fattr4_maxname; + typedef uint64_t fattr4_maxread; + typedef uint64_t fattr4_maxwrite; + typedef utf8str_cs fattr4_mimetype; + typedef mode4 fattr4_mode; + + + +Shepler, et al. Standards Track [Page 241] + +RFC 3530 NFS version 4 Protocol April 2003 + + + typedef uint64_t fattr4_mounted_on_fileid; + typedef bool fattr4_no_trunc; + typedef uint32_t fattr4_numlinks; + typedef utf8str_mixed fattr4_owner; + typedef utf8str_mixed fattr4_owner_group; + typedef uint64_t fattr4_quota_avail_hard; + typedef uint64_t fattr4_quota_avail_soft; + typedef uint64_t fattr4_quota_used; + typedef specdata4 fattr4_rawdev; + typedef uint64_t fattr4_space_avail; + typedef uint64_t fattr4_space_free; + typedef uint64_t fattr4_space_total; + typedef uint64_t fattr4_space_used; + typedef bool fattr4_system; + typedef nfstime4 fattr4_time_access; + typedef settime4 fattr4_time_access_set; + typedef nfstime4 fattr4_time_backup; + typedef nfstime4 fattr4_time_create; + typedef nfstime4 fattr4_time_delta; + typedef nfstime4 fattr4_time_metadata; + typedef nfstime4 fattr4_time_modify; + typedef settime4 fattr4_time_modify_set; + + + /* + * Mandatory Attributes + */ + const FATTR4_SUPPORTED_ATTRS = 0; + const FATTR4_TYPE = 1; + const FATTR4_FH_EXPIRE_TYPE = 2; + const FATTR4_CHANGE = 3; + const FATTR4_SIZE = 4; + const FATTR4_LINK_SUPPORT = 5; + const FATTR4_SYMLINK_SUPPORT = 6; + const FATTR4_NAMED_ATTR = 7; + const FATTR4_FSID = 8; + const FATTR4_UNIQUE_HANDLES = 9; + const FATTR4_LEASE_TIME = 10; + const FATTR4_RDATTR_ERROR = 11; + const FATTR4_FILEHANDLE = 19; + + /* + * Recommended Attributes + */ + const FATTR4_ACL = 12; + const FATTR4_ACLSUPPORT = 13; + const FATTR4_ARCHIVE = 14; + const FATTR4_CANSETTIME = 15; + + + +Shepler, et al. Standards Track [Page 242] + +RFC 3530 NFS version 4 Protocol April 2003 + + + const FATTR4_CASE_INSENSITIVE = 16; + const FATTR4_CASE_PRESERVING = 17; + const FATTR4_CHOWN_RESTRICTED = 18; + const FATTR4_FILEID = 20; + const FATTR4_FILES_AVAIL = 21; + const FATTR4_FILES_FREE = 22; + const FATTR4_FILES_TOTAL = 23; + const FATTR4_FS_LOCATIONS = 24; + const FATTR4_HIDDEN = 25; + const FATTR4_HOMOGENEOUS = 26; + const FATTR4_MAXFILESIZE = 27; + const FATTR4_MAXLINK = 28; + const FATTR4_MAXNAME = 29; + const FATTR4_MAXREAD = 30; + const FATTR4_MAXWRITE = 31; + const FATTR4_MIMETYPE = 32; + const FATTR4_MODE = 33; + const FATTR4_NO_TRUNC = 34; + const FATTR4_NUMLINKS = 35; + const FATTR4_OWNER = 36; + const FATTR4_OWNER_GROUP = 37; + const FATTR4_QUOTA_AVAIL_HARD = 38; + const FATTR4_QUOTA_AVAIL_SOFT = 39; + const FATTR4_QUOTA_USED = 40; + const FATTR4_RAWDEV = 41; + const FATTR4_SPACE_AVAIL = 42; + const FATTR4_SPACE_FREE = 43; + const FATTR4_SPACE_TOTAL = 44; + const FATTR4_SPACE_USED = 45; + const FATTR4_SYSTEM = 46; + const FATTR4_TIME_ACCESS = 47; + const FATTR4_TIME_ACCESS_SET = 48; + const FATTR4_TIME_BACKUP = 49; + const FATTR4_TIME_CREATE = 50; + const FATTR4_TIME_DELTA = 51; + const FATTR4_TIME_METADATA = 52; + const FATTR4_TIME_MODIFY = 53; + const FATTR4_TIME_MODIFY_SET = 54; + const FATTR4_MOUNTED_ON_FILEID = 55; + + typedef opaque attrlist4<>; + + /* + * File attribute container + */ + struct fattr4 { + bitmap4 attrmask; + attrlist4 attr_vals; + + + +Shepler, et al. Standards Track [Page 243] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + /* + * Change info for the client + */ + struct change_info4 { + bool atomic; + changeid4 before; + changeid4 after; + }; + + struct clientaddr4 { + /* see struct rpcb in RFC 1833 */ + string r_netid<>; /* network id */ + string r_addr<>; /* universal address */ + }; + + /* + * Callback program info as provided by the client + */ + struct cb_client4 { + uint32_t cb_program; + clientaddr4 cb_location; + }; + + /* + * Stateid + */ + struct stateid4 { + uint32_t seqid; + opaque other[12]; + }; + + /* + * Client ID + */ + struct nfs_client_id4 { + verifier4 verifier; + opaque id; + }; + + struct open_owner4 { + clientid4 clientid; + opaque owner; + }; + + struct lock_owner4 { + clientid4 clientid; + + + +Shepler, et al. Standards Track [Page 244] + +RFC 3530 NFS version 4 Protocol April 2003 + + + opaque owner; + }; + + enum nfs_lock_type4 { + READ_LT = 1, + WRITE_LT = 2, + READW_LT = 3, /* blocking read */ + WRITEW_LT = 4 /* blocking write */ + }; + + /* + * ACCESS: Check access permission + */ + const ACCESS4_READ = 0x00000001; + const ACCESS4_LOOKUP = 0x00000002; + const ACCESS4_MODIFY = 0x00000004; + const ACCESS4_EXTEND = 0x00000008; + const ACCESS4_DELETE = 0x00000010; + const ACCESS4_EXECUTE = 0x00000020; + + struct ACCESS4args { + /* CURRENT_FH: object */ + uint32_t access; + }; + + struct ACCESS4resok { + uint32_t supported; + uint32_t access; + }; + + union ACCESS4res switch (nfsstat4 status) { + case NFS4_OK: + ACCESS4resok resok4; + default: + void; + }; + + /* + * CLOSE: Close a file and release share reservations + */ + struct CLOSE4args { + /* CURRENT_FH: object */ + seqid4 seqid; + stateid4 open_stateid; + }; + + union CLOSE4res switch (nfsstat4 status) { + case NFS4_OK: + + + +Shepler, et al. Standards Track [Page 245] + +RFC 3530 NFS version 4 Protocol April 2003 + + + stateid4 open_stateid; + default: + void; + }; + + /* + * COMMIT: Commit cached data on server to stable storage + */ + struct COMMIT4args { + /* CURRENT_FH: file */ + offset4 offset; + count4 count; + }; + + struct COMMIT4resok { + verifier4 writeverf; + }; + + + union COMMIT4res switch (nfsstat4 status) { + case NFS4_OK: + COMMIT4resok resok4; + default: + void; + }; + + /* + * CREATE: Create a non-regular file + */ + union createtype4 switch (nfs_ftype4 type) { + case NF4LNK: + linktext4 linkdata; + case NF4BLK: + case NF4CHR: + specdata4 devdata; + case NF4SOCK: + case NF4FIFO: + case NF4DIR: + void; + default: + void; /* server should return NFS4ERR_BADTYPE */ + }; + + struct CREATE4args { + /* CURRENT_FH: directory for creation */ + createtype4 objtype; + component4 objname; + fattr4 createattrs; + + + +Shepler, et al. Standards Track [Page 246] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + struct CREATE4resok { + change_info4 cinfo; + bitmap4 attrset; /* attributes set */ + }; + + union CREATE4res switch (nfsstat4 status) { + case NFS4_OK: + CREATE4resok resok4; + default: + void; + }; + + /* + * DELEGPURGE: Purge Delegations Awaiting Recovery + */ + struct DELEGPURGE4args { + clientid4 clientid; + }; + + struct DELEGPURGE4res { + nfsstat4 status; + }; + + /* + * DELEGRETURN: Return a delegation + */ + struct DELEGRETURN4args { + /* CURRENT_FH: delegated file */ + stateid4 deleg_stateid; + }; + + struct DELEGRETURN4res { + nfsstat4 status; + }; + + /* + * GETATTR: Get file attributes + */ + struct GETATTR4args { + /* CURRENT_FH: directory or file */ + bitmap4 attr_request; + }; + + struct GETATTR4resok { + fattr4 obj_attributes; + }; + + + +Shepler, et al. Standards Track [Page 247] + +RFC 3530 NFS version 4 Protocol April 2003 + + + union GETATTR4res switch (nfsstat4 status) { + case NFS4_OK: + GETATTR4resok resok4; + default: + void; + }; + + /* + * GETFH: Get current filehandle + */ + struct GETFH4resok { + nfs_fh4 object; + }; + + union GETFH4res switch (nfsstat4 status) { + case NFS4_OK: + GETFH4resok resok4; + default: + void; + }; + + /* + * LINK: Create link to an object + */ + struct LINK4args { + /* SAVED_FH: source object */ + /* CURRENT_FH: target directory */ + component4 newname; + }; + + struct LINK4resok { + change_info4 cinfo; + }; + + union LINK4res switch (nfsstat4 status) { + case NFS4_OK: + LINK4resok resok4; + default: + void; + }; + + /* + * For LOCK, transition from open_owner to new lock_owner + */ + struct open_to_lock_owner4 { + seqid4 open_seqid; + stateid4 open_stateid; + seqid4 lock_seqid; + + + +Shepler, et al. Standards Track [Page 248] + +RFC 3530 NFS version 4 Protocol April 2003 + + + lock_owner4 lock_owner; + }; + + /* + * For LOCK, existing lock_owner continues to request file locks + */ + struct exist_lock_owner4 { + stateid4 lock_stateid; + seqid4 lock_seqid; + }; + + union locker4 switch (bool new_lock_owner) { + case TRUE: + open_to_lock_owner4 open_owner; + case FALSE: + exist_lock_owner4 lock_owner; + }; + + /* + * LOCK/LOCKT/LOCKU: Record lock management + */ + struct LOCK4args { + /* CURRENT_FH: file */ + nfs_lock_type4 locktype; + bool reclaim; + offset4 offset; + length4 length; + locker4 locker; + }; + + struct LOCK4denied { + offset4 offset; + length4 length; + nfs_lock_type4 locktype; + lock_owner4 owner; + }; + + struct LOCK4resok { + stateid4 lock_stateid; + }; + + union LOCK4res switch (nfsstat4 status) { + case NFS4_OK: + LOCK4resok resok4; + case NFS4ERR_DENIED: + LOCK4denied denied; + default: + void; + + + +Shepler, et al. Standards Track [Page 249] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + struct LOCKT4args { + /* CURRENT_FH: file */ + nfs_lock_type4 locktype; + offset4 offset; + length4 length; + lock_owner4 owner; + }; + + union LOCKT4res switch (nfsstat4 status) { + case NFS4ERR_DENIED: + LOCK4denied denied; + case NFS4_OK: + void; + default: + void; + }; + + struct LOCKU4args { + /* CURRENT_FH: file */ + nfs_lock_type4 locktype; + seqid4 seqid; + stateid4 lock_stateid; + offset4 offset; + length4 length; + }; + + union LOCKU4res switch (nfsstat4 status) { + case NFS4_OK: + stateid4 lock_stateid; + default: + void; + }; + + /* + * LOOKUP: Lookup filename + */ + struct LOOKUP4args { + /* CURRENT_FH: directory */ + component4 objname; + }; + + struct LOOKUP4res { + /* CURRENT_FH: object */ + nfsstat4 status; + }; + + + + +Shepler, et al. Standards Track [Page 250] + +RFC 3530 NFS version 4 Protocol April 2003 + + + /* + * LOOKUPP: Lookup parent directory + */ + struct LOOKUPP4res { + /* CURRENT_FH: directory */ + nfsstat4 status; + }; + + /* + * NVERIFY: Verify attributes different + */ + struct NVERIFY4args { + /* CURRENT_FH: object */ + fattr4 obj_attributes; + }; + + struct NVERIFY4res { + nfsstat4 status; + }; + + /* + * Various definitions for OPEN + */ + enum createmode4 { + UNCHECKED4 = 0, + GUARDED4 = 1, + EXCLUSIVE4 = 2 + }; + + union createhow4 switch (createmode4 mode) { + case UNCHECKED4: + case GUARDED4: + fattr4 createattrs; + case EXCLUSIVE4: + verifier4 createverf; + }; + + enum opentype4 { + OPEN4_NOCREATE = 0, + OPEN4_CREATE = 1 + }; + + union openflag4 switch (opentype4 opentype) { + case OPEN4_CREATE: + createhow4 how; + default: + void; + }; + + + +Shepler, et al. Standards Track [Page 251] + +RFC 3530 NFS version 4 Protocol April 2003 + + + /* Next definitions used for OPEN delegation */ + enum limit_by4 { + NFS_LIMIT_SIZE = 1, + NFS_LIMIT_BLOCKS = 2 + /* others as needed */ + }; + + struct nfs_modified_limit4 { + uint32_t num_blocks; + uint32_t bytes_per_block; + }; + + union nfs_space_limit4 switch (limit_by4 limitby) { + /* limit specified as file size */ + case NFS_LIMIT_SIZE: + uint64_t filesize; + /* limit specified by number of blocks */ + case NFS_LIMIT_BLOCKS: + nfs_modified_limit4 mod_blocks; + } ; + + /* + * Share Access and Deny constants for open argument + */ + const OPEN4_SHARE_ACCESS_READ = 0x00000001; + const OPEN4_SHARE_ACCESS_WRITE = 0x00000002; + const OPEN4_SHARE_ACCESS_BOTH = 0x00000003; + + const OPEN4_SHARE_DENY_NONE = 0x00000000; + const OPEN4_SHARE_DENY_READ = 0x00000001; + const OPEN4_SHARE_DENY_WRITE = 0x00000002; + const OPEN4_SHARE_DENY_BOTH = 0x00000003; + + enum open_delegation_type4 { + OPEN_DELEGATE_NONE = 0, + OPEN_DELEGATE_READ = 1, + OPEN_DELEGATE_WRITE = 2 + }; + + enum open_claim_type4 { + CLAIM_NULL = 0, + CLAIM_PREVIOUS = 1, + CLAIM_DELEGATE_CUR = 2, + CLAIM_DELEGATE_PREV = 3 + }; + + struct open_claim_delegate_cur4 { + stateid4 delegate_stateid; + + + +Shepler, et al. Standards Track [Page 252] + +RFC 3530 NFS version 4 Protocol April 2003 + + + component4 file; + }; + + union open_claim4 switch (open_claim_type4 claim) { + /* + * No special rights to file. Ordinary OPEN of the specified file. + */ + case CLAIM_NULL: + /* CURRENT_FH: directory */ + component4 file; + + /* + * Right to the file established by an open previous to server + * reboot. File identified by filehandle obtained at that time + * rather than by name. + */ + case CLAIM_PREVIOUS: + /* CURRENT_FH: file being reclaimed */ + open_delegation_type4 delegate_type; + + /* + * Right to file based on a delegation granted by the server. + * File is specified by name. + */ + case CLAIM_DELEGATE_CUR: + /* CURRENT_FH: directory */ + open_claim_delegate_cur4 delegate_cur_info; + + /* Right to file based on a delegation granted to a previous boot + * instance of the client. File is specified by name. + */ + case CLAIM_DELEGATE_PREV: + /* CURRENT_FH: directory */ + component4 file_delegate_prev; + }; + + /* + * OPEN: Open a file, potentially receiving an open delegation + */ + struct OPEN4args { + seqid4 seqid; + uint32_t share_access; + uint32_t share_deny; + open_owner4 owner; + openflag4 openhow; + open_claim4 claim; + }; + + + + +Shepler, et al. Standards Track [Page 253] + +RFC 3530 NFS version 4 Protocol April 2003 + + + struct open_read_delegation4 { + stateid4 stateid; /* Stateid for delegation*/ + bool recall; /* Pre-recalled flag for + delegations obtained + by reclaim + (CLAIM_PREVIOUS) */ + nfsace4 permissions; /* Defines users who don't + need an ACCESS call to + open for read */ + }; + + struct open_write_delegation4 { + stateid4 stateid; /* Stateid for delegation */ + bool recall; /* Pre-recalled flag for + delegations obtained + by reclaim + (CLAIM_PREVIOUS) */ + nfs_space_limit4 space_limit; /* Defines condition that + the client must check to + determine whether the + file needs to be flushed + to the server on close. + */ + nfsace4 permissions; /* Defines users who don't + need an ACCESS call as + part of a delegated + open. */ + }; + + union open_delegation4 + switch (open_delegation_type4 delegation_type) { + case OPEN_DELEGATE_NONE: + void; + case OPEN_DELEGATE_READ: + open_read_delegation4 read; + case OPEN_DELEGATE_WRITE: + open_write_delegation4 write; + }; + /* + * Result flags + */ + /* Client must confirm open */ + const OPEN4_RESULT_CONFIRM = 0x00000002; + /* Type of file locking behavior at the server */ + const OPEN4_RESULT_LOCKTYPE_POSIX = 0x00000004; + + struct OPEN4resok { + stateid4 stateid; /* Stateid for open */ + + + +Shepler, et al. Standards Track [Page 254] + +RFC 3530 NFS version 4 Protocol April 2003 + + + change_info4 cinfo; /* Directory Change Info */ + uint32_t rflags; /* Result flags */ + bitmap4 attrset; /* attribute set for create*/ + open_delegation4 delegation; /* Info on any open + delegation */ + }; + + union OPEN4res switch (nfsstat4 status) { + case NFS4_OK: + /* CURRENT_FH: opened file */ + OPEN4resok resok4; + default: + void; + }; + + /* + * OPENATTR: open named attributes directory + */ + struct OPENATTR4args { + /* CURRENT_FH: object */ + bool createdir; + }; + + struct OPENATTR4res { + /* CURRENT_FH: named attr directory */ + nfsstat4 status; + }; + + /* + * OPEN_CONFIRM: confirm the open + */ + struct OPEN_CONFIRM4args { + /* CURRENT_FH: opened file */ + stateid4 open_stateid; + seqid4 seqid; + }; + + struct OPEN_CONFIRM4resok { + stateid4 open_stateid; + }; + + union OPEN_CONFIRM4res switch (nfsstat4 status) { + case NFS4_OK: + OPEN_CONFIRM4resok resok4; + default: + void; + }; + + + + +Shepler, et al. Standards Track [Page 255] + +RFC 3530 NFS version 4 Protocol April 2003 + + + /* + * OPEN_DOWNGRADE: downgrade the access/deny for a file + */ + struct OPEN_DOWNGRADE4args { + /* CURRENT_FH: opened file */ + stateid4 open_stateid; + seqid4 seqid; + uint32_t share_access; + uint32_t share_deny; + }; + + struct OPEN_DOWNGRADE4resok { + stateid4 open_stateid; + }; + + union OPEN_DOWNGRADE4res switch(nfsstat4 status) { + case NFS4_OK: + OPEN_DOWNGRADE4resok resok4; + default: + void; + }; + + /* + * PUTFH: Set current filehandle + */ + struct PUTFH4args { + nfs_fh4 object; + }; + + struct PUTFH4res { + /* CURRENT_FH: */ + nfsstat4 status; + }; + + /* + * PUTPUBFH: Set public filehandle + */ + struct PUTPUBFH4res { + /* CURRENT_FH: public fh */ + nfsstat4 status; + }; + + /* + * PUTROOTFH: Set root filehandle + */ + struct PUTROOTFH4res { + + /* CURRENT_FH: root fh */ + + + +Shepler, et al. Standards Track [Page 256] + +RFC 3530 NFS version 4 Protocol April 2003 + + + nfsstat4 status; + }; + + /* + * READ: Read from file + */ + struct READ4args { + /* CURRENT_FH: file */ + stateid4 stateid; + offset4 offset; + count4 count; + }; + + struct READ4resok { + bool eof; + opaque data<>; + }; + + union READ4res switch (nfsstat4 status) { + case NFS4_OK: + READ4resok resok4; + default: + void; + }; + + /* + * READDIR: Read directory + */ + struct READDIR4args { + /* CURRENT_FH: directory */ + nfs_cookie4 cookie; + verifier4 cookieverf; + count4 dircount; + count4 maxcount; + bitmap4 attr_request; + }; + + struct entry4 { + nfs_cookie4 cookie; + component4 name; + fattr4 attrs; + entry4 *nextentry; + }; + + struct dirlist4 { + entry4 *entries; + bool eof; + }; + + + +Shepler, et al. Standards Track [Page 257] + +RFC 3530 NFS version 4 Protocol April 2003 + + + struct READDIR4resok { + verifier4 cookieverf; + dirlist4 reply; + }; + + + union READDIR4res switch (nfsstat4 status) { + case NFS4_OK: + READDIR4resok resok4; + default: + void; + }; + + + /* + * READLINK: Read symbolic link + */ + struct READLINK4resok { + linktext4 link; + }; + + union READLINK4res switch (nfsstat4 status) { + case NFS4_OK: + READLINK4resok resok4; + default: + void; + }; + + /* + * REMOVE: Remove filesystem object + */ + struct REMOVE4args { + /* CURRENT_FH: directory */ + component4 target; + }; + + struct REMOVE4resok { + change_info4 cinfo; + }; + + union REMOVE4res switch (nfsstat4 status) { + case NFS4_OK: + REMOVE4resok resok4; + default: + void; + }; + + /* + + + +Shepler, et al. Standards Track [Page 258] + +RFC 3530 NFS version 4 Protocol April 2003 + + + * RENAME: Rename directory entry + */ + struct RENAME4args { + /* SAVED_FH: source directory */ + component4 oldname; + /* CURRENT_FH: target directory */ + + component4 newname; + }; + + struct RENAME4resok { + change_info4 source_cinfo; + change_info4 target_cinfo; + }; + + union RENAME4res switch (nfsstat4 status) { + case NFS4_OK: + RENAME4resok resok4; + default: + void; + }; + + /* + * RENEW: Renew a Lease + */ + struct RENEW4args { + clientid4 clientid; + }; + + struct RENEW4res { + nfsstat4 status; + }; + + /* + * RESTOREFH: Restore saved filehandle + */ + + struct RESTOREFH4res { + /* CURRENT_FH: value of saved fh */ + nfsstat4 status; + }; + + /* + * SAVEFH: Save current filehandle + */ + struct SAVEFH4res { + /* SAVED_FH: value of current fh */ + nfsstat4 status; + + + +Shepler, et al. Standards Track [Page 259] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + /* + * SECINFO: Obtain Available Security Mechanisms + */ + struct SECINFO4args { + /* CURRENT_FH: directory */ + component4 name; + }; + + /* + + * From RFC 2203 + */ + enum rpc_gss_svc_t { + RPC_GSS_SVC_NONE = 1, + RPC_GSS_SVC_INTEGRITY = 2, + RPC_GSS_SVC_PRIVACY = 3 + }; + + struct rpcsec_gss_info { + sec_oid4 oid; + qop4 qop; + rpc_gss_svc_t service; + }; + + /* RPCSEC_GSS has a value of '6' - See RFC 2203 */ + union secinfo4 switch (uint32_t flavor) { + case RPCSEC_GSS: + rpcsec_gss_info flavor_info; + default: + void; + }; + + typedef secinfo4 SECINFO4resok<>; + + union SECINFO4res switch (nfsstat4 status) { + case NFS4_OK: + SECINFO4resok resok4; + default: + void; + }; + + /* + * SETATTR: Set attributes + */ + struct SETATTR4args { + /* CURRENT_FH: target object */ + + + +Shepler, et al. Standards Track [Page 260] + +RFC 3530 NFS version 4 Protocol April 2003 + + + stateid4 stateid; + fattr4 obj_attributes; + }; + + struct SETATTR4res { + nfsstat4 status; + bitmap4 attrsset; + }; + + /* + * SETCLIENTID + */ + struct SETCLIENTID4args { + nfs_client_id4 client; + cb_client4 callback; + uint32_t callback_ident; + + }; + + struct SETCLIENTID4resok { + clientid4 clientid; + verifier4 setclientid_confirm; + }; + + union SETCLIENTID4res switch (nfsstat4 status) { + case NFS4_OK: + SETCLIENTID4resok resok4; + case NFS4ERR_CLID_INUSE: + clientaddr4 client_using; + default: + void; + }; + + struct SETCLIENTID_CONFIRM4args { + clientid4 clientid; + verifier4 setclientid_confirm; + }; + + struct SETCLIENTID_CONFIRM4res { + nfsstat4 status; + }; + + /* + * VERIFY: Verify attributes same + */ + struct VERIFY4args { + /* CURRENT_FH: object */ + fattr4 obj_attributes; + + + +Shepler, et al. Standards Track [Page 261] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + struct VERIFY4res { + nfsstat4 status; + }; + + /* + * WRITE: Write to file + */ + enum stable_how4 { + UNSTABLE4 = 0, + DATA_SYNC4 = 1, + FILE_SYNC4 = 2 + }; + + struct WRITE4args { + /* CURRENT_FH: file */ + stateid4 stateid; + offset4 offset; + stable_how4 stable; + opaque data<>; + }; + + struct WRITE4resok { + count4 count; + stable_how4 committed; + verifier4 writeverf; + }; + + union WRITE4res switch (nfsstat4 status) { + case NFS4_OK: + WRITE4resok resok4; + default: + void; + }; + + /* + * RELEASE_LOCKOWNER: Notify server to release lockowner + */ + struct RELEASE_LOCKOWNER4args { + lock_owner4 lock_owner; + }; + + struct RELEASE_LOCKOWNER4res { + nfsstat4 status; + }; + + /* + + + +Shepler, et al. Standards Track [Page 262] + +RFC 3530 NFS version 4 Protocol April 2003 + + + * ILLEGAL: Response for illegal operation numbers + */ + struct ILLEGAL4res { + nfsstat4 status; + }; + + /* + * Operation arrays + */ + + enum nfs_opnum4 { + OP_ACCESS = 3, + OP_CLOSE = 4, + OP_COMMIT = 5, + OP_CREATE = 6, + OP_DELEGPURGE = 7, + OP_DELEGRETURN = 8, + OP_GETATTR = 9, + OP_GETFH = 10, + OP_LINK = 11, + OP_LOCK = 12, + OP_LOCKT = 13, + OP_LOCKU = 14, + OP_LOOKUP = 15, + OP_LOOKUPP = 16, + OP_NVERIFY = 17, + OP_OPEN = 18, + OP_OPENATTR = 19, + OP_OPEN_CONFIRM = 20, + OP_OPEN_DOWNGRADE = 21, + OP_PUTFH = 22, + OP_PUTPUBFH = 23, + OP_PUTROOTFH = 24, + OP_READ = 25, + OP_READDIR = 26, + OP_READLINK = 27, + OP_REMOVE = 28, + OP_RENAME = 29, + OP_RENEW = 30, + OP_RESTOREFH = 31, + OP_SAVEFH = 32, + OP_SECINFO = 33, + OP_SETATTR = 34, + OP_SETCLIENTID = 35, + OP_SETCLIENTID_CONFIRM = 36, + OP_VERIFY = 37, + OP_WRITE = 38, + OP_RELEASE_LOCKOWNER = 39, + + + +Shepler, et al. Standards Track [Page 263] + +RFC 3530 NFS version 4 Protocol April 2003 + + + OP_ILLEGAL = 10044 + }; + + union nfs_argop4 switch (nfs_opnum4 argop) { + case OP_ACCESS: ACCESS4args opaccess; + case OP_CLOSE: CLOSE4args opclose; + case OP_COMMIT: COMMIT4args opcommit; + case OP_CREATE: CREATE4args opcreate; + case OP_DELEGPURGE: DELEGPURGE4args opdelegpurge; + case OP_DELEGRETURN: DELEGRETURN4args opdelegreturn; + case OP_GETATTR: GETATTR4args opgetattr; + case OP_GETFH: void; + case OP_LINK: LINK4args oplink; + case OP_LOCK: LOCK4args oplock; + case OP_LOCKT: LOCKT4args oplockt; + case OP_LOCKU: LOCKU4args oplocku; + case OP_LOOKUP: LOOKUP4args oplookup; + case OP_LOOKUPP: void; + case OP_NVERIFY: NVERIFY4args opnverify; + case OP_OPEN: OPEN4args opopen; + case OP_OPENATTR: OPENATTR4args opopenattr; + case OP_OPEN_CONFIRM: OPEN_CONFIRM4args opopen_confirm; + case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4args opopen_downgrade; + case OP_PUTFH: PUTFH4args opputfh; + case OP_PUTPUBFH: void; + case OP_PUTROOTFH: void; + case OP_READ: READ4args opread; + case OP_READDIR: READDIR4args opreaddir; + case OP_READLINK: void; + case OP_REMOVE: REMOVE4args opremove; + case OP_RENAME: RENAME4args oprename; + case OP_RENEW: RENEW4args oprenew; + case OP_RESTOREFH: void; + case OP_SAVEFH: void; + case OP_SECINFO: SECINFO4args opsecinfo; + case OP_SETATTR: SETATTR4args opsetattr; + case OP_SETCLIENTID: SETCLIENTID4args opsetclientid; + case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4args + opsetclientid_confirm; + case OP_VERIFY: VERIFY4args opverify; + case OP_WRITE: WRITE4args opwrite; + case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4args + oprelease_lockowner; + case OP_ILLEGAL: void; + }; + + union nfs_resop4 switch (nfs_opnum4 resop){ + case OP_ACCESS: ACCESS4res opaccess; + + + +Shepler, et al. Standards Track [Page 264] + +RFC 3530 NFS version 4 Protocol April 2003 + + + case OP_CLOSE: CLOSE4res opclose; + case OP_COMMIT: COMMIT4res opcommit; + case OP_CREATE: CREATE4res opcreate; + case OP_DELEGPURGE: DELEGPURGE4res opdelegpurge; + case OP_DELEGRETURN: DELEGRETURN4res opdelegreturn; + case OP_GETATTR: GETATTR4res opgetattr; + case OP_GETFH: GETFH4res opgetfh; + case OP_LINK: LINK4res oplink; + case OP_LOCK: LOCK4res oplock; + case OP_LOCKT: LOCKT4res oplockt; + case OP_LOCKU: LOCKU4res oplocku; + case OP_LOOKUP: LOOKUP4res oplookup; + case OP_LOOKUPP: LOOKUPP4res oplookupp; + case OP_NVERIFY: NVERIFY4res opnverify; + case OP_OPEN: OPEN4res opopen; + case OP_OPENATTR: OPENATTR4res opopenattr; + case OP_OPEN_CONFIRM: OPEN_CONFIRM4res opopen_confirm; + case OP_OPEN_DOWNGRADE: OPEN_DOWNGRADE4res opopen_downgrade; + case OP_PUTFH: PUTFH4res opputfh; + case OP_PUTPUBFH: PUTPUBFH4res opputpubfh; + case OP_PUTROOTFH: PUTROOTFH4res opputrootfh; + case OP_READ: READ4res opread; + case OP_READDIR: READDIR4res opreaddir; + case OP_READLINK: READLINK4res opreadlink; + case OP_REMOVE: REMOVE4res opremove; + case OP_RENAME: RENAME4res oprename; + case OP_RENEW: RENEW4res oprenew; + case OP_RESTOREFH: RESTOREFH4res oprestorefh; + case OP_SAVEFH: SAVEFH4res opsavefh; + case OP_SECINFO: SECINFO4res opsecinfo; + case OP_SETATTR: SETATTR4res opsetattr; + case OP_SETCLIENTID: SETCLIENTID4res opsetclientid; + case OP_SETCLIENTID_CONFIRM: SETCLIENTID_CONFIRM4res + opsetclientid_confirm; + case OP_VERIFY: VERIFY4res opverify; + case OP_WRITE: WRITE4res opwrite; + case OP_RELEASE_LOCKOWNER: RELEASE_LOCKOWNER4res + oprelease_lockowner; + case OP_ILLEGAL: ILLEGAL4res opillegal; + }; + + struct COMPOUND4args { + utf8str_cs tag; + uint32_t minorversion; + nfs_argop4 argarray<>; + }; + + struct COMPOUND4res { + + + +Shepler, et al. Standards Track [Page 265] + +RFC 3530 NFS version 4 Protocol April 2003 + + + nfsstat4 status; + utf8str_cs tag; + nfs_resop4 resarray<>; + }; + + /* + * Remote file service routines + */ + program NFS4_PROGRAM { + version NFS_V4 { + void + NFSPROC4_NULL(void) = 0; + + COMPOUND4res + NFSPROC4_COMPOUND(COMPOUND4args) = 1; + + } = 4; + } = 100003; + + + + /* + * NFS4 Callback Procedure Definitions and Program + */ + + /* + * CB_GETATTR: Get Current Attributes + */ + struct CB_GETATTR4args { + nfs_fh4 fh; + bitmap4 attr_request; + }; + + struct CB_GETATTR4resok { + fattr4 obj_attributes; + }; + + union CB_GETATTR4res switch (nfsstat4 status) { + case NFS4_OK: + CB_GETATTR4resok resok4; + default: + void; + }; + + /* + * CB_RECALL: Recall an Open Delegation + */ + struct CB_RECALL4args { + + + +Shepler, et al. Standards Track [Page 266] + +RFC 3530 NFS version 4 Protocol April 2003 + + + stateid4 stateid; + bool truncate; + nfs_fh4 fh; + }; + + struct CB_RECALL4res { + nfsstat4 status; + }; + + /* + * CB_ILLEGAL: Response for illegal operation numbers + */ + struct CB_ILLEGAL4res { + nfsstat4 status; + }; + + /* + * Various definitions for CB_COMPOUND + */ + enum nfs_cb_opnum4 { + OP_CB_GETATTR = 3, + OP_CB_RECALL = 4, + OP_CB_ILLEGAL = 10044 + }; + + union nfs_cb_argop4 switch (unsigned argop) { + case OP_CB_GETATTR: CB_GETATTR4args opcbgetattr; + case OP_CB_RECALL: CB_RECALL4args opcbrecall; + case OP_CB_ILLEGAL: void; + }; + + union nfs_cb_resop4 switch (unsigned resop){ + case OP_CB_GETATTR: CB_GETATTR4res opcbgetattr; + case OP_CB_RECALL: CB_RECALL4res opcbrecall; + case OP_CB_ILLEGAL: CB_ILLEGAL4res opcbillegal; + }; + + struct CB_COMPOUND4args { + utf8str_cs tag; + uint32_t minorversion; + uint32_t callback_ident; + nfs_cb_argop4 argarray<>; + }; + + struct CB_COMPOUND4res { + nfsstat4 status; + utf8str_cs tag; + nfs_cb_resop4 resarray<>; + + + +Shepler, et al. Standards Track [Page 267] + +RFC 3530 NFS version 4 Protocol April 2003 + + + }; + + + /* + * Program number is in the transient range since the client + * will assign the exact transient program number and provide + * that to the server via the SETCLIENTID operation. + */ + program NFS4_CALLBACK { + version NFS_CB { + void + CB_NULL(void) = 0; + CB_COMPOUND4res + CB_COMPOUND(CB_COMPOUND4args) = 1; + } = 1; + } = 0x40000000; + +19. Acknowledgements + + The authors thank and acknowledge: + + Neil Brown for his extensive review and comments of various + documents. Rick Macklem at the University of Guelph, Mike Frisch, + Sergey Klyushin, and Dan Trufasiu of Hummingbird Ltd., and Andy + Adamson, Bruce Fields, Jim Rees, and Kendrick Smith from the CITI + organization at the University of Michigan, for their implementation + efforts and feedback on the protocol specification. Mike Kupfer for + his review of the file locking and ACL mechanisms. Alan Yoder for + his input to ACL mechanisms. Peter Astrand for his close review of + the protocol specification. Ran Atkinson for his constant reminder + that users do matter. + +20. Normative References + + [ISO10646] "ISO/IEC 10646-1:1993. International + Standard -- Information technology -- + Universal Multiple-Octet Coded Character + Set (UCS) -- Part 1: Architecture and Basic + Multilingual Plane." + + [RFC793] Postel, J., "Transmission Control + Protocol", STD 7, RFC 793, September 1981. + + [RFC1831] Srinivasan, R., "RPC: Remote Procedure Call + Protocol Specification Version 2", RFC + 1831, August 1995. + + + + + +Shepler, et al. Standards Track [Page 268] + +RFC 3530 NFS version 4 Protocol April 2003 + + + [RFC1832] Srinivasan, R., "XDR: External Data + Representation Standard", RFC 1832, August + 1995. + + [RFC2373] Hinden, R. and S. Deering, "IP Version 6 + Addressing Architecture", RFC 2373, July + 1998. + + [RFC1964] Linn, J., "The Kerberos Version 5 GSS-API + Mechanism", RFC 1964, June 1996. + + [RFC2025] Adams, C., "The Simple Public-Key GSS-API + Mechanism (SPKM)", RFC 2025, October 1996. + + [RFC2119] Bradner, S., "Key words for use in RFCs to + Indicate Requirement Levels", BCP 14, RFC + 2119, March 1997. + + [RFC2203] Eisler, M., Chiu, A. and L. Ling, + "RPCSEC_GSS Protocol Specification", RFC + 2203, September 1997. + + [RFC2277] Alvestrand, H., "IETF Policy on Character + Sets and Languages", BCP 19, RFC 2277, + January 1998. + + [RFC2279] Yergeau, F., "UTF-8, a transformation + format of ISO 10646", RFC 2279, January + 1998. + + [RFC2623] Eisler, M., "NFS Version 2 and Version 3 + Security Issues and the NFS Protocol's Use + of RPCSEC_GSS and Kerberos V5", RFC 2623, + June 1999. + + [RFC2743] Linn, J., "Generic Security Service + Application Program Interface, Version 2, + Update 1", RFC 2743, January 2000. + + [RFC2847] Eisler, M., "LIPKEY - A Low Infrastructure + Public Key Mechanism Using SPKM", RFC 2847, + June 2000. + + [RFC3010] Shepler, S., Callaghan, B., Robinson, D., + Thurlow, R., Beame, C., Eisler, M. and D. + Noveck, "NFS version 4 Protocol", RFC 3010, + December 2000. + + + + +Shepler, et al. Standards Track [Page 269] + +RFC 3530 NFS version 4 Protocol April 2003 + + + [RFC3454] Hoffman, P. and P. Blanchet, "Preparation + of Internationalized Strings + ("stringprep")", RFC 3454, December 2002. + + [Unicode1] The Unicode Consortium, "The Unicode + Standard, Version 3.0", Addison-Wesley + Developers Press, Reading, MA, 2000. ISBN + 0-201-61633-5. + + More information available at: + http://www.unicode.org/ + + [Unicode2] "Unsupported Scripts" Unicode, Inc., The + Unicode Consortium, P.O. Box 700519, San + Jose, CA 95710-0519 USA, September 1999. + http://www.unicode.org/unicode/standard/ + unsupported.html + +21. Informative References + + [Floyd] S. Floyd, V. Jacobson, "The Synchronization + of Periodic Routing Messages," IEEE/ACM + Transactions on Networking, 2(2), pp. 122- + 136, April 1994. + + [Gray] C. Gray, D. Cheriton, "Leases: An Efficient + Fault-Tolerant Mechanism for Distributed + File Cache Consistency," Proceedings of the + Twelfth Symposium on Operating Systems + Principles, p. 202-210, December 1989. + + [Juszczak] Juszczak, Chet, "Improving the Performance + and Correctness of an NFS Server," USENIX + Conference Proceedings, USENIX Association, + Berkeley, CA, June 1990, pages 53-63. + Describes reply cache implementation that + avoids work in the server by handling + duplicate requests. More important, though + listed as a side-effect, the reply cache + aids in the avoidance of destructive non- + idempotent operation re-application -- + improving correctness. + + + + + + + + + +Shepler, et al. Standards Track [Page 270] + +RFC 3530 NFS version 4 Protocol April 2003 + + + [Kazar] Kazar, Michael Leon, "Synchronization and + Caching Issues in the Andrew File System," + USENIX Conference Proceedings, USENIX + Association, Berkeley, CA, Dallas Winter + 1988, pages 27-36. A description of the + cache consistency scheme in AFS. + Contrasted with other distributed file + systems. + + [Macklem] Macklem, Rick, "Lessons Learned Tuning the + 4.3BSD Reno Implementation of the NFS + Protocol," Winter USENIX Conference + Proceedings, USENIX Association, Berkeley, + CA, January 1991. Describes performance + work in tuning the 4.3BSD Reno NFS + implementation. Describes performance + improvement (reduced CPU loading) through + elimination of data copies. + + [Mogul] Mogul, Jeffrey C., "A Recovery Protocol for + Spritely NFS," USENIX File System Workshop + Proceedings, Ann Arbor, MI, USENIX + Association, Berkeley, CA, May 1992. + Second paper on Spritely NFS proposes a + lease-based scheme for recovering state of + consistency protocol. + + [Nowicki] Nowicki, Bill, "Transport Issues in the + Network File System," ACM SIGCOMM + newsletter Computer Communication Review, + April 1989. A brief description of the + basis for the dynamic retransmission work. + + [Pawlowski] Pawlowski, Brian, Ron Hixon, Mark Stein, + Joseph Tumminaro, "Network Computing in the + UNIX and IBM Mainframe Environment," + Uniforum `89 Conf. Proc., (1989) + Description of an NFS server implementation + for IBM's MVS operating system. + + [RFC1094] Sun Microsystems, Inc., "NFS: Network File + System Protocol Specification", RFC 1094, + March 1989. + + [RFC1345] Simonsen, K., "Character Mnemonics & + Character Sets", RFC 1345, June 1992. + + + + + +Shepler, et al. Standards Track [Page 271] + +RFC 3530 NFS version 4 Protocol April 2003 + + + [RFC1813] Callaghan, B., Pawlowski, B. and P. + Staubach, "NFS Version 3 Protocol + Specification", RFC 1813, June 1995. + + [RFC3232] Reynolds, J., Editor, "Assigned Numbers: + RFC 1700 is Replaced by an On-line + Database", RFC 3232, January 2002. + + [RFC1833] Srinivasan, R., "Binding Protocols for ONC + RPC Version 2", RFC 1833, August 1995. + + [RFC2054] Callaghan, B., "WebNFS Client + Specification", RFC 2054, October 1996. + + [RFC2055] Callaghan, B., "WebNFS Server + Specification", RFC 2055, October 1996. + + [RFC2152] Goldsmith, D. and M. Davis, "UTF-7 A Mail- + Safe Transformation Format of Unicode", RFC + 2152, May 1997. + + [RFC2224] Callaghan, B., "NFS URL Scheme", RFC 2224, + October 1997. + + [RFC2624] Shepler, S., "NFS Version 4 Design + Considerations", RFC 2624, June 1999. + + [RFC2755] Chiu, A., Eisler, M. and B. Callaghan, + "Security Negotiation for WebNFS" , RFC + 2755, June 2000. + + [Sandberg] Sandberg, R., D. Goldberg, S. Kleiman, D. + Walsh, B. Lyon, "Design and Implementation + of the Sun Network Filesystem," USENIX + Conference Proceedings, USENIX Association, + Berkeley, CA, Summer 1985. The basic paper + describing the SunOS implementation of the + NFS version 2 protocol, and discusses the + goals, protocol specification and trade- + offs. + + + + + + + + + + + +Shepler, et al. Standards Track [Page 272] + +RFC 3530 NFS version 4 Protocol April 2003 + + + [Srinivasan] Srinivasan, V., Jeffrey C. Mogul, "Spritely + NFS: Implementation and Performance of + Cache Consistency Protocols", WRL Research + Report 89/5, Digital Equipment Corporation + Western Research Laboratory, 100 Hamilton + Ave., Palo Alto, CA, 94301, May 1989. This + paper analyzes the effect of applying a + Sprite-like consistency protocol applied to + standard NFS. The issues of recovery in a + stateful environment are covered in + [Mogul]. + + [XNFS] The Open Group, Protocols for Interworking: + XNFS, Version 3W, The Open Group, 1010 El + Camino Real Suite 380, Menlo Park, CA + 94025, ISBN 1-85912-184-5, February 1998. + + HTML version available: + http://www.opengroup.org + +22. Authors' Information + +22.1. Editor's Address + + Spencer Shepler + Sun Microsystems, Inc. + 7808 Moonflower Drive + Austin, Texas 78750 + + Phone: +1 512-349-9376 + EMail: spencer.shepler@sun.com + + + + + + + + + + + + + + + + + + + + +Shepler, et al. Standards Track [Page 273] + +RFC 3530 NFS version 4 Protocol April 2003 + + +22.2. Authors' Addresses + + Carl Beame + Hummingbird Ltd. + + EMail: beame@bws.com + + Brent Callaghan + Sun Microsystems, Inc. + 17 Network Circle + Menlo Park, CA 94025 + + Phone: +1 650-786-5067 + EMail: brent.callaghan@sun.com + + Mike Eisler + 5765 Chase Point Circle + Colorado Springs, CO 80919 + + Phone: +1 719-599-9026 + EMail: mike@eisler.com + + David Noveck + Network Appliance + 375 Totten Pond Road + Waltham, MA 02451 + + Phone: +1 781-768-5347 + EMail: dnoveck@netapp.com + + David Robinson + Sun Microsystems, Inc. + 5300 Riata Park Court + Austin, TX 78727 + + Phone: +1 650-786-5088 + EMail: david.robinson@sun.com + + Robert Thurlow + Sun Microsystems, Inc. + 500 Eldorado Blvd. + Broomfield, CO 80021 + + Phone: +1 650-786-5096 + EMail: robert.thurlow@sun.com + + + + + + +Shepler, et al. Standards Track [Page 274] + +RFC 3530 NFS version 4 Protocol April 2003 + + +23. Full Copyright Statement + + Copyright (C) The Internet Society (2003). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Shepler, et al. Standards Track [Page 275] + -- cgit v1.2.3