Internet DRAFT - draft-thurlow-nfsv4-namespace

draft-thurlow-nfsv4-namespace





Network Working Group                                     Robert Thurlow
Internet Draft                                            February 2005
Document: draft-thurlow-nfsv4-namespace-01.txt



                     A Namespace For NFS Version 4



Status of this Memo

   By submitting this Internet-Draft, I certify that any applicable
   patent or other IPR claims of which I am aware have been disclosed,
   and any of which I become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   Discussion and suggestions for improvement are requested.  This
   document will expire in August, 2005. Distribution of this draft is
   unlimited.

Abstract

   Recent work on Replication and Migration for NFSv4 has reminded us of
   a more fundamental problem:  NFS currently lacks a coherent
   enterprise or global namespace.  With changes in a minor revision of
   NFS Version 4, this can be addressed to make services like
   replication and migration of filesystems fully functional.

   This draft is considered obsolete, but is refreshed for the
   convenience of those reviewing Charles Fan's [NS_PROBLEM] document.




Expires: August 2005                                            [Page 1]

Title                 A Namespace For NFS Version 4        February 2005


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
   2.  Problem Statement  . . . . . . . . . . . . . . . . . . . . . 4
   3.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4
   4.  Implementation Options . . . . . . . . . . . . . . . . . . . 4
   5.  Minor Revision Client-Server Changes . . . . . . . . . . . . 5
   5.1.  Finding The Global Root  . . . . . . . . . . . . . . . . . 5
   5.2.  Junction Nodes . . . . . . . . . . . . . . . . . . . . . . 5
   5.3.  New error number: NFS4ERR_REFERRAL . . . . . . . . . . . . 7
   5.4.  Enhanced Lookup Procedure  . . . . . . . . . . . . . . . . 7
   5.5.  An Example . . . . . . . . . . . . . . . . . . . . . . . . 8
   6.  Server-Namespace Interaction . . . . . . . . . . . . . . . . 9
   6.1.  Use The Directory  . . . . . . . . . . . . . . . . . . . . 9
   6.2.  Use Replication  . . . . . . . . . . . . . . . . . . . . . 9
   7.  Appendix A: XDR Protocol Definition File . . . . . . . . .  10
   8.  Full Copyright Statement . . . . . . . . . . . . . . . . .  12
   9.  Intellectual property  . . . . . . . . . . . . . . . . . .  12
   10.  Normative References  . . . . . . . . . . . . . . . . . .  13
   11.  Informative References  . . . . . . . . . . . . . . . . .  13
   12.  Author's Address  . . . . . . . . . . . . . . . . . . . .  14






























Expires: August 2005                                            [Page 2]

Title                 A Namespace For NFS Version 4        February 2005


1.  Introduction

   Unlike some other distributed filesystems, NFS has never had a
   unified, Internet- or enterprise-level namespace.  NFS namespaces are
   often, at best, confined to a set of machines within an
   administrative domain which is often smaller or much smaller than the
   company-wide Intranet.  This has become a larger issue with the EOL
   of such systems as AFS and DCE/DFS, which each provide a strong and
   unified namespace at the enterprise level that can be extended to a
   global namespace for the Internet.

   To create a large-scale namespace, we have to reconfigure the way the
   NFS client discovers resources.  In general, the client specially
   handles certain "junction" points in its view of the namespace; when
   one of these junction points is manipulated, the client consults some
   kind of distributed database to find distributed filesystems and
   attributes for them, and then mounts and uses them.  This makes the
   namespace a composite construction accessed by different protocols at
   different times.  This is neither necessary nor desireable; with an
   extensible distributed filesystem protocol such as NFS Version 4,
   these junctions can be embedded into the NFS virtual filesystem and
   all necessary information can be discovered by NFSv4 protocol
   operations.  The server could also support mechanisms to work with
   other distributed filesystems.

   Once clients understand junctions and how to get referrals to actual
   locations, we can support generic servers to provide clients with
   easy access to the namespace.  These servers need not store any data
   files, but could just store a replica of the top level of the global
   namespace.  They could advertise the global root via SLP [RFC2608]
   and let clients find the locations of useful filesystems via
   referrals. This easily supports an enterprise-level consistent
   namespace, which can be made global with industry agreement regarding
   the management of a global root directory and the servers to present
   it.

   A namespace solves some problems for [REPLMIG] as well.  An unsolved
   problem in replication is how to inform the client of the locations
   of the replicas; so far, non-standard extentions to automounter maps
   and manual mount command syntax must be used.  With an extended
   lookup, attributes and several or all locations can be returned in-
   band.

   This document does not currently define a syntax for the top levels
   of the filesystem.  Any syntax should define the names used for the
   top two levels (e.g. "/nfs/sun.com"), and should also define a
   shorthand for accessing the enterprise-level root without the need to
   go through the top level unnecessarily.



Expires: August 2005                                            [Page 3]

Title                 A Namespace For NFS Version 4        February 2005


2.  Problem Statement

   Customers have seen a gap in NFS with respect to such distributed
   filesystems as AFS and DCE/DFS.  They want to be able to build a
   namespace which is seen consistently by all clients in their
   enterprise, and some hope for a truly global Internet-wide hierarchy
   of files, to which you would gain only the access level you deserved.
   They want to be able to delegate authority to match the data; I may
   not be able to dictate where my home directory is in the namespace,
   but I should be able to construct my home directory as multiple
   filesystems and be able to "publish" those parts.  Because the best
   we have been able to do involves a highly-configurable Automounter
   daemon with variance across platforms, combined with non-standard
   databases with non-standard filesystem location information, the NFS
   industry can't currently offer this functionality.

3.  Requirements

   Customers requirements include the following:

   o    Permit me to build (at least) enterprise-wide namespaces

   o    Permit me to delegate management of parts of the namespace to
        owners of data

   o    Make this manageable from almost anywhere

   o    Don't make me deploy a new naming service

   o    Permit reasonable backwards compatibility


4.  Implementation Options

   The problem boils down to a couple of issues:

   o    How does the client find a server for a relevant root vnode?

   o    How does the client detect and navigate a "junction point" where
        it must transition from a higher-level to a lower-level
        filesystem?

   The first issue can be solved by configuring the root location into
   the client or by having the client do a network transaction to find a
   suitable global root server.  Hard-coding this information does not
   scale to many clients, so a network transaction is in order.  The
   most suitable deployed service to find an instance of a highly-
   replcated object is the Service Location Protocol [SLP].  By doing an



Expires: August 2005                                            [Page 4]

Title                 A Namespace For NFS Version 4        February 2005


   SLP request, a client should be able to find a nearby server which
   knows how to find the global root and can thus see all the data in
   that one filesystem.  Typically, though, that filesystem would
   consist almost entirely of some locations of more interesting
   resources.

   The "junction" point, where virtual filesystems meet, is inherently
   an abstraction, since the separate virtual filesystems must never
   completely look like a single one.  Since the junction is abstract,
   there are different ways to construct it.  The client can construct
   it based on information from a distributed service such as LDAP or
   the server can construct it and make it visible through the NFS
   protocol.  If the server constructs it, it could again base the
   construction on a service like LDAP, or it could hold copies in
   actual filesystem objects, with the filesystems managed as replicas.

   The best choice at this time seems to be to have the server make an
   abstraction visible to the client via NFS Version 4 minor revision
   protocol elements.  The server would be able to construct symbolic
   links for NFS Version 2 and Version 3 clients and to construct other
   types of referrals to other distributed filesystems (e.g. Microsoft
   DFS referrals for CIFS clients).

   How the server gets its data is not so clear at present and is not
   specified by this draft.  Further ideas are discussed in Section 6.

5.  Minor Revision Client-Server Changes


5.1.  Finding The Global Root

   The NFS client should begin navigation of the global namespace by
   issuing an SLP call to look for a service named "Global_NFS".  It
   should then attempt to ask that server for information about the
   global or enterprise root.

5.2.  Junction Nodes

   Junction nodes to other distributed filesystems could be represented
   by the following XDR definition:

   enum nodetype4 {
           NAME_NFS_URL    = 1,
           NAME_NFS_IP     = 2,
           NAME_SMB        = 3
   };





Expires: August 2005                                            [Page 5]

Title                 A Namespace For NFS Version 4        February 2005


   enum ipaddrtype {
           NAME_IPV4       = 1,
           NAME_IPV6       = 2
   };

   struct nameipnode4 {
           ipaddrtype type;
           opaque addr<NAME_MAX_ADDR>;
           opaque path<>;
   };

   union namenode4 switch (nodetype4 type) {
    case NAME_NFS_URL:
           /* nfs://server.domain.com/export/dir1/dir2 */
           opaque nfs_url<NAME_MAX_URL>;
    case NAME_NFS_IP:
           /* 10.0.0.2 + /export/dir1/dir2 */
           nameipnode4 nfs_ip;
    case NAME_SMB:
           /* As defined by:
   http://www.ietf.org/internet-drafts/draft-crhertel-smb-url-04.txt */
           /* smb://server.domain.com/export/dir1/dir2 */
           opaque smb_url<NAME_MAX_URL>;
   };

   enum opttype4 {
           NAME_ACCESS     = 1,
           NAME_MASTER     = 2,
           NAME_VOLID      = 3,
           NAME_STRING     = 4
   };

   enum access4 {
           NAME_RO         = 0,
           NAME_RW         = 1
   };

   struct keyvalue {
           opaque key<>;
           opaque value<>;
   };










Expires: August 2005                                            [Page 6]

Title                 A Namespace For NFS Version 4        February 2005


   union options4 switch (opttype4 type) {
    case NAME_ACCESS:      /* ro/rw */
           access4 acc;
    case NAME_MASTER:      /* master true/false */
           bool master;
    case NAME_VOLID:       /* volume ID if multiple paths to master */
           int64 volid;
    case NAME_STRING:      /* generic string=value option */
           keyvalue kv;
   };

   struct location4 {
           namenode4 loc;
           options4 opts<>;
   };

   This definition would permit servers to be able to send referrals
   containing NFS URLs, which would require a name service lookup, or a
   combination of IPv4 or IPv6 address and a path name, suitable for
   immediate use, and even an SMB URL for Samba servers.

5.3.  New error number: NFS4ERR_REFERRAL

   A new error number should be added to those defined in [RFC3530],
   defined this way:

   NFS4ERR_REFERRAL        The name being looked up is valid, but
                           refers to an object on another NFS server.
                           The RLOOKUP operation will provide more
                           information about this node.

5.4.  Enhanced Lookup Procedure

   An enhanced RLOOKUP operation is proposed for a future NFS Version 4
   minor revision.  It will act like the current LOOKUP operation in
   [RFC3530] in most cases, but will return much richer data in
   operation response when a node is a junction.  This extra information
   makes it possible to begin use of a referred filesystem without an
   extra round-trip.  The definition is:

   struct RLOOKUP4args {
           /* CURRENT_FH: directory */
           component4      objname;
   };







Expires: August 2005                                            [Page 7]

Title                 A Namespace For NFS Version 4        February 2005


   union referral4 switch (nfsstat4 status) {
    case NFS4ERR_REFERRAL:
           location4 locarray<>;
    default:
           void;
   };

   struct RLOOKUP4res {
           /* CURRENT_FH: object */
           referral4 refer;
   };


5.5.  An Example

   The client calls:

   Fd = open("/nfs/sun.com/corp/data/spreadsheet.pdf", ...);

   The following traffic would result:

   SLP SrvRqst "Global_NFS" --> Broadcast

   SLP SrvRply "master1:/, master2:/" <-- SLP server

   NFS COMPOUND {putrootfh rlookup nfs rlookup sun.com rlookup corp
       rlookup data open spreadsheet.pdf} --> master1

   NFS { putrootfh OK rlookup OK rlookup OK rlookup OK
       rlookup EREFER corp:/stuff} <-- master1

   NFS COMPOUND {putrootfh rlookup stuff rlookup data
       open spreadsheet.pdf } --> corp

   NFS { putrootfh OK rlookup OK rlookup OK
       rlookup EREFER cdata:/finance } <-- corp

   NFS COMPOUND {putrootfh rlookup finance rlookup data
       open spreadsheet.pdf } --> cdata

   NFS { putrootfh OK rlookup OK rlookup OK open OK } <-- cdata










Expires: August 2005                                            [Page 8]

Title                 A Namespace For NFS Version 4        February 2005


6.  Server-Namespace Interaction

   Though this document intends to specify the client-server
   interactions of the namespace, it is interesting to speculate on how
   servers will construct the namespace abstraction for the client.
   There are two main ways to do this, which differ in where the "real"
   data lives.

6.1.  Use The Directory

   In this scenario, the real home of the data is in a set of
   interrelated nodes in an LDAP directory.  The server enumerates a
   list of junction points from the directory and marks those nodes as
   requiring special handling, and accesses to these nodes result in an
   LDAP lookup to find the latest data to return to the client.  This
   group would standardize an LDAP schema and management would be via
   LDAP tools.  This has the benefit that an LDAP schema would be a
   well-understood concept and that tools should be available to manage
   it.  A disadvantage is that NFS server implementations are usually
   embedded in the operating system kernel, requiring LDAP lookups to
   involve a user-level daemon.  Also, unavailability of the LDAP
   service will cause issues for the server.

6.2.  Use Replication

   In this scenario, the new virtual junction becomes an actual
   filesystem object, and contains the data needed by the client.  The
   junction object could be created on the master filesystem and
   propagated by filesystem replication as defined in [REPLMIG].  For
   managability, an SNMP MIB could be defined to enumerate all junction
   points in a particular filesystem and to manipulate their properties.
   A management tool would construct an image of the namespace by
   consulting the root of the global filesystem and walking down as
   needed.  This has the advantage that servers would always have data
   to give to clients, and that changes in the linkage of filesystems
   would be identical to other changes to the linkage of directories in
   the filesystem as far as the client could see.














Expires: August 2005                                            [Page 9]

Title                 A Namespace For NFS Version 4        February 2005


7.  Appendix A: XDR Protocol Definition File

   /*
    * Copyright (C) The Internet Society (2003).
    *  All Rights Reserved.
    */

   /*
    * node.x
    */

   %#pragma ident  "@(#)node.x     1.2    03/05/21"

   enum nodetype4 {
           NAME_NFS_URL    = 1,
           NAME_NFS_IP     = 2,
           NAME_SMB        = 3
   };

   enum ipaddrtype {
           NAME_IPV4       = 1,
           NAME_IPV6       = 2
   };

   struct nameipnode4 {
           ipaddrtype type;
           opaque addr<NAME_MAX_ADDR>;
           opaque path<>;
   };

   union namenode4 switch (nodetype4 type) {
    case NAME_NFS_URL:
           /* nfs://server.domain.com/export/dir1/dir2 */
           opaque nfs_url<NAME_MAX_URL>;
    case NAME_NFS_IP:
           /* 10.0.0.2 + /export/dir1/dir2 */
           nameipnode4 nfs_ip;
    case NAME_SMB:
           /* As defined by:
   http://www.ietf.org/internet-drafts/draft-crhertel-smb-url-04.txt */
           /* smb://server.domain.com/export/dir1/dir2 */
           opaque smb_url<NAME_MAX_URL>;
   };

   enum opttype4 {
           NAME_ACCESS     = 1,
           NAME_MASTER     = 2,
           NAME_VOLID      = 3,



Expires: August 2005                                           [Page 10]

Title                 A Namespace For NFS Version 4        February 2005


           NAME_STRING     = 4
   };

   enum access4 {
           NAME_RO         = 0,
           NAME_RW         = 1
   };

   struct keyvalue {
           opaque key<>;
           opaque value<>;
   };

   union options4 switch (opttype4 type) {
    case NAME_ACCESS:      /* ro/rw */
           access4 acc;
    case NAME_MASTER:      /* master true/false */
           bool master;
    case NAME_VOLID:       /* volume ID if multiple paths to master */
           int64 volid;
    case NAME_STRING:      /* generic string=value option */
           keyvalue kv;
   };

   struct location4 {
           namenode4 loc;
           options4 opts<>;
   };

   struct RLOOKUP4args {
           /* CURRENT_FH: directory */
           component4      objname;
   };

   union referral4 switch (nfsstat4 status) {
    case NFS4ERR_REFERRAL:
           location4 locarray<>;
    default:
           void;
   };

   struct RLOOKUP4res {
           /* CURRENT_FH: object */
           referral4 refer;
   };






Expires: August 2005                                           [Page 11]

Title                 A Namespace For NFS Version 4        February 2005


8.  Full Copyright Statement

   Copyright (C) The Internet Society (2005).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

9.  Intellectual property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at ietf-
   ipr@ietf.org.














Expires: August 2005                                           [Page 12]

Title                 A Namespace For NFS Version 4        February 2005


10.  Normative References


   [RFC1831]
   R. Srinivasan, "RPC: Remote Procedure Call Protocol Specification
   Version 2", RFC1831, August 1995.


   [RFC1832]
   R. Srinivasan, "XDR: External Data Representation Standard", RFC1832,
   August 1995.


   [RFC2165]
   J. Veizades, E. Guttman, C. Perkins, S. Kaplan, "Service Location
   Protocol", RFC2165, June 1997


   [RFC3530]
   S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M.
   Eisler, D. Noveck, "Network File System (NFS) Version 4 Protocol",
   RFC3530, April 2003.


   [RFC2608]
   E. Guttman, C. Perkins, J. Veizades, M. Day, "Service Location
   Protocol, Version 2", RFC2608, June 1999.


11.  Informative References


   [RFC3010]
   S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M.
   Eisler, D. Noveck, "NFS version 4 Protocol", RFC3010, December 2000.


   [NS_PROBLEM]
   C. Charles Fan, "NFSv4 Global Namespace Problem Statement", draft-
   fan-nfsv4-global-namespace-problem-statement-00.txt, February 2005.











Expires: August 2005                                           [Page 13]

Title                 A Namespace For NFS Version 4        February 2005


12.  Author's Address

   Address comments related to this memorandum to:

        nfsv4-wg@sunroof.eng.sun.com

   Robert Thurlow
   Sun Microsystems, Inc.
   500 Eldorado Boulevard, UBRM05-171
   Broomfield, CO 80021

   Phone: 877-718-3419
   E-mail: robert.thurlow@sun.com






































Expires: August 2005                                           [Page 14]