Internet DRAFT - draft-thurlow-nfsv4-namespace
draft-thurlow-nfsv4-namespace
Network Working Group Robert Thurlow
Internet Draft February 2005
Document: draft-thurlow-nfsv4-namespace-01.txt
A Namespace For NFS Version 4
Status of this Memo
By submitting this Internet-Draft, I certify that any applicable
patent or other IPR claims of which I am aware have been disclosed,
and any of which I become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
Discussion and suggestions for improvement are requested. This
document will expire in August, 2005. Distribution of this draft is
unlimited.
Abstract
Recent work on Replication and Migration for NFSv4 has reminded us of
a more fundamental problem: NFS currently lacks a coherent
enterprise or global namespace. With changes in a minor revision of
NFS Version 4, this can be addressed to make services like
replication and migration of filesystems fully functional.
This draft is considered obsolete, but is refreshed for the
convenience of those reviewing Charles Fan's [NS_PROBLEM] document.
Expires: August 2005 [Page 1]
Title A Namespace For NFS Version 4 February 2005
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Problem Statement . . . . . . . . . . . . . . . . . . . . . 4
3. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4
4. Implementation Options . . . . . . . . . . . . . . . . . . . 4
5. Minor Revision Client-Server Changes . . . . . . . . . . . . 5
5.1. Finding The Global Root . . . . . . . . . . . . . . . . . 5
5.2. Junction Nodes . . . . . . . . . . . . . . . . . . . . . . 5
5.3. New error number: NFS4ERR_REFERRAL . . . . . . . . . . . . 7
5.4. Enhanced Lookup Procedure . . . . . . . . . . . . . . . . 7
5.5. An Example . . . . . . . . . . . . . . . . . . . . . . . . 8
6. Server-Namespace Interaction . . . . . . . . . . . . . . . . 9
6.1. Use The Directory . . . . . . . . . . . . . . . . . . . . 9
6.2. Use Replication . . . . . . . . . . . . . . . . . . . . . 9
7. Appendix A: XDR Protocol Definition File . . . . . . . . . 10
8. Full Copyright Statement . . . . . . . . . . . . . . . . . 12
9. Intellectual property . . . . . . . . . . . . . . . . . . 12
10. Normative References . . . . . . . . . . . . . . . . . . 13
11. Informative References . . . . . . . . . . . . . . . . . 13
12. Author's Address . . . . . . . . . . . . . . . . . . . . 14
Expires: August 2005 [Page 2]
Title A Namespace For NFS Version 4 February 2005
1. Introduction
Unlike some other distributed filesystems, NFS has never had a
unified, Internet- or enterprise-level namespace. NFS namespaces are
often, at best, confined to a set of machines within an
administrative domain which is often smaller or much smaller than the
company-wide Intranet. This has become a larger issue with the EOL
of such systems as AFS and DCE/DFS, which each provide a strong and
unified namespace at the enterprise level that can be extended to a
global namespace for the Internet.
To create a large-scale namespace, we have to reconfigure the way the
NFS client discovers resources. In general, the client specially
handles certain "junction" points in its view of the namespace; when
one of these junction points is manipulated, the client consults some
kind of distributed database to find distributed filesystems and
attributes for them, and then mounts and uses them. This makes the
namespace a composite construction accessed by different protocols at
different times. This is neither necessary nor desireable; with an
extensible distributed filesystem protocol such as NFS Version 4,
these junctions can be embedded into the NFS virtual filesystem and
all necessary information can be discovered by NFSv4 protocol
operations. The server could also support mechanisms to work with
other distributed filesystems.
Once clients understand junctions and how to get referrals to actual
locations, we can support generic servers to provide clients with
easy access to the namespace. These servers need not store any data
files, but could just store a replica of the top level of the global
namespace. They could advertise the global root via SLP [RFC2608]
and let clients find the locations of useful filesystems via
referrals. This easily supports an enterprise-level consistent
namespace, which can be made global with industry agreement regarding
the management of a global root directory and the servers to present
it.
A namespace solves some problems for [REPLMIG] as well. An unsolved
problem in replication is how to inform the client of the locations
of the replicas; so far, non-standard extentions to automounter maps
and manual mount command syntax must be used. With an extended
lookup, attributes and several or all locations can be returned in-
band.
This document does not currently define a syntax for the top levels
of the filesystem. Any syntax should define the names used for the
top two levels (e.g. "/nfs/sun.com"), and should also define a
shorthand for accessing the enterprise-level root without the need to
go through the top level unnecessarily.
Expires: August 2005 [Page 3]
Title A Namespace For NFS Version 4 February 2005
2. Problem Statement
Customers have seen a gap in NFS with respect to such distributed
filesystems as AFS and DCE/DFS. They want to be able to build a
namespace which is seen consistently by all clients in their
enterprise, and some hope for a truly global Internet-wide hierarchy
of files, to which you would gain only the access level you deserved.
They want to be able to delegate authority to match the data; I may
not be able to dictate where my home directory is in the namespace,
but I should be able to construct my home directory as multiple
filesystems and be able to "publish" those parts. Because the best
we have been able to do involves a highly-configurable Automounter
daemon with variance across platforms, combined with non-standard
databases with non-standard filesystem location information, the NFS
industry can't currently offer this functionality.
3. Requirements
Customers requirements include the following:
o Permit me to build (at least) enterprise-wide namespaces
o Permit me to delegate management of parts of the namespace to
owners of data
o Make this manageable from almost anywhere
o Don't make me deploy a new naming service
o Permit reasonable backwards compatibility
4. Implementation Options
The problem boils down to a couple of issues:
o How does the client find a server for a relevant root vnode?
o How does the client detect and navigate a "junction point" where
it must transition from a higher-level to a lower-level
filesystem?
The first issue can be solved by configuring the root location into
the client or by having the client do a network transaction to find a
suitable global root server. Hard-coding this information does not
scale to many clients, so a network transaction is in order. The
most suitable deployed service to find an instance of a highly-
replcated object is the Service Location Protocol [SLP]. By doing an
Expires: August 2005 [Page 4]
Title A Namespace For NFS Version 4 February 2005
SLP request, a client should be able to find a nearby server which
knows how to find the global root and can thus see all the data in
that one filesystem. Typically, though, that filesystem would
consist almost entirely of some locations of more interesting
resources.
The "junction" point, where virtual filesystems meet, is inherently
an abstraction, since the separate virtual filesystems must never
completely look like a single one. Since the junction is abstract,
there are different ways to construct it. The client can construct
it based on information from a distributed service such as LDAP or
the server can construct it and make it visible through the NFS
protocol. If the server constructs it, it could again base the
construction on a service like LDAP, or it could hold copies in
actual filesystem objects, with the filesystems managed as replicas.
The best choice at this time seems to be to have the server make an
abstraction visible to the client via NFS Version 4 minor revision
protocol elements. The server would be able to construct symbolic
links for NFS Version 2 and Version 3 clients and to construct other
types of referrals to other distributed filesystems (e.g. Microsoft
DFS referrals for CIFS clients).
How the server gets its data is not so clear at present and is not
specified by this draft. Further ideas are discussed in Section 6.
5. Minor Revision Client-Server Changes
5.1. Finding The Global Root
The NFS client should begin navigation of the global namespace by
issuing an SLP call to look for a service named "Global_NFS". It
should then attempt to ask that server for information about the
global or enterprise root.
5.2. Junction Nodes
Junction nodes to other distributed filesystems could be represented
by the following XDR definition:
enum nodetype4 {
NAME_NFS_URL = 1,
NAME_NFS_IP = 2,
NAME_SMB = 3
};
Expires: August 2005 [Page 5]
Title A Namespace For NFS Version 4 February 2005
enum ipaddrtype {
NAME_IPV4 = 1,
NAME_IPV6 = 2
};
struct nameipnode4 {
ipaddrtype type;
opaque addr<NAME_MAX_ADDR>;
opaque path<>;
};
union namenode4 switch (nodetype4 type) {
case NAME_NFS_URL:
/* nfs://server.domain.com/export/dir1/dir2 */
opaque nfs_url<NAME_MAX_URL>;
case NAME_NFS_IP:
/* 10.0.0.2 + /export/dir1/dir2 */
nameipnode4 nfs_ip;
case NAME_SMB:
/* As defined by:
http://www.ietf.org/internet-drafts/draft-crhertel-smb-url-04.txt */
/* smb://server.domain.com/export/dir1/dir2 */
opaque smb_url<NAME_MAX_URL>;
};
enum opttype4 {
NAME_ACCESS = 1,
NAME_MASTER = 2,
NAME_VOLID = 3,
NAME_STRING = 4
};
enum access4 {
NAME_RO = 0,
NAME_RW = 1
};
struct keyvalue {
opaque key<>;
opaque value<>;
};
Expires: August 2005 [Page 6]
Title A Namespace For NFS Version 4 February 2005
union options4 switch (opttype4 type) {
case NAME_ACCESS: /* ro/rw */
access4 acc;
case NAME_MASTER: /* master true/false */
bool master;
case NAME_VOLID: /* volume ID if multiple paths to master */
int64 volid;
case NAME_STRING: /* generic string=value option */
keyvalue kv;
};
struct location4 {
namenode4 loc;
options4 opts<>;
};
This definition would permit servers to be able to send referrals
containing NFS URLs, which would require a name service lookup, or a
combination of IPv4 or IPv6 address and a path name, suitable for
immediate use, and even an SMB URL for Samba servers.
5.3. New error number: NFS4ERR_REFERRAL
A new error number should be added to those defined in [RFC3530],
defined this way:
NFS4ERR_REFERRAL The name being looked up is valid, but
refers to an object on another NFS server.
The RLOOKUP operation will provide more
information about this node.
5.4. Enhanced Lookup Procedure
An enhanced RLOOKUP operation is proposed for a future NFS Version 4
minor revision. It will act like the current LOOKUP operation in
[RFC3530] in most cases, but will return much richer data in
operation response when a node is a junction. This extra information
makes it possible to begin use of a referred filesystem without an
extra round-trip. The definition is:
struct RLOOKUP4args {
/* CURRENT_FH: directory */
component4 objname;
};
Expires: August 2005 [Page 7]
Title A Namespace For NFS Version 4 February 2005
union referral4 switch (nfsstat4 status) {
case NFS4ERR_REFERRAL:
location4 locarray<>;
default:
void;
};
struct RLOOKUP4res {
/* CURRENT_FH: object */
referral4 refer;
};
5.5. An Example
The client calls:
Fd = open("/nfs/sun.com/corp/data/spreadsheet.pdf", ...);
The following traffic would result:
SLP SrvRqst "Global_NFS" --> Broadcast
SLP SrvRply "master1:/, master2:/" <-- SLP server
NFS COMPOUND {putrootfh rlookup nfs rlookup sun.com rlookup corp
rlookup data open spreadsheet.pdf} --> master1
NFS { putrootfh OK rlookup OK rlookup OK rlookup OK
rlookup EREFER corp:/stuff} <-- master1
NFS COMPOUND {putrootfh rlookup stuff rlookup data
open spreadsheet.pdf } --> corp
NFS { putrootfh OK rlookup OK rlookup OK
rlookup EREFER cdata:/finance } <-- corp
NFS COMPOUND {putrootfh rlookup finance rlookup data
open spreadsheet.pdf } --> cdata
NFS { putrootfh OK rlookup OK rlookup OK open OK } <-- cdata
Expires: August 2005 [Page 8]
Title A Namespace For NFS Version 4 February 2005
6. Server-Namespace Interaction
Though this document intends to specify the client-server
interactions of the namespace, it is interesting to speculate on how
servers will construct the namespace abstraction for the client.
There are two main ways to do this, which differ in where the "real"
data lives.
6.1. Use The Directory
In this scenario, the real home of the data is in a set of
interrelated nodes in an LDAP directory. The server enumerates a
list of junction points from the directory and marks those nodes as
requiring special handling, and accesses to these nodes result in an
LDAP lookup to find the latest data to return to the client. This
group would standardize an LDAP schema and management would be via
LDAP tools. This has the benefit that an LDAP schema would be a
well-understood concept and that tools should be available to manage
it. A disadvantage is that NFS server implementations are usually
embedded in the operating system kernel, requiring LDAP lookups to
involve a user-level daemon. Also, unavailability of the LDAP
service will cause issues for the server.
6.2. Use Replication
In this scenario, the new virtual junction becomes an actual
filesystem object, and contains the data needed by the client. The
junction object could be created on the master filesystem and
propagated by filesystem replication as defined in [REPLMIG]. For
managability, an SNMP MIB could be defined to enumerate all junction
points in a particular filesystem and to manipulate their properties.
A management tool would construct an image of the namespace by
consulting the root of the global filesystem and walking down as
needed. This has the advantage that servers would always have data
to give to clients, and that changes in the linkage of filesystems
would be identical to other changes to the linkage of directories in
the filesystem as far as the client could see.
Expires: August 2005 [Page 9]
Title A Namespace For NFS Version 4 February 2005
7. Appendix A: XDR Protocol Definition File
/*
* Copyright (C) The Internet Society (2003).
* All Rights Reserved.
*/
/*
* node.x
*/
%#pragma ident "@(#)node.x 1.2 03/05/21"
enum nodetype4 {
NAME_NFS_URL = 1,
NAME_NFS_IP = 2,
NAME_SMB = 3
};
enum ipaddrtype {
NAME_IPV4 = 1,
NAME_IPV6 = 2
};
struct nameipnode4 {
ipaddrtype type;
opaque addr<NAME_MAX_ADDR>;
opaque path<>;
};
union namenode4 switch (nodetype4 type) {
case NAME_NFS_URL:
/* nfs://server.domain.com/export/dir1/dir2 */
opaque nfs_url<NAME_MAX_URL>;
case NAME_NFS_IP:
/* 10.0.0.2 + /export/dir1/dir2 */
nameipnode4 nfs_ip;
case NAME_SMB:
/* As defined by:
http://www.ietf.org/internet-drafts/draft-crhertel-smb-url-04.txt */
/* smb://server.domain.com/export/dir1/dir2 */
opaque smb_url<NAME_MAX_URL>;
};
enum opttype4 {
NAME_ACCESS = 1,
NAME_MASTER = 2,
NAME_VOLID = 3,
Expires: August 2005 [Page 10]
Title A Namespace For NFS Version 4 February 2005
NAME_STRING = 4
};
enum access4 {
NAME_RO = 0,
NAME_RW = 1
};
struct keyvalue {
opaque key<>;
opaque value<>;
};
union options4 switch (opttype4 type) {
case NAME_ACCESS: /* ro/rw */
access4 acc;
case NAME_MASTER: /* master true/false */
bool master;
case NAME_VOLID: /* volume ID if multiple paths to master */
int64 volid;
case NAME_STRING: /* generic string=value option */
keyvalue kv;
};
struct location4 {
namenode4 loc;
options4 opts<>;
};
struct RLOOKUP4args {
/* CURRENT_FH: directory */
component4 objname;
};
union referral4 switch (nfsstat4 status) {
case NFS4ERR_REFERRAL:
location4 locarray<>;
default:
void;
};
struct RLOOKUP4res {
/* CURRENT_FH: object */
referral4 refer;
};
Expires: August 2005 [Page 11]
Title A Namespace For NFS Version 4 February 2005
8. Full Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
9. Intellectual property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at ietf-
ipr@ietf.org.
Expires: August 2005 [Page 12]
Title A Namespace For NFS Version 4 February 2005
10. Normative References
[RFC1831]
R. Srinivasan, "RPC: Remote Procedure Call Protocol Specification
Version 2", RFC1831, August 1995.
[RFC1832]
R. Srinivasan, "XDR: External Data Representation Standard", RFC1832,
August 1995.
[RFC2165]
J. Veizades, E. Guttman, C. Perkins, S. Kaplan, "Service Location
Protocol", RFC2165, June 1997
[RFC3530]
S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M.
Eisler, D. Noveck, "Network File System (NFS) Version 4 Protocol",
RFC3530, April 2003.
[RFC2608]
E. Guttman, C. Perkins, J. Veizades, M. Day, "Service Location
Protocol, Version 2", RFC2608, June 1999.
11. Informative References
[RFC3010]
S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M.
Eisler, D. Noveck, "NFS version 4 Protocol", RFC3010, December 2000.
[NS_PROBLEM]
C. Charles Fan, "NFSv4 Global Namespace Problem Statement", draft-
fan-nfsv4-global-namespace-problem-statement-00.txt, February 2005.
Expires: August 2005 [Page 13]
Title A Namespace For NFS Version 4 February 2005
12. Author's Address
Address comments related to this memorandum to:
nfsv4-wg@sunroof.eng.sun.com
Robert Thurlow
Sun Microsystems, Inc.
500 Eldorado Boulevard, UBRM05-171
Broomfield, CO 80021
Phone: 877-718-3419
E-mail: robert.thurlow@sun.com
Expires: August 2005 [Page 14]