Hi-
(Sorry for the length, it's a pretty wonky problem).
I'm working with the IETF NFSv4 working group on a schema for storing file system
referral information in LDAP, as part of the FedFS standard (RFC 5716). I'm looking
for some opinions about certain details of the schema.
A "referral" is a file server response that conveys a table of file server
hostname and export pathname pairs that are replicas of the local file set on that file
server. When a referral is encountered, a file system client chooses a row from this
table and mounts that export automatically as it continues to traverse the file system
name space. Referrals are a standard part of both the NFSv4 and SMB protocols.
FedFS provides a standard way to store these rows in an LDAP database. Each row is
contained in a single LDAP record, called a File Set Location record. A group of these
records that live under the same parent record is retrieved by a file server to generate
the table in a referral response.
Today an FSL record for an NFS referral contains among other things a UTF-8 string server
hostname, an integer port number, and a binary blob containing an XDR-marshalled
representation of the export pathname. Note that both the pathname components and
hostname are represented in UTF-8 in the NFS protocol, which is why they are stored as
UTF-8 in LDAP.
XDR was chosen because the file server doesn't have to alter the pathname data it
reads from the LDAP server; it can just turn it around and send it immediately on to NFS
clients. The pathname's components are UTF-8 strings. The pathname is expressed as
an ordered variable-length list of these strings.
The pathname separator is not stored in the XDR blob, since physical file systems can use
different characters for this purpose (HFS+ on Mac OS uses ":", POSIX uses
"/", and Windows uses "\"). NFS typically performs single component
lookups on the wire, so NFS clients are never concerned with how a file server might
separate its pathname components.
The downside of using a binary XDR blob is that it's not observable or editable via
typical LDAP tools. Plus, ewww.
It's been suggested that we use a file URL to represent export pathnames. A file URL
is expressed in US-ASCII with escaping, and the pathname separator is stored in the
string. A file URL also has the ability to store a hostname.
"file://hostname/path/to/some/file"
I'm not sure this is the best fit for our purpose. We're especially concerned
about some of the complexities of converting escaped US-ASCII to UTF-8, and the use of a
fixed pathname separator character. Can we represent the full range of the UTF-8 code set
with a US-ASCII file URL?
We could also use an NFS URL, which would allow us to express the server hostname, a port
number, and the pathname in a single string. But both the hostname and pathname are
enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator
character.
An alternative we have considered would store the pathname in a single-valued UTF-8 string
attribute, including pathname separators, but also store the pathname separator character
in a separate attribute. A simple escaping mechanism would be used to represent a
separator character embedded in a component.
We'd like to have a schema that represents referral data in a way that is considered
natural for LDAP, can store the full richness that an NFS referral is capable of, and is
easy to access and update with typical LDAP client tools like ldapmodify.
Are there other ideas we haven't considered? What is a practical way to store an
ordered variable-length list of strings in an LDAP attribute? Is there a similar CIFS URL
format that might be used to store SMB share information?
Thanks very much for your consideration.
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com