Hi-
(Sorry for the length, it's a pretty wonky problem).
I'm working with the IETF NFSv4 working group on a schema for storing file system referral information in LDAP, as part of the FedFS standard (RFC 5716). I'm looking for some opinions about certain details of the schema.
A "referral" is a file server response that conveys a table of file server hostname and export pathname pairs that are replicas of the local file set on that file server. When a referral is encountered, a file system client chooses a row from this table and mounts that export automatically as it continues to traverse the file system name space. Referrals are a standard part of both the NFSv4 and SMB protocols.
FedFS provides a standard way to store these rows in an LDAP database. Each row is contained in a single LDAP record, called a File Set Location record. A group of these records that live under the same parent record is retrieved by a file server to generate the table in a referral response.
Today an FSL record for an NFS referral contains among other things a UTF-8 string server hostname, an integer port number, and a binary blob containing an XDR-marshalled representation of the export pathname. Note that both the pathname components and hostname are represented in UTF-8 in the NFS protocol, which is why they are stored as UTF-8 in LDAP.
XDR was chosen because the file server doesn't have to alter the pathname data it reads from the LDAP server; it can just turn it around and send it immediately on to NFS clients. The pathname's components are UTF-8 strings. The pathname is expressed as an ordered variable-length list of these strings.
The pathname separator is not stored in the XDR blob, since physical file systems can use different characters for this purpose (HFS+ on Mac OS uses ":", POSIX uses "/", and Windows uses ""). NFS typically performs single component lookups on the wire, so NFS clients are never concerned with how a file server might separate its pathname components.
The downside of using a binary XDR blob is that it's not observable or editable via typical LDAP tools. Plus, ewww.
It's been suggested that we use a file URL to represent export pathnames. A file URL is expressed in US-ASCII with escaping, and the pathname separator is stored in the string. A file URL also has the ability to store a hostname.
"file://hostname/path/to/some/file"
I'm not sure this is the best fit for our purpose. We're especially concerned about some of the complexities of converting escaped US-ASCII to UTF-8, and the use of a fixed pathname separator character. Can we represent the full range of the UTF-8 code set with a US-ASCII file URL?
We could also use an NFS URL, which would allow us to express the server hostname, a port number, and the pathname in a single string. But both the hostname and pathname are enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator character.
An alternative we have considered would store the pathname in a single-valued UTF-8 string attribute, including pathname separators, but also store the pathname separator character in a separate attribute. A simple escaping mechanism would be used to represent a separator character embedded in a component.
We'd like to have a schema that represents referral data in a way that is considered natural for LDAP, can store the full richness that an NFS referral is capable of, and is easy to access and update with typical LDAP client tools like ldapmodify.
Are there other ideas we haven't considered? What is a practical way to store an ordered variable-length list of strings in an LDAP attribute? Is there a similar CIFS URL format that might be used to store SMB share information?
Thanks very much for your consideration.
Chuck Lever wrote:
The downside of using a binary XDR blob is that it's not observable or editable via typical LDAP tools.
Some tools allow implementing plugins for certain attributes. But of course it's rather cumbersome.
It's been suggested that we use a file URL to represent export pathnames. A file URL is expressed in US-ASCII with escaping, [..] Can we represent the full range of the UTF-8 code set with a US-ASCII file URL?
Yes, of course just like HTTP URLs can contain non-ASCII chars in an URL-quoted form. You first encode to UTF-8 and then URL-quote. Decoding means URL-unquote and the decode UTF-8 to Unicode char entities.
We could also use an NFS URL, which would allow us to express the server hostname, a port number, and the pathname in a single string. But both the hostname and pathname are enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator character.
That's what I would prefer. Think of file browsers which can open the NFS mount point just by clicking on it. Same encoding steps as with file URLs.
An alternative we have considered would store the pathname in a single-valued UTF-8 string attribute, including pathname separators, but also store the pathname separator character in a separate attribute. A simple escaping mechanism would be used to represent a separator character embedded in a component.
I would not do this.
Ciao, Michael.
Michael Ströder wrote:
Chuck Lever wrote:
Can we represent the full range of the UTF-8 code set with a US-ASCII file URL?
Yes, of course just like HTTP URLs can contain non-ASCII chars in an URL-quoted form. You first encode to UTF-8 and then URL-quote. Decoding means URL-unquote and the decode UTF-8 to Unicode char entities.
We could also use an NFS URL, which would allow us to express the server hostname, a port number, and the pathname in a single string. But both the hostname and pathname are enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator character.
That's what I would prefer. Think of file browsers which can open the NFS mount point just by clicking on it. Same encoding steps as with file URLs.
This seems the most obvious and natural solution (NFS URL). After all, you are specifying an NFS resource...
On Aug 9, 2012, at 2:46 PM, Howard Chu wrote:
Michael Ströder wrote:
Chuck Lever wrote:
Can we represent the full range of the UTF-8 code set with a US-ASCII file URL?
Yes, of course just like HTTP URLs can contain non-ASCII chars in an URL-quoted form. You first encode to UTF-8 and then URL-quote. Decoding means URL-unquote and the decode UTF-8 to Unicode char entities.
OK, that makes sense.
We could also use an NFS URL, which would allow us to express the server hostname, a port number, and the pathname in a single string. But both the hostname and pathname are enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator character.
That's what I would prefer. Think of file browsers which can open the NFS mount point just by clicking on it. Same encoding steps as with file URLs.
This seems the most obvious and natural solution (NFS URL). After all, you are specifying an NFS resource...
I've looked more closely at this idea. While it's got some surface appeal, NFS URLs (RFC 2224) don't specify a generic NFS resource. They specify a webDAV like resource that can be accessed with NFS, called WebNFS (RFC 2054, RFC 20550), which gives clients access via a so-called "public file handle," which is a degenerate NFS FH.
WebNFS is defined only for legacy versions of NFS, not for NFSv4. Referrals are supported only in NFSv4. In fact, section 4 of RFC 2224 specifies that clients try version 3 then version 2. NFS version 4 is not discussed.
Thus, the form of an NFS URL might be rich enough, but the existing semantics are not equivalent.
There may still be value, however, in understanding any issues around expressing a single pathname string with a fixed separator character and how we can use that to express a list of UTF-8 pathname components.
Otherwise I think we would have to specify a new NFS URL format to get what we need.
Chuck Lever wrote:
On Aug 9, 2012, at 2:46 PM, Howard Chu wrote:
Michael Ströder wrote:
Chuck Lever wrote:
We could also use an NFS URL, which would allow us to express the server hostname, a port number, and the pathname in a single string. But both the hostname and pathname are enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator character.
That's what I would prefer. Think of file browsers which can open the NFS mount point just by clicking on it. Same encoding steps as with file URLs.
This seems the most obvious and natural solution (NFS URL). After all, you are specifying an NFS resource...
I've looked more closely at this idea. While it's got some surface appeal, NFS URLs (RFC 2224) don't specify a generic NFS resource. They specify a webDAV like resource that can be accessed with NFS, called WebNFS (RFC 2054, RFC 20550), which gives clients access via a so-called "public file handle," which is a degenerate NFS FH.
WebNFS is defined only for legacy versions of NFS, not for NFSv4. Referrals are supported only in NFSv4. In fact, section 4 of RFC 2224 specifies that clients try version 3 then version 2. NFS version 4 is not discussed.
Thus, the form of an NFS URL might be rich enough, but the existing semantics are not equivalent.
I see no reason why it should not be able to use NFS URLs and define the exact usage of them for NFSv4. Maybe I'm overlooking something though.
Ciao, Michael.
On Aug 9, 2012, at 1:40 PM, Michael Ströder wrote:
Chuck Lever wrote:
It's been suggested that we use a file URL to represent export pathnames. A file URL is expressed in US-ASCII with escaping, [..] Can we represent the full range of the UTF-8 code set with a US-ASCII file URL?
Yes, of course just like HTTP URLs can contain non-ASCII chars in an URL-quoted form. You first encode to UTF-8 and then URL-quote. Decoding means URL-unquote and the decode UTF-8 to Unicode char entities.
We could also use an NFS URL, which would allow us to express the server hostname, a port number, and the pathname in a single string. But both the hostname and pathname are enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator character.
That's what I would prefer. Think of file browsers which can open the NFS mount point just by clicking on it. Same encoding steps as with file URLs.
One final question about this.
NFS fs_locations data can contain, as the file server's hostname: an i18n DNS label, an IPv4 presentation address, or an IPv6 presentation address. Any of these can be specified with or without a port number.
The problem lies with IPv6 presentation addresses with a port specified. For a URL the form is generally:
"nfs://" presentation-address ":" port "/" pathname
But as we all know, IPv6 addresses have a variable number of colons in them.
For NFS administrative interfaces, we generally just escape an IPv6 presentation address by surrounding it with square brackets. Then it's easy to recognize and pick off the ":port".
What is the appropriate URL syntax for specifying an IPv6 presentation address with a port?
Chuck Lever wrote:
On Aug 9, 2012, at 1:40 PM, Michael Ströder wrote:
Chuck Lever wrote:
It's been suggested that we use a file URL to represent export pathnames. A file URL is expressed in US-ASCII with escaping, [..] Can we represent the full range of the UTF-8 code set with a US-ASCII file URL?
Yes, of course just like HTTP URLs can contain non-ASCII chars in an URL-quoted form. You first encode to UTF-8 and then URL-quote. Decoding means URL-unquote and the decode UTF-8 to Unicode char entities.
We could also use an NFS URL, which would allow us to express the server hostname, a port number, and the pathname in a single string. But both the hostname and pathname are enocded in US-ASCII, not UTF-8, and the NFS URL format employs a fixed pathname separator character.
That's what I would prefer. Think of file browsers which can open the NFS mount point just by clicking on it. Same encoding steps as with file URLs.
One final question about this.
NFS fs_locations data can contain, as the file server's hostname: an i18n DNS label, an IPv4 presentation address, or an IPv6 presentation address. Any of these can be specified with or without a port number.
The problem lies with IPv6 presentation addresses with a port specified. For a URL the form is generally:
"nfs://" presentation-address ":" port "/" pathname
But as we all know, IPv6 addresses have a variable number of colons in them.
For NFS administrative interfaces, we generally just escape an IPv6 presentation address by surrounding it with square brackets. Then it's easy to recognize and pick off the ":port".
What is the appropriate URL syntax for specifying an IPv6 presentation address with a port?
Come on, Chuck. Do your own homework. RFC2732, from 1999. There's nothing OpenLDAP- or even LDAP-specific about that question.
openldap-technical@openldap.org