New subject: Dynamic syntax support vs. slapo-constraint vs. schema declaration vs. whatever

31 Oct 2007


      Here's a project for someone fairly ambitious:
The goal is to be able to dynamically add support for new syntaxes to slapd 
simply by feeding in their specifications. Typically this would mean feeding 
in an ABNF description from an IETF document, though it may also mean feeding 
in ASN.1 documents. E.g., we want to be able to insert (using ldapmodify) the 
ABNF into an LDAP entry somewhere and have it automatically converted into an 
executable parser.
There are a couple of tools out there for working with ABNF. The IETF actually 
uses this one
http://www.coasttocoastresearch.com/Apg5_0/Apg5_0_description.htm
for validating IETF drafts.
There's also this one
http://www.2p.cz/en/abnf_gen
I don't think either of these are suitable as they exist, because they don't 
generate output that is embeddable and directly executable. I guess you could 
feed the generated C code into an embedded C interpreter, but that seems like 
a ridiculously wasteful solution. Most of these existing tools will analyze a 
grammar and spit out source code to recognize it, but we don't want the source 
code, we just want the recognition action.
I think a more practical route will involve embedding a Forth engine into the 
OpenLDAP code. There's already been a fair amount of work done with BNF in 
Forth, so it shouldn't be too hard to get started:
http://www.zetetics.com/bj/papers/bnfparse.htm
http://www-personal.umich.edu/~williams/archive/forth/strings/index.html
http://www-personal.umich.edu/~williams/archive/forth/strings/expr.html
The nice thing about Forth is that the core language is tiny, it's extremely 
portable, it's extremely extensible, and any implementation is inherently both 
an interpreter and a compiler. As interpreted languages goes, it beats pretty 
much every other common scripting language for memory and CPU efficiency.
It ought to be straightforward to convert a grammar into a set of Forth words 
to recognize the grammar, for a slapd syntax validator. I'm not sure yet what 
we would do for Prettifiers or Normalizers, nor what to do for Matching rules. 
We could simply accept them as Forth source code. Personally I haven't written 
Forth in over 20 years but it seems to me that a TIL is the most suitable type 
of language for the job.
On the whole it doesn't seem like new syntaxes pop up in LDAP specs very 
often. Some may say this is because they're not really needed. I believe 
they've been sorely needed, and people have just avoided them because they 
required too much work to properly support in existing implementations. We can 
fix that.
-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/