Re: LMDB Fixed memory address mapping documentation and example(s)? - openldap-devel

13 Jun 2018


      John Daly wrote:
...
Hi Howard --
(cc'ing openldap-devel for general interest)
Hi John,
    thanks for your interest. Unfortunately, this is a feature that we never 
fully implemented. I can give you a rundown of how it was intended to be used, 
  though. You have the overall concept - you must make sure that any object 
you want to store is marshalled into a single contiguous blob.
In OpenLDAP, the back-mdb backend already does this for us when storing LDAP 
entries, but it only goes half way.
Follow along in slap.h:
http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=blob;f=servers/sla...
We have a struct Entry with a couple of struct berval names and then a linked 
list of Attributes.
http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=blob;f=servers/sla...
Attributes point to an AttributeDescription and then have two arrays of struct 
bervals for values, and a next pointer.
So to serialize all of this into an LMDB blob we would malloc a single blob to 
hold an Entry struct contiguous with all of its Attribute structs and followed 
by all of its value arrays. The various struct fields would then point to 
addresses within this contiguous blob. This would be passed into mdb_put or 
mdb_cursor_put. Laid out appropriately, the object could be used as-is with no 
deserialization needed, on read.
The MDB_rel_func callback on the DB would have previously been set to a 
function that knows the layout of this Entry blob, and would increment or 
decrement all of these blob-internal pointers whenever the blob itself was 
moved by an LMDB operation. The instances where this occurs are when copying a 
read-only page to make a writable copy, when moving items in a page to fill 
the gap from deleting a node, and when splitting a page to insert a new node.
This latter part has never been implemented, because it turned out we never 
really needed it. Most LDAP entries are larger than half a page, so they wind 
up being written to an overflow page. Once on an overflow page, they are never 
moved or relocated.
In my experience, C++ doesn't give you much control over object storage 
layout, particularly not if you're using standard templates and other such 
stuff. I'm not sure how well you could leverage this capability from C++.
If you're looking at our implementation in back-mdb/id2entry.c to see how we 
actually store the complete Entry's, you'll notice that it's more complicated 
than I just described. The main reason here is that the AttributeDescription 
is a pointer into the slapd schema, and schema elements aren't (currently) 
stored in LMDB so we can't guarantee constant pointer values for those 
objects. Instead we had to use a mapping table from 16-bit integers to schema 
instances. Because of this, back-mdb still has to do some minor processing to 
turn an on-disk entry into a slapd in-memory entry. Doing the complete 
transition to persistent schema was deferred to OpenLDAP 2.6 or later.
...
We're investigating database (persistence engines) for use in an embedded 
environment and stumbled across your LMDB database.  The features of LMDB map 
quite nicely on to our needs and the fact that its very small, very fast, and 
highly configurable are also extremely attractive.
Our current (home-grown) OO persistence solution/framework has a long list of 
problems (complicated, large, slow, etc.) that I won't bore you with. One 
feature/mode of your LMDB that piqued our interest is the use of 'fixed 
address memory mapping'.  If we understand the feature correctly, we could 
effectively eliminate all our OO serialization/deserialization transformations 
by using a custom memory allocator to ensure objects-to-be-persisted are 
mapped into LMDB memory-mapped page space, thereby allowing LMDB (and the 
underlying OS virtual memory management system) to handle all our persistence 
needs.  Is this correct?
I've watched several talks you've given on LMDB, read thru the documentation, 
plus many articles & blogs on LMDB, but I haven't come across much information 
(or explicit examples) on using the 'fixed memory mapped feature', which leads 
me to my questions:
Are we interpreting this feature correctly?  Can LMDB be used as a persistence 
engine as described? If so, can you point us to documentation and/or examples 
that illustrate this use case?  If not, any pointers on LMDBs application as 
an OO persistence engine would be much appreciated.
Thanks in advance,
-John
P.S.  Our application is currently written in a combination of C# and C++.  
We're in the process of rearchitecting to address a number of issues and will 
be moving to a completely native (C++) implementation in the future.
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/