Re: LMDB encryption support - openldap-devel

10 Aug 2017

      Greg Hudson wrote:
...
On 08/10/2017 11:55 AM, Howard Chu wrote:
...
Thoughts? Hardcode 1 algorithm, or leave it pluggable?
Some thoughts, without advocating for either option:

If support isn't built-in, then generic LMDB tools (including

mdb_copy/dump/load/stat) can't operate on encrypted databases, if they
need plaintext pages to work.
Yeah, already thought about that. We can add an option to the generic tools to 
dynamically load a user-supplied module for such cases. I always wanted this 
for BerkeleyDB as well, to safely operate on DBs with custom comparators.
...

Built-in support doesn't necessarily mean hardcoding an algorithm for

all time, if the meta pages can include an algorithm selector.  One of
the selector values could even mean "use application callbacks".

Is the page size guaranteed to be a multiple of 16 bytes?  32 bytes?

I would assume yes to both; documenting that would make it easier to use
block ciphers since ciphertext expansion isn't allowed.
Yes, page sizes are always large powers of 2. 4096 bytes is typical (but on 
the small side). SPARC uses 8192, some MIPS systems use 32768 or 65536.
...

Application writers are more likely to get encryption callbacks wrong

than Howard is.  They could ignore the IV (making it easy to detect
duplicate initial blocks within a page) or even do pure ECB encryption
(making it easy to detect duplicate blocks anywhere).  Less egregiously,
applications might not make the ideal choice of cipher mode.  I would
personally have to think about the best choice to use.  If I were using
a block cipher, CBC with the provided ivec seems like it should be okay,
but assuming 128-bit cipher blocks, after around 2^64 blocks one would
expect to experience a block collision which reveals the XOR of the
plaintexts of the preceding two blocks[1].  Deriving a key with
HKDF(key, ivec) and using counter mode might be safer, unless I'm
missing something, which I easily could be.  If I were using a stream
cipher, I would have to do research to figure out how to incorporate the
ivec.
The user-supplied IV is really just a seed, it will be hashed with some other 
uniqifiers (pageID,txnID) before being passed to the cipher. I suppose we 
could make some recommendations on ciphers and modes, but really I think it's 
up to the user to determine what kind of strength/speed tradeoffs they'll accept.
I would expect stream ciphers to be used, in general.
...

Not wanting to depend on crypto libraries seems like a valid concern.

Teaching the LMDB code how to dynamically load encryption plugins
doesn't necessarily seem attractive either.
We'll probably do the dynamic loading anyway, as noted above.
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/