I've recently added support for page-level encryption to LMDB 1.x using user-supplied callbacks:
/** @brief A callback function used to encrypt/decrypt pages in the env. * * Encrypt or decrypt the data in src and store the result in dst using the * provided key. The result must be the same number of bytes as the input. * The input size will always be a multiple of the page size. * @param[in] src The input data to be transformed. * @param[out] dst Storage for the result. * @param[in] key An array of two values: key[0] is the encryption key, * and key[1] is the initialization vector. * @param[in] encdec 1 to encrypt, 0 to decrypt. */ typedef void (MDB_enc_func)(const MDB_val *src, MDB_val *dst, const MDB_val *key, int encdec);
/** @brief Set encryption on an environment. * * This must be called before #mdb_env_open(). * It implicitly sets #MDB_REMAP_CHUNKS on the env. * @param[in] env An environment handle returned by #mdb_env_create(). * @param[in] func An #MDB_enc_func function. * @param[in] key An array of two values: key[0] is the encryption key, * and key[1] is the initialization vector. * @return A non-zero error value on failure and 0 on success. */ int mdb_env_set_encrypt(MDB_env *env, MDB_enc_func *func, const MDB_val *key);
I intend to extend this a bit further to support authenticated encryption instead, so that each page also yields a signature that can be used to detect tampering or corruption.
One question is whether we should actually make this pluggable like this, or we should just hardcode support for a specific algorithm and leave it at that. For comparison, BerkeleyDB supported AES128 with HMAC-SHA1 for a 20 byte per page signature. In some ways it simplifies use if the DB just takes care of everything and there's only one hardcoded mechanism. Also, user-supplied callback relies on the app developer to know what they're doing with the crypto code.
From the LMDB maintenance perspective, it's simpler for us not to have hardcoded dependencies on crypto libraries. Also, leaving it pluggable keeps the door open for 3rd party hardware-accelerated crypto engines. One complication is that if the algorithm is actually user-selectable, we need to dynamically adjust DB page layouts to accommodate different nonce/IV and signature sizes. (Currently MDB_page metadata is a statically defined structure. A dynamic size element here will make processing slower.)
Thoughts? Hardcode 1 algorithm, or leave it pluggable?
Howard Chu wrote:
I've recently added support for page-level encryption to LMDB 1.x using user-supplied callbacks:
Interesting.
Thoughts? Hardcode 1 algorithm, or leave it pluggable?
"Cryptographic algorithms age; they become weaker with time." [1]
Ciao, Michael.
On Thu, Aug 10, 2017 at 11:55 AM, Howard Chu hyc@symas.com wrote:
I've recently added support for page-level encryption to LMDB 1.x using user-supplied callbacks:
/** @brief A callback function used to encrypt/decrypt pages in the env.
- Encrypt or decrypt the data in src and store the result in dst using the
- provided key. The result must be the same number of bytes as the input.
- The input size will always be a multiple of the page size.
- @param[in] src The input data to be transformed.
- @param[out] dst Storage for the result.
- @param[in] key An array of two values: key[0] is the encryption key,
- and key[1] is the initialization vector.
- @param[in] encdec 1 to encrypt, 0 to decrypt.
*/ typedef void (MDB_enc_func)(const MDB_val *src, MDB_val *dst, const MDB_val *key, int encdec);
/** @brief Set encryption on an environment. * * This must be called before #mdb_env_open(). * It implicitly sets #MDB_REMAP_CHUNKS on the env. * @param[in] env An environment handle returned by #mdb_env_create(). * @param[in] func An #MDB_enc_func function. * @param[in] key An array of two values: key[0] is the encryption key, * and key[1] is the initialization vector. * @return A non-zero error value on failure and 0 on success. */
int mdb_env_set_encrypt(MDB_env *env, MDB_enc_func *func, const MDB_val *key);
I intend to extend this a bit further to support authenticated encryption instead, so that each page also yields a signature that can be used to detect tampering or corruption.
One question is whether we should actually make this pluggable like this, or we should just hardcode support for a specific algorithm and leave it at that. For comparison, BerkeleyDB supported AES128 with HMAC-SHA1 for a 20 byte per page signature. In some ways it simplifies use if the DB just takes care of everything and there's only one hardcoded mechanism. Also, user-supplied callback relies on the app developer to know what they're doing with the crypto code.
From the LMDB maintenance perspective, it's simpler for us not to have hardcoded dependencies on crypto libraries. Also, leaving it pluggable keeps the door open for 3rd party hardware-accelerated crypto engines. One complication is that if the algorithm is actually user-selectable, we need to dynamically adjust DB page layouts to accommodate different nonce/IV and signature sizes. (Currently MDB_page metadata is a statically defined structure. A dynamic size element here will make processing slower.)
Thoughts? Hardcode 1 algorithm, or leave it pluggable?
Make the core library pluggable and ship a side example implementation library like: libmdb-enc-(aes128|whatever)
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Hi,
I've recently added support for page-level encryption to LMDB 1.x using user-supplied callbacks
That does sound cool. :)
One question is whether we should actually make this pluggable like this, or we should just hardcode support for a specific algorithm and leave it at that.
I vote on keeping it pluggable, so every crypograpy nut out there can use their favourite mechanism.
One complication is that if the algorithm is actually user-selectable, we need to dynamically adjust DB page layouts to accommodate different nonce/IV and signature sizes. (Currently MDB_page metadata is a statically defined structure. A dynamic size element here will make processing slower.)
What if page size would still be static, but that static size would be user-defined on a per-environment basis?
Question: will this affect performance on non-encrypted databases?
Cheers, Timur