Ah, I overlooked that flag. This means I can return the pointer to Java and write directly into LMDB for the duration of the current transaction. Sadly, after trying it, there is no noticeable effect on latency. Commits are still occupying majority of total time.
Could be some kind of overhead calling JNI repeatably in a tight loop? So instead I tried to batch all 1 million addresses over to JNI in one go and manage puts+begin+commit from the C side. Again, this did not have much notable effect.
What's even more weird is that when I clock the batch time from C I get around 1.8 - 1.7 sec, but the total execution time is actually more like 6 sec! Clearly there is something I don't understand here.
For reads I already let pass the value pointer to Java and read directly from it (like you suggest) and this part is actually ~1.5 times faster than the buffer copy approach in lmdbjni. Just to be sure, are the value pointers managed by LMDB or do I need to free them manually per transaction? Right now I just read data from them.
BTW, forgot to say thanks for a great product Howard!
On Sun, Jun 29, 2014 at 4:56 PM, Howard Chu hyc@symas.com wrote:
Kristoffer Sjögren wrote:
Hi
I'm experimenting a bit with developing a JNI library for LMDB that utilize zero copy buffers. As you may know there is a project called lmdbjni [1] already but it does buffer copy only.
I must admit that my C and JNI skills is not really that great, so please forgive any stupidity on my part.
Can't offer much advice on java or JNI. But your test doesn't really leverage LMDB's zero-copy writes. To do that you have to use the MDB_RESERVE write flag before generating the output value. Otherwise there's still a copy from the mdb_put argument into the actual DB.
I don't know what other buffer copies occur before you finally reach mdb_put, but I don't see why you need to do anything special to pass a user value into mdb_put. zero-copy really only has significant benefit for readers, and that's where you have to play games with non-GC'd pointers.
Perhaps other java users on this list can offer more advice.
Anyway, i'm taking the sun.misc.Unsafe (no bounds checking) approach for
memory allocation and pass memory addresses through JNI to LMDB for both writes and reads and at the moment I have implemented put, get, begin, commit [2].
I suspect that something is wrong because the performance between the buffer copy and zero copy isn't really that big. lmdbjni is even faster sometimes, write commits specifically is almost twice as fast.
The test write 1M entries with a 4 bytes key (1-1M) and 128 bytes value (random) committed in one go. LMDB is configured identically both implementations with 4GB MDB_WRITEMAP. I run Linux 3.2.0 with Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz.
JProfiler show no signs of bottlenecks in Java in my implementation, most time is spent on the native methods put and commit. The opposite is true for lmdbjni where most time is spent creating and writing Java byte buffers, while native put and commit is just a fraction of that time.
Not sure exactly the significance of the gcc compiler but here is how I do it.
gcc -g -Wall -O2 -Wbad-function-cast -Wno-write-strings -fPIC -shared -I$JAVA_HOME/include -I$JAVA_HOME/include/linux -Isrc/main/native src/main/native/jlmdb.c src/main/native/liblmdb.a -o target/linux64/libjlmdb.so
- What could be the reason for "slow(er)" commits?
- How much faster can I expect properly implemented zero copying to be?
- Maybe lmdbjni have defacto standard optimizations that me as a C noobie
might have overlooked?
- Are there any performance counters, tracepoints or similar that might
be of interest to find where latency is spent?
Greatful for any tips or pointers on how to track the problem down.
Cheers, -Kristoffer
[1] https://github.com/chirino/lmdbjni
[2] JNI
MDB_env *mdb_env; MDB_dbi dbi;
JNIEXPORT jlong JNICALL Java_NativeLmdb_put (JNIEnv * env, jobject obj, jlong tx, jlong keyAddress, jlong keySize, jlong valAddress, jlong valSize) { MDB_val mdb_key, mdb_val;
mdb_key.mv_data = (void *)(intptr_t) keyAddress; mdb_key.mv_size = (size_t) keySize; mdb_val.mv_data = (void *)(intptr_t) valAddress; mdb_val.mv_size = (size_t) valSize; int rc = mdb_put((MDB_txn *) (intptr_t) tx, dbi, &mdb_key, &mdb_val,
0); return rc; }
JNIEXPORT jlong JNICALL Java_NativeLmdb_get (JNIEnv *env, jobject o, jlong tx, jlong a, jlong s) { MDB_val mdb_key, mdb_val; mdb_key.mv_data = (void *)(intptr_t) a; mdb_key.mv_size = (size_t) s; int rc = mdb_get((MDB_txn *) (intptr_t) tx, dbi, &mdb_key, &mdb_val); if (rc == 0) { return (intptr_t) mdb_val.mv_data; } return -1; }
JNIEXPORT void JNICALL Java_org_deephacks_lmdb_NativeLmdb_mdb_1txn_1begin(JNIEnv *env, jobject obj, jlongArray array) { jlong *nArray = (*env)->GetLongArrayElements(env, array, NULL); MDB_txn *txn; mdb_txn_begin(mdb_env, NULL, 0, &txn); nArray[0] = (jlong) txn; (*env)->ReleaseLongArrayElements(env, array, nArray, 0); }
JNIEXPORT void JNICALL Java_NativeLmdb_mdb_1txn_1begin(JNIEnv *env, jobject obj, jlongArray tx) { jlong *nArray = (*env)->GetLongArrayElements(env, array, NULL); MDB_txn *txn; mdb_txn_begin(mdb_env, NULL, 0, &txn); tx[0] = (jlong) txn; (*env)->ReleaseLongArrayElements(env, tx, nArray, 0); }
JNIEXPORT jint JNICALL Java_NativeLmdb_mdb_1txn_1commit(JNIEnv *env, jobject obj, jlong tx) { return (jint)mdb_txn_commit((MDB_txn *)(intptr_t)tx); }
JNIEXPORT void JNICALL Java_NativeLmdb_open (JNIEnv * env, jobject obj) { mdb_env_create(&mdb_env); mdb_env_set_mapsize(mdb_env, 4294967296); mdb_env_open(mdb_env, "/tmp/testdb", MDB_WRITEMAP, 0664); MDB_txn *txn; mdb_txn_begin(mdb_env, NULL, 0, &txn); mdb_open(txn, NULL, 0, &dbi); mdb_txn_commit(txn); }
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/