romange@gmail.com wrote:
--001a11c1e98008372804e46726c2 Content-Type: text/plain; charset=UTF-8
Hi, I extracted a small dataset that shows the problem. you can download it from here: https://docs.google.com/file/d/0B6o29pwkWoERdnFSaUtMNDljemc/edit?usp=sharing
I modified mdb_copy.c to demonstrate the difference. copy it to source dir from here https://docs.google.com/file/d/0B6o29pwkWoERd3VuUm1DN0FpcUU/edit?usp=sharing
build and run "time ./mdb_copy foo foo2" after this change the flag at line 64 and run it again. at my computer the difference is 17s vs 1.7s for 3 million items.
This test doesn't prove the existence of a bug. You're running on a Little-Endian machine, therefore data that is in sorted order as a string is in hashed order when used as an integer. Your data insert turns into a worst-case insert order in this case, causing the worst possible random access strides through memory. Assuming the two orders to be equivalent is a pretty common mistake for DB programmers. Microsoft has done the same thing in ActiveDirectory, I mentioned it here a few years ago http://www.openldap.org/lists/openldap-devel/200711/msg00002.html
If you had run this test on a Big-Endian machine, like SPARC, the insert order would be identical either way, and INTEGERKEY result would have been faster.
Closing this ITS, no bug.
On Tue, Aug 20, 2013 at 9:04 PM, Quanah Gibson-Mount quanah@zimbra.comwrote:
--On Sunday, August 18, 2013 11:46 AM +0000 romange@gmail.com wrote:
Full_Name: Roman Gershman
Version: OS: linux 3.8.0-25-generic URL: Submission from: (NULL) (212.150.97.210)
Please provide further information, specifically:
The size of values Insert order Sample code if possible
Thanks, Quanah
--
Quanah Gibson-Mount Lead Engineer Zimbra, Inc
Zimbra :: the leader in open source messaging and collaboration