Content-Type: text/plain; charset=UTF-8
Hi, I extracted a small dataset that shows the problem.
you can download it from here:
I modified mdb_copy.c to demonstrate the difference. copy it to source dir
run "time ./mdb_copy foo foo2"
after this change the flag at line 64 and run it again.
at my computer the difference is 17s vs 1.7s for 3 million items.
This test doesn't prove the existence of a bug. You're running on a
Little-Endian machine, therefore data that is in sorted order as a string is
in hashed order when used as an integer. Your data insert turns into a
worst-case insert order in this case, causing the worst possible random access
strides through memory. Assuming the two orders to be equivalent is a pretty
common mistake for DB programmers. Microsoft has done the same thing in
ActiveDirectory, I mentioned it here a few years ago
If you had run this test on a Big-Endian machine, like SPARC, the insert order
would be identical either way, and INTEGERKEY result would have been faster.
Closing this ITS, no bug.
On Tue, Aug 20, 2013 at 9:04 PM, Quanah Gibson-Mount <quanah(a)zimbra.com>wrote:
> --On Sunday, August 18, 2013 11:46 AM +0000 romange(a)gmail.com wrote:
> Full_Name: Roman Gershman
>> OS: linux 3.8.0-25-generic
>> Submission from: (NULL) (22.214.171.124)
> Please provide further information, specifically:
> The size of values
> Insert order
> Sample code if possible
> Quanah Gibson-Mount
> Lead Engineer
> Zimbra, Inc
> Zimbra :: the leader in open source messaging and collaboration
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/