Back on the server side of things...
Using ldapadd to load my test database (380836 entries, 533MB LDIF, 1.2GB id2entry size) takes well over an hour using synchronous writes.
With DB_TXN_WRITE_NOSYNC and a 512 MB BDB cache, plus periodic checkpoints, it takes 39:21.55 minutes. With the patch in add.c:1.247, and the duplicate check in slap_mods_check replaced by a quicksort, it takes 20:23.24 minutes.
With a 1GB BDB cache it takes 14:06.48 minutes. With the above and the new ldapadd client code, it takes 12:49.74 minutes.
The question arises now about whether to keep the attribute values in sorted order, or to just sort a temporary copy at some strategic points and discard the results. Keeping in sorted order breaks a number of tests in the test suite, since results are compared to LDIF files with values in the original order. But it offers the possibility of speeding up Modifies, value_find, and a number of other functions. It's definitely got a lot of potential upside, at the cost of breaking current expectations about attribute value ordering.
Any thoughts?