A few thoughts occurred to me today about our indexing code: 1) we compute a hash preset for each invocation, crunching the syntax and matching rule's OID, among other things. (It used to be worse, we used to recompute this for each individual value, even though it's a constant.) There's no need to always recompute this on each invocation, we can compute it once at first usage and reuse that result. It should speed up index generation, particularly on smaller attribute values. I'm preparing a patch to test this now. 2) using this precomputed hash, we can drop the syntax, mr, and prefix arguments from the indexer function signature. That will also speed things up. 3) I note that the 'use' argument is also never used in our indexer functions. Will drop this as well.
--On Friday, March 14, 2014 3:42 AM -0700 Howard Chu hyc@symas.com wrote:
A few thoughts occurred to me today about our indexing code:
- we compute a hash preset for each invocation, crunching the syntax
and matching rule's OID, among other things. (It used to be worse, we used to recompute this for each individual value, even though it's a constant.) There's no need to always recompute this on each invocation, we can compute it once at first usage and reuse that result. It should speed up index generation, particularly on smaller attribute values. I'm preparing a patch to test this now. 2) using this precomputed hash, we can drop the syntax, mr, and prefix arguments from the indexer function signature. That will also speed things up. 3) I note that the 'use' argument is also never used in our indexer functions. Will drop this as well.
Please send the patch my way! ;)
--Quanah
--
Quanah Gibson-Mount Architect - Server Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
--On Friday, March 14, 2014 3:42 AM -0700 Howard Chu hyc@symas.com wrote:
A few thoughts occurred to me today about our indexing code: 1) we compute a hash preset for each invocation, crunching the syntax and matching rule's OID, among other things. (It used to be worse, we used to recompute this for each individual value, even though it's a constant.) There's no need to always recompute this on each invocation, we can compute it once at first usage and reuse that result. It should speed up index generation, particularly on smaller attribute values. I'm preparing a patch to test this now. 2) using this precomputed hash, we can drop the syntax, mr, and prefix arguments from the indexer function signature. That will also speed things up. 3) I note that the 'use' argument is also never used in our indexer functions. Will drop this as well.
Please send the patch my way! ;)
Complete patch is on my indx2 branch on ada. (I just copied the back-mdb changes over to back-bdb/hdb today; the back-mdb code hasn't changed from last week.)
Howard Chu wrote:
Quanah Gibson-Mount wrote:
--On Friday, March 14, 2014 3:42 AM -0700 Howard Chu hyc@symas.com wrote:
A few thoughts occurred to me today about our indexing code: 1) we compute a hash preset for each invocation, crunching the syntax and matching rule's OID, among other things. (It used to be worse, we used to recompute this for each individual value, even though it's a constant.) There's no need to always recompute this on each invocation, we can compute it once at first usage and reuse that result. It should speed up index generation, particularly on smaller attribute values. I'm preparing a patch to test this now. 2) using this precomputed hash, we can drop the syntax, mr, and prefix arguments from the indexer function signature. That will also speed things up. 3) I note that the 'use' argument is also never used in our indexer functions. Will drop this as well.
Please send the patch my way! ;)
Complete patch is on my indx2 branch on ada. (I just copied the back-mdb changes over to back-bdb/hdb today; the back-mdb code hasn't changed from last week.)
For reference, the impact has been tiny, on the data sets I've tested. Around 3% improvement at most. But the code is cleaner anyway.