Hi,
I'm trying to use >= <= filter on eq-indexed generalizedTime attribute. Seems eq-index does not work for <= filter. olcAttributeTypes: ( 2.999.777.1.1.1.5 NAME 'logTime' DESC 'Time' EQUALITY generalizedTimeMatch ORDERING generalizedTimeOrderingMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.24 ) olcDbIndex: logTime eq (&(logTime>=201209201440+0400)(logTime<=201209201450+0400)) - 57sec (|(logTime=20120920144001+0400)(logTime=20120920144008+0400)) - 0.03sec ~ 1000000 records in db, target = 100000000 records What should I do to make >= filters work efficiently? I may consider using another data type that work with <=/=> filters for sure. Strings? Integers? Anything that can mimic TIME semantics. Any advice?
14.04.2012 19:44, Michael Ströder пишет:
can I influence the order of index usage by order in the filter or slapd index configuration?
The same question from me. Is it possible to select indexes using some sort of filter syntax?
Roman Rybalko wrote:
Hi,
I'm trying to use >= <= filter on eq-indexed generalizedTime attribute. Seems eq-index does not work for <= filter.
Actually the eq index works fine.
olcAttributeTypes: ( 2.999.777.1.1.1.5 NAME 'logTime' DESC 'Time' EQUALITY generalizedTimeMatch ORDERING generalizedTimeOrderingMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.24 ) olcDbIndex: logTime eq (&(logTime>=201209201440+0400)(logTime<=201209201450+0400)) - 57sec (|(logTime=20120920144001+0400)(logTime=20120920144008+0400)) - 0.03sec ~ 1000000 records in db, target = 100000000 records What should I do to make >= filters work efficiently?
Use a correct filter. Your clauses above use invalid syntax.
24.09.2012 22:23, Howard Chu пишет:
(&(logTime>=201209201440+0400)(logTime<=201209201450+0400)) - 57sec (|(logTime=20120920144001+0400)(logTime=20120920144008+0400)) - 0.03sec
Use a correct filter. Your clauses above use invalid syntax.
Point me please, where exactly my syntax is invalid?
Roman Rybalko wrote:
24.09.2012 22:23, Howard Chu пишет:
(&) - 57sec (|(logTime=20120920144001+0400)(logTime=20120920144008+0400)) - 0.03sec
Use a correct filter. Your clauses above use invalid syntax.
Point me please, where exactly my syntax is invalid?
You must be blind. You wrote:
(&(logTime>=201209201440+0400)(logTime<=201209201450+0400)) - 57sec
Check the proper format of a GeneralizedTime value. You have omitted the seconds field, so you're effectively looking for every entry greater than the year 20, December 09. The index lookup for this will most likely hit every entry in your DB, which is why it takes 57 seconds.
24.09.2012 23:04, Howard Chu пишет:
Roman Rybalko wrote:
24.09.2012 22:23, Howard Chu пишет:
(&(logTime>=201209201440+0400)(logTime<=201209201450+0400)) - 57sec (|(logTime=20120920144001+0400)(logTime=20120920144008+0400)) - 0.03sec
Use a correct filter. Your clauses above use invalid syntax.
Point me please, where exactly my syntax is invalid?
You must be blind. You wrote:
(&(logTime>=201209201440+0400)(logTime<=201209201450+0400)) - 57sec
Check the proper format of a GeneralizedTime value. You have omitted the seconds field, so you're effectively looking for every entry greater than the year 20, December 09. The index lookup for this will most likely hit every entry in your DB, which is why it takes 57 seconds.
Many thanks for suggestion. Apologies for my extra curiosity.
I tried full GeneralizedTime format (with seconds, with fractions) but the search works also slow. Even the search (|(logTime=20120920144001+0400)(logTime=20120920144008+0400)) that works less than second and returns 2 entries, when reformatted as (&(logTime>=20120920144001+0400)(logTime<=20120920144008+0400)) works more than 50 seconds.
According to RFC4517 ( http://tools.ietf.org/html/rfc4517#section-3.3.13), GeneraalizedTime has the syntax:
GeneralizedTime = century year month day hour [ minute [ second / leap-second ] ] [ fraction ] g-time-zone
which means that minutes and seconds may be omitted. Probably that's not implemented... no problem.
openldap version 2.4.23
How may I optimize (&(>=)(<=)) searches?
Roman Rybalko wrote:
I tried full GeneralizedTime format (with seconds, with fractions) but the search works also slow. Even the search (|(logTime=20120920144001+0400)(logTime=20120920144008+0400)) that works less than second and returns 2 entries, when reformatted as (&(logTime>=20120920144001+0400)(logTime<=20120920144008+0400)) works more than 50 seconds.
According to RFC4517 ( http://tools.ietf.org/html/rfc4517#section-3.3.13), GeneraalizedTime has the syntax:
GeneralizedTime = century year month day hour [ minute [ second / leap-second ] ] [ fraction ] g-time-zone
which means that minutes and seconds may be omitted. Probably that's not implemented... no problem.
My mistake. Our generalizedTime validator handles those optional components.
openldap version 2.4.23
How may I optimize (&(>=)(<=)) searches?
Aside from the equality index, there's not much else to be done. I'd suggest you pull the code in RE24 and see how back-mdb performs with your data.
25.09.2012 01:31, Howard Chu пишет:
Aside from the equality index, there's not much else to be done. I'd suggest you pull the code in RE24 and see how back-mdb performs with your data.
Pulled 2a68553ec103eec28237c2608b3fea149a492b76 Tried mdb - works completely the same way. (&(logTime>=20120816200001+0400)(logTime<=20120816200501+0400)) - 12sec (|(logTime=20120816200001+0400)(logTime=20120816200501+0400)) - 0.01sec ~ 100000 objects in db
Should I rely only on (|(=)(=)(=)(=)...) filters ?
Roman Rybalko wrote:
25.09.2012 01:31, Howard Chu пишет:
Aside from the equality index, there's not much else to be done. I'd suggest you pull the code in RE24 and see how back-mdb performs with your data.
Pulled 2a68553ec103eec28237c2608b3fea149a492b76 Tried mdb - works completely the same way.
Looks to me like it's 4-5x faster.....
(&(logTime>=20120816200001+0400)(logTime<=20120816200501+0400)) - 12sec (|(logTime=20120816200001+0400)(logTime=20120816200501+0400)) - 0.01sec ~ 100000 objects in db
Should I rely only on (|(=)(=)(=)(=)...) filters ?
If that suits you, sure. Personally I would submit an enhancement request to the ITS about getting this indexing code restructured. Ideally with a patch attached, fixing it as desired.
27.09.2012 14:21, Howard Chu пишет:
Looks to me like it's 4-5x faster.....
(&(logTime>=20120816200001+0400)(logTime<=20120816200501+0400)) - 12sec (|(logTime=20120816200001+0400)(logTime=20120816200501+0400)) - 0.01sec ~ 100000 objects in db
Actually Yes, I tried the same search on 1000000-object-sized db and MDB is really 5x faster, timing 12-13sec. Also it's very convenient to watch roughly DB size by shared process memory usage.
Except it's still buggy. I trapped some "mdb_add: txn_commit failed : MDB_PAGE_FULL: Internal error - page has no more space (-30786)" error. How may I report? Via ITS (gimme url plz) or openldap-devel list? Also I have full debug build so I may look into stacktrace.
Roman Rybalko wrote:
27.09.2012 14:21, Howard Chu пишет:
Looks to me like it's 4-5x faster.....
(&(logTime>=20120816200001+0400)(logTime<=20120816200501+0400)) - 12sec (|(logTime=20120816200001+0400)(logTime=20120816200501+0400)) - 0.01sec ~ 100000 objects in db
Actually Yes, I tried the same search on 1000000-object-sized db and MDB is really 5x faster, timing 12-13sec. Also it's very convenient to watch roughly DB size by shared process memory usage.
Except it's still buggy. I trapped some "mdb_add: txn_commit failed : MDB_PAGE_FULL: Internal error - page has no more space (-30786)" error.
Did you already have commit 0c4c6fe72a57f812e4486cd017298f730df19c23 ?
How may I report? Via ITS (gimme url plz)
www.openldap.org
or openldap-devel list? Also I have full debug build so I may look into stacktrace.
27.09.2012 17:26, Howard Chu пишет:
Roman Rybalko wrote:
I trapped some "mdb_add: txn_commit failed : MDB_PAGE_FULL: Internal error - page has no more space (-30786)" error.
Did you already have commit 0c4c6fe72a57f812e4486cd017298f730df19c23 ?
No, I didn't. But now I have commit a1c2dc6 (next after 0c4c6fe) and it works fine. Thank you!
25.09.2012 01:31, Howard Chu пишет:
Roman Rybalko wrote:
How may I optimize (&(>=)(<=)) searches?
Aside from the equality index, there's not much else to be done. I'd suggest you pull the code in RE24 and see how back-mdb performs with your data.
Actually MDB works with indexed generalizedTime attribute no faster than HDB, timings are non-constant (mdb: 22sec-78sec vs hdb: 52sec) probably due to intensive mmap-io (dataset size: 1M objects).
But eq-index on Integers olcAttributeTypes: ( 2.999.777.1.1.1.3 NAME 'logOrderingInteger' DESC 'Integer with ordering' EQUALITY integerMatch ORDERING integerOrderingMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.27 ) with (&(<=)(>=)) search filters works on MDB really fast, no slower than any other eq/sub-index on strings (0.02sec-0.6sec on 1000000 objects dataset). Didn't tried it on HDB though, may be it works the same, may be not, but MDB seems has big advantage in memory consuming (MDB is less memory consuming vs HDB).
So I decided to search my time values encoded into eq-indexed integers with MDB backend. That fits my requirements best.
openldap-technical@openldap.org