[Bug 9197] New: slapd-ldap/slapo-chain hits error 80 after idletimeout
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9197
Bug ID: 9197
Summary: slapd-ldap/slapo-chain hits error 80 after idletimeout
Product: OpenLDAP
Version: 2.5
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: backends
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
From a customer:
In order to communicate via the LB managed writable ldap, we have to ensure
that an idle connection is periodically refreshed. If we do not, the LB will
silently drop the connection after 5 minutes.
Therefore to combat that I set an olcIdleTimeout on the writable server so that
the chain cached connections will be removed before the LB timeout hits.
However the slapo-ldap client goes into CLOSE_WAIT state, which causes
subsequent ldapmodify updates being brokered by the read only instance to fail
with err=80. There appear to be a few bugs filed on this in the past against
slapd-ldap, but it's not clear if we may be hitting the same issue, or if this
is a new one.
I've also connected the read only instances directly to the writable ldap
instances and the CLOSE_WAIT issue persists, so I don't believe the CLOSE_WAIT
issue is caused by the LB
These were the other threads I found as I started looking for this problem,
these are using the ldap-proxy though I think:
https://www.openldap.org/lists/openldap-technical/201301/msg00323.html
http://www.openldap.org/lists/openldap-software/201004/msg00060.html
https://www.openldap.org/lists/openldap-bugs/200412/msg00029.html
The LB we have seems to be set to forget connections that last over 5 min per
the setting, so the 240:10:30 seemed like it should have worked and I just
thought it wasn't working because in the man page the text "Only some systems
support the customization of these values" is present. however after setting
keepalive to 60:10:30 did I maintain a stable connection, so there may be other
network settings at play I'm not aware of.
--
You are receiving this mail because:
You are on the CC list for the bug.
6 hours, 34 minutes
[Bug 9189] New: Add GSSAPI channel-bindings support
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9189
Bug ID: 9189
Summary: Add GSSAPI channel-bindings support
Product: OpenLDAP
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: libraries
Assignee: bugs(a)openldap.org
Reporter: iboukris(a)gmail.com
Target Milestone: ---
Recently MS has announce they plan to enforce channel-bindings for LDAP over
TLS (ADV190023).
To support it on client side, we need to pass "tls-endpoint" bindings (RFC
5929) to the SASL plugin, and make use of that in GSSAPI.
See also:
https://github.com/cyrusimap/cyrus-sasl/pull/601
--
You are receiving this mail because:
You are on the CC list for the bug.
5 days, 3 hours
[Issue 9341] New: Delta-sync MPR needs to be stable regardless of ordering
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9341
Issue ID: 9341
Summary: Delta-sync MPR needs to be stable regardless of
ordering
Product: OpenLDAP
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: replication
Severity: normal
Priority: ---
Component: backends
Assignee: bugs(a)openldap.org
Reporter: ondra(a)mistotebe.net
Target Milestone: ---
If two or more updates are spread across several providers before they have a
chance to learn about the others, all replicas need to arrive at the same
content regardless of the order in which they arrive.
One example that is broken at the moment:
- (csn a) server 1 accepts a modify
- (csn b) server 2 accepts a delete on the same DN
- (csn c) server 2 accepts an add on that DN again
If a replica receives the actions in the order bca vs. abc, the content of the
entry will be different even though the final CSN set is the same -> they will
never converge. The ordering 'bac' also needs to result in eventual
convergence, even if it means a refresh or replication from either provider
stalling temporarily?
Merge request with this test case (so far):
https://git.openldap.org/openldap/openldap/-/merge_requests/145
--
You are receiving this mail because:
You are on the CC list for the issue.
2 months
[Issue 9354] New: Tool for monitoring slapd status
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9354
Issue ID: 9354
Summary: Tool for monitoring slapd status
Product: OpenLDAP
Version: 2.4.53
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: enhancement
Priority: ---
Component: client tools
Assignee: bugs(a)openldap.org
Reporter: hyc(a)openldap.org
Target Milestone: ---
Something we've seen in various scripts many times - this tool continuously
reports operation counts from cn=Monitor and tracks contextCSN from a specified
baseDN, polling at the specified interval (default 10 seconds). It tracks when
contextCSN changes start and stop, and for a set of multiprovider servers,
reports the lag between consumers and providers.
Sample output of a screen:
2020-09-21 06:25:36
ldap://tx01
Entries Bind Unbind Search Compare Modify
ModDN Add Delete Abandon Extended
Num 1601143 95 108 617710 0 0
0 600000 0 0 9
Num/s 0.00 0.00 0.00 0.40 0.00 0.00
0.00 0.00 0.00 0.00 0.00
contextCSN: 20200921004518.559704Z#000000#001#000000 idle
contextCSN: 20200921004548.656746Z#000000#002#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:12:24, sync'd, max delta 00:03:11
contextCSN: 20200921004446.224575Z#000000#003#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:12:24, sync'd, max delta 00:02:54
contextCSN: 20200921004331.841460Z#000000#004#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:12:24, sync'd, max delta 00:02:19
ldap://tx02
Entries Bind Unbind Search Compare Modify
ModDN Add Delete Abandon Extended
Num 1601143 105 120 2217678 0 0
0 400000 200000 0 3
Num/s 0.00 0.00 0.00 0.40 0.00 0.00
0.00 0.00 0.00 0.00 0.00
contextCSN: 20200921004518.559704Z#000000#001#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:11:33, sync'd, max delta 00:03:09
contextCSN: 20200921004548.656746Z#000000#002#000000 idle
contextCSN: 20200921004446.224575Z#000000#003#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:11:33, sync'd, max delta 00:02:52
contextCSN: 20200921004331.841460Z#000000#004#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:11:33, sync'd, max delta 00:02:17
ldap://uk03
Entries Bind Unbind Search Compare Modify
ModDN Add Delete Abandon Extended
Num 1601143 1407 107 675570 0 3677
0 601135 14 0 3
Num/s 0.00 0.00 0.00 0.40 0.00 0.00
0.00 0.00 0.00 0.00 0.00
contextCSN: 20200921004518.559704Z#000000#001#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:05:56, sync'd, max delta 00:02:11
contextCSN: 20200921004548.656746Z#000000#002#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:05:56, sync'd, max delta 00:02:14
contextCSN: 20200921004446.224575Z#000000#003#000000 idle
contextCSN: 20200921004331.841460Z#000000#004#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:05:56, sync'd, max delta 00:01:33
ldap://uk04
Entries Bind Unbind Search Compare Modify
ModDN Add Delete Abandon Extended
Num 1601143 98 116 2217670 0 0
0 400000 200000 0 3
Num/s 0.00 0.00 0.00 0.40 0.00 0.00
0.00 0.00 0.00 0.00 0.00
contextCSN: 20200921004518.559704Z#000000#001#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:02:11, sync'd, max delta 00:01:13
contextCSN: 20200921004548.656746Z#000000#002#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:02:11, sync'd, max delta 00:01:17
contextCSN: 20200921004446.224575Z#000000#003#000000 actv@2020-09-21 00:58:57,
idle@2020-09-21 01:02:11, sync'd, max delta 00:01:01
contextCSN: 20200921004331.841460Z#000000#004#000000 idle
--
You are receiving this mail because:
You are on the CC list for the issue.
2 months, 2 weeks
Exhausting Locked SEM_UNDO Semaphores on MacOS
by Kris Zyp
In LMDB, if you attempt to open more than 10 transactions (on different environments) on one process on MacOS, then mdb_txn_begin will fail. This can be reproduced by opening 11 different database environments in one process, and calling mdb_txn_begin (as write transactions) on each one. The 11th one will return EINVAL. I believe this is because (on this OS, MacOS 10.13.6) the System V semaphores have a limit of 10 SEM_UNDO locked semaphores on one process. So when mdb_txn_begin attempts to open an 11th transaction, the mdb_sem_wait/semop call fails, returning EINVAL. I was testing with latest LMDB from mdb.master.
Once I finally debugged this and figured out the issue, I have been able to work around it by compiling with MDB_USE_POSIX_SEM which seems to resolve the issue. But this still leaves a few questions:
Is there any issue with compiling with MDB_USE_POSIX_SEM on MacOS? Would it be better if LMDB defaulted to this shared lock implementation for this OS?
And/or would there be value in LMDB providing a more specific error message in this situation? The documentation doesn't indicate EINVAL as a possible return value for mdb_txn_begin, and this is a very generic error with little indication of the root problem, that was rather confusing to me, at least. Or is there an expectation that processes can only open a limited number of database environments (and have open transactions on them)? (Our server typically has about 30 environments open with about 2-16 dbs/env, with many concurrent transactions, without issue on other OSes.)
I would be happy to put together a patch for this, but I am not sure of the reasons for the selection of the different shared lock implementations on different OSes. Anyway, I'd be glad to help contribute a patch if there is a specific way this should work. Thank you!
3 months
[Bug 9200] New: 2.4 to 2.5 upgrade documentation
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9200
Bug ID: 9200
Summary: 2.4 to 2.5 upgrade documentation
Product: OpenLDAP
Version: 2.5
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: blocker
Priority: ---
Component: documentation
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
For the 2.5 release, we need to document the upgrade procedures for moving from
OpenLDAP 2.4 to OpenLDAP 2.5.
--
You are receiving this mail because:
You are on the CC list for the bug.
3 months
[Bug 9222] New: Fix presence list to use a btree instead of an AVL tree
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9222
Bug ID: 9222
Summary: Fix presence list to use a btree instead of an AVL
tree
Product: OpenLDAP
Version: 2.5
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
[23:34] <hyc> ok, so far heap profile shows that memory use during refresh is
normal
[23:35] <hyc> not wonderful, but normal. mem usage grows because we're
recording the present list while receiving entries in the refresh
[23:36] <hyc> I'm seeing for 1.2GB of data about 235MB of presentlist
[23:36] <hyc> which is pretty awful, considering presentlist is just a list of
UUIDs
[23:36] <hyc> being stored in an avl tree
[23:37] <hyc> a btree would have been better here, and we could just use an
unsorted segmented array
[23:42] <hyc> for the accumulation phase anyway. we need to be able to lookup
records during the delete pphase
[00:05] <hyc> this stuff seriously needs a rewrite
[01:13] <hyc> 2.8M records x 16 bytes per uuid so this should be no more than
48MB of overhead
[01:13] <hyc> and instead it's 3-400MB
--
You are receiving this mail because:
You are on the CC list for the bug.
3 months
[Issue 9356] New: Add list of peerSIDs to consumer cookie to reduce cross traffic
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9356
Issue ID: 9356
Summary: Add list of peerSIDs to consumer cookie to reduce
cross traffic
Product: OpenLDAP
Version: 2.5
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
If we add a list of peersids to the cookie, each consumer can tell the
providers who else the consumers talk to and then the provider can omit sending
updates to that consumer, originating from those peers
There's some special handling needed if a connection dies
If a consumer loses one of its peer connections, and after N retries is still
not connected, it should send a new cookie to its remaining peers saying
"here's my new peer list" with the missing one removed. Likewise, if a retry
eventually connects again, it can send a new cookie again
Make that peer list reset configurable in the syncrepl config stanza. This can
help account for end admin knowledge that some links may be more or less stable
than other ones.
The idea here is that if one of your other peers can still see the missing
peer, they can start routing updates to you again
It should abandon all existing persist sessions and send a new sync search with
the new cookie to all remaining peers
For consumer side, also means adding the sid for a given provider into the
syncrepl stanza to save on having to try and discover the peer sid.
--
You are receiving this mail because:
You are on the CC list for the issue.
3 months
[Issue 9272] New: Invalid search results for subordinate/glued database
by openldap-its@openldap.org
https://bugs.openldap.org/show_bug.cgi?id=9272
Issue ID: 9272
Summary: Invalid search results for subordinate/glued database
Product: OpenLDAP
Version: 2.4.47
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: overlays
Assignee: bugs(a)openldap.org
Reporter: grapvar(a)gmail.com
Target Milestone: ---
Here is a trivial test case. Look at the following bunch of glued
dit's/databases, declared in this order:
| suffix ou=a,ou=1,ou=T # subordinate; contains only one (top-level) entry
| suffix ou=2,ou=T # subordinate; contains only one (top-level) entry
| suffix ou=b,ou=1,ou=T # subordinate; contains only one (top-level) entry
| suffix ou=T # master database, has two entries, top-level
| ` ou=1 # ... and this child entry
let's query the united database:
| $ ldapsearch -b ou=1,ou=T -s sub '' nx
| dn: ou=1,ou=T
| dn: ou=a,ou=1,ou=T
| dn: ou=b,ou=1,ou=T
Nice! But wait, what if ...
| $ ldapsearch -b ou=1,ou=T -s sub -E\!pr=2/noprompt '' nx
| dn: ou=1,ou=T
| dn: ou=a,ou=1,ou=T
|
| # pagedresults: cookie=//////////8=
... BANG! ...
| Server is unwilling to perform (53)
The problem is the glue_op_search(), which has issues
* different parts of code make different assumptions about data structures
* different parts of code track state inconsistently
* code that looks like a highly probably dead code
I mean that likely possible to build another bug-triggering test cases, and
glue_op_search() needs not just a fix of the bug above, but intense cleaning
and structuring.
--
You are receiving this mail because:
You are on the CC list for the issue.
3 months