https://bugs.openldap.org/show_bug.cgi?id=10274
Issue ID: 10274
Summary: Replication issue on MMR configuration
Product: OpenLDAP
Version: 2.5.14
Hardware: All
OS: Linux
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: falgon.comp(a)gmail.com
Target Milestone: ---
Created attachment 1036
--> https://bugs.openldap.org/attachment.cgi?id=1036&action=edit
In this attachment you will find 2 openldap configurations for 2 instances +
slamd conf exemple + 5 screenshots to show the issue and one text file to
explain what you see
Hello we are openning this issue further to the initial post in technical :
https://lists.openldap.org/hyperkitty/list/openldap-technical@openldap.org/…
Issue :
We are working on a project and we've come across an issue with the replication
after performance testing :
*Configuration :*
RHEL 8.6
OpenLDAP 2.5.14
*MMR-delta *configuration on multiple servers attached
300,000 users configured and used for tests
*olcLastBind: TRUE*
Use of SLAMD (performance shooting)
*Problem description:*
We are currently running performance and resilience tests on our infrastructure
using the SLAMD tool configured to perform BINDs and MODs on a defined range of
accounts.
We use a load balancer (VIP) to poll all of our servers equally. (but it is
possible to do performance tests directly on each of the directories)
With our current infrastructure we're able to perform approximately 300
MOD/BIND/s. Beyond that, we start to generate delays and can randomly come
across one issue.
However, when we run performance tests that exceed our write capacity, our
replication between servers can randomly create an incident with directories
being unable to catch up with their replication delay.
The directories update their contextCSNs, but extremely slowly (like freezing).
From then on, it's impossible for the directories to catch again. (even with no
incoming traffic)
A restart of the instance is required to perform a full refresh and solve the
incident.
We have enabled synchronization logs and have no error or refresh logs to
indicate a problem ( we can provide you with logs if necessary).
We suspect a write collision or a replication conflict but this is never write
in our sync logs.
We've run a lot of tests.
For example, when we run a performance test on a single live server, we don't
reproduce the problem.
Anothers examples: if we define different accounts ranges for each server with
SALMD, we don't reproduce the problem either.
If we use only one account for the range, we don't reproduce the problem
either.
______________________________________________________________________
I have add some screenshots on attachement to show you the issue and all the
explanations.
______________________________________________________________________
*Symptoms :*
One or more directories can no longer be replicated normally after performance
testing ends.
No apparent error logs.
Need a restart of instances to solve the problem.
*How to reproduce the problem:*
Have at least two servers in MMR mode
Set LastBind to TRUE
Perform a SLAMD shot from a LoadBalancer in bandwidth mode OR start multiple
SLAMD test on same time for each server with the same account range.
Exceed the maximum write capacity of the servers.
*SLAMD configuration :*
authrate.sh --hostname ${HOSTNAME} --port ${PORTSSL} \
--useSSL --trustStorePath ${CACERTJKS} \
--trustStorePassword ${CACERTJKSPW} --bindDN "${BINDDN}" \
--bindPassword ${BINDPW} --baseDN "${BASEDN}" \
--filter "(uid=[${RANGE}])" --credentials ${USERPW} \
--warmUpIntervals ${WARMUP} \
--numThreads ${NTHREADS} ${ARGS}
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9564
Issue ID: 9564
Summary: Race condition with freeing the spilled pages from
transaction
Product: LMDB
Version: 0.9.29
Hardware: Other
OS: Mac OS
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: kriszyp(a)gmail.com
Target Milestone: ---
Created attachment 825
--> https://bugs.openldap.org/attachment.cgi?id=825&action=edit
Free the spilled pages and dirty overflows before unlocking the write mutex
The spilled pages (a transaction's mt_spill_pgs) is freed *after* a write
transaction's mutex is unlocked (in mdb.master3). This can result in a race
condition where a second transaction can start and subsequently assign a new
mt_spill_pgs list to the transaction structure that it reuses. If this occurs
after the first transaction unlocks the mutex, but before it performs the free
operation on mt_spill_pgs, then the first transaction will end up calling a
free on the same spilled page list as the second transaction, resulting in a
double free (and crash).
It would seem to be an extremely unlikely scenario to actually happen, since
the free call is literally the next operation after the mutex is unlocked, and
the second transaction would need to make it all the way to the point of saving
the freelist before a page spill list is likely to be allocated. Consequently,
this probably has rarely happened. However, one of our users (see
https://github.com/DoctorEvidence/lmdb-store/issues/48 for the discussion) has
noticed this occurring, and it seems that it may be particularly likely to
happen on MacOS on the new M1 silicon. Perhaps there is some peculiarity to how
the threads are more likely to yield execution after a mutex unlock, I am not
sure. I was able to reproduce the issue by intentionally manipulating the
timing (sleeping before the free) to verify that the race condition is
technically feasible, and apparently this can happen "in the wild" on MacOS, at
least with an M1.
It is also worth noting that this is due to (or a regression from) the fix for
ITS#9155
(https://github.com/LMDB/lmdb/commit/cb256f409bb53efeda4ac69ee8165a0b4fc1a277)
where the free call was moved outside the conditional (for having a parent)
that had previously never been executed after the mutex is unlocked, but now is
called after the unlock.
Anyway, the solution is relatively simple, the free call simply has to be moved
above the unlock. In my patch, I also moved the free call for mt_dirty_ovs. I
am not sure what OVERFLOW_NOTYET/mt_dirty_ovs is for, but presumably it should
be handled the same. This could alternately be solved by saving the reference
to these lists before unlocking, and freeing after unlocking, which would
slightly decrease the amount of processing within the mutex guarded code. Let
me know if you prefer a patch that does that.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10191
Issue ID: 10191
Summary: backend searches should respond to pause requests
Product: OpenLDAP
Version: 2.6.7
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: backends
Assignee: bugs(a)openldap.org
Reporter: hyc(a)openldap.org
Target Milestone: ---
A long running search will cause mods to cn=config to wait a long time. Search
ops should periodically check for threadpool pause requests.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9739
Issue ID: 9739
Summary: Undefined reference to ber_sockbuf_io_udp in 2.6.0
Product: OpenLDAP
Version: 2.6.0
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: build
Assignee: bugs(a)openldap.org
Reporter: simon.pichugin(a)gmail.com
Target Milestone: ---
While I was trying to build OpenLDAP 2.6 on Fedora Rawhide I've got the error
message:
/usr/bin/ld: ./.libs/libldap.so: undefined reference to
`ber_sockbuf_io_udp'
I've checked commits from https://bugs.openldap.org/show_bug.cgi?id=9673 and
found that 'ber_sockbuf_io_udp' was not added to
https://git.openldap.org/openldap/openldap/-/blob/master/libraries/liblber/…
I've asked on the project's mailing list and got a reply:
"That symbol only exists if OpenLDAP is built with LDAP_CONNECTIONLESS
defined, which is not a supported feature. Feel free to file a bug report
at https://bugs.openldap.org/"
https://lists.openldap.org/hyperkitty/list/openldap-technical@openldap.org/…
Hence, creating the bug.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10205
Issue ID: 10205
Summary: SSL handshake blocks forever in async mode if server
unaccessible
Product: OpenLDAP
Version: 2.5.17
Hardware: x86_64
OS: Linux
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: libraries
Assignee: bugs(a)openldap.org
Reporter: regtube(a)hotmail.com
Target Milestone: ---
When ldaps:// scheme is used to connect to currently unaccessible server with
LDAP_OPT_CONNECT_ASYNC and LDAP_OPT_NETWORK_TIMEOUT options set, it blocks
forever on SSL_connect.
Here is a trace:
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP winserv.test.net:636
ldap_new_socket: 3
ldap_prepare_socket: 3
ldap_connect_to_host: Trying 192.168.56.2:636
ldap_pvt_connect: fd: 3 tm: 30 async: -1
ldap_ndelay_on: 3
attempting to connect:
connect errno: 115
ldap_int_poll: fd: 3 tm: 0
ldap_err2string
[2024-04-25 15:41:27.112] [error] [:1] bind(): Connecting (X)
[2024-04-25 15:41:27.112] [error] [:1] err: -18
ldap_sasl_bind
ldap_send_initial_request
ldap_int_poll: fd: 3 tm: 0
ldap_is_sock_ready: 3
ldap_ndelay_off: 3
TLS trace: SSL_connect:before SSL initialization
TLS trace: SSL_connect:SSLv3/TLS write client hello
Looks like it happens because non-blocking mode is cleared from the socket
(ldap_ndelay_off) after the first poll for write, and non-blocking mode is
never restored before attempt to do tls connect, because of the check that
assumes that non-blocking mode has already been set for async mode:
if ( !async ) {
/* if async, this has already been set */
ber_sockbuf_ctrl( sb, LBER_SB_OPT_SET_NONBLOCK, (void*)1 );
}
while in fact it was cleared.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10254
Issue ID: 10254
Summary: Allow upgrading password hash on bind
Product: OpenLDAP
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: overlays
Assignee: bugs(a)openldap.org
Reporter: me(a)floriswesterman.nl
Target Milestone: ---
Many OpenLDAP installations are likely to contain relatively old password
hashes such as SSHA and CRYPT, as modern alternatives such as Argon are only
recent additions. Due to the nature of password hashes, it is of course not
possible to "unhash" the old values and rehash them with a more modern
algorithm. The presence of these old password hashes poses a liability in case
of information leaks or hacks.
Currently, the only way to upgrade a password hash is to wait for the user to
change their password. This can be sped up by expiring passwords and forcing
users to change them. However, this can be slow and frequent password rotation
is no longer considered a best practice.
It would be a very helpful addition to add support for upgrading a password
hash on bind. This is implemented in the 389 directory server:
https://www.port389.org/docs/389ds/design/pwupgrade-on-bind.html
Essentially, when a user binds, the password is checked like normal. In case of
a successful bind, the proposed feature would check the hash algorithm used for
the password; and in case it is not equal to the current `olcPasswordHash`
value, the user-provided password is rehashed using the new algorithm and
stored. This way, the old hashes are phased out more quickly, without being a
disturbance to users.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9343
Issue ID: 9343
Summary: Expand ppolicy policy configuration to allow URL
filter
Product: OpenLDAP
Version: 2.5
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: overlays
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
Currently, ppolicy only supports a single global default policy, and past that
any policies must be manually added to a given user entry if they are supposed
to have something other than the default policy.
Also, some sites want no default policy, and only a specific subset to have a
policy applied to them.
For both of these cases, it would be helpful if it were possible to configure a
policy to apply to a set of users via a URL similar to the way we handle
creating groups of users in dynlist
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10169
Issue ID: 10169
Summary: Add support for token only authentication with otp
overlay
Product: OpenLDAP
Version: 2.6.6
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: overlays
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
Currently the OTP overlay is password + token. It would be nice to be able to
configure it so it can run in a token only mode, similar to the slapo-totp
overlay in contrib. This would allow us to have a project supported solution
and retire that contrib module.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9186
Bug ID: 9186
Summary: RFE: More metrics in cn=monitor
Product: OpenLDAP
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: backends
Assignee: bugs(a)openldap.org
Reporter: michael(a)stroeder.com
Target Milestone: ---
Currently I'm grepping metrics from syslog with mtail:
https://gitlab.com/ae-dir/ansible-ae-dir-server/-/blob/master/templates/mta…
With a new binary logging this is not possible anymore.
Thus it would be nice if cn=monitor provides more metrics.
1. Overall connection count per listener starting at 0 when started. This would
be a simple counter added to:
entries cn=Listener 0,cn=Listeners,cn=Monitor
2. Counter for the various "deferring" messages separated by the reason for
deferring.
3. Counters for all possible result codes. In my mtail program I also label it
with the result type.
--
You are receiving this mail because:
You are on the CC list for the bug.
https://bugs.openldap.org/show_bug.cgi?id=10092
Issue ID: 10092
Summary: Local logging doesn't build on Windows
Product: OpenLDAP
Version: 2.6.6
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: hyc(a)openldap.org
Target Milestone: ---
slapd/logging.c uses writev() to write a prefix and the log message together in
one call. This feature doesn't exist on Windows. The closest equivalent,
WriteFileGather, only works on page sized and aligned writes. On Windows the
only way to write as desired is to copy the message into a new buffer first.
--
You are receiving this mail because:
You are on the CC list for the issue.