https://bugs.openldap.org/show_bug.cgi?id=10395
Issue ID: 10395
Summary: Support multiple readers on uncommitted changes
Product: LMDB
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: renault.cle(a)gmail.com
Target Milestone: ---
Created attachment 1086
--> https://bugs.openldap.org/attachment.cgi?id=1086&action=edit
A patch to support multiple readers on uncommitted changes
Hello,
The attached patch is not meant to be merged immediately into LMDB. Still, it
demonstrates how I added a helpful feature to the key-value store: reading
uncommitted changes from multiple transactions. I am conscious that the patch
still requires some work and uses non-C99 features, i.e., atomics are C11,
which could be a blocker for it to be merged upstream. I would also be
delighted to merge these changes under a flag/define to ensure we don't impact
other users with non-C99 stuff.
The main feature we need at Meilisearch is to read uncommitted changes from
multiple threads, compute parallel post-processing of different data structures
[1], and speed up the search requests. We could have done the post-processing
in a following transaction by opening multiple read transactions, but this
would mean that the post-processed data structure would not include newly
inserted or modified document IDs. Both data structures would be desync.
Regarding the design choice, I decided to follow the same design as the nested
write transactions: use the parent argument of the mdb_txn_begin [2], and allow
the MDB_RDONLY flag, which was disallowed when the parent argument was non-NULL
[3]. I find it clear enough that, by calling the mdb_txn_begin function with
these arguments, you can call it multiple times (I need to update the doc) to
obtain nested read-only transactions from the parent write transaction.
ret = mdb_txn_begin(env, parent_txn, MDB_RDONLY, new_nested_rtxn);
Note that this early proposal lacks security and error handling. The generated
transactions are fake-read-only and actually write transactions that share the
underlying parent allocations and data structures. This is unsafe and must be
changed or reviewed carefully, but most importantly, we need to add read-only
capabilities to these transactions to disallow writes. Using a Rust wrapper on
top of LMDB, I wrapped the fake read-only transactions into ReadTxn, which
disallows any writes at compile time. However, I haven't checked the conflict
database creations or openings.
The main issues I encountered were concurrent free of the main shared data
structures when the different threads owning the transactions were dropping the
transactions simultaneously. So, I decided to implement the equivalent of an
ARC to free resources only when the last nested transaction was freed.
I can share numbers about how this feature improves the post-processing by
4x-9x or from 1200s to 120s [1]. You can look at this PR, which I would be
happy to merge once an improved version of this patch lands on LMDB upstream.
I would be very happy if you could guide me a bit on how I could improve this
patch to make it mergeable into LMDB. We want to contribute useful features
like this to LMDB and not keep a deviant fork. LMDB works great; we are happy
about it, and its performance is predictable.
Have a lovely week,
kero
[1]: https://github.com/meilisearch/meilisearch/pull/5307
[2]:
https://github.com/LMDB/lmdb/blob/14d6629bc8a9fe40d8a6bee1bf71c45afe7576b6/…
[3]:
https://github.com/LMDB/lmdb/blob/14d6629bc8a9fe40d8a6bee1bf71c45afe7576b6/…
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10353
Issue ID: 10353
Summary: No TLS connection on Windows because of missing
ENOTCONN in socket.h
Product: OpenLDAP
Version: 2.6.10
Hardware: All
OS: Windows
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: libraries
Assignee: bugs(a)openldap.org
Reporter: julien.wadel(a)belledonne-communications.com
Target Milestone: ---
On Windows, the TLS connection cannot be done and we get the connection error:
Can't contact LDAP server.
=> Connections are done with WSAGetLastError().
After getting WSAEWOULDBLOCK, the connection is not restart because of the
state WSAENOTCONN that is not known.
OpenLDAP use ENOTCONN that is set to 126 by "ucrt/errno.h" while WSAENOTCONN
is 10057L.
Adding #define ENOTCONN WSAENOTCONN
like for EWOULDBLOCK resolve the issue.
Reference commit on external project:
https://gitlab.linphone.org/BC/public/external/openldap/-/commit/62fbfb12e8…
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10274
Issue ID: 10274
Summary: Replication issue on MMR configuration
Product: OpenLDAP
Version: 2.5.14
Hardware: All
OS: Linux
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: falgon.comp(a)gmail.com
Target Milestone: ---
Created attachment 1036
--> https://bugs.openldap.org/attachment.cgi?id=1036&action=edit
In this attachment you will find 2 openldap configurations for 2 instances +
slamd conf exemple + 5 screenshots to show the issue and one text file to
explain what you see
Hello we are openning this issue further to the initial post in technical :
https://lists.openldap.org/hyperkitty/list/openldap-technical@openldap.org/…
Issue :
We are working on a project and we've come across an issue with the replication
after performance testing :
*Configuration :*
RHEL 8.6
OpenLDAP 2.5.14
*MMR-delta *configuration on multiple servers attached
300,000 users configured and used for tests
*olcLastBind: TRUE*
Use of SLAMD (performance shooting)
*Problem description:*
We are currently running performance and resilience tests on our infrastructure
using the SLAMD tool configured to perform BINDs and MODs on a defined range of
accounts.
We use a load balancer (VIP) to poll all of our servers equally. (but it is
possible to do performance tests directly on each of the directories)
With our current infrastructure we're able to perform approximately 300
MOD/BIND/s. Beyond that, we start to generate delays and can randomly come
across one issue.
However, when we run performance tests that exceed our write capacity, our
replication between servers can randomly create an incident with directories
being unable to catch up with their replication delay.
The directories update their contextCSNs, but extremely slowly (like freezing).
From then on, it's impossible for the directories to catch again. (even with no
incoming traffic)
A restart of the instance is required to perform a full refresh and solve the
incident.
We have enabled synchronization logs and have no error or refresh logs to
indicate a problem ( we can provide you with logs if necessary).
We suspect a write collision or a replication conflict but this is never write
in our sync logs.
We've run a lot of tests.
For example, when we run a performance test on a single live server, we don't
reproduce the problem.
Anothers examples: if we define different accounts ranges for each server with
SALMD, we don't reproduce the problem either.
If we use only one account for the range, we don't reproduce the problem
either.
______________________________________________________________________
I have add some screenshots on attachement to show you the issue and all the
explanations.
______________________________________________________________________
*Symptoms :*
One or more directories can no longer be replicated normally after performance
testing ends.
No apparent error logs.
Need a restart of instances to solve the problem.
*How to reproduce the problem:*
Have at least two servers in MMR mode
Set LastBind to TRUE
Perform a SLAMD shot from a LoadBalancer in bandwidth mode OR start multiple
SLAMD test on same time for each server with the same account range.
Exceed the maximum write capacity of the servers.
*SLAMD configuration :*
authrate.sh --hostname ${HOSTNAME} --port ${PORTSSL} \
--useSSL --trustStorePath ${CACERTJKS} \
--trustStorePassword ${CACERTJKSPW} --bindDN "${BINDDN}" \
--bindPassword ${BINDPW} --baseDN "${BASEDN}" \
--filter "(uid=[${RANGE}])" --credentials ${USERPW} \
--warmUpIntervals ${WARMUP} \
--numThreads ${NTHREADS} ${ARGS}
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10391
Issue ID: 10391
Summary: Add proper compiler/linker flag for threading support
on HP-UX
Product: OpenLDAP
Version: 2.6.10
Hardware: Other
OS: Other
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: libraries
Assignee: bugs(a)openldap.org
Reporter: michael.osipov(a)innomotics.com
Target Milestone: ---
Created attachment 1085
--> https://bugs.openldap.org/attachment.cgi?id=1085&action=edit
Patch against master
If an application uses a thread-enabled library or the application itself is
thread enabled it is imperative on HP-UX to use the -mt flag through out build.
The compiler will transform into proper -D and -l flags.
From the manpage:
-mt Sets various -D flags to enable multi-threading. Also
sets -lpthread. +Oopenmp automatically implies -mt.
For details see HP C/aC++ Online Programmer's Guide.
Find a Git-formatted patch attached.
Sample output:
libtool: link: /opt/aCC/bin/aCC -AC99 +We901 -o ldapsearch ldapsearch.o
common.o ldsversion.o -L/opt/ports/lib/hpux32
../../libraries/liblutil/liblutil.a ../../libraries/libldap/.libs/libldap.a
-L/opt/ports/lib
/var/tmp/ports/work/openldap-threading-hpux/libraries/liblber/.libs/liblber.a
../../libraries/liblber/.libs/liblber.a /opt/ports/lib/hpux32/libsasl2.so -ldl
-lssl -lcrypto -mt -Wl,+b -Wl,/opt/ports/lib/hpux32
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9564
Issue ID: 9564
Summary: Race condition with freeing the spilled pages from
transaction
Product: LMDB
Version: 0.9.29
Hardware: Other
OS: Mac OS
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: kriszyp(a)gmail.com
Target Milestone: ---
Created attachment 825
--> https://bugs.openldap.org/attachment.cgi?id=825&action=edit
Free the spilled pages and dirty overflows before unlocking the write mutex
The spilled pages (a transaction's mt_spill_pgs) is freed *after* a write
transaction's mutex is unlocked (in mdb.master3). This can result in a race
condition where a second transaction can start and subsequently assign a new
mt_spill_pgs list to the transaction structure that it reuses. If this occurs
after the first transaction unlocks the mutex, but before it performs the free
operation on mt_spill_pgs, then the first transaction will end up calling a
free on the same spilled page list as the second transaction, resulting in a
double free (and crash).
It would seem to be an extremely unlikely scenario to actually happen, since
the free call is literally the next operation after the mutex is unlocked, and
the second transaction would need to make it all the way to the point of saving
the freelist before a page spill list is likely to be allocated. Consequently,
this probably has rarely happened. However, one of our users (see
https://github.com/DoctorEvidence/lmdb-store/issues/48 for the discussion) has
noticed this occurring, and it seems that it may be particularly likely to
happen on MacOS on the new M1 silicon. Perhaps there is some peculiarity to how
the threads are more likely to yield execution after a mutex unlock, I am not
sure. I was able to reproduce the issue by intentionally manipulating the
timing (sleeping before the free) to verify that the race condition is
technically feasible, and apparently this can happen "in the wild" on MacOS, at
least with an M1.
It is also worth noting that this is due to (or a regression from) the fix for
ITS#9155
(https://github.com/LMDB/lmdb/commit/cb256f409bb53efeda4ac69ee8165a0b4fc1a277)
where the free call was moved outside the conditional (for having a parent)
that had previously never been executed after the mutex is unlocked, but now is
called after the unlock.
Anyway, the solution is relatively simple, the free call simply has to be moved
above the unlock. In my patch, I also moved the free call for mt_dirty_ovs. I
am not sure what OVERFLOW_NOTYET/mt_dirty_ovs is for, but presumably it should
be handled the same. This could alternately be solved by saving the reference
to these lists before unlocking, and freeing after unlocking, which would
slightly decrease the amount of processing within the mutex guarded code. Let
me know if you prefer a patch that does that.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10392
Issue ID: 10392
Summary: back-ldap does not return a response if incorrect
secprops is configured
Product: OpenLDAP
Version: 2.6.10
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: backends
Assignee: bugs(a)openldap.org
Reporter: nivanova(a)symas.com
Target Milestone: ---
If a non-existent secprops mechanism is configured as part of a back-ldap
idassert-bind directive, back-ldap does not return a response to the following
requests, leaving the client waiting. In the same case meta and asyncmeta would
return an LDAP error. My guess is failure to establish the target connection is
handled incorrectly in that code path.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10381
Issue ID: 10381
Summary: Logformat config is broken
Product: OpenLDAP
Version: 2.6.10
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: hyc(a)openldap.org
Target Milestone: ---
After ITS#10140, logformat config parsing and emitting is broken. Some formats
don't emit in cn=config, some abort. On Windows, log prefix is corrupted.
Patch is being tested.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10191
Issue ID: 10191
Summary: backend searches should respond to pause requests
Product: OpenLDAP
Version: 2.6.7
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: backends
Assignee: bugs(a)openldap.org
Reporter: hyc(a)openldap.org
Target Milestone: ---
A long running search will cause mods to cn=config to wait a long time. Search
ops should periodically check for threadpool pause requests.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10393
Issue ID: 10393
Summary: Duplicate test names test090-asyncmeta-conttl and
test090-auditlog
Product: OpenLDAP
Version: 2.6.10
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: test suite
Assignee: bugs(a)openldap.org
Reporter: nivanova(a)symas.com
Target Milestone: ---
Probably patch approval timing problem, I suggest to rename
test090-asyncmeta-conttl to test091-asyncmeta-conttl.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10372
Issue ID: 10372
Summary: support lastbind_forward_updates outside of the
lastbind overlay
Product: OpenLDAP
Version: 2.6.10
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: lklement(a)symas.com
Target Milestone: ---
according to the documentation for the LastBind overlay (slapo-lastbind) this
overlay is no longer recommended unless lastbind_forward_updates is used.
Can this functionality be added to the current lastbind configuration
recommendation?
--
You are receiving this mail because:
You are on the CC list for the issue.