https://bugs.openldap.org/show_bug.cgi?id=9378
Issue ID: 9378
Summary: Crash in mdb_put() / mdb_page_dirty()
Product: LMDB
Version: 0.9.26
Hardware: All
OS: Linux
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: nate(a)kde.org
Target Milestone: ---
The KDE Baloo file indexer uses lmdb as its database (source code available at
https://invent.kde.org/frameworks/baloo). Our most common crash, with over 100
duplicate bug reports, is in lmdb. Here's the bug report tracking it:
https://bugs.kde.org/show_bug.cgi?id=389848.
The version of lmdb does not seem to matter much. We have bug reports from Arch
users with lmdb 0.9.26 as well as bug reports from people using many earlier
versions.
Here's an example backtrace, taken from
https://bugs.kde.org/show_bug.cgi?id=426195:
#6 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#7 0x00007f3c0bbb9859 in __GI_abort () at abort.c:79
#8 0x00007f3c0b23ba83 in mdb_assert_fail (env=0x55e2ad710600,
expr_txt=expr_txt@entry=0x7f3c0b23e02f "rc == 0",
func=func@entry=0x7f3c0b23e978 <__func__.7221> "mdb_page_dirty",
line=line@entry=2127, file=0x7f3c0b23e010 "mdb.c") at mdb.c:1542
#9 0x00007f3c0b2306d5 in mdb_page_dirty (mp=<optimized out>,
txn=0x55e2ad7109f0) at mdb.c:2114
#10 mdb_page_dirty (txn=0x55e2ad7109f0, mp=<optimized out>) at mdb.c:2114
#11 0x00007f3c0b231966 in mdb_page_alloc (num=num@entry=1,
mp=mp@entry=0x7f3c0727aee8, mc=<optimized out>) at mdb.c:2308
#12 0x00007f3c0b231ba3 in mdb_page_touch (mc=mc@entry=0x7f3c0727b420) at
mdb.c:2495
#13 0x00007f3c0b2337c7 in mdb_cursor_touch (mc=mc@entry=0x7f3c0727b420) at
mdb.c:6523
#14 0x00007f3c0b2368f9 in mdb_cursor_put (mc=mc@entry=0x7f3c0727b420,
key=key@entry=0x7f3c0727b810, data=data@entry=0x7f3c0727b820,
flags=flags@entry=0) at mdb.c:6657
#15 0x00007f3c0b23976b in mdb_put (txn=0x55e2ad7109f0, dbi=5,
key=key@entry=0x7f3c0727b810, data=data@entry=0x7f3c0727b820,
flags=flags@entry=0) at mdb.c:9022
#16 0x00007f3c0c7124c5 in Baloo::DocumentDB::put
(this=this@entry=0x7f3c0727b960, docId=<optimized out>,
docId@entry=27041423333263366, list=...) at ./src/engine/documentdb.cpp:79
#17 0x00007f3c0c743da7 in Baloo::WriteTransaction::replaceDocument
(this=0x55e2ad7ea340, doc=..., operations=operations@entry=...) at
./src/engine/writetransaction.cpp:232
#18 0x00007f3c0c736b16 in Baloo::Transaction::replaceDocument
(this=this@entry=0x7f3c0727bc10, doc=..., operations=operations@entry=...) at
./src/engine/transaction.cpp:295
#19 0x000055e2ac5d6cbc in Baloo::UnindexedFileIndexer::run
(this=0x55e2ad79ca20) at
/usr/include/x86_64-linux-gnu/qt5/QtCore/qrefcount.h:60
#20 0x00007f3c0c177f82 in QThreadPoolThread::run (this=0x55e2ad717f20) at
thread/qthreadpool.cpp:99
#21 0x00007f3c0c1749d2 in QThreadPrivate::start (arg=0x55e2ad717f20) at
thread/qthread_unix.cpp:361
#22 0x00007f3c0b29d609 in start_thread (arg=<optimized out>) at
pthread_create.c:477
#23 0x00007f3c0bcb6103 in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9736
Issue ID: 9736
Summary: pwrite bug in OSX breaking LMDB promise about the
maximum value size
Product: LMDB
Version: unspecified
Hardware: All
OS: Mac OS
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: renault.cle(a)gmail.com
Target Milestone: ---
Hi,
I was working with LMDB and found an issue when trying to write a value of
approximately 3.3GiB in the database, I dive into the LMDB source code of the
mdb_put method using the lldb debugger and found out that it was not related to
an issue in LMDB itself but rather a bug in the pwrite function of the Mac OS
libc implementation.
The pwrite function is given four parameters, the file descriptor, the buffer,
the count of bytes to write from the buffer and, the offset of where to write
it in the file. On Mac OS the count of bytes is a size_t that must be a 64bits
unsigned integer but when you call pwrite with a number bigger or equal to 2^31
it returns an error 22 (invalid argument). LMDB was returning a 22 error from
the mdb_put call and not an EINVAL because the error was cause by an internal
issue and not something catchable by LMDB.
I am not sure about what we can do, can we implement this single pwrite [1] as
multiple pwrite with counts smaller than 2^31 in a loop, just for Mac OS? Like
for Windows where we do specific things for this operating system too?
I also found this issue on the RocksDB repository [2] about a similar problem
they have with pwrite and write on Mac OS it seems. I understand that this is
not a real promise that LMDB is specifying but rather an "in theory" rule [3].
Thank you for your time,
kero
[1]:
https://github.com/LMDB/lmdb/blob/01b1b7dc204abdf3849536979205dc9e3a0e3ece/…
[2]: https://github.com/facebook/rocksdb/issues/5169
[3]: http://www.lmdb.tech/doc/group__mdb.html#structMDB__val
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10054
Issue ID: 10054
Summary: Value size limited to 2,147,479,552 bytes
Product: LMDB
Version: unspecified
Hardware: x86_64
OS: Linux
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: louis(a)meilisearch.com
Target Milestone: ---
Hello,
According to the documentation[0], a database that is not using `MDB_DUPSORT`
can store values up to `0xffffffff` bytes (around 4GB).
In practice, under Linux, the actual limit is `0x7ffff000` though (2^31 - 4096,
so around 2GB).
This is due to the write loop in `mdb_page_flush`. The `wsize` value
determining how many bytes will be written can be as big as
`4096*dp->mp_pages`[1], and the number of overflow pages grows with the size of
the value put inside the DB.
The `wsize` is not split in smaller chunks in the case where there are many
overflow pages to write, and as a result the call to `pwrite`[2] does not
perform a full write, but only a "short" write of 2147479552 bytes (the maximum
allowed on a call to `pwrite` on Linux[3]).
This would be OK if the short write condition was handled by looping and
performing another `pwrite` with the rest of the data, but instead `EIO` is
returned[4].
There seems to be a related, but different issue on macOS when trying to
`pwrite` more the 2^31 bytes, that was already reported[5].
This issue was reported to me by a Meilisearch user because it causes their
database indexing to fail[6]. I had to investigate a bit because their setup
was peculiar (high number of documents in their database) and the `EIO` error
code is not very descriptive of the underlying issue.
I join a C reproducer of the issue that attempts to add a 2147479553 bytes
value to the DB and fails with `EIO` (decreasing `nb_items` to a smaller value
such as `2107479552` does succeed)[7].
Thank you for making LMDB!
Louis Dureuil.
[0]:
https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h#LL284…
[1]:
https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#LL3770…
[2]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#L3820
[3]:
https://stackoverflow.com/questions/70368651/why-cant-linux-write-more-than…
[4]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#L3840
[5]: https://bugs.openldap.org/show_bug.cgi?id=9736
[6]: https://github.com/meilisearch/meilisearch/issues/3654
[7]: https://github.com/dureuill/lmdb_3654/tree/main
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10274
Issue ID: 10274
Summary: Replication issue on MMR configuration
Product: OpenLDAP
Version: 2.5.14
Hardware: All
OS: Linux
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: slapd
Assignee: bugs(a)openldap.org
Reporter: falgon.comp(a)gmail.com
Target Milestone: ---
Created attachment 1036
--> https://bugs.openldap.org/attachment.cgi?id=1036&action=edit
In this attachment you will find 2 openldap configurations for 2 instances +
slamd conf exemple + 5 screenshots to show the issue and one text file to
explain what you see
Hello we are openning this issue further to the initial post in technical :
https://lists.openldap.org/hyperkitty/list/openldap-technical@openldap.org/…
Issue :
We are working on a project and we've come across an issue with the replication
after performance testing :
*Configuration :*
RHEL 8.6
OpenLDAP 2.5.14
*MMR-delta *configuration on multiple servers attached
300,000 users configured and used for tests
*olcLastBind: TRUE*
Use of SLAMD (performance shooting)
*Problem description:*
We are currently running performance and resilience tests on our infrastructure
using the SLAMD tool configured to perform BINDs and MODs on a defined range of
accounts.
We use a load balancer (VIP) to poll all of our servers equally. (but it is
possible to do performance tests directly on each of the directories)
With our current infrastructure we're able to perform approximately 300
MOD/BIND/s. Beyond that, we start to generate delays and can randomly come
across one issue.
However, when we run performance tests that exceed our write capacity, our
replication between servers can randomly create an incident with directories
being unable to catch up with their replication delay.
The directories update their contextCSNs, but extremely slowly (like freezing).
From then on, it's impossible for the directories to catch again. (even with no
incoming traffic)
A restart of the instance is required to perform a full refresh and solve the
incident.
We have enabled synchronization logs and have no error or refresh logs to
indicate a problem ( we can provide you with logs if necessary).
We suspect a write collision or a replication conflict but this is never write
in our sync logs.
We've run a lot of tests.
For example, when we run a performance test on a single live server, we don't
reproduce the problem.
Anothers examples: if we define different accounts ranges for each server with
SALMD, we don't reproduce the problem either.
If we use only one account for the range, we don't reproduce the problem
either.
______________________________________________________________________
I have add some screenshots on attachement to show you the issue and all the
explanations.
______________________________________________________________________
*Symptoms :*
One or more directories can no longer be replicated normally after performance
testing ends.
No apparent error logs.
Need a restart of instances to solve the problem.
*How to reproduce the problem:*
Have at least two servers in MMR mode
Set LastBind to TRUE
Perform a SLAMD shot from a LoadBalancer in bandwidth mode OR start multiple
SLAMD test on same time for each server with the same account range.
Exceed the maximum write capacity of the servers.
*SLAMD configuration :*
authrate.sh --hostname ${HOSTNAME} --port ${PORTSSL} \
--useSSL --trustStorePath ${CACERTJKS} \
--trustStorePassword ${CACERTJKSPW} --bindDN "${BINDDN}" \
--bindPassword ${BINDPW} --baseDN "${BASEDN}" \
--filter "(uid=[${RANGE}])" --credentials ${USERPW} \
--warmUpIntervals ${WARMUP} \
--numThreads ${NTHREADS} ${ARGS}
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=10280
Issue ID: 10280
Summary: Combining positive & negated filters doesn't work with
dynlist
Product: OpenLDAP
Version: 2.5.18
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: overlays
Assignee: bugs(a)openldap.org
Reporter: code(a)pipoprods.org
Target Milestone: ---
The directory contains 3 users & 2 groups.
user1 is in group1, user2 is in group2, user3 isn't is any group.
Filter [1] matches users that are either:
- member of group1
- member of group2
✅ It returns user1 & user2
Filter [2] matches user that are:
- not member of group1 nor group2
✅ It returns user3
Filter [3] should match users that are either:
- member of group1
- member of group2
- not member of group1 nor group2
❌ It should return the 3 users but only returns users matched by the first part
of the filter (whatever the first part, if we swap both parts we get the
complementary search results)
Filter [1]:
(|(memberOf=cn=group1,ou=example-groups,dc=example,dc=com)(memberOf=cn=group2,ou=example-groups,dc=example,dc=com))
Filter [2]:
(!(|(memberOf=cn=group1,ou=example-groups,dc=example,dc=com)(memberOf=cn=group2,ou=example-groups,dc=example,dc=com)))
Filter [3]:
(|(memberOf=cn=group1,ou=example-groups,dc=example,dc=com)(memberOf=cn=group2,ou=example-groups,dc=example,dc=com)(!(|(memberOf=cn=group1,ou=example-groups,dc=example,dc=com)(memberOf=cn=group2,ou=example-groups,dc=example,dc=com))))
Here's my dynlist config:
```
dn: olcOverlay={2}dynlist,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcDynListConfig
olcOverlay: {2}dynlist
olcDynListAttrSet: {0}groupOfURLs memberURL member+memberOf@groupOfNames
structuralObjectClass: olcDynListConfig
entryUUID: 7df8328a-fd72-103e-82df-6fed25d5f6c8
creatorsName: cn=config
createTimestamp: 20240902122741Z
entryCSN: 20240902122741.257759Z#000000#000#000000
modifiersName: cn=config
modifyTimestamp: 20240902122741Z
```
Here's a LDIF to initialise directory contents:
```
dn: ou=example-groups,dc=example,dc=com
changetype: add
objectClass: organizationalUnit
ou: example-groups
dn: ou=example-users,dc=example,dc=com
changetype: add
objectClass: organizationalUnit
ou: example-users
dn: uid=user1,ou=example-users,dc=example,dc=com
changetype: add
objectClass: inetOrgPerson
cn: User
sn: One
uid: user1
dn: uid=user2,ou=example-users,dc=example,dc=com
changetype: add
objectClass: inetOrgPerson
cn: User
sn: Two
uid: user2
dn: uid=user3,ou=example-users,dc=example,dc=com
changetype: add
objectClass: inetOrgPerson
cn: User
sn: Three
uid: user3
dn: cn=group1,ou=example-groups,dc=example,dc=com
changetype: add
objectClass: groupOfNames
cn: group1
member: uid=user1,ou=example-users,dc=example,dc=com
dn: cn=group2,ou=example-groups,dc=example,dc=com
changetype: add
objectClass: groupOfNames
cn: group2
member: uid=user2,ou=example-users,dc=example,dc=com
```
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9431
Issue ID: 9431
Summary: back-mdb: Always have an equality index for
objectClass
Product: OpenLDAP
Version: 2.5
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: backends
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
Data storage backends require an equality index on objectClass to function
correctly. As this is a hard requirement it should be automatic with back-mdb.
Why this wasn't done in the past with other backends isn't exactly clear, it
may have been due to their requirements to have additional cache layers. That
however is not necessary with back-mdb.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9204
Bug ID: 9204
Summary: slapo-constraint allows anyone to apply Relax control
Product: OpenLDAP
Version: 2.4.49
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: overlays
Assignee: bugs(a)openldap.org
Reporter: ryan(a)openldap.org
Target Milestone: ---
slapo-constraint doesn't limit who can use the Relax control, beyond the global
limits applied by slapd. In practice, for many modifications this means any
configured constraints are advisory only.
In my opinion this should be considered a bug, in design if not implementation.
I expect many admins would not read the man page closely enough to realize the
behaviour does technically adhere to the letter of what's written there.
Either slapd should require manage privileges for the Relax control globally,
or slapo-constraint should perform a check for manage privilege itself, like
slapo-unique does.
Quoting ando in https://bugs.openldap.org/show_bug.cgi?id=5705#c4:
> Well, a user with "manage" privileges on related data could bypass
> constraints enforced by slapo-constraint(5) by using the "relax"
> control. The rationale is that a user with manage privileges could be
> able to repair an entry that needs to violate a constraint for good
> reasons. Note that the user:
>
> - must have enough privileges to do it (manage)
>
> - must inform the DSA that intends to violate the constraint (by using
> the control)
but such privileges are currently not being required.
--
You are receiving this mail because:
You are on the CC list for the bug.