https://bugs.openldap.org/show_bug.cgi?id=9360
Issue ID: 9360
Summary: MDB_BAD_TXN: Transaction must abort, has a child, or
is invalid
Product: LMDB
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: spam(a)markandruth.co.uk
Target Milestone: ---
I have 2 python scripts writing to a database (lmdb 0.9.26, py-lmdb 0.98) and
5-10 lua processes (with lightningmdb module which uses lmdb 0.9.22) which are
long-running serving queries from the database.
The database seems fine, not corrupted, and the python writes still working all
the time. But periodically (perhaps 10-20% of the time), in a way I am unable
to reliably reproduce, when the lua starts up every time a query is issued txn
dbi_open returns "MDB_BAD_TXN: Transaction must abort, has a child, or is
invalid". A direct restart of the processes does not fix this issue, however
stopping lua+python and then starting again after a 5-20s wait usually fixes
the issue. This has been reproduced over multiple servers but I'm at a loss as
to how to debug this any further?
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9208
Bug ID: 9208
Summary: LMDB feature request: variant of mdb_env_copy{,fd2}
that takes transaction as parameter
Product: LMDB
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: github(a)nicwatson.org
Target Milestone: ---
The mdb_env_copy* functions create a read transaction themselves to run the
backup on. New variants of these functions (one for mdb_env_copy2 and one for
mdb_env_copyfd2) would have a transaction parameter. This transaction would be
used instead of creating a new transaction.
Application code could use these new functions to synchronize consistent live
backups across multiple LMDB instances (potentially across multiple hosts).
--
You are receiving this mail because:
You are on the CC list for the bug.
https://bugs.openldap.org/show_bug.cgi?id=9619
Issue ID: 9619
Summary: mdb_env_copy2 with MDB_CP_COMPACT in mdb.master3
produces corrupt mdb file
Product: LMDB
Version: 0.9.29
Hardware: All
OS: Windows
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: kriszyp(a)gmail.com
Target Milestone: ---
When copying an LMDB database with mdb_env_copy2 with the MDB_CP_COMPACT with
mdb.master3, the resulting mdb file seems to be corrupt and when using it in
LMDB, I get segmentation faults. Copying without the compacting flag seems to
work fine. I apologize, I know this is not a very good issue report, as I
haven't had a chance to actually narrow this down to a more
reproducible/isolated case, or look for how to patch. I thought I would report
in case there are any ideas on what could cause this. The segmentation faults
always seem to be memory write faults (as opposed to try fault on trying to
read). Or perhaps the current backup/copying functionality is eventually going
to be replaced by incremental backup/copying anyway
(https://twitter.com/hyc_symas/status/1315651814096875520). I'll try to update
this if I get a chance to investigate more, but otherwise feel free to
ignore/consider low-priority since the work around is easy.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9475
Issue ID: 9475
Summary: Add support for MAP_POPULATE
Product: LMDB
Version: unspecified
Hardware: All
OS: Linux
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: aa531811820(a)gmail.com
Target Milestone: ---
In some case (such as cloud computing platforms), the reading speed of large
files is very fast while the small files is very slow, and we have enough
memory, so we hope to prefetch the entire LMDB file into the memory during MMAP
through the MAP_POPULATE flag . According to our test, this is faster than
using readahead flag. Here are some test data:
# mmap with no readahead
read one sample: 0.2s
total time: 4800s
# mmap with readahead
read one sample: 0.0001s~0.03s
total time: 95.86s
# mmap with MAP_POPULATE
db init: 20s
read one sample: 0.0001s
total time: 78s
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9397
Issue ID: 9397
Summary: LMDB: A second process opening a file with
MDB_WRITEMAP can cause the first to SIGBUS
Product: LMDB
Version: 0.9.26
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: github(a)nicwatson.org
Target Milestone: ---
Created attachment 780
--> https://bugs.openldap.org/attachment.cgi?id=780&action=edit
Full reproduction of SIGBUS MDB_WRITEMAP issue (works on Linux only)
The fundamental problem is that a ftruncate() on Linux that makes a file
smaller will cause accesses past the new end of the file to SIGBUS (see the
mmap man page).
The sequence that causes a SIGBUS involves two processes.
1. The first process opens a new LMDB file with MDB_WRITEMAP.
2. The second process opens the same LMDB file with MDB_WRITEMAP and with an
explicit map_size smaller than the first process's map size.
* This causes an ftruncate that makes the underlying file *smaller*.
3. (Optional) The second process closes the environment and exits.
4. The first process opens a write transaction and writes a bunch of data.
5. The first process commits the transaction. This causes a memory read from
the mapped memory that's now past the end of the file. On Linux, this triggers
a SIGBUS.
Attached is code that fully reproduces the problem on Linux.
The most straightforward solution is to allow ftruncate to *reduce* the file
size if it is the only reader. Another possibility is check the file size and
ftruncate if necessary every time a write transaction is opened. A third
possibility is to catch the SIGBUS signal.
Repro note: I used clone() to create the subprocess to most straightforwardly
demonstrate that the problem is not due to inherited file descriptors. The
problem still manifests when the processes are completely independent.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=9207
Bug ID: 9207
Summary: Remove Moznss compatibility layer
Product: OpenLDAP
Version: 2.5
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: libraries
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
For the 2.5 release, remove the MozNSS compatibility layer.
--
You are receiving this mail because:
You are on the CC list for the bug.
https://bugs.openldap.org/show_bug.cgi?id=9204
Bug ID: 9204
Summary: slapo-constraint allows anyone to apply Relax control
Product: OpenLDAP
Version: 2.4.49
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: overlays
Assignee: bugs(a)openldap.org
Reporter: ryan(a)openldap.org
Target Milestone: ---
slapo-constraint doesn't limit who can use the Relax control, beyond the global
limits applied by slapd. In practice, for many modifications this means any
configured constraints are advisory only.
In my opinion this should be considered a bug, in design if not implementation.
I expect many admins would not read the man page closely enough to realize the
behaviour does technically adhere to the letter of what's written there.
Either slapd should require manage privileges for the Relax control globally,
or slapo-constraint should perform a check for manage privilege itself, like
slapo-unique does.
Quoting ando in https://bugs.openldap.org/show_bug.cgi?id=5705#c4:
> Well, a user with "manage" privileges on related data could bypass
> constraints enforced by slapo-constraint(5) by using the "relax"
> control. The rationale is that a user with manage privileges could be
> able to repair an entry that needs to violate a constraint for good
> reasons. Note that the user:
>
> - must have enough privileges to do it (manage)
>
> - must inform the DSA that intends to violate the constraint (by using
> the control)
but such privileges are currently not being required.
--
You are receiving this mail because:
You are on the CC list for the bug.
https://bugs.openldap.org/show_bug.cgi?id=9223
Bug ID: 9223
Summary: Add support for incremental backup
Product: LMDB
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: quanah(a)openldap.org
Target Milestone: ---
For LMDB 1.0, add support for incremental backups
--
You are receiving this mail because:
You are on the CC list for the bug.
https://bugs.openldap.org/show_bug.cgi?id=9434
Issue ID: 9434
Summary: Abysmal write performance with certain data patterns
Product: LMDB
Version: 0.9.24
Hardware: x86_64
OS: Linux
Status: UNCONFIRMED
Severity: normal
Priority: ---
Component: liblmdb
Assignee: bugs(a)openldap.org
Reporter: tina(a)tina.pm
Target Milestone: ---
Created attachment 784
--> https://bugs.openldap.org/attachment.cgi?id=784&action=edit
Monitoring graph of disk usage
Hi,
I have recently written a project for a customer which relies heavily on LMDB,
in which performance is critical. Sadly, after completing the project I started
having all kinds of problems when the DB started to grow. This has gotten so
bad the project release had to be postponed, and I have been asked to rewrite
the DB layer using a different engine, unless I can find some solution quickly.
I have so far found 4 serious issues, which I suspect are related either to the
size of the database or to the patterns of the data:
* Writing a value in some of the subdatabases has become increasingly slower,
and commits are taking way too long to complete. This is running on a powerful
computer with SSDs, and the 95% percentile of commits is at around 400ms. The
single-writer limitation meant that I have run out of optimisations to try.
* For some reason I cannot understand, the disk usage has grown to over 2x the
size of the actual data stored, and the free space does not seem to be
reclaimed. The file takes up 348 GB, while the used pages amount to only 162
GB.
* A couple of days ago it had a sudden spike in disk usage (not correlated to
increases in actual data stored, or even to the last pageno user) that filled
the disk in a couple of hours. You can see this in the attached captures of the
monitoring graphs which show actual disk usage (bottom) and counts of pages as
reported by LMDB (top). The bottom graph is total disk usage, although the
partition is almost exclusively the database, but ignore the few dips in size
which are from removing other stuff.
* Running `mdb_dump` for backups takes up to 7 hours for the database; restores
are totally useless: I tried to re-create the database after the weird space
spike and had to stopped after 24h when not even 30% of the data ad been
restored! This alone is a deal-breaker, as we have no usable way to backup and
restore the database.
For context, this is the mdb_stat output with descriptions of each subdatabase.
I have no explanation for the ridiculous amount of free pages, and even running
mdb_stat takes a few seconds:
Environment Info
Map address: (nil)
Map size: 397166026752
Page size: 4096
Max pages: 96964362
Number of pages used: 90991042
Last transaction ID: 14647267
Max readers: 126
Number of readers used: 4
Freelist Status
Tree depth: 3
Branch pages: 26
Leaf pages: 5168
Overflow pages: 74319
Entries: 111981
Free pages: 36352392
Status of Main DB
Tree depth: 1
Branch pages: 0
Leaf pages: 1
Overflow pages: 0
Entries: 8
Status of audit_log
Tree depth: 4
Branch pages: 309
Leaf pages: 69154
Overflow pages: 6082343
Entries: 2061655
* Audit log: MDB_INTEGERKEY, big values (12kb av). Append only, few reads.
Status of audit_idx
Tree depth: 4
Branch pages: 261
Leaf pages: 27310
Overflow pages: 0
Entries: 2006963
* Audit index 1: 40 byte keys, 8 byte values. Append only, it has less records
as I disabled it yesterday due to its impact on performance.
Status of time_idx
Tree depth: 3
Branch pages: 22
Leaf pages: 4611
Overflow pages: 0
Entries: 2061655
* Audit index 2: MDB_INTEGERKEY, MDB_DUPSORT, MDB_DUPFIXED; 40 byte values.
Append only.
Status of item_db
Tree depth: 4
Branch pages: 132
Leaf pages: 10040
Overflow pages: 0
Entries: 186291
* Main data store: 40 byte keys, small values (220b avg). Lots of reads and new
records, very few deletes and no updates.
Status of user_state_db
Tree depth: 5
Branch pages: 83283
Leaf pages: 9289578
Overflow pages: 32
Entries: 207894432
* User state: 20-40 byte keys, small values (180b avg), *many* entries. Lots
and reads and updates.
Status of item_users_idx
Tree depth: 4
Branch pages: 203
Leaf pages: 16532
Overflow pages: 0
Entries: 1035586217
* User / data matrix index: MDB_DUPSORT; 40 byte keys, 20-40 byte values,
*really big*. Lots of writes, very few deletes and no updates.
Status of user_log
Tree depth: 5
Branch pages: 361275
Leaf pages: 26570347
Overflow pages: 0
Entries: 1035586217
* User log: 30-50 byte keys, small values (100b avg), 1e9 records. Append only,
very few reads. I had to stop the restore operation while this was being
recreated, because after 24h only 50% the entries had been restored. Thanks to
monitoring, I measured this maxing out at 7000 entries per second; the other
databases showed way slower rates than this!
Any help would be really appreciated!
Thanks. Tina.
--
You are receiving this mail because:
You are on the CC list for the issue.