We have an environment with no flags that contains a database with no flags. The database is append only, no deletions or modifications. It is written using a single RW transaction, in the absence of any RO transactions. We observe that when we commit and recreate the RW transaction every 2000 insertion ops, the data.mdb file size on disk is 2x larger than when committing every 64000  insertion ops. The mdb_copy –c utility shrinks the large 2k ops commit file to almost the same file size as the 64k commit one. mdb_stat –e on the data.mdb shows that  when we have more commits and bigger file, we have more pages used by the same proportion.

In production we will have several large DBs (>1TB) on an NVMe card and we do not have the 2x space for periodic mdb_copy –c compactifications (and we cannot stop the writing process). We also need to commit every 2000 write ops, because there will be short-lived RO transactions that need to see the DB updates every 2000 writes.

 

1.  Why is the file size on disk dependent on the commit frequency? (I suppose because with less frequent commits it can allocate data between pages more efficiently)?

2.  What can we do to reduce data.mdb, if we must commit frequently? Can we use any environment, transaction or db flags, or anything else?

We are on Linux 5.4.0 / ext4 fs. The DB that grows 2x faster with more frequent commits has bytearr key -> u32 val structure (the byterarray key is between 31 and 36 bytes). Another DB that has a reverse u32 key -> bytearr structure oonly grows 10% larger in the more frequent commits regime.