I had to duplicate an LMDB database for replication recently, and used mdb_copy to do so. One server is using the original data.mdb database (which is sparse) and the other is using the mdb_copy non-sparse data.mdb file. The two servers are identical (hardware, OS, software and configuration). OpenLDAP-2.4.39 is being used, 64 bit Linux OS. mdb_stat shows the map size as the same, which is expected.
Will the use of the non-sparse file cause any performance issues?
The reason for asking is that I am seeing a difference in search times between the two. With 20 million objects, a search on modifyTimestamp (which is indexed) gives: server 1: approx 1s server 2: approx 60s
server 2 started with the same search time as server 1 when the databases were originally copied, but has slowly increased its search time over about a week for this same search.
Geoff Swan wrote:
I had to duplicate an LMDB database for replication recently, and used mdb_copy to do so. One server is using the original data.mdb database (which is sparse)
and the other is using the mdb_copy non-sparse data.mdb file.
If you specified no special options, the file produced by mdb_copy is identical to the original - it will also be sparse if the original is.
The two servers are identical (hardware, OS, software and
configuration). OpenLDAP-2.4.39 is being used, 64 bit Linux OS. mdb_stat shows the map size as the same, which is expected.
Will the use of the non-sparse file cause any performance issues?
Question is irrelevant since both are sparse files.
The reason for asking is that I am seeing a difference in search times between the two. With 20 million objects, a search on modifyTimestamp (which is indexed) gives: server 1: approx 1s server 2: approx 60s
server 2 started with the same search time as server 1 when the
databases were originally copied, but has slowly increased its search time over about a week for this same search.
Look at disk I/O and memory usage, the LMDB file itself has no bearing here.
--On Monday, March 23, 2015 8:38 PM +0000 Howard Chu hyc@symas.com wrote:
Geoff Swan wrote:
I had to duplicate an LMDB database for replication recently, and used mdb_copy to do so. One server is using the original data.mdb database (which is sparse)
and the other is using the mdb_copy non-sparse data.mdb file.
If you specified no special options, the file produced by mdb_copy is identical to the original - it will also be sparse if the original is.
Well, to be clear: While the DB is sparse, mdb_copy does drop the unused map space when using mdb_copy by default.
I.e., if I specified an 80GB maxsize on a 20MB db, then the database copy done via mdb_copy will be 20MB not 80GB. I.e., it's still sparse, but the unused portion has been dropped.
[zimbra@zre-ldap003 db]$ ls -l total 820 -rw-------. 1 zimbra zimbra 52710469632 Mar 23 14:50 data.mdb -rw-------. 1 zimbra zimbra 8192 Mar 23 14:53 lock.mdb [zimbra@zre-ldap003 db]$ mkdir -p /tmp/mdb/db [zimbra@zre-ldap003 db]$ mdb_copy . /tmp/mdb/db [zimbra@zre-ldap003 db]$ cd /tmp/mdb/db [zimbra@zre-ldap003 db]$ ls -l total 816 -rw-r-----. 1 zimbra zimbra 835584 Mar 23 14:54 data.mdb
I think this is the behavior they're referring to. However, in my experience, after starting up slapd with an mdb_copy'd db, where sparse files are in use, the size will be set to whatever slapd's configured to use after slapd is started. For example:
[zimbra@zre-ldap003 db]$ cd [zimbra@zre-ldap003 ~]$ ldap stop Killing slapd with pid 29463 done. [zimbra@zre-ldap003 ~]$ cd data/ldap/mdb [zimbra@zre-ldap003 mdb]$ mv db db.old [zimbra@zre-ldap003 mdb]$ mv /tmp/mdb/db . [zimbra@zre-ldap003 mdb]$ cd db [zimbra@zre-ldap003 db]$ ls -l total 816 -rw-r-----. 1 zimbra zimbra 835584 Mar 23 14:54 data.mdb [zimbra@zre-ldap003 db]$ ldap start Started slapd: pid 28079 [zimbra@zre-ldap003 db]$ ls -l total 820 -rw-r-----. 1 zimbra zimbra 52710469632 Mar 23 14:55 data.mdb -rw-------. 1 zimbra zimbra 8192 Mar 23 14:55 lock.mdb
If that is not being seen, then your configurations are not as identical as thought.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On 24/03/2015 6:56 AM, Quanah Gibson-Mount wrote:
--On Monday, March 23, 2015 8:38 PM +0000 Howard Chu hyc@symas.com wrote:
Geoff Swan wrote:
I had to duplicate an LMDB database for replication recently, and used mdb_copy to do so. One server is using the original data.mdb database (which is sparse)
and the other is using the mdb_copy non-sparse data.mdb file.
If you specified no special options, the file produced by mdb_copy is identical to the original - it will also be sparse if the original is.
Well, to be clear: While the DB is sparse, mdb_copy does drop the unused map space when using mdb_copy by default.
That is what I have seen. The filesystem reports 2TB file size for the first server, and 590GB file size for the mdb copy, with default options.
I.e., if I specified an 80GB maxsize on a 20MB db, then the database copy done via mdb_copy will be 20MB not 80GB. I.e., it's still sparse, but the unused portion has been dropped.
OK
I think this is the behavior they're referring to. However, in my experience, after starting up slapd with an mdb_copy'd db, where sparse files are in use, the size will be set to whatever slapd's configured to use after slapd is started.
This is not what is being seen. The file size remains at 590GB on the server using the copy, and 2TB on the original server.
If that is not being seen, then your configurations are not as identical as thought.
The configurations are identical. The servers are direct copies of each other. Hence the question concerning performance as one is ramping up to a much slower search time than the other.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc.
Zimbra :: the leader in open source messaging and collaboration
--On Wednesday, March 25, 2015 8:41 AM +1100 Geoff Swan gswan3@bigpond.net.au wrote:
Well, to be clear: While the DB is sparse, mdb_copy does drop the unused map space when using mdb_copy by default.
That is what I have seen. The filesystem reports 2TB file size for the first server, and 590GB file size for the mdb copy, with default options.
Use du -c to get actual used space instead of the maxsize value.
I think this is the behavior they're referring to. However, in my experience, after starting up slapd with an mdb_copy'd db, where sparse files are in use, the size will be set to whatever slapd's configured to use after slapd is started.
This is not what is being seen. The file size remains at 590GB on the server using the copy, and 2TB on the original server.
Then there's something odd happening on your server where you placed the copy. What OpenLDAP version are you using?
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
On 26/03/2015 6:23 AM, Quanah Gibson-Mount wrote:
--On Wednesday, March 25, 2015 8:41 AM +1100 Geoff Swan gswan3@bigpond.net.au wrote:
Well, to be clear: While the DB is sparse, mdb_copy does drop the unused map space when using mdb_copy by default.
That is what I have seen. The filesystem reports 2TB file size for the first server, and 590GB file size for the mdb copy, with default options.
Use du -c to get actual used space instead of the maxsize value.
I think this is the behavior they're referring to. However, in my experience, after starting up slapd with an mdb_copy'd db, where sparse files are in use, the size will be set to whatever slapd's configured to use after slapd is started.
This is not what is being seen. The file size remains at 590GB on the server using the copy, and 2TB on the original server.
Then there's something odd happening on your server where you placed the copy. What OpenLDAP version are you using?
It is 2.4.39. I might try a slapcat/slapadd to rebuild the db file and see if that corrects the problem.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc.
Zimbra :: the leader in open source messaging and collaboration
On 26/03/2015 8:13 AM, Geoff Swan wrote:
On 26/03/2015 6:23 AM, Quanah Gibson-Mount wrote:
--On Wednesday, March 25, 2015 8:41 AM +1100 Geoff Swan gswan3@bigpond.net.au wrote:
Well, to be clear: While the DB is sparse, mdb_copy does drop the unused map space when using mdb_copy by default.
That is what I have seen. The filesystem reports 2TB file size for the first server, and 590GB file size for the mdb copy, with default options.
Use du -c to get actual used space instead of the maxsize value.
I think this is the behavior they're referring to. However, in my experience, after starting up slapd with an mdb_copy'd db, where sparse files are in use, the size will be set to whatever slapd's configured to use after slapd is started.
This is not what is being seen. The file size remains at 590GB on the server using the copy, and 2TB on the original server.
Then there's something odd happening on your server where you placed the copy. What OpenLDAP version are you using?
It is 2.4.39. I might try a slapcat/slapadd to rebuild the db file and see if that corrects the problem.
Further testing may give some clues. The search is on modifyTimestamp, in particular branches, to find objects with a modifyTimestamp>=value. If value is fairly close to the current datetime, the search returns quickly. However if value is a few days ago then the search appears to take many hours, even though there are no objects that match the filter (ie the result set size has no effect). Not sure why this should be the case, given that modifyTimestamp is indexed, there is plenty of memory and 30-50% is free during the search operation.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc.
Zimbra :: the leader in open source messaging and collaboration
On 24/03/2015 6:38 AM, Howard Chu wrote:
Geoff Swan wrote:
I had to duplicate an LMDB database for replication recently, and used mdb_copy to do so. One server is using the original data.mdb database (which is sparse)
and the other is using the mdb_copy non-sparse data.mdb file.
If you specified no special options, the file produced by mdb_copy is identical to the original - it will also be sparse if the original is.
The two servers are identical (hardware, OS, software and
configuration). OpenLDAP-2.4.39 is being used, 64 bit Linux OS. mdb_stat shows the map size as the same, which is expected.
Will the use of the non-sparse file cause any performance issues?
Question is irrelevant since both are sparse files.
Thanks for the clarification Howard. The data file from the mdb_copy snapshot was transferred over the network to the other server using scp, which I understand does not recognise sparse files, so the copy is likely to be non-sparse. I guess this would be considered a corruption and best to start with a fresh copy?
The reason for asking is that I am seeing a difference in search times between the two. With 20 million objects, a search on modifyTimestamp (which is indexed) gives: server 1: approx 1s server 2: approx 60s
server 2 started with the same search time as server 1 when the
databases were originally copied, but has slowly increased its search time over about a week for this same search.
Look at disk I/O and memory usage, the LMDB file itself has no bearing here.
openldap-technical@openldap.org