Yeah, that worked.

DB_CONFIG now contains:

# one 1 GB cache
set_cachesize 0 1073741824 1

# Data Directory
set_data_dir /var/lib/ldap/domain

# Transaction Log settings
set_lg_regionmax        1048576
set_lg_max              10485760
set_lg_bsize            2097152
set_lg_dir /var/lib/ldap/domain

# autoremove log files
set_flags DB_LOG_AUTOREMOVE

Stats now show:

80      Last allocated locker ID
0x7fffffff      Current maximum unused locker ID
9       Number of lock modes
1000    Maximum number of locks possible
1000    Maximum number of lockers possible
1000    Maximum number of lock objects possible
80      Number of lock object partitions
16      Number of current locks
724     Maximum number of locks at any one time
12      Maximum number of locks in any one bucket
9       Maximum number of locks stolen by for an empty partition
2       Maximum number of locks stolen for any one partition
80      Number of current lockers
82      Maximum number of lockers at any one time
16      Number of current lock objects
391     Maximum number of lock objects at any one time
6       Maximum number of lock objects in any one bucket
0       Maximum number of objects stolen by for an empty partition
0       Maximum number of objects stolen for any one partition
1630324 Total number of locks requested
1630299 Total number of locks released
0       Total number of locks upgraded
18      Total number of locks downgraded
1       Lock requests not available due to conflicts, for which we waited
0       Lock requests not available due to conflicts, for which we did not wait
0       Number of deadlocks
0       Lock timeout value
0       Number of locks that have timed out
0       Transaction timeout value
0       Number of transactions that have timed out
744KB   The size of the lock region
1       The number of partition locks that required waiting (0%)
1       The maximum number of times any partition lock was waited for (0%)
0       The number of object queue operations that required waiting (0%)
0       The number of locker allocations that required waiting (0%)
0       The number of region locks that required waiting (0%)
6       Maximum hash bucket length

Our entire database dumped to ldif happens to be only 1MB in size while we only do writes maybe once or twice a week.  So the 1GB cache seems rather wasteful.  From reading http://www.openldap.org/faq/data/cache/1072.html the cache should be in memory not on disk?  I see the 1GB file exists on disk while memory usage for slapd has not actually increased.  Does it preallocate the entire cache to disk then only flush from memory to disk when "set_lg_bsize" value has been reached?

Moreover, after removing "checkpoint" from slapd.conf, and using the above DB_CONFIG, I now see several more 10MB *.log files.  

-rw------- 1 ldap ldap  10M Aug 30 11:45 log.0000000455
-rw------- 1 ldap ldap  10M Aug 30 11:45 log.0000000456
-rw------- 1 ldap ldap  10M Aug 30 11:45 log.0000000457
-rw------- 1 ldap ldap  10M Aug 30 11:45 log.0000000458
-rw------- 1 ldap ldap  24K Aug 30 11:46 __db.001
-rw------- 1 ldap ldap  32K Aug 30 11:55 __db.006
-rw------- 1 ldap ldap 3.0M Aug 30 11:55 __db.004
-rw------- 1 ldap ldap 1.0G Aug 30 11:55 __db.003
-rw------- 1 ldap ldap 744K Aug 30 11:56 __db.005
-rw------- 1 ldap ldap  35M Aug 30 11:56 __db.002

Why do so many transaction log files exist if my entire database happens to be only 1MB?





Thanks,

Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Physiology and Biophysics
Weill Cornell Medicine
E: doug@med.cornell.edu
O: 212-746-6305
F: 212-746-8690

On Wed, Aug 30, 2017 at 10:11 AM, Douglas Duckworth <dod2014@med.cornell.edu> wrote:
Once I put in place DB_CONFIG I should stop slapd then run this to reinitialize the database with the new runtime configuration:

sudo -u ldap /usr/bin/db_recover -h /var/lib/ldap/domain -v

Thanks,

Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Physiology and Biophysics
Weill Cornell Medicine

On Wed, Aug 30, 2017 at 10:06 AM, Douglas Duckworth <dod2014@med.cornell.edu> wrote:
This seems to help

user@ldap[~]$ sudo -u ldap /usr/bin/db_stat -h /var/lib/ldap/domain -c
566     Last allocated locker ID
0x7fffffff      Current maximum unused locker ID
9       Number of lock modes
1000    Maximum number of locks possible
1000    Maximum number of lockers possible
1000    Maximum number of lock objects possible
80      Number of lock object partitions
16      Number of current locks
986     Maximum number of locks at any one time
14      Maximum number of locks in any one bucket
303     Maximum number of locks stolen by for an empty partition
18      Maximum number of locks stolen for any one partition
90      Number of current lockers
130     Maximum number of lockers at any one time
16      Number of current lock objects
519     Maximum number of lock objects at any one time
8       Maximum number of lock objects in any one bucket
0       Maximum number of objects stolen by for an empty partition
0       Maximum number of objects stolen for any one partition
348M    Total number of locks requested (348174715)
348M    Total number of locks released (348174394)
0       Total number of locks upgraded
112     Total number of locks downgraded
10622   Lock requests not available due to conflicts, for which we waited            <------ sounds bad
0       Lock requests not available due to conflicts, for which we did not wait
2       Number of deadlocks
0       Lock timeout value
0       Number of locks that have timed out
0       Transaction timeout value
0       Number of transactions that have timed out
744KB   The size of the lock region
221341  The number of partition locks that required waiting (0%)
5041    The maximum number of times any partition lock was waited for (0%)
1       The number of object queue operations that required waiting (0%)
40577   The number of locker allocations that required waiting (0%)
0       The number of region locks that required waiting (0%)
8       Maximum hash bucket length

Only four clients are currently using this cluster so perhaps I should actually use DB_CONFIG before putting it into production.


Thanks,

Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Physiology and Biophysics
Weill Cornell Medicine

On Tue, Aug 29, 2017 at 2:13 PM, Douglas Duckworth <dod2014@med.cornell.edu> wrote:
Adding 

# checkpointing - added 8/29/2017
checkpoint 128 10

To slapd.conf then running

sudo db_archive -d -h /var/lib/ldap/domain

Removed the old log files.  /var now using under 1GB.

Thanks Howard!

Our LDAP server contains about 4000 entries.  At what point would adding DB_CONFIG be needed for performance reasons?  How would I even ascertain that there's performance issues?


Thanks,

Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Physiology and Biophysics
Weill Cornell Medicine

On Mon, Aug 28, 2017 at 10:19 AM, Douglas Duckworth <dod2014@med.cornell.edu> wrote:
Thanks for the reply, Howard.

Thanks for pointing me in the right direction.  From what I have read there are two options.

1) Copy /usr/share/openldap-servers/DB_CONFIG.example to /var/lib/domain then rebuild the database.
2) Enable checkpointing in slapd.conf

Does enabling checkpointing in slapd.conf require rebuilding the database or can I simply restart slapd.conf?  We are not using online configuration.

Best
Doug



Thanks,

Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Physiology and Biophysics
Weill Cornell Medicine

On Fri, Aug 25, 2017 at 8:55 AM, Howard Chu <hyc@symas.com> wrote:
Douglas Duckworth wrote:
> Hi
>
> I am running openldap-servers-2.4.40-16.el6.x86_64 cluster on Centos 6.9.  My
> /var/lib/ldap directory contains many 10MB log files.  /var partition rather
> small...
>
> I've read they can be removed either by running "sudo db_archive -d -h
> /var/lib/ldap/domain" or by defining "DB_LOG_AUTOREMOVE" within the file
> "DB_CONFIG."  That file does not presently exist whereas the db_archive
> command does not actually remove any of the log files.

If the db_archive command doesn't remove anything, that means it thinks all of
the log files are still in active use.

Read the docs more carefully.
https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.oracle.com_cd_E17076-5F05_html_programmer-5Freference_transapp-5Flogfile.html&d=DwICaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=2Fzhh_78OGspKQpl_e-CbhH6xUjnRkaqPFUS2wTJ2cw&m=WP95x8mwSiEHHqUWRqJv6WdpfcTtJDAUAKN756yEEDA&s=Kfi27b4v7vABZjPQYMkeo4xBqUyDGZeyB8pHAFin8xY&e=

>
> Can I remove the old log files manually using rm?

Not if the above is true, you will corrupt the logs and the DB will fail to
open on a subsequent restart.

> If not should I create
> /var/lib/ldap/DB_CONFIG then restart slapd to make this removal automatic?

> Do you have any idea why db_archive does not work or produce any helpful error
> to stdout?

There's no error message because there's no error, everything is working as
designed.

You need to do periodic checkpoints to allow log files to be closed, and then
db_archive will be able to remove some of them.

--
   -- Howard Chu
   CTO, Symas Corp.           https://urldefense.proofpoint.com/v2/url?u=http-3A__www.symas.com&d=DwICaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=2Fzhh_78OGspKQpl_e-CbhH6xUjnRkaqPFUS2wTJ2cw&m=WP95x8mwSiEHHqUWRqJv6WdpfcTtJDAUAKN756yEEDA&s=IT7tNF72SCugdO8WpRd-oNsk4nPNpdjE2aUFL4R4X_M&e=
   Director, Highland Sun     https://urldefense.proofpoint.com/v2/url?u=http-3A__highlandsun.com_hyc_&d=DwICaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=2Fzhh_78OGspKQpl_e-CbhH6xUjnRkaqPFUS2wTJ2cw&m=WP95x8mwSiEHHqUWRqJv6WdpfcTtJDAUAKN756yEEDA&s=XqfYCnjG9ibPbeW05QZOlWdl9u0ZH-7IXkxx0gh238k&e=
   Chief Architect, OpenLDAP  https://urldefense.proofpoint.com/v2/url?u=http-3A__www.openldap.org_project_&d=DwICaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=2Fzhh_78OGspKQpl_e-CbhH6xUjnRkaqPFUS2wTJ2cw&m=WP95x8mwSiEHHqUWRqJv6WdpfcTtJDAUAKN756yEEDA&s=-tGdeTJRpeaRbljBBUq49XgfNWzVElqiGEgv0LeqspU&e=