I recently installed some updates and configuration changes on one of my LDAP slaves. Replication broke mysteriously after that. I turned on full debugging on slapd and just saw this:
----------- Jul 11 12:42:51 pip slapd[30723]: =>do_syncrepl rid 001 Jul 11 12:42:51 pip slapd[30723]: daemon: epoll: listen=8 active_threads=0 tvp=zero Jul 11 12:42:51 pip slapd[30723]: =>do_syncrep2 rid 001 Jul 11 12:42:51 pip slapd[30723]: do_syncrep2: rid 001 LDAP_RES_SEARCH_RESULT Jul 11 12:42:51 pip slapd[30723]: connection_get(12) Jul 11 12:42:51 pip slapd[30723]: connection_get(12): got connid=0 Jul 11 12:42:51 pip slapd[30723]: daemon: removing 12 Jul 11 12:42:51 pip slapd[30723]: daemon: activity on 1 descriptor Jul 11 12:42:51 pip slapd[30723]: daemon: activity on: Jul 11 12:42:51 pip slapd[30723]: Jul 11 12:42:51 pip slapd[30723]: daemon: epoll: listen=7 active_threads=0 tvp=zero Jul 11 12:42:51 pip slapd[30723]: daemon: epoll: listen=8 active_threads=0 tvp=zero Jul 11 12:42:51 pip slapd[30723]: daemon: activity on 1 descriptor Jul 11 12:42:51 pip slapd[30723]: do_syncrepl: rid 001 retrying (9 retries left) -----------
The replica would continue to connect over and over again to the master, and the logs just kept saying "retrying".
Finally, I ended up having to disable TLS on the replica and temporarily allow plaintext authentication on the master.
On reviewing the packet capture, it was immediately obvious that the search Was failing with a protocol error because derefAliases was set to always. A quick Google search indicated that other people have had a similar problem, generally because they changed the global LDAP configuration file.
Indeed, I had switched to NFS home directories with the auto mounter, and LDAP integration for my deployment required dereferencing aliases by the auto mount client, so I had set "DEREF always" in /etc/openldap/ldap.conf, which is being inherited by slapd.
It would be useful if replication failure provided better error messages; something in the logs indicating that a protocol error had occurred because of an invalid dereferencing setting would have saved me a lot of time. Also, if alias dereferencing is not valid for a syncrepl query, shouldn't the server simply override that setting from the global configuration and do the right thing?
In any case, I find myself stuck: the auto mounter requires alias dereferencing in order to work; while slapd requires alias dereferencing disabled.
There appears to be three ways to define configuration: the global configuration file, a configuration file in the home directory, or an environment variable.
The global configuration file will not work, as I require a different option setting for two processes. The home directory configuration file will not work, as both automount and slapd look in ~root. while I could probably kludge an init script to pass an environment variable to one or the other process, the init script framework does not allow for that and I would prefer something that fits within the intended operating system configuration.
Any suggestions on how to best have a different LDAP configuration for two processes both running as root?
Would there be any value in modifying slapd to ignore the alias dereferencing setting for the purposes of syncrepl? Or to enhance the slapd.conf file to allow setting general LDAP configuration options? Or perhaps a commandline option allowing the specification of either an alternate configuration file or LDAP configuration options on the command line?
Thanks much for any suggestions or assistance...