SunOS ldapslave01.unix 5.10 Generic_141445-09 i86pc i386 i86pc db-4.8.30.NC.tar.gz openldap-2.4.23.tgz
Hello list,
We have had a rare bug hit us now and then, that appears to be quite persistent. I know it isn't fun to read through someone else's setup and troubles, but I am hoping to at least get some hints at what to look at next.
Let us imagine our setup as:
uid=user1,o=a.gmail.com,ou=mail,dc=company,dc=com uid=user2,o=a.gmail.com,ou=mail,dc=company,dc=com uid=user1,o=b.gmail.com,ou=mail,dc=company,dc=com uid=user1,o=c.gmail.com,ou=mail,dc=company,dc=com
(Not literally gmail.com, replacing customer domains manually, but same "style" of service).
ldapmaster -> slave1 -> localhost slapd (pop/imap) -> slave2 -> localhost slapd (postfix) -> slave3 -> localhost slapd (ftp) -> slave4 -> localhost slapd (smtp/forward)
All using syncrepl. One master, only ever updated by the provisioning engine. 4 immediate slaves (full ldap tree), then about 150 work servers, running localhost slapd with only applicable tree. Ie, only "ou=mail" for the mail-related servers.
What happens now and then, is that:
o=b.gmail.com,ou=mail,dc=company,dc=com
gets deleted. Gone.
But you can access record "uid=user1,o=b.gmail.com,ou=mail,dc=company,dc=com" just fine. But not "o=b.gmail.com,ou=mail,dc=company,dc=com". How is that possible? Mismatching indexes?
Amusingly, POP and IMAP work fine since they authenticate against the final user record "uid=user1,o=b.gmail.com,ou=mail,dc=company,dc=com". but Postfix will fail, as it verifies that the domain exists "o=b.gmail.com,ou=mail,dc=company,dc=com". syncrepl for accounts under tree "o=b.gmail.com,ou=mail,dc=company,dc=com" appear to stop until the record is fixed.
There is an easy fix, if we modify the record "o=b.gmail.com,ou=mail,dc=company,dc=com" on the master, it gets pushed out.
It is not always the same record, but it does seem to be a class of certain records. And the missing records seem to happen most on certain servers. 80% of the time, on the vmx servers. It far worse when they disappear on the slaves, since many localhost slapds will then not get synced.
Now, this happens more frequently when we do deletions. Usually around the first of the month, since that is when we delete customers who left. Alas, we moved the entire LDAP DNS tree to its own LDAP master, so have been deleting this tree for 20 days, and will keep deleting for 12 more days. So, at the moment, we have this error once, or twice, a day.
At the moment, I do not really know what goes wrong. I am hoping that perhaps I can get some good ideas on where to look, and what to look for.
This morning, we lost "i.gmail.com" on vmx02. If I grep for it in the logs, we only find:
Apr 27 09:17:51 vmx02.unix slapd[13990]: [ID 338579 local4.debug] nonpresent_cal lback: rid=275 nonpresent UUID 82d9570a-cdd6-102f-9364-65cc77dfbe36, dn o=i.gmail.com,ou=mail,dc=company,dc=com
Apr 27 09:18:11 vmx02.unix slapd[13990]: [ID 905397 local4.debug] syncrepl_del_nonpresent: rid=275 be_delete o=i.gmail.com,ou=mail,dc=company,dc=com (66)
# grep nonpresent_callback /var/log/slaplog| wc -l 517707
# grep syncrepl_del_nonpresent /var/log/slaplog|wc -l 9
What is also curious, is the UUID, for example:
slaplog.20110424.gz:Apr 24 15:31:26 vmx02.unix slapd[13990]: [ID 338579 local4.debug] nonpresent_callback: rid=275 nonpresent UUID 82d9570a-cdd6-102f-9364-65cc77dfbe36, dn o=i.gmail.com,ou=mail,dc=company,dc=com
slaplog.20110424.gz:Apr 24 23:41:00 vmx02.unix slapd[13990]: [ID 338579 local4.debug] nonpresent_callback: rid=275 nonpresent UUID 82d9570a-cdd6-102f-9364-65cc77dfbe36, dn o=i.gmail.com,ou=mail,dc=company,dc=com
Persistent uuid? Is it ok that UUID is repeated? Are unique over time, or sometimes recycled?
If I move from vmx02, to its LDAP provider, in this case, ldapslave03 (rid = 329)
Apr 27 09:12:27 ldapslave03.unix slapd[20759]: [ID 338579 local4.debug] nonprese nt_callback: rid=329 present UUID d6215e0e-1c55-102f-8da7-0b6e3175d2bb, dn o=i.gmail.com,ou=mail,dc=company,dc=com
/export/backup/ldapslave03.unix# gzgrep d6215e0e-1c55-102f-8da7-0b6e35d2bb slaplog.20110* Jan 6 07:15:39 Feb 8 18:47:48 Apr 24 15:22:21 Apr 24 23:36:22
Is it worth pursuing the repeated UUID? Is that some stuck record that keeps retrying over and over? Is it that is constantly trying to delete these records, and correctly failing, except now and then, it succeeds like that of vmx02 this morning?
Then there is the slightly different message, with error "66".
Apr 27 09:20:06 ldapslave03.unix slapd[20759]: [ID 976560 local4.debug] syncprov_sendresp: cookie=
Apr 27 09:20:06 ldapslave03.unix last message repeated 1 time
Apr 27 09:20:06 ldapslave03.unix slapd[20759]: [ID 905397 local4.debug] syncrepl _del_nonpresent: rid=329 be_delete o=i.gmail.com,ou=ftp,dc=company,dc=com (66)
Apr 27 09:20:06 ldapslave03.unix slapd[20759]: [ID 976560 local4.debug] syncprov_sendresp: cookie=
Apr 27 09:20:06 ldapslave03.unix last message repeated 4 times
I believe 66 is LDAP_NOT_ALLOWED_ON_NONLEAF, which is correct. It has thousands of accounts in there, so it definitely should not delete it. But why does it try to delete it in the first place? It is not us trying to delete half the tree (well, you know, pretty sure...ish.. it isn't us) so where is it coming from?
Or should I look at that empty "cookie="? Why that empty, it starts back at:
Apr 27 09:17:29 ldapslave03.unix slapd[20759]: [ID 859137 local4.debug] slap_queue_csn: queing 21798d80 20110427000316.304365Z#000000#000#000000
Apr 27 09:17:29 ldapslave03.unix slapd[20759]: [ID 976560 local4.debug] syncprov_sendresp: cookie=rid=268,csn=20110427000316.304365Z#000000#000#000000
Apr 27 09:17:29 ldapslave03.unix slapd[20759]: [ID 976560 local4.debug] syncprov_sendresp: cookie=rid=278,csn=20110427000316.304365Z#000000#000#000000
Apr 27 09:17:29 ldapslave03.unix slapd[20759]: [ID 775493 local4.debug] slap_graduate_commit_csn: removing a2bee60 0110427000316.304365Z#000000#000#000000
Apr 27 09:17:29 ldapslave03.unix slapd[20759]: [ID 905397 local4.debug] syncrepl_del_nonpresent: rid=329 be_delete uid=$account,ou=users,ou=radius,dc=company,dc=com (0)
Apr 27 09:17:29 ldapslave03.unix slapd[20759]: [ID 502419 local4.debug] Entry uid=userX,o=$customerdomain.com,ou=mail,dc=company,dc=com CSN 20110426223343.974235Z#000000#000#000000 older or equal to ctx 20110426223343.974235Z#000000 #000#000000
Apr 27 09:17:29 ldapslave03.unix last message repeated 1 time
Apr 27 09:17:29 ldapslave03.unix slapd[20759]: [ID 976560 local4.debug] syncprov_sendresp: cookie=
Apr 27 09:17:29 ldapslave03.unix last message repeated 1 time
then a few hundred lines of empty "cookie=" lines.
It is also worth to note that very close to the time that it lost the record today, we have:
Apr 27 09:11:25 vmx02.unix slapd[13990]: [ID 763815 local4.debug] connection_input: conn=6946 deferring operation: too many executing
and it is almost as if, it kills the replication connection randomly when it gets busy. Whatever record it was working on at the time is half processed?
Alas, there are no entries at all for any of these on the master. Possibly because master's loglevel is "sync", which don't seem to do much when it is the provider. Recommended log level would be?
Now, there is version openldap-2.4.24 out, with a very large list of fixes, but I did not see anything immediately obvious. If there is a fix that "looks good" to fix this, I can go through the upgrade. But it is a 3 week process to make it happen on all servers.
Worse is, I have tried to replicate the trouble in a test network, running the exact same data, and binaries, with a master, slave and localhost slapd all on one machine. But I can not make the trouble happen in a simulated environment. Possibly as I am not generating the same load. Quite annoying.
slapd.conf is pushed out with git, and is the same everywhere, except that slaves have the syncrepl section, with separate rids, and memory cache size are set differently.
include /usr/local/etc/openldap/schema/core.schema include /usr/local/etc/openldap/schema/cosine.schema include /usr/local/etc/openldap/schema/nis.schema include /usr/local/etc/openldap/schema/qmail.schema include /usr/local/etc/openldap/schema/local.schema include /usr/local/etc/openldap/schema/RADIUS-LDAPv3.schema include /usr/local/etc/openldap/schema/AmavisdLDAP.schema include /usr/local/etc/openldap/schema/analyse.schema include /usr/local/etc/openldap/schema/formail.schema include /usr/local/etc/openldap/schema/dlz.schema include /usr/local/etc/openldap/schema/filter.schema pidfile /usr/local/var/run/slapd.pid argsfile /usr/local/var/run/slapd.args loglevel sync
database monitor access to dn.subtree="cn=Monitor" by dn="cn=admin,dc=company,dc=com" read by * none
access to attrs=userPassword by self write by anonymous auth by * none
access to * by self write by dn="cn=admin,dc=company,dc=com" write by * read
password-hash {CRYPT}
database hdb suffix "dc=company,dc=com" rootdn "cn=admin,dc=company,dc=com" directory /usr/local/var/openldap-data rootpw $secret
index objectClass eq index uid eq index uidNumber eq index mail eq index mailAlternateAddress pres,eq index deliveryMode eq index accountStatus eq index gecos eq index radiusGroupName eq index o pres,eq index entryCSN,entryUUID eq index gidNumber eq index DNSType eq index DNSIPAddr eq index DNSData eq index DNSHostName eq
checkpoint 128 15 cachesize 6000 idlcachesize 18000
overlay syncprov syncprov-checkpoint 100 10 syncprov-sessionlog 1000
dbconfig set_lk_detect DB_LOCK_DEFAULT dbconfig set_lg_max 52428800 dbconfig set_cachesize 4 0 1 dbconfig set_flags db_log_autoremove dbconfig set_lk_max_objects 1500 dbconfig set_lk_max_locks 1500 dbconfig set_lk_max_lockers 1500
syncrepl rid=( $IP-4TH-OCTET + 256 ) provider=ldap://172.20.12.163 type=refreshAndPersist interval=00:00:00:30 searchbase="dc=company,dc=com" filter="(objectClass=*)" attrs="*,+" scope=sub schemachecking=off bindmethod=simple binddn="cn=admin,dc=company,dc=com" credentials="$secret" retry="60 10 300 +"
updateref ldap://172.20.12.113
(Actually, 172.20.12.163 and 172.20.12.113 are the same server, just aliases on same nic. But updateref is never actually used, we only one writer that talks directly to master).
--On Wednesday, April 27, 2011 11:38 AM +0900 Jorgen Lundman lundman@lundman.net wrote:
SunOS ldapslave01.unix 5.10 Generic_141445-09 i86pc i386 i86pc db-4.8.30.NC.tar.gz openldap-2.4.23.tgz
Hello list,
We have had a rare bug hit us now and then, that appears to be quite persistent. I know it isn't fun to read through someone else's setup and troubles, but I am hoping to at least get some hints at what to look at next.
My only comment would be, if you want reliable replication, use delta-syncrepl.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
My only comment would be, if you want reliable replication, use delta-syncrepl.
Are you saying syncrepl is known to be broken? Or shoddy in general? I find the whole issue that a "top level" of a tree can disappear, but the tree itself still works to be troubling. Perhaps BerkeleyDB broken?
I did look at delta-syncrepl, but it never ran very well, and seemed quite over complicated for marginal gains. As I understood it, syncrepl always sends entire records, whereas delta-syncrepl can update just the attribute of a record. This did not sound like it would make any difference. Is my understanding incorrect?
This is what I added to the slapd.conf on master when testing it. Currently, not on the production servers.
*** In the middle of slapd.conf
# delta-syncrepl needs full access access to * by dn.base="cn=replicator,dc=company,dc=com" read by * break
*** At the end of slapd.conf
overlay accesslog logdb "cn=deltalog" logops writes logsuccess TRUE logpurge 2+00:00 1+00:00
# define the replica provider for this database # (last directives in database section) overlay syncprov # contextCSN saved to database every 100 updates or ten minutes syncprov-checkpoint 100 10
# Let the replica DN have limitless searches limits dn.exact="cn=replicator,dc=company,dc=com" time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
# Accesslog database definitions database hdb suffix cn=deltalog directory /usr/local/var/openldap-data/deltalog rootdn cn=deltalog index default eq index entryCSN,objectClass,reqEnd,reqResult,reqStart # dbconfig set_lk_detect DB_LOCK_DEFAULT dbconfig set_lg_max 52428800 dbconfig set_cachesize 0 536870912 0 dbconfig set_flags db_log_autoremove dbconfig set_lk_max_objects 200 dbconfig set_lk_max_locks 200 dbconfig set_lk_max_lockers 200
# the access log is also a provider # syncprov-nopresent inhibits the present phase of synchronization # syncprov-reloadhint TRUE mandatory for delta sync overlay syncprov syncprov-nopresent TRUE syncprov-reloadhint TRUE
--On Wednesday, April 27, 2011 12:04 PM +0900 Jorgen Lundman lundman@lundman.net wrote:
Quanah Gibson-Mount wrote:
My only comment would be, if you want reliable replication, use delta-syncrepl.
Are you saying syncrepl is known to be broken? Or shoddy in general? I find the whole issue that a "top level" of a tree can disappear, but the tree itself still works to be troubling. Perhaps BerkeleyDB broken?
I'm saying that with delta-syncrepl, my replication experience has been reliable. From what I see on the list, people who use normal syncrepl do not have reliable replication.
I did look at delta-syncrepl, but it never ran very well, and seemed quite over complicated for marginal gains. As I understood it, syncrepl always sends entire records, whereas delta-syncrepl can update just the attribute of a record. This did not sound like it would make any difference. Is my understanding incorrect?
With delta-syncrepl, the changes are read directly out of a changes database, in order. The gains are significant.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
From what I see on the list, people who use normal syncrepl do not have reliable replication.
Frankly this sounds scary! Especially since you have more knowledge of the internals here... :-/
With delta-syncrepl, the changes are read directly out of a changes database, in order. The gains are significant.
IIRC with delta-syncrepl all write operations are serialized. Yes?
Ciao, Michael.
--On Wednesday, April 27, 2011 7:41 PM +0200 Michael Ströder michael@stroeder.com wrote:
Quanah Gibson-Mount wrote:
From what I see on the list, people who use normal syncrepl do not have reliable replication.
Frankly this sounds scary! Especially since you have more knowledge of the internals here... :-/
With delta-syncrepl, the changes are read directly out of a changes database, in order. The gains are significant.
IIRC with delta-syncrepl all write operations are serialized. Yes?
Correct. This improves performance significantly over normal syncrepl. It may seem counter intuitive, and yet it reduces the contention in the underlying database. I tested this heavily in the past. ;)
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
--On Wednesday, April 27, 2011 7:41 PM +0200 Michael Ströder michael@stroeder.com wrote:
Quanah Gibson-Mount wrote:
From what I see on the list, people who use normal syncrepl do not have reliable replication.
Frankly this sounds scary! Especially since you have more knowledge of the internals here... :-/
With delta-syncrepl, the changes are read directly out of a changes database, in order. The gains are significant.
IIRC with delta-syncrepl all write operations are serialized. Yes?
Correct.
Couldn't this be the cause for delta-syncrepl to seem more reliable than normal syncrepl (without slapo-accesslog)?
This improves performance significantly over normal syncrepl. It may seem counter intuitive, and yet it reduces the contention in the underlying database. I tested this heavily in the past. ;)
I've read the results of your tests before. Many thanks for that.
Ciao, Michael.
--On Wednesday, April 27, 2011 8:01 PM +0200 Michael Ströder michael@stroeder.com wrote:
Correct.
Couldn't this be the cause for delta-syncrepl to seem more reliable than normal syncrepl (without slapo-accesslog)?
I don't think so, because that is on the master, not on the replicas. Replicas still get changes in the order they were written to the master (supposedly anyhow. ;) ).
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
I'm saying that with delta-syncrepl, my replication experience has been reliable. From what I see on the list, people who use normal syncrepl do not have reliable replication.
Are the slapd.conf lines I pasted for delta-syncrepl correct? If it is just a matter of adding those lines, and creating the directory, I can do that and see if it improves.
--On Thursday, April 28, 2011 7:59 AM +0900 Jorgen Lundman lundman@lundman.net wrote:
Are the slapd.conf lines I pasted for delta-syncrepl correct? If it is just a matter of adding those lines, and creating the directory, I can do that and see if it improves.
I would suggest reading the documentation on configuring delta-syncrepl as well as looking at the configurations the test suite creates. There is configuration that is necessary on both the replica and provider that has some differences to regular syncrepl.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
My only comment would be, if you want reliable replication, use delta-syncrepl.
This really strikes me.
1. CHANGES contains a lot of references to fixes made to syncrepl and delta-syncrepl. And it seems not every issue is resolved yet.
2. delta-syncrepl cannot be used with multi-master replication.
So I wonder what are the really supported replication setups?
Ciao, Michael.
--On Wednesday, April 27, 2011 9:15 AM +0200 Michael Ströder michael@stroeder.com wrote:
Quanah Gibson-Mount wrote:
My only comment would be, if you want reliable replication, use delta-syncrepl.
This really strikes me.
- CHANGES contains a lot of references to fixes made to syncrepl and
delta-syncrepl. And it seems not every issue is resolved yet.
delta-syncrepl has been very solid for me, with the exception of the occasional edge case. The last time there was an edge case encountered was with 2.4.22 (4/24/2010).
- delta-syncrepl cannot be used with multi-master replication.
This will no be the case in the near future.
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
--On Wednesday, April 27, 2011 9:15 AM +0200 Michael Ströder michael@stroeder.com wrote:
Quanah Gibson-Mount wrote:
My only comment would be, if you want reliable replication, use delta-syncrepl.
This really strikes me.
- CHANGES contains a lot of references to fixes made to syncrepl and
delta-syncrepl. And it seems not every issue is resolved yet.
delta-syncrepl has been very solid for me, with the exception of the occasional edge case. The last time there was an edge case encountered was with 2.4.22 (4/24/2010).
Are you running with slapo-memberof? (see also ITS#6915 since delta-syncrepl needs slapo-accesslog)
Ciao, Michael.
--On Wednesday, April 27, 2011 7:37 PM +0200 Michael Ströder michael@stroeder.com wrote:
Quanah Gibson-Mount wrote:
--On Wednesday, April 27, 2011 9:15 AM +0200 Michael Ströder michael@stroeder.com wrote:
Quanah Gibson-Mount wrote:
My only comment would be, if you want reliable replication, use delta-syncrepl.
This really strikes me.
- CHANGES contains a lot of references to fixes made to syncrepl and
delta-syncrepl. And it seems not every issue is resolved yet.
delta-syncrepl has been very solid for me, with the exception of the occasional edge case. The last time there was an edge case encountered was with 2.4.22 (4/24/2010).
Are you running with slapo-memberof? (see also ITS#6915 since delta-syncrepl needs slapo-accesslog)
No, I'm not. But as noted in the ITS, that's a long standing bug in slapo-memberof, and not in delta-syncrepl. It has in fact existed since slapo-memberof was first written. ;)
--Quanah
--
Quanah Gibson-Mount Sr. Member of Technical Staff Zimbra, Inc A Division of VMware, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
delta-syncrepl has been very solid for me, with the exception of the occasional edge case. The last time there was an edge case encountered was with 2.4.22 (4/24/2010).
Are you running with slapo-memberof? (see also ITS#6915 since delta-syncrepl needs slapo-accesslog)
No, I'm not. But as noted in the ITS, that's a long standing bug in slapo-memberof, and not in delta-syncrepl. It has in fact existed since slapo-memberof was first written. ;)
I've also read the ITS carefully. And I've observed my logdb!
The point is that the operations by slapo-memberof reach the logdb but the write operation to the group entry is not in the logdb! So from my understanding delta-syncrepl won't work correctly if slapo-memberof is used on the provider no matter in which part of the code the bug was.
So the general recommendation for delta-syncrepl has some caveats...
Ciao, Michael.
openldap-technical@openldap.org