--On Wednesday, February 15, 2017 6:36 PM -0800 "Paul B. Henson" henson@acm.org wrote:
On Wed, Feb 15, 2017 at 12:22:29PM -0800, Quanah Gibson-Mount wrote:
I would suggest filing an ITS with the full backtrace info, so I can track it.
Ok, will do.
It could be useful to have the entry data from the accesslog as well for the failed replication op, as we can see the failed entry DN in the output of your backtrace.
That would be in the accesslog on the server that crashed? Hmm, the server that crashed is the master, and all updates were going to it. Am I confused, or did the update that caused the crash come in via syncrepl though, and hence originate from a different server? So the accesslog entry you want would be from that server, not the server that crashed? But given no other servers should have been receiving updates, how would an update have been received via replication? Or is this another issue like the memberOf problem where updates are being improperly replicated?
It appears to be crashing while writing the change to the accesslog database. It's odd that the value for the attribute is NULL. Do we know for sure what the client doing the write op to the server is sending?
Hmm, looking at the logs that correspond with one of the crashes:
This operation appears to succeed? Then there's this:
Feb 14 04:00:13 fosse slapd[12524]: conn=37859 op=806 MOD dn="uid=vntruong,ou=user,dc=csupomona,dc=edu" Feb 14 04:00:13 fosse slapd[12524]: conn=37859 op=806 MOD attr=csupomonaEduPersonExpiration
Yeah, so this is the operation that actually failed... It'd be interesting to know if it succeeded in the primary DB, but failed when writing to the accesslog DB (I.e., the master and its consumers are now out of sync for that entry), or if the entire write op failed (master and consumers are in sync for the entry)
when I restarted the server. I guess I am confused; the entryCSN has serverID 0, the ID of this server, so this isn't a replicated op, it's an op from this server. So why does the backtrace show the change coming in via syncrepl? It seems like it's getting applied twice. The change is deleting the attribute, so the second time it's getting applied you would get a no such attribute error...
Hm, so I guess my question would be is it doing the op like this:
dn: ... changetype: modify replace: csupomonaEduPersonExpiration csupomonaEduPersonExpiration:
Or is it doing it like this:
dn: ... changetype: modify delete: csupomonaEduPersonExpiration
Because the NULL value seems to imply the former.
--Quanah
--
Quanah Gibson-Mount Product Architect Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: http://www.symas.com