--On Wednesday, February 15, 2017 6:36 PM -0800 "Paul B. Henson"
On Wed, Feb 15, 2017 at 12:22:29PM -0800, Quanah Gibson-Mount wrote:
> I would suggest filing an ITS with the full backtrace info, so I can
> track it.
Ok, will do.
> It could be useful to have the entry data from the accesslog as
> well for the failed replication op, as we can see the failed entry DN in
> the output of your backtrace.
That would be in the accesslog on the server that crashed? Hmm, the
server that crashed is the master, and all updates were going to it. Am
I confused, or did the update that caused the crash come in via syncrepl
though, and hence originate from a different server? So the accesslog
entry you want would be from that server, not the server that crashed?
But given no other servers should have been receiving updates, how would
an update have been received via replication? Or is this another issue
like the memberOf problem where updates are being improperly replicated?
It appears to be crashing while writing the change to the accesslog
database. It's odd that the value for the attribute is NULL. Do we know
for sure what the client doing the write op to the server is sending?
Hmm, looking at the logs that correspond with one of the crashes:
This operation appears to succeed? Then there's this:
Feb 14 04:00:13 fosse slapd: conn=37859 op=806 MOD
dn="uid=vntruong,ou=user,dc=csupomona,dc=edu" Feb 14 04:00:13 fosse
slapd: conn=37859 op=806 MOD attr=csupomonaEduPersonExpiration
Yeah, so this is the operation that actually failed... It'd be interesting
to know if it succeeded in the primary DB, but failed when writing to the
accesslog DB (I.e., the master and its consumers are now out of sync for
that entry), or if the entire write op failed (master and consumers are in
sync for the entry)
when I restarted the server. I guess I am confused; the entryCSN has
serverID 0, the ID of this server, so this isn't a replicated op, it's
an op from this server. So why does the backtrace show the change coming
in via syncrepl? It seems like it's getting applied twice. The change is
deleting the attribute, so the second time it's getting applied you
would get a no such attribute error...
Hm, so I guess my question would be is it doing the op like this:
Or is it doing it like this:
Because the NULL value seems to imply the former.
Packaged, certified, and supported LDAP solutions powered by OpenLDAP: