On 12/12/11 02:26 PM, Howard Chu wrote:
But meanwhile... can anyone tell me if seeing errors like the following is normal when replicating cn=config?
No. Errors are by definition not normal.
That's good to establish, other projects sometimes disagree. :)
The test suite tests these types of replication setups. Does "make test" pass on your build?
With flying colours. I'm inserting Debug() statements all over the place to figure out where the "downgrade" happens, since gdb apparently affects things enough to make the issue more miss than hit. As near as I can tell, the "Operations" structure is coming out of slap_op_alloc() with op->o_hdr->oh_protocol with "2" already set when do_search() is called.
Can you confirm whether Operations structures are meant to be recycled?
To explain, these servers are being monitored by Nagios, which does a simple bind and search every five minutes. It *only* uses LDAPv2 (I didn't write the test, I think it came with Nagios).
I'm only going by the pointer, but it seems like the Operations structure gets recycled between these LDAPv2 connections and my LDAPv3 syncrepl query, and the protocol value is carried over. Then things explode. I've found code that initializes oh_protocol if the value isn't set, but nothing if it already has a "valid" value.
So I'm trying to figure out if: a) I'm getting the wrong Op structure belonging to a different connection; b) I'm getting a recycled Op structure that isn't cleaned up properly, or c) if there's some internal memory corruption happening, possibly as a bug within Linux or VMWare.