Rein Tollevik wrote:
On Wed, 19 Mar 2008, Howard Chu wrote:
syncprov_playlog() in syncprov.c generates such invalid cookies when replaying a sessionlog where no entries has been deleted. The patch also fixes this bug.
This should never happen, since the cookie is not generated if there were no deletes recorded.
Well, it do happen anyhow. Using the 2.4.8 release this can easily be reproduced with a configuration where the syncrepl provider (with sessionlogs enabled) is on a glue backend, the consumer replicates the entire glue subtree. Assuming both servers are in sync and stopped, start the producer, modify an entry in the subordinate db and then start the consumer. The producer will send a cookie with an empty csn= value as in this sync debug output:
slapd starting slap_queue_csn: queing 0x418001a0 20080320125116.487494Z#000000#000#000000 slap_graduate_commit_csn: removing 0x8d6ae0 20080320125116.487494Z#000000#000#000000 srs csn 20080320124421.642686Z#000000#000#000000 log csn 20080320125116.487494Z#000000#000#000000 syncprov_playlog: cookie=rid=007,csn= syncprov_search_response: cookie=rid=007,csn=20080320125116.487494Z#000000#000#000000
I have also tested it using the cvs head source, and the good news is that this was accidentally fixed (at least partly) by the ITS#5434 change. I.e it was caused by the be_search() in syncprov_playlog() not being propagated to the subordinate database and thereby not finding the entry that was modified.
The bad news is that there is still a race condition that would have the same effect. If someone deletes the entry that was modified just before be_search() is done it will return the same empty cookie. I was able to reproduce this in a debugger by setting a break point before be_search(), deleting the entry with ldapdelete and in the debugger run next over the be_search() call before continuing.
So, I still think it is best to keep delcsn[0].bv_val NULL until an actual cookie has been written to it, as slap_compose_sync_cookie() will create a cookie with an empty csn= if passed a csn array where bv_len is zero but bv_val is non-NULL. Although an alternative and maybe more correct fix could be to change slap_compose_sync_cookie() instead?
No. Thanks for your explanation of what you observed. The actual bug is in the mods processing, when it decides that a modified entry should be treated as a delete instead. It increments ndel then but doesn't set delcsn in that case. That's what actually needs to be fixed.