On Wed, 30 Apr 2008, Howard Chu wrote:
rein@OpenLDAP.org wrote:
When syncrepl and syncprov are both used on a glue database, the contextCSN received from the syncrepl producers are not passed on to the syncprov consumers when changes in subordinate databases are received. The reason is that syncrepl queues the CSNs in the glue backend, while syncprov fetches them from the backend where the changes are made. As a consequence, the consumers will be passed a cookie without any csn value.
My first attempt at fixing this was to change syncprov to fetch the queued csn values from the glue backend where it was used. But that failed as other modules queues the csn values in their own backend when they changes things.
What other modules? Generally there cannot be any other sources of changes.
Sorry, I should have written other configurations. The CSNs gets queued in the subordinate database when syncrepl is used there, or not at all (i.e in regular updates that comes in through the frontend).
Instead I changed ctxcsn.c so that it always queues them in the glue backend where syncprov is used. But I don't feel that my understanding of this stuff is good enough to be sure that this is the optimal solution..
I definitely don't like references to the syncprov overlay appearing in main slapd code like that. We need a different solution.
That's reasonable, but the test for syncrepl is probably not needed if this solution should be kept. The test was more or less a copy and paste from syncrepl where it finds out which backend to write through. To me it makes sense to have a single queue of CSN values in a glued configuration, no matter if or where syncprov is used.
At one point in the past, I had changed syncrepl.c to queue the CSNs in both places, but that seemed rather sloppy. Still, it may work best here.
I don't like duplicating information, sooner or later it tends to end up with wrong info in one of the places..
Another approach could be to have syncprov look in the glue database if it fails to find any queued CSN in a subordinate db. I haven't tested it, but that should work in both configurations. It should also remove the need to always look for the glue db which my patch requires. Would that be better?
Btw, in syncprov_checkpoint() there is a similar SLAP_GLUE_SUBORDINATE test, should that have included an overlay_is_inst() clause as well?
Perhaps. You would have to use op->o_bd->bd_self instead of op->o_bd on that call.
The current test (introduced to fix ITS#5433) causes the contextCSN to be written to the glue database when syncprov is used on a subordinate db, which appears wrong to me.
Could you elaborate on when op->o_bd->bd_self must be used instead of op->o_bd? I understand that op->o_bd may be a copy of the original structure that op->o_bd->bd_self refers to, but I'm not sure when it must be used. Btw, could op->o_bd->bd_self->bd_info be used to fetch the BackendInfo that can be used to call the top-most bd_search (and similar) also in overlays?
Rein