Rein Tollevik wrote:
On Wed, 30 Apr 2008, Howard Chu wrote:
rein@OpenLDAP.org wrote:
My first attempt at fixing this was to change syncprov to fetch the queued csn values from the glue backend where it was used. But that failed as other modules queues the csn values in their own backend when they changes things.
What other modules? Generally there cannot be any other sources of changes.
Sorry, I should have written other configurations. The CSNs gets queued in the subordinate database when syncrepl is used there, or not at all (i.e in regular updates that comes in through the frontend).
OK, but that's again quite a special case. I.e., that's multi-master; in the default (single-master) case there cannot be regular updates arriving through the frontend. When a single-master syncrepl consumer is configured, that is the only possible source of updates. Let's be sure we've solved this question for the single-master case first, before addressing the multi-master case.
While it's expected that the software will be able to handle multiple glued DBs and multi-master across them, I seriously doubt that anyone out there actually knows how to configure and maintain such a setup yet.
Instead I changed ctxcsn.c so that it always queues them in the glue backend where syncprov is used. But I don't feel that my understanding of this stuff is good enough to be sure that this is the optimal solution..
I definitely don't like references to the syncprov overlay appearing in main slapd code like that. We need a different solution.
To me it makes sense to have a single queue of CSN values in a glued configuration, no matter if or where syncprov is used.
Yes, I can probably go along with that. The downside is that it may reduce write concurrency a bit, compared to a glued configuration where each glued DB is otherwise independent.
Another approach could be to have syncprov look in the glue database if it fails to find any queued CSN in a subordinate db. I haven't tested it, but that should work in both configurations. It should also remove the need to always look for the glue db which my patch requires. Would that be better?
That sounds like a decent alternative.
Btw, in syncprov_checkpoint() there is a similar SLAP_GLUE_SUBORDINATE test, should that have included an overlay_is_inst() clause as well?
Perhaps. You would have to use op->o_bd->bd_self instead of op->o_bd on that call.
The current test (introduced to fix ITS#5433) causes the contextCSN to be written to the glue database when syncprov is used on a subordinate db, which appears wrong to me.
Understood.
Again, the question is whether the admin intended to configure a single syncprov over an entire glued DB, or individual syncprovs over each component of the glued tree. The distinction is vital, and it's detected based on whether the syncprov overlay is above the glue overlay in the overlay stack, or below it, on the topmost DB.
Could you elaborate on when op->o_bd->bd_self must be used instead of op->o_bd? I understand that op->o_bd may be a copy of the original structure that op->o_bd->bd_self refers to, but I'm not sure when it must be used. Btw, could op->o_bd->bd_self->bd_info be used to fetch the BackendInfo that can be used to call the top-most bd_search (and similar) also in overlays?
If you read the code for overlay_is_inst() it should be obvious - that function only works when used with a real BackendDB structure. The local copy structure has had its bd_info replaced with whatever on_inst structure corresponds to the current overlay.
Yes, the bd_self points to the topmost structure, so you can use it for be_search. Much of what's happening in these overlays was intended to avoid starting over at the top though, because the code is already running in the desired overlay context.