https://bugs.openldap.org/show_bug.cgi?id=8649
--- Comment #5 from Quanah Gibson-Mount <quanah(a)openldap.org> ---
Commits:
• bc29154c
by Howard Chu at 2021-08-03T13:10:27+01:00
ITS#8649 syncrepl: fix backend selection in glued DBs
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8958
--- Comment #30 from Hallvard Furuseth <h.b.furuseth(a)usit.uio.no> ---
I wrote wrote:
> https://bugs.openldap.org/show_bug.cgi?id=8958
>> In patch #5
>> + ldap_pvt_thread_pool_setspeed( &connection_pool, ctx, 0 );
>>
>> Shouldn't the minimum speed be 1, not 0?
>
> That's just the API. 0 = "slowest". I didn't want to export details
> of the tpool implementation, which might get replaced. Could use 0.0
> so it looks different, if floating point numbers are OK in libldap.
That is, I have a vague feeling that merely mentioning a floating point
number would require libm in some C implementations. Don't remember.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8958
--- Comment #29 from Hallvard Furuseth <h.b.furuseth(a)usit.uio.no> ---
On 03.08.2021 14:25, openldap-its(a)openldap.org wrote:
> --- Comment #24 from Howard Chu <hyc(a)openldap.org> ---
> (In reply to Quanah Gibson-Mount from comment #23)
>> Created attachment 799 [details]
>> proposed fix
>
> In patch #5
> + ldap_pvt_thread_pool_setspeed( &connection_pool, ctx, 0 );
>
> Shouldn't the minimum speed be 1, not 0?
That's just the API. 0 = "slowest". I didn't want to export details
of the tpool implementation, which might get replaced. Could use 0.0
so it looks different, if floating point numbers are OK in libldap.
> Since you have
> +enum { NOT_PAUSED = 0, WANT_PAUSE = LDAP_PVT_THREAD_POOL_SPEED_MAX+1, PAUSED
> };
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8958
--- Comment #28 from Hallvard Furuseth <h.b.furuseth(a)usit.uio.no> ---
On 03.08.2021 14:36, openldap-its(a)openldap.org wrote:
> https://bugs.openldap.org/show_bug.cgi?id=8958
>
> --- Comment #25 from Howard Chu <hyc(a)openldap.org> ---
> (In reply to Hallvard Furuseth from comment #17)
>
>> (...) A pause only stops tasks with speed < ltp_pause.
>> In thread_pool_pause(), replace the WANT_PAUSE stage with
>>
>> while (++ltp_pause <= max speed) {
>> wait until no more tasks with speed < ltp_pause;
>> }
>>
>> Then fast tasks should breeze past slow ones when preparing
>> to pause. Until all threads have slow tasks, anyway.
>
> I don't understand how this solves anything. If a slow indexing
> task is currently running, and a fast config mod comes in, it's
> still the case that the config change could pull the DB out from
> under the indexer task. So there's nothing safe about letting the
> fast task progress while the slow task is still running
Fast tasks still wait for *running* slow tasks. And when
there is no pause involved, slow tasks get scheduled normally.
This is only about scheduling when something wants a pause.
setspeed() does CHECK_PAUSE, standing aside for faster tasks.
Then, a fast task which wants a pause (cn=config change #2) won't
block other fast tasks while a slower task (indexer) is running.
So normal tasks will keep getting scheduled, instead of slapd
locking up for them.
This all depends on there being only a few config changes/slow
tasks at any time, since they do occupy a thread.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8958
--- Comment #27 from Ondřej Kuzník <ondra(a)mistotebe.net> ---
On Tue, Aug 03, 2021 at 12:42:01PM +0000, openldap-its(a)openldap.org wrote:
> I don't think we should be changing anything else about how tpool
> handles pauses. We should just be fixing this specific case of the
> indexer being a slow task, by implementing checkpointing into the
> indexer. I.e., when it detects a pause request it should save its
> current progress and pause itself. If it gets resumed it can pick up
> where it left off, or if a config change affects it it can abort or
> or start over. A checkpointing mechanism is needed anyway, for the
> case of a (clean) shutdown while the indexer is running.
I'll put a suggestion here that we discussed previously: to support this
checkpointing for pauses/shutdowns, the indexer could be writing to a
"scratch" database (whatever that means for each backend) along with
resume data and move them into place when finished. You mentioned that
for liblmdb, this would need support for a database to be renamed.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8958
--- Comment #26 from Howard Chu <hyc(a)openldap.org> ---
(In reply to Howard Chu from comment #25)
> (In reply to Hallvard Furuseth from comment #17)
>
> > Duuh, right. I got stuck looking for what's special about the
> > indexing task and couldn't find it:-( I need to make it special.
> >
> > So, let tasks declare their expected speed until finish or
> > between pausechecks. At FAST=1 (default) and SLOW=0.
> > A pause only stops tasks with speed < ltp_pause.
> > In thread_pool_pause(), replace the WANT_PAUSE stage with
> >
> > while (++ltp_pause <= max speed) {
> > wait until no more tasks with speed < ltp_pause;
> > }
> >
> > Then fast tasks should breeze past slow ones when preparing
> > to pause. Until all threads have slow tasks, anyway.
>
> I don't understand how this solves anything. If a slow indexing
> task is currently running, and a fast config mod comes in, it's
> still the case that the config change could pull the DB out from
> under the indexer task. So there's nothing safe about letting the
> fast task progress while the slow task is still running.
I don't think we should be changing anything else about how tpool
handles pauses. We should just be fixing this specific case of the
indexer being a slow task, by implementing checkpointing into the
indexer. I.e., when it detects a pause request it should save its
current progress and pause itself. If it gets resumed it can pick up
where it left off, or if a config change affects it it can abort or
or start over. A checkpointing mechanism is needed anyway, for the
case of a (clean) shutdown while the indexer is running.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8958
--- Comment #25 from Howard Chu <hyc(a)openldap.org> ---
(In reply to Hallvard Furuseth from comment #17)
> Duuh, right. I got stuck looking for what's special about the
> indexing task and couldn't find it:-( I need to make it special.
>
> So, let tasks declare their expected speed until finish or
> between pausechecks. At FAST=1 (default) and SLOW=0.
> A pause only stops tasks with speed < ltp_pause.
> In thread_pool_pause(), replace the WANT_PAUSE stage with
>
> while (++ltp_pause <= max speed) {
> wait until no more tasks with speed < ltp_pause;
> }
>
> Then fast tasks should breeze past slow ones when preparing
> to pause. Until all threads have slow tasks, anyway.
I don't understand how this solves anything. If a slow indexing
task is currently running, and a fast config mod comes in, it's
still the case that the config change could pull the DB out from
under the indexer task. So there's nothing safe about letting the
fast task progress while the slow task is still running.
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8958
--- Comment #24 from Howard Chu <hyc(a)openldap.org> ---
(In reply to Quanah Gibson-Mount from comment #23)
> Created attachment 799 [details]
> proposed fix
In patch #5
+ ldap_pvt_thread_pool_setspeed( &connection_pool, ctx, 0 );
Shouldn't the minimum speed be 1, not 0?
Since you have
+enum { NOT_PAUSED = 0, WANT_PAUSE = LDAP_PVT_THREAD_POOL_SPEED_MAX+1, PAUSED
};
A speed of 0 would mean no pause at all, wouldn't it?
--
You are receiving this mail because:
You are on the CC list for the issue.
https://bugs.openldap.org/show_bug.cgi?id=8649
Howard Chu <hyc(a)openldap.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |TEST
Status|UNCONFIRMED |RESOLVED
--- Comment #4 from Howard Chu <hyc(a)openldap.org> ---
fixed in master
--
You are receiving this mail because:
You are on the CC list for the issue.