Re: ACLs broken by ITS#5419
by hyc@symas.com
Rein Tollevik wrote:
> On Mon, 24 Mar 2008, Howard Chu wrote:
>
>> rein(a)basefarm.no wrote:
>>> The change to servers/slapd/backend.c for ITS#5416 seem to have broken the
>>> ability for group and set statements in access control lines to refer to
>>> entries
>>> outside the backend currently being operated on.
>> That ability was never intended in the first place. Historically, backends in
>> slapd have been treated as isolated DSAs with no connection to each other.
>> They've required special mechanisms (like back-relay or slapo-glue) to be
>> joined.
>
> Yes, I know, the change that allowed this was imo the one that made sets
> and groups really useful. Our database configuration still has traces of
> the workarounds the lack of this feature once forced us to make..
>
> But, the latest change also removes this ability for databases subordinate
> to the same common superior (i.e using the slapo-glue). If I understand
> you correct it is a bug that glue'ed databases cannot refer to each other,
> although I still consider it a bug (or at least a huge drawback) if this
> would no longer be generally possible.
Actually, looking back over CVS, it seems this ability has existed since
OpenLDAP 2.0, intended or not. Will have to work up a better solution to
restore that behavior.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
15 years, 2 months
Re: (ITS#5439) syncprov race condition seg. fault
by hyc@symas.com
rein(a)basefarm.no wrote:
> Full_Name: Rein Tollevik
> Version: CVS head
> OS: CentOS 4.4
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (81.93.160.250)
>
>
> We have bin hit by what looks like a race condition bug in syncprov. We got
> some core dumps all showing stack frames like the one at the end. As such nasty
> bugs tends to do it have behaved OK after I restarted slapd with more debug
> output :-( (trace + stats + stats2 + sync).
>
> The configuration is a master server with multiple bdb backend databases all
> being subordinate to the same glue database where syncprov is used. One of the
> backends is a syncrepl consumer from another server, the server is master for
> the other backends. There are multiple consumers for the syncprov suffix, which
> I assume is what causes the race condition to happen.
>
> Note the a=0xBAD argument to attr_find(), which I expect is the result of some
> other thread freeing the attribute list it was called with while it was
> processing it. The rs->sr_entry->e_attrs argument passed to attr_find() as the
> original "a" argument by findpres_cb() looks like a perfectly valid structure,
> as are all the attributes found by following the a_next pointer. The list is
> terminated by an attribute with a NULL a_next value, none of the a_next values
> are 0xBAD.
I don't believe that's the cause. Notice that arg0 in stack frame #9 is also
0xbad, even though it is shown correctly in frames 8 and 10. Something else is
going on.
> I'm currently trying to gather more information related to this bug, any
> pointers as to what I should look for is appreciated. I'm posting this bug
> report now in the hope that the stack frame should enlighten someone with better
> knowledge of the code than what I have.
Check for stack overruns, compile without optimization and make sure it's not
a compiler optimization bug, etc.
>
> Rein Tollevik
> Basefarm AS
>
> #0 0x0807d03a in attr_find (a=0xbad, desc=0x81e8680) at attr.c:665
> #1 0xb7a656f6 in findpres_cb (op=0xaf068ba4, rs=0xaf068b68) at syncprov.c:546
> #2 0x0808416d in slap_response_play (op=0xaf068ba4, rs=0xaf068b68) at
> result.c:307
> #3 0x0808555b in slap_send_search_entry (op=0xaf068ba4, rs=0xaf068b68) at
> result.c:770
> #4 0x080f2cdc in bdb_search (op=0xaf068ba4, rs=0xaf068b68) at search.c:870
> #5 0x080db72b in overlay_op_walk (op=0xaf068ba4, rs=0xaf068b68,
> which=op_search, oi=0x8274218, on=0x8274318) at backover.c:653
> #6 0x080dbcaf in over_op_func (op=0xaf068ba4, rs=0xaf068b68, which=op_search)
> at backover.c:705
> #7 0x080dbdef in over_op_search (op=0xaf068ba4, rs=0xaf068b68) at
> backover.c:727
> #8 0x080d9570 in glue_sub_search (op=0xaf068ba4, rs=0xaf068b68, b0=0xaf068ba4,
> on=0xaf068ba4) at backglue.c:340
> #9 0x080da131 in glue_op_search (op=0xbad, rs=0xaf068b68) at backglue.c:459
> #10 0x080db6d5 in overlay_op_walk (op=0xaf068ba4, rs=0xaf068b68,
> which=op_search, oi=0x8271860, on=0x8271a60) at backover.c:643
> #11 0x080dbcaf in over_op_func (op=0xaf068ba4, rs=0xaf068b68, which=op_search)
> at backover.c:705
> #12 0x080dbdef in over_op_search (op=0xaf068ba4, rs=0xaf068b68) at
> backover.c:727
> #13 0xb7a65ff4 in syncprov_findcsn (op=0x85c7e60, mode=FIND_PRESENT) at
> syncprov.c:700
> #14 0xb7a670a0 in syncprov_op_search (op=0x85c7e60, rs=0xaf06a1c0) at
> syncprov.c:2277
> #15 0x080db6d5 in overlay_op_walk (op=0x85c7e60, rs=0xaf06a1c0, which=op_search,
> oi=0x8271860, on=0x8271b60) at backover.c:643
> #16 0x080dbcaf in over_op_func (op=0x85c7e60, rs=0xaf06a1c0, which=op_search) at
> backover.c:705
> #17 0x080dbdef in over_op_search (op=0x85c7e60, rs=0xaf06a1c0) at
> backover.c:727
> #18 0x08076554 in fe_op_search (op=0x85c7e60, rs=0xaf06a1c0) at search.c:368
> #19 0x080770e4 in do_search (op=0x85c7e60, rs=0xaf06a1c0) at search.c:217
> #20 0x08073e28 in connection_operation (ctx=0xaf06a2b8, arg_v=0x85c7e60) at
> connection.c:1084
> #21 0x08074f14 in connection_read_thread (ctx=0xaf06a2b8, argv=0x59) at
> connection.c:1211
> #22 0xb7fb5546 in ldap_int_thread_pool_wrapper (xpool=0x81ee240) at tpool.c:663
> #23 0xb7c80371 in start_thread () from /lib/tls/libpthread.so.0
> #24 0xb7c17ffe in clone () from /lib/tls/libc.so.6
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
15 years, 2 months
Re: (ITS#5340) REP_ENTRY_MODIFIABLE bug in dynlist
by ando@sys-net.it
h.b.furuseth(a)usit.uio.no wrote:
> h.b.furuseth(a)usit.uio.no writes:
>> overlays/dynlist.c lines 371-376 sets REP_ENTRY_MUSTBEFREED in rs->sr_flags
>> without duplicating the entry, if REP_ENTRY_MODIFIABLE was set.
>> Thus the entry gets freed twice. Breaks test044-dynlist with back-ldif.
>
> Actually obeying rs->sr_flags seems to be a more general problem - but I
> don't know what are bugs and what are my lacking understanding of the
> flags. I.e. when can an overlay change REP_ENTRY_MUSTBEFREED,
> REP_ENTRY_MUSTRELEASE? Usually backends/overlays that do not set these
> flags, don't seem to expect them to change either.
In general, callbacks are supposed to know about the existence of
rs->sr_flags and deal with them. In detail, callbacks should use
mechanisms that allow them to dispose of read-only resources without
counting on the fact that rs->sr_* fields will be persistent through the
callback chain. So, callbacks that need to modify data should check
whether that data is modifiable; if it's not, they should first copy it,
and set the rs->sr_flags as appropriate. The main reason for those
flags to exist is optimization, otherwise we'd always duplicate data and
dispose of it when done. So back-bdb/back-hdb, which are supposed to be
the high-end database implementations, will always pass read-only data
to the callback chain; however, if an overlay needs to muck with that
data, it will first copy it, and set the REP_ENTRY_MODIFYABLE to inform
other overlays that data can be freely modified without copying it
again. On its own, back-bdb/back-hdb will take care of releasing the
original entry without the need to use the value of rs->sr_entry; it'll
use its copy of the entry's pointer.
The existence of the REP_ENTRY_MUSTRELEASE value is not totally clear to
me; it seems to indicate that in some cases callbacks want to pass a
read-only entry without providing further cleanup code to release it; it
makes sense, since this simple operation can be done by
slap_send_search_entry() instead, if all that's required; however, this
means that any callback that replaces the entry must remember to release
it and must muck with rs->sr_flags accordingly. However, it is not
clear what happens when a callback chain is interrupted by
slap_null_cb() or similar, without getting to slap_send_search_entry().
This seems to indicate that callbacks should always provide a last
resort means to release the resources they set; if read-only, by keeping
track of what they sent; if modifiable, by freeing the contents of
rs->sr_* if not NULL, setting it to NULL to prevent further cleanup.
Similarly, the existence of REP_ENTRY_MUSTBEFREED is not totally clear:
in principle as soon as REP_ENTRY_MODIFYABLE is set, it should imply
REP_ENTRY_MUSTBEFREED; the only difference in the semantics of the two
is that REP_ENTRY_MODIFYABLE without REP_ENTRY_MUSTBEFREED implies that
the callback that set the former will take care of freeing the entry;
however, other callbacks may further modify it, so freeing temporary
data should probably be left to the final handler.
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
15 years, 2 months
(ITS#5439) syncprov race condition seg. fault
by rein@basefarm.no
Full_Name: Rein Tollevik
Version: CVS head
OS: CentOS 4.4
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (81.93.160.250)
We have bin hit by what looks like a race condition bug in syncprov. We got
some core dumps all showing stack frames like the one at the end. As such nasty
bugs tends to do it have behaved OK after I restarted slapd with more debug
output :-( (trace + stats + stats2 + sync).
The configuration is a master server with multiple bdb backend databases all
being subordinate to the same glue database where syncprov is used. One of the
backends is a syncrepl consumer from another server, the server is master for
the other backends. There are multiple consumers for the syncprov suffix, which
I assume is what causes the race condition to happen.
Note the a=0xBAD argument to attr_find(), which I expect is the result of some
other thread freeing the attribute list it was called with while it was
processing it. The rs->sr_entry->e_attrs argument passed to attr_find() as the
original "a" argument by findpres_cb() looks like a perfectly valid structure,
as are all the attributes found by following the a_next pointer. The list is
terminated by an attribute with a NULL a_next value, none of the a_next values
are 0xBAD.
I'm currently trying to gather more information related to this bug, any
pointers as to what I should look for is appreciated. I'm posting this bug
report now in the hope that the stack frame should enlighten someone with better
knowledge of the code than what I have.
Rein Tollevik
Basefarm AS
#0 0x0807d03a in attr_find (a=0xbad, desc=0x81e8680) at attr.c:665
#1 0xb7a656f6 in findpres_cb (op=0xaf068ba4, rs=0xaf068b68) at syncprov.c:546
#2 0x0808416d in slap_response_play (op=0xaf068ba4, rs=0xaf068b68) at
result.c:307
#3 0x0808555b in slap_send_search_entry (op=0xaf068ba4, rs=0xaf068b68) at
result.c:770
#4 0x080f2cdc in bdb_search (op=0xaf068ba4, rs=0xaf068b68) at search.c:870
#5 0x080db72b in overlay_op_walk (op=0xaf068ba4, rs=0xaf068b68,
which=op_search, oi=0x8274218, on=0x8274318) at backover.c:653
#6 0x080dbcaf in over_op_func (op=0xaf068ba4, rs=0xaf068b68, which=op_search)
at backover.c:705
#7 0x080dbdef in over_op_search (op=0xaf068ba4, rs=0xaf068b68) at
backover.c:727
#8 0x080d9570 in glue_sub_search (op=0xaf068ba4, rs=0xaf068b68, b0=0xaf068ba4,
on=0xaf068ba4) at backglue.c:340
#9 0x080da131 in glue_op_search (op=0xbad, rs=0xaf068b68) at backglue.c:459
#10 0x080db6d5 in overlay_op_walk (op=0xaf068ba4, rs=0xaf068b68,
which=op_search, oi=0x8271860, on=0x8271a60) at backover.c:643
#11 0x080dbcaf in over_op_func (op=0xaf068ba4, rs=0xaf068b68, which=op_search)
at backover.c:705
#12 0x080dbdef in over_op_search (op=0xaf068ba4, rs=0xaf068b68) at
backover.c:727
#13 0xb7a65ff4 in syncprov_findcsn (op=0x85c7e60, mode=FIND_PRESENT) at
syncprov.c:700
#14 0xb7a670a0 in syncprov_op_search (op=0x85c7e60, rs=0xaf06a1c0) at
syncprov.c:2277
#15 0x080db6d5 in overlay_op_walk (op=0x85c7e60, rs=0xaf06a1c0, which=op_search,
oi=0x8271860, on=0x8271b60) at backover.c:643
#16 0x080dbcaf in over_op_func (op=0x85c7e60, rs=0xaf06a1c0, which=op_search) at
backover.c:705
#17 0x080dbdef in over_op_search (op=0x85c7e60, rs=0xaf06a1c0) at
backover.c:727
#18 0x08076554 in fe_op_search (op=0x85c7e60, rs=0xaf06a1c0) at search.c:368
#19 0x080770e4 in do_search (op=0x85c7e60, rs=0xaf06a1c0) at search.c:217
#20 0x08073e28 in connection_operation (ctx=0xaf06a2b8, arg_v=0x85c7e60) at
connection.c:1084
#21 0x08074f14 in connection_read_thread (ctx=0xaf06a2b8, argv=0x59) at
connection.c:1211
#22 0xb7fb5546 in ldap_int_thread_pool_wrapper (xpool=0x81ee240) at tpool.c:663
#23 0xb7c80371 in start_thread () from /lib/tls/libpthread.so.0
#24 0xb7c17ffe in clone () from /lib/tls/libc.so.6
15 years, 2 months
Re: (ITS#5401) slapd crashes in mirrormode
by ali.pouya@free.fr
Howard Chu wrote :
>A patch for this is now in CVS HEAD. Please test.
>I was unable to reproduce the reported crash, so your feedback is important
>here.
Hi Howard;
We tested the HEAD extracted march 18 around 16h GMT.
It's OK.
Thanks a lot.
The ITS can be closed.
Best regards
Ali
15 years, 2 months
Re: ACLs broken by ITS#5419
by rein@basefarm.no
On Mon, 24 Mar 2008, Howard Chu wrote:
> rein(a)basefarm.no wrote:
>>
>> The change to servers/slapd/backend.c for ITS#5416 seem to have broken the
>> ability for group and set statements in access control lines to refer to
>> entries
>> outside the backend currently being operated on.
>
> That ability was never intended in the first place. Historically, backends in
> slapd have been treated as isolated DSAs with no connection to each other.
> They've required special mechanisms (like back-relay or slapo-glue) to be
> joined.
Yes, I know, the change that allowed this was imo the one that made sets
and groups really useful. Our database configuration still has traces of
the workarounds the lack of this feature once forced us to make..
But, the latest change also removes this ability for databases subordinate
to the same common superior (i.e using the slapo-glue). If I understand
you correct it is a bug that glue'ed databases cannot refer to each other,
although I still consider it a bug (or at least a huge drawback) if this
would no longer be generally possible.
Rein
15 years, 2 months
Re: ACLs broken by ITS#5419
by hyc@symas.com
rein(a)basefarm.no wrote:
> Full_Name: Rein Tollevik
> Version: CVS head
> OS: linux, solaris
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (84.215.2.34)
>
>
> The change to servers/slapd/backend.c for ITS#5416 seem to have broken the
> ability for group and set statements in access control lines to refer to entries
> outside the backend currently being operated on.
That ability was never intended in the first place. Historically, backends in
slapd have been treated as isolated DSAs with no connection to each other.
They've required special mechanisms (like back-relay or slapo-glue) to be joined.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
15 years, 2 months
ACLs broken by ITS#5419
by rein@basefarm.no
Full_Name: Rein Tollevik
Version: CVS head
OS: linux, solaris
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (84.215.2.34)
The change to servers/slapd/backend.c for ITS#5416 seem to have broken the
ability for group and set statements in access control lines to refer to entries
outside the backend currently being operated on.
This is caused by op->op_bd being set back to the backend being operated on when
backend_group() was called, not the backend returned by select_backend() for the
group being referred to as was always done before this change was made.
>From what I can see of the cases where backend_group() is called (i.e in ACLs,
limits and the dynlist and dyngroup overlays) I think the correct fix is to back
out this changes as far as that function is conserned. But I don't know enough
of other consequences, so I haven't tried it.
Rein Tollevik
Basefarm AS
15 years, 2 months
Re: (ITS#5434) syncprov in a glue'ed environment must search through the glue overlay
by rein@basefarm.no
On Wed, 19 Mar 2008, Howard Chu wrote:
> rein(a)basefarm.no wrote:
>>
>> When the syncprov overlay is stacked on top of other overlays (notably the
>> glue
>> overlay) it must search through the next overlay, not the original backend
>> DB.
>> Otherwise it will not find any entries from the subordinate databases and
>> consumers will remove those entries when they enters the refresh phase of
>> the
>> syncrepl protocol.
>>
>> The attached patch fixes this problem.
>
> Thanks, fixed in HEAD
It looks as if my fix to this bug was utterly wrong :-( It probably only
works when syncprov is stacked on top of the glue overlay. not with other
overlays.
First, searching through the next overlay looses the overlay semantics
(i.e to return SLAP_CB_CONTINUE etc.) Even worse, a seg. fault will
occur if the bi_op_search() method is not implement by the next overlay.
I don't know whether that is actually allowed or not, but it is not done
in at least the auditlog overlay.
The correct fix should (I hope...) be to use the original overinst
structure as the bd_info, as the patch at the end (which is relative to
the current cvs head) implements. Since o_sync_mode appear to always
have been cleared before be_search() is called it should be ignored by
syncprov_op_search().
Sorry for the inconvenience!
Rein
Index: OpenLDAP/servers/slapd/overlays/syncprov.c
diff -u OpenLDAP/servers/slapd/overlays/syncprov.c:1.1.1.12 OpenLDAP/servers/slapd/overlays/syncprov.c:1.9
--- OpenLDAP/servers/slapd/overlays/syncprov.c:1.1.1.12 Sat Mar 22 16:48:20 2008
+++ OpenLDAP/servers/slapd/overlays/syncprov.c Sun Mar 23 14:06:03 2008
@@ -696,10 +696,7 @@
break;
}
- if ( on->on_next )
- fop.o_bd->bd_info = (BackendInfo *)on->on_next;
- else
- fop.o_bd->bd_info = on->on_info->oi_orig;
+ fop.o_bd->bd_info = (BackendInfo *)on->on_info;
fop.o_bd->be_search( &fop, &frs );
fop.o_bd->bd_info = (BackendInfo *)on;
@@ -1531,10 +1528,7 @@
fop.ors_filter = ⁡
cb.sc_response = playlog_cb;
- if ( on->on_next )
- fop.o_bd->bd_info = (BackendInfo *)on->on_next;
- else
- fop.o_bd->bd_info = on->on_info->oi_orig;
+ fop.o_bd->bd_info = (BackendInfo *)on->on_info;
for ( i=ndel; i<num; i++ ) {
if ( uuids[i].bv_len == 0 ) continue;
15 years, 2 months
Re: (ITS#5340) REP_ENTRY_MODIFIABLE bug in dynlist
by h.b.furuseth@usit.uio.no
h.b.furuseth(a)usit.uio.no writes:
> overlays/dynlist.c lines 371-376 sets REP_ENTRY_MUSTBEFREED in rs->sr_flags
> without duplicating the entry, if REP_ENTRY_MODIFIABLE was set.
> Thus the entry gets freed twice. Breaks test044-dynlist with back-ldif.
Actually obeying rs->sr_flags seems to be a more general problem - but I
don't know what are bugs and what are my lacking understanding of the
flags. I.e. when can an overlay change REP_ENTRY_MUSTBEFREED,
REP_ENTRY_MUSTRELEASE? Usually backends/overlays that do not set these
flags, don't seem to expect them to change either.
--
Hallvard
15 years, 2 months