Full_Name: Quanah Gibson-Mount
Version: 2.3.37
OS: Linux 2.6
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (38.104.138.6)
Several customers have found delta-syncrepl will lock up after a time. Today it
occurred again, and this time some logging data was able to be gathered. The
last operation logged was a MOD op. This matches past lockups, which were also
either MOD or ADD operations.
The following files will be uploaded to the ftp site, where # will be the
assigned ITS number.
#-dbstat.delta.out.2007-10-01
which is the db_stat information for the accesslog DB
#-db_stat.out.2007-10-01
which is the db_stat information for the main DB
#-pstak.out.2007-10-01
which is the pstack information for the slapd process
Unfortunately no GDB info was retrieved this time, and reportedly gcore hung.
The most interesting part is I see no WRITE locks held in either DB, and all the
client threads are hung in a mutex.
jclarke(a)linagora.com wrote:
> I have tested this on both 2.3.38 and HEAD (same version on all 3 servers), and
> behaviour is quite different, though the end result is the same.
Due to multiple limitations, this is not expected to work in 2.3 at all.
> On HEAD, things are quite different:
> request done: ld 0x82c2228 msgid 2
> do_syncrep2: rid=7 LDAP_RES_SEARCH_RESULT
> nonpresent_callback: rid=7 got UUID b5797e8c-0486-102c-83e0-79137da6179f, dn
> dc=ossa,dc=linagora,dc=org
> nonpresent_callback: rid=7 got UUID b5a2f834-0486-102c-83e1-79137da6179f, dn
> uid=root,dc=ossa,dc=linagora,dc=org
> nonpresent_callback: rid=7 got UUID b5a31526-0486-102c-83e2-79137da6179f, dn
> uid=replicator,dc=ossa,dc=linagora,dc=org
> adresse de be_modify : 80c8840
> null_callback : error code 0x32
> syncrepl_updateCookie: rid=7 be_modify failed (50)
This is Insufficient Access. You have not configured the glue databases with
identical rootDN's, which is clearly documented as a requirement for glued databases.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Full_Name: Jonathan Clarke
Version: 2.3.38 and HEAD (but with slightly different results)
OS: Linux
URL: ftp://ftp.openldap.org/incoming/unwanted-deletes-syncrepl-glue.tar.gz
Submission from: (NULL) (213.41.243.192)
Hi folks,
I've come across an issue with a server using the glue overlay with one of it's
subordinate databases syncrepl'd. There are two problems:
1) updating the root database's contextCSN
2) when replicating this whole server with syncrepl (to a 3rd server), certain
updates cause many entries to be deleted from the consumer.
The following "schema" should describe this setup more clearly (names TOP,
MIDDLE and BOTTOM are for easy reference):
TOP:
|----------------------------|
| One bdb backend: |
| dc=ossa,dc=linagora,dc=org |
|----------------------------|
|
MIDDLE: |
|-------------------------------|
| Two bdb backends + glue: |
| 1) dc=ossa,dc=linagora,dc=org |
| subordinate |
| syncrepl from above server |
| 2) dc=linagora,dc=org |
| 'master' for this branch |
| |
| syncprov overlay |
|-------------------------------|
|
BOTTOM: |
|-------------------------------|
| One bdb backend: |
| dc=linagora,dc=org |
| syncrepl from above server |
|-------------------------------|
All config files, and some sample data sets, are in the archive at the URL
above.
I have tested this on both 2.3.38 and HEAD (same version on all 3 servers), and
behaviour is quite different, though the end result is the same.
On 2.3.38:
1) Set up all three servers, make sure they're sync'ed.
2) Modify some attribute on the TOP server (I add a description to the root DN,
dc=ossa,dc=linagora,dc=org)
3) Watch this modification propagate to the middle server. the contextCSN in
dc=linagora,dc=org is not updated, but the one in dc=ossa,dc=linagora,dc=org is
(equals to the entryCSN of the entry I modified). The output is the following
with loglevel=stats+sync:
8>---------------------------------------------------------------
request done: ld 0x822ec20 msgid 1
do_syncrep2: rid 007 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
syncrepl_entry: rid 007 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid 007 be_search (0)
syncrepl_entry: rid 007 dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid 007 be_modify (0)
request done: ld 0x822ec20 msgid 2
do_syncrep2: rid 007 LDAP_RES_SEARCH_RESULT
8>---------------------------------------------------------------
4) Watch the BOTTOM server (see schema above) do it's syncrepl and delete some
entries below the glued database (glued on MIDDLE server, not on this one). The
output is the following with loglevel=stats+sync:
8>---------------------------------------------------------------
request done: ld 0x822a320 msgid 1
request done: ld 0x822a320 msgid 2
do_syncrep2: rid 888 LDAP_RES_SEARCH_RESULT
request done: ld 0x822a320 msgid 1
do_syncrep2: rid 888 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
syncrepl_entry: rid 888 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid 888 be_search (0)
syncrepl_entry: rid 888 dc=linagora,dc=org
syncrepl_entry: rid 888 be_modify (0)
syncrepl_entry: rid 888 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid 888 be_search (0)
syncrepl_entry: rid 888 dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid 888 be_modify (0)
request done: ld 0x822a320 msgid 2
do_syncrep2: rid 888 LDAP_RES_SEARCH_RESULT
syncrepl_del_nonpresent: rid 888 be_delete
uid=replicator,dc=ossa,dc=linagora,dc=org (0)
syncrepl_del_nonpresent: rid 888 be_delete uid=root,dc=ossa,dc=linagora,dc=org
(0)
8>---------------------------------------------------------------
On HEAD, things are quite different:
1) Start the TOP server.
2) Start the MIDDLE server. Errors happen immediatly, on first sync attempt. The
output is the following with loglevel=stats+sync:
8>---------------------------------------------------------------
slapd starting
request done: ld 0x82c2228 msgid 1
syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid=7 inserted UUID b5797e8c-0486-102c-83e0-79137da6179f
syncrepl_entry: rid=7 be_search (32)
syncrepl_entry: rid=7 dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid=7 be_add (0)
syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid=7 inserted UUID b5a2f834-0486-102c-83e1-79137da6179f
syncrepl_entry: rid=7 be_search (0)
syncrepl_entry: rid=7 uid=root,dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid=7 be_add (0)
syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid=7 inserted UUID b5a31526-0486-102c-83e2-79137da6179f
syncrepl_entry: rid=7 be_search (0)
syncrepl_entry: rid=7 uid=replicator,dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid=7 be_add (0)
request done: ld 0x82c2228 msgid 2
do_syncrep2: rid=7 LDAP_RES_SEARCH_RESULT
nonpresent_callback: rid=7 got UUID b5797e8c-0486-102c-83e0-79137da6179f, dn
dc=ossa,dc=linagora,dc=org
nonpresent_callback: rid=7 got UUID b5a2f834-0486-102c-83e1-79137da6179f, dn
uid=root,dc=ossa,dc=linagora,dc=org
nonpresent_callback: rid=7 got UUID b5a31526-0486-102c-83e2-79137da6179f, dn
uid=replicator,dc=ossa,dc=linagora,dc=org
adresse de be_modify : 80c8840
null_callback : error code 0x32
syncrepl_updateCookie: rid=7 be_modify failed (50)
request done: ld 0x82c2228 msgid 1
syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid=7 inserted UUID b5797e8c-0486-102c-83e0-79137da6179f
dn_callback : entries have identical CSN dc=ossa,dc=linagora,dc=org ours
20071001162546.703481Z#000000#000#000000, new
20071001162546.703481Z#000000#000#000000
syncrepl_entry: rid=7 be_search (0)
syncrepl_entry: rid=7 dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid=7 entry unchanged, ignored (dc=ossa,dc=linagora,dc=org)
syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid=7 inserted UUID b5a2f834-0486-102c-83e1-79137da6179f
dn_callback : entries have identical CSN uid=root,dc=ossa,dc=linagora,dc=org
ours 20071001162546.975377Z#000000#000#000000, new
20071001162546.975377Z#000000#000#000000
syncrepl_entry: rid=7 be_search (0)
syncrepl_entry: rid=7 uid=root,dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid=7 entry unchanged, ignored
(uid=root,dc=ossa,dc=linagora,dc=org)
syncrepl_entry: rid=7 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
syncrepl_entry: rid=7 inserted UUID b5a31526-0486-102c-83e2-79137da6179f
dn_callback : entries have identical CSN
uid=replicator,dc=ossa,dc=linagora,dc=org ours
20071001162546.976133Z#000000#000#000000, new
20071001162546.976133Z#000000#000#000000
syncrepl_entry: rid=7 be_search (0)
syncrepl_entry: rid=7 uid=replicator,dc=ossa,dc=linagora,dc=org
syncrepl_entry: rid=7 entry unchanged, ignored
(uid=replicator,dc=ossa,dc=linagora,dc=org)
8>---------------------------------------------------------------
Obviously, the desired result is that entries are not deleted from the BOTTOM
server when replication happens. I'm a bit at a loss as to the logic behind
these updates, and how to go about correcting.
I tried applying a patch backported from HEAD to 2.3.38 that makes syncrepl
update the contextCSN in the real root (not the bdb database root). It works in
that the contextCSN is updated correctly, but replication to BOTTOM still has
unwanted deletes. The patch is in the archive attached
(update-root-contextCSN.diff) and corresponds to revisions 1.308 and 1.309 in
syncrepl.c CVS log.
I am completely available to provide any more information necessary: logs,
testing, gdb output, etc. Any help or pointers most welcome!
Thanks in advance,
Jon
h.b.furuseth(a)usit.uio.no wrote:
> OpenLDAP is full of code like
> ptr += snprintf( ptr, ... );
> and
> bv.bv_len = snprintf( ... );
> <use bv>;
Let me add that there seem to be occasional improper uses of sprintf()
too (usually in marginal code, fortunately).
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
hyc(a)symas.com wrote:
> Howard Chu wrote:
>> I just got tripped trying to import an LDIF with a cert with 16 byte
>> SerialNumber. I've patched this to just use the same hexadecimal format that
>> OpenSSL uses when the number is larger than ber_int_t. We really don't want
>> the format to change just because someone has a BigNum library available; it
>> needs to stay consistent.
>
> But we still need to fix serialNumberAndIssuerNormalize() to normalize to Hex now.
> And in case somebody feeds in a very large decimal integer, we still need a
> multi-word decimal-to-binary converter. As such, this bug cannot be closed yet.
OK. Does it make any sense to just move to a hex-only syntax, perfixed
by "0x", with no sign as you mentioned earlier, or should we preserve
compatibility with the original form, where the minus sign is allowed
while a number not starting with "0x" should be treated as decimal? The
latter would be probably better, but we'd need to convert decimal to
hex, and this could fail if decimals are too large.
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
Howard Chu wrote:
> I just got tripped trying to import an LDIF with a cert with 16 byte
> SerialNumber. I've patched this to just use the same hexadecimal format that
> OpenSSL uses when the number is larger than ber_int_t. We really don't want
> the format to change just because someone has a BigNum library available; it
> needs to stay consistent.
But we still need to fix serialNumberAndIssuerNormalize() to normalize to Hex now.
And in case somebody feeds in a very large decimal integer, we still need a
multi-word decimal-to-binary converter. As such, this bug cannot be closed yet.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
hyc(a)symas.com wrote:
> Of course now that userPassword and authPassword already exist, all the good
> attribute names are already gone. ;)
"password"?
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
Full_Name: Howard Chu
Version: HEAD/RE24
OS:
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (63.193.240.128)
Submitted by: hyc
In RE23 the slap_passwd_parse() function parses the exop data destructively, so
it cannot be successfully parsed again by any other function that needs to. In
HEAD there is a new LBER option to specify non-destructive parsing of strings,
and slap_passwd_parse() has been changed to use this option.