Re: (ITS#4817) modify "replace" command firing conflict error where it shouldn't be
by daniel@ncsu.edu
Hi Howard! As much as I hate to say this, it did not solve the problem I
was running into. =( I rebuilt openldap with the entry.c patch and
rebuilt the LDAP database from scratch (pulling from sources and using
slapadd). Then I ran the following statement via ldapmodify (as always,
XXX'd out sensitive info):
dn: uid=XXX,ou=students,ou=people,dc=ncsu,dc=edu
changetype: modify
replace: ou
ou: Political Science - Int Politics Conc
ou: B A - History
-
replace: ncsucurriculumcode
ncsucurriculumcode: LIP
ncsucurriculumcode: LAH
And I get:
modifying entry "uid=XXX,ou=students,ou=people,dc=ncsu,dc=edu"
ldap_modify: Type or value exists (20)
additional info: modify/replace: ou: value #1 already exists
(and again, I can 'fix' it by doing a delete: ou and then redoing the
replace; ncsucurriculumcode fails in exactly the same way; after deleting
and redoing the replace, i can replace the ou over and over and over again
with the same data and it's fine every time from then on)
Entry as it stood before running that statement was (as produced by slapcat):
dn: uid=XXX,ou=students,ou=people,dc=ncsu,dc=edu
objectClass: person
objectClass: inetOrgPerson
objectClass: ncsuPerson
uid: XXX
cn: xxx
sn: XXX
title: Sophomore
ncsuTwoPartName: XXX
organizationalStatus: registered
o: NC State University
givenName: XXX
ncsuMiddleName: XXX
initials: XXX
displayName: XXX
ncsuAltDisplayName: XXX
ncsuCampusID: XXX
ncsuClassCode: SO
ou: Political Science - Int Politics Conc
ncsuCurriculumCode: LIP
ou: B A - History
ncsuCurriculumCode: LAH
mail: ztmiller(a)gmail.com
ncsuPrimaryEMail: XXX
registeredAddress: XXX
postalAddress: XXX
telephoneNumber: XXX
l: Graham
st: NC
postalCode: 27253
ncsuPrimaryRole: student
structuralObjectClass: inetOrgPerson
entryUUID: d10a18fe-4286-102b-89b0-9fbe6f44e542
creatorsName: cn=XXX,dc=ncsu,dc=edu
modifiersName: cn=xxX,dc=ncsu,dc=edu
createTimestamp: 20070127191747Z
modifyTimestamp: 20070127191747Z
entryCSN: 20070127191747Z#000014#00#000000
Note that yes, I am replacing the setting with exactly what it was before,
for testing purposes, but there's no reason that shouldn't work, correct?
Daniel
> Daniel Henninger wrote:
>
>> RE23... ;D Does that mean I should wait for the next minor release? Or
>> shall I pull from CVS/SVN? or?
>
> You can pull the current slapd/entry.c from CVS (use the
> OPENLDAP_REL_ENG_2_3
> tag) if you want to patch your current builds. The fix will be in 2.3.34.
>
> --
> -- Howard Chu
> Chief Architect, Symas Corp. http://www.symas.com
> Director, Highland Sun http://highlandsun.com/hyc
> OpenLDAP Core Team http://www.openldap.org/project/
>
16 years, 4 months
Re: (ITS#4821) test043 cores; apparently incorrect resource usage
by ando@sys-net.it
ando(a)sys-net.it wrote:
> hyc(a)symas.com wrote:
>
>> Working now with patched libldap_r/tpool.c
>
> But now it leaks a lot at shutdown...
I've plugged the config related leaks; what appears to be missing is the
companion of connection_fake_init() that takes care of destroying the
slab. BTW, there's a SLAPI-related connection_fake_destroy() which has
nothing to do with the purpose...
p.
16 years, 4 months
Re: (ITS#4821) test043 cores; apparently incorrect resource usage
by ando@sys-net.it
hyc(a)symas.com wrote:
> Working now with patched libldap_r/tpool.c
But now it leaks a lot at shutdown...
<producer>
==5246== 24,118,976 (1,728 direct, 24,117,248 indirect) bytes in 54
blocks are definitely lost in loss record 5 of 8
==5246== at 0x4004405: malloc (vg_replace_malloc.c:149)
==5246== by 0x8217EC7: ber_memalloc_x (memory.c:226)
==5246== by 0x809ECCC: ch_malloc (ch_malloc.c:54)
==5246== by 0x80DD5A1: slap_sl_mem_create (sl_malloc.c:122)
==5246== by 0x8081145: connection_fake_init (connection.c:1997)
==5246== by 0x8070E03: config_back_db_open (bconfig.c:5272)
==5246== by 0x808D1F0: backend_startup_one (backend.c:212)
==5246== by 0x808D67D: backend_startup (backend.c:303)
==5246== by 0x80B47B5: slap_startup (init.c:248)
==5246== by 0x8062D75: main (main.c:923)
</producer>
<consumer>
==5265== 5,243,255 (375 direct, 5,242,880 indirect) bytes in 12 blocks
are definitely lost in loss record 3 of 7
==5265== at 0x4004405: malloc (vg_replace_malloc.c:149)
==5265== by 0x8217EC7: ber_memalloc_x (memory.c:226)
==5265== by 0x81FAA5E: ldap_dn2bv_x (getdn.c:3027)
==5265== by 0x8096621: dnNormalize (dn.c:485)
==5265== by 0x80EBB59: parse_syncrepl_line (syncrepl.c:3039)
==5265== by 0x80EDB0E: add_syncrepl (syncrepl.c:3408)
==5265== by 0x80EED08: syncrepl_config (syncrepl.c:3670)
==5265== by 0x8073163: config_set_vals (config.c:305)
==5265== by 0x807364E: config_add_vals (config.c:374)
==5265== by 0x80745EE: read_config_file (config.c:731)
==5265== by 0x806C3A2: read_config (bconfig.c:3440)
==5265== by 0x8062668: main (main.c:745)
...
==5301== 23,068,672 bytes in 8 blocks are possibly lost in loss record 7
of 7
==5301== at 0x4004405: malloc (vg_replace_malloc.c:149)
==5301== by 0x8217EC7: ber_memalloc_x (memory.c:226)
==5301== by 0x809ECCC: ch_malloc (ch_malloc.c:54)
==5301== by 0x80DD5AF: slap_sl_mem_create (sl_malloc.c:123)
==5301== by 0x8081145: connection_fake_init (connection.c:1997)
==5301== by 0x8070E03: config_back_db_open (bconfig.c:5272)
==5301== by 0x808D1F0: backend_startup_one (backend.c:212)
==5301== by 0x808D67D: backend_startup (backend.c:303)
==5301== by 0x80B47B5: slap_startup (init.c:248)
==5301== by 0x8062D75: main (main.c:923)
</consumer>
p.
16 years, 4 months
Re: (ITS#4821) test043 cores; apparently incorrect resource usage
by hyc@symas.com
hyc(a)symas.com wrote:
> ando(a)sys-net.it wrote:
>> ando(a)sys-net.it wrote:
>>
>>> #4 0x080bc142 in entry_schema_check (op=0xb6f10760, e=0x8a606dc, oldattrs=0x0,
>>> manage=0, add_soc=1, text=0xb6f10738,
>>> textbuf=0xb6f1040c "��@", textlen=256) at schema_check.c:87
>>> 87 assert( a->a_vals[0].bv_val != NULL );
>> The two things were actually unrelated. The latter is now fixed in
>> HEAD; I still get two cores, but both are related to calling
>> bdb_locker_id_free() at shutdown with an invalid environment.
>
> I see that as well, looking into it. That invocation of bdb_locker_id_free()
> should of course not be happening.
>
Working now with patched libldap_r/tpool.c
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/
16 years, 4 months
Re: (ITS#4821) test043 cores; apparently incorrect resource usage
by hyc@symas.com
ando(a)sys-net.it wrote:
> ando(a)sys-net.it wrote:
>
>> #4 0x080bc142 in entry_schema_check (op=0xb6f10760, e=0x8a606dc, oldattrs=0x0,
>> manage=0, add_soc=1, text=0xb6f10738,
>> textbuf=0xb6f1040c "��@", textlen=256) at schema_check.c:87
>> 87 assert( a->a_vals[0].bv_val != NULL );
>
> The two things were actually unrelated. The latter is now fixed in
> HEAD; I still get two cores, but both are related to calling
> bdb_locker_id_free() at shutdown with an invalid environment.
I see that as well, looking into it. That invocation of bdb_locker_id_free()
should of course not be happening.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/
16 years, 4 months
Re: (ITS#4821) test043 cores; apparently incorrect resource usage
by ando@sys-net.it
ando(a)sys-net.it wrote:
> #4 0x080bc142 in entry_schema_check (op=0xb6f10760, e=0x8a606dc, oldattrs=0x0,
> manage=0, add_soc=1, text=0xb6f10738,
> textbuf=0xb6f1040c "��@", textlen=256) at schema_check.c:87
> 87 assert( a->a_vals[0].bv_val != NULL );
The two things were actually unrelated. The latter is now fixed in
HEAD; I still get two cores, but both are related to calling
bdb_locker_id_free() at shutdown with an invalid environment.
p.
16 years, 4 months
(ITS#4821) test043 cores; apparently incorrect resource usage
by ando@sys-net.it
Full_Name: Pierangelo Masarati
Version: HEAD
OS: Linux 2.6 (CentOS 4.4)
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (87.28.220.33)
Submitted by: ando
test43 results in a couple of core dumps. The first occurs to the producer.
Valgrind is showing
==1654== Invalid read of size 4
==1654== at 0x8148BC3: bdb_locker_id_free (cache.c:1360)
==1654== by 0x81E5F7A: ldap_pvt_thread_pool_context_reset (tpool.c:903)
==1654== by 0x80B48EF: slap_destroy (init.c:275)
==1654== by 0x8062E2B: main (main.c:954)
==1654== Address 0x4ED36A0 is 520 bytes inside a block of size 680 free'd
==1654== at 0x4004EFA: free (vg_replace_malloc.c:235)
==1654== by 0x8217E09: ber_memfree_x (memory.c:149)
==1654== by 0x8217E6D: ber_memfree (memory.c:162)
==1654== by 0x58E09E: __os_free (in /lib/tls/i686/libdb-4.2.so)
==1654== by 0x56D6E6: __dbenv_close (in /lib/tls/i686/libdb-4.2.so)
==1654== by 0x56D816: __dbenv_close_pp (in /lib/tls/i686/libdb-4.2.so)
==1654== by 0x80FE3D8: bdb_db_close (init.c:512)
==1654== by 0x80F1430: over_db_close (backover.c:195)
==1654== by 0x808D7A4: backend_shutdown (backend.c:352)
==1654== by 0x80B4846: slap_shutdown (init.c:258)
==1654== by 0x8062E0A: main (main.c:947)
bdb_locker_id_free() is passed to ldap_pvt_thread_pool_setkey() as cleanup
function, but apparently the keys are cleaned up __after__ the key (the db env)
has already been destroyed (in bdb_do_close()).
The second core occurs again in the producer, since the test doesn't realize it
cored and thus tries to restart it, according to the test pattern. Valgrind in
this case only reports
slapd: schema_check.c:87: entry_schema_check: Assertion `a->a_vals[0].bv_val !=
((void *)0)' failed.
If the same problem is reproduced without valgrind (the vgcore appears to be
screwed), a NULL attribute seems to be left 'round:
(gdb) bt
#0 0x003957a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x003d57a5 in raise () from /lib/tls/libc.so.6
#2 0x003d7209 in abort () from /lib/tls/libc.so.6
#3 0x003ced91 in __assert_fail () from /lib/tls/libc.so.6
#4 0x080bc142 in entry_schema_check (op=0xb6f10760, e=0x8a606dc, oldattrs=0x0,
manage=0, add_soc=1, text=0xb6f10738,
textbuf=0xb6f1040c "��@", textlen=256) at schema_check.c:87
#5 0x08144565 in bdb_add (op=0xb6f10760, rs=0xb6f10724) at add.c:97
#6 0x080f1c30 in overlay_op_walk (op=0xb6f10760, rs=0xb6f10724, which=op_add,
oi=0x89e5968, on=0x0) at backover.c:507
#7 0x080f1de5 in over_op_func (op=0xb6f10760, rs=0xb6f10724, which=op_add) at
backover.c:559
#8 0x080f1ef1 in over_op_add (op=0xb6f10760, rs=0xb6f10724) at backover.c:605
#9 0x081a308d in accesslog_response (op=0x8a84160, rs=0xb6f111c8) at
accesslog.c:1279
#10 0x080f1560 in over_back_response (op=0x8a84160, rs=0xb6f111c8) at
backover.c:237
#11 0x08090f2d in slap_response_play (op=0x8a84160, rs=0xb6f111c8) at
result.c:317
#12 0x080910cd in send_ldap_response (op=0x8a84160, rs=0xb6f111c8) at
result.c:391
#13 0x08091dd2 in slap_send_ldap_result (op=0x8a84160, rs=0xb6f111c8) at
result.c:638
#14 0x08103653 in bdb_modrdn (op=0x8a84160, rs=0xb6f111c8) at modrdn.c:784
#15 0x080f1c30 in overlay_op_walk (op=0x8a84160, rs=0xb6f111c8, which=op_modrdn,
oi=0x89fb040, on=0x0) at backover.c:507
#16 0x080f1de5 in over_op_func (op=0x8a84160, rs=0xb6f111c8, which=op_modrdn) at
backover.c:559
#17 0x080f1ecf in over_op_modrdn (op=0x8a84160, rs=0xb6f111c8) at
backover.c:599
#18 0x0809e220 in fe_op_modrdn (op=0x8a84160, rs=0xb6f111c8) at modrdn.c:318
#19 0x0809db83 in do_modrdn (op=0x8a84160, rs=0xb6f111c8) at modrdn.c:185
#20 0x0807ef0f in connection_operation (ctx=0xb6f112a4, arg_v=0x8a84160) at
connection.c:1129
#21 0x0807f3dc in connection_read_thread (ctx=0xb6f112a4, argv=0x10) at
connection.c:1257
#22 0x081e57a8 in ldap_int_thread_pool_wrapper (xpool=0x89bd968) at tpool.c:704
#23 0x00692371 in start_thread () from /lib/tls/libpthread.so.0
#24 0x00475ffe in clone () from /lib/tls/libc.so.6
(gdb) frame 4
#4 0x080bc142 in entry_schema_check (op=0xb6f10760, e=0x8a606dc, oldattrs=0x0,
manage=0, add_soc=1, text=0xb6f10738,
textbuf=0xb6f1040c "��@", textlen=256) at schema_check.c:87
87 assert( a->a_vals[0].bv_val != NULL );
(gdb) p a
$1 = (Attribute *) 0xb6f78e34
(gdb) p a->a_vals
$2 = 0x8a93ef8
(gdb) p a->a_vals[0]
$3 = {bv_len = 0, bv_val = 0x0}
No solution comes to mind right now for the first problem; I'm investigating the
second to see where the NULL attr comes from.
p.
16 years, 4 months
Re: (ITS#4820) More issues with modify opattrs in syncrepl
by ando@sys-net.it
Howard Chu wrote:
> I guess we could just set DBFLAG_NOLASTMOD on the db. We really only
> need to make sure a structuralObjectClass and entryUUID gets set for an
> Add operation.
Just ot avoid overloading stuff, I prefer to add a field to the
rs_modify structure that prevents from creating the operational attrs.
There might be cases where occasionally we want opattrs and occasionally
we don't, and having to create a copy of the BackendDB structure just to
toggle a flag seems to be an overkill.
I'm about to commit a patch to this. The fix allows to ask for most of
the operational attrs in test017 and test018 (only entryCSN can't be
checked only because it gets out of sort, I need to find why). In
test019, if all modify opattrs are requested there are even more out of
sort stuff, so I'm not going to touch it. In any case, I've checked
that apart from the print order the values are all in sync. About print
ordering, I guess a workaround would be to sort each LDIF line when
comparing outputs. This would cure the tests, but it might still
overload mods checking in syncrepl; redesigning syncrepl so that
out-of-order is no longer an issue could be an option.
p.
16 years, 4 months
Re: (ITS#4820) More issues with modify opattrs in syncrepl
by hyc@symas.com
ando(a)sys-net.it wrote:
> Full_Name: Pierangelo Masarati
> Version: HEAD (re23?)
> OS: irrelevant
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (87.28.220.33)
> Submitted by: ando
>
>
> I've modified syncrepl tests so that producer/consumer comparison includes all
> operational attributes as well. This yields some surprises. For example, after
> a modify, the modifiersName is changed into the consumer's rootdn. This occurs
> because the modified entry returned by syncprov contains the same modifiersName
> as the consumer's copy, so no modifiersName gets appended to the list of
> modifications that's passed to the underlying database. At this point, the
> underlying database adds the missing modify opattrs, thus breaking consistency.
> I've modified syncrepl as to always check if the modify opattrs are present,
> and, if absent, pull them off the entry sent by syncprov. However, in this
> case, it's likely that the order of th attributes breaks between the producer
> and the consumer, thus causing issues at the next check. Maybe a better fix
> would be to allow syncrepl to instruct the database not to create the missing
> modify opattrs. Comments?
I guess we could just set DBFLAG_NOLASTMOD on the db. We really only need to
make sure a structuralObjectClass and entryUUID gets set for an Add operation.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/
16 years, 4 months
Re: (ITS#4819) test018-syncreplication-persist failed (exit 1)
by ando@sys-net.it
michael.stroeder(a)t-systems.com wrote:
> Comparing retrieved entries from master and slave...
> test failed - master and slave databases differ
Please post the testrun/server*.out files, or
diff -u testrun/server1.out testrun/server2.out
Needless to say, it's working fine here.
p.
16 years, 4 months