openldap-bugs August 2008

openldap-bugs@openldap.org

40 participants
169 discussions

(ITS#5665) slapd crashing with slapo-pcache when using attrset "*"
by toby＠inf.ed.ac.uk 22 Aug '08

22 Aug '08

Full_Name: Toby Blake Version: 2.4.11 OS: Scientific Linux 5.1 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (129.215.24.127) Hi there, I have been seeing problems when using slapo-pcache with openldap-2.4.11, specifically when using an attrset of "*". - openldap-2.4.11 on scientific linux 5.1 - We build our own RPMs. I have built them with no optimisation (-O0) for the purposes of debugging. Relevant part of slapd.conf: overlay pcache proxycache bdb 5000 1 500 60 proxycachequeries 10000 proxyattrset 0 "*" proxytemplate (uid=) 0 60 60 What seems to happen is that a matching query will get answered and added to the cache - all is fine until that entry expires and is then deleted from the cache. The next matching query will then cause slapd to crash, either with an abort or a segfault. This is repeatable. I have been testing with the above configuration and the following queries: ldapsearch -x "uid=toby" ldapsearch -x "uid=blah" (the first for a positive reply, the second for a negative) I have seen three different types of crash, all at the same point (i.e. directly triggered by the query following the entry being deleted from the cache). So, here are the 3 different backtraces: backtrace 1: Thread 1 (process 13771): #0 0x081c39fb in ber_put_string (ber=0x9839c00, str=0x79626f74 <Address 0x79626f74 out of bounds>, tag=4294967295) at encode.c:396 #1 0x081c488a in ber_printf (ber=0x9839c00, fmt=0x8227d5d "v}N}") at encode.c:828 #2 0x08198957 in ldap_build_search_req (ld=0x9827920, base=0xb56051a4 "dc=inf,dc=ed,dc=ac,dc=uk", scope=2, filter=0xb5605234 "(uid=toby)", attrs=0x9831838, attrsonly=0, sctrls=0x0, cctrls=0x0, timelimit=3600, sizelimit=24576, idp=0xb5f05d78) at search.c:328 #3 0x081982fa in ldap_search_ext (ld=0x9827920, base=0xb56051a4 "dc=inf,dc=ed,dc=ac,dc=uk", scope=2, filter=0xb5605234 "(uid=toby)", attrs=0x9831838, attrsonly=0, sctrls=0x0, cctrls=0x0, timeout=0xb5f05e28, sizelimit=24576, msgidp=0xb5f05e3c) at search.c:100 #4 0x08116466 in ldap_back_search (op=0x9811140, rs=0xb5f07110) at search.c:216 #5 0x080eb88e in overlay_op_walk (op=0x9811140, rs=0xb5f07110, which=op_search, oi=0x97a6da8, on=0x0) at backover.c:646 #6 0x080eba96 in over_op_func (op=0x9811140, rs=0xb5f07110, which=op_search) at backover.c:698 #7 0x080ebb3a in over_op_search (op=0x9811140, rs=0xb5f07110) at backover.c:720 #8 0x08070e83 in fe_op_search (op=0x9811140, rs=0xb5f07110) at search.c:366 #9 0x080707e1 in do_search (op=0x9811140, rs=0xb5f07110) at search.c:217 #10 0x0806d530 in connection_operation (ctx=0xb5f07200, arg_v=0x9811140) at connection.c:1084 #11 0x0806da1d in connection_read_thread (ctx=0xb5f07200, argv=0x18) at connection.c:1211 #12 0x08192de9 in ldap_int_thread_pool_wrapper (xpool=0x9785880) at tpool.c:663 #13 0x0076046b in start_thread () from /lib/libpthread.so.0 #14 0x006b7dbe in clone () from /lib/libc.so.6 (gdb) backtrace 2: Thread 1 (process 27627): #0 0x0065305a in free () from /lib/libc.so.6 #1 0x081c69ca in ber_memfree_x (p=0x9c8a1a0, ctx=0x0) at memory.c:152 #2 0x080d4020 in slap_sl_free (ptr=0x9c8a1a0, ctx=0x9c87c40) at sl_malloc.c:456 #3 0x080708de in do_search (op=0x9c89d78, rs=0xb5b8d110) at search.c:233 #4 0x0806d530 in connection_operation (ctx=0xb5b8d200, arg_v=0x9c89d78) at connection.c:1084 #5 0x0806da1d in connection_read_thread (ctx=0xb5b8d200, argv=0x10) at connection.c:1211 #6 0x08192de9 in ldap_int_thread_pool_wrapper (xpool=0x9bfe880) at tpool.c:663 #7 0x0076046b in start_thread () from /lib/libpthread.so.0 #8 0x006b7dbe in clone () from /lib/libc.so.6 (gdb) backtrace 3: Thread 1 (process 10333): #0 0x080bc4b2 in ad_inlist (desc=0x8efa9c8, attrs=0x8f8c488) at ad.c:586 #1 0x08080641 in fe_aux_operational (op=0x8f8bce0, rs=0xb5b8b110) at backend.c:1885 #2 0x08080809 in backend_operational (op=0x8f8bce0, rs=0xb5b8b110) at backend.c:1933 #3 0x080829f6 in slap_send_search_entry (op=0x8f8bce0, rs=0xb5b8b110) at result.c:778 #4 0x0811684c in ldap_back_search (op=0x8f8bce0, rs=0xb5b8b110) at search.c:338 #5 0x080eb88e in overlay_op_walk (op=0x8f8bce0, rs=0xb5b8b110, which=op_search, oi=0x8f21da8, on=0x0) at backover.c:646 #6 0x080eba96 in over_op_func (op=0x8f8bce0, rs=0xb5b8b110, which=op_search) at backover.c:698 #7 0x080ebb3a in over_op_search (op=0x8f8bce0, rs=0xb5b8b110) at backover.c:720 #8 0x08070e83 in fe_op_search (op=0x8f8bce0, rs=0xb5b8b110) at search.c:366 #9 0x080707e1 in do_search (op=0x8f8bce0, rs=0xb5b8b110) at search.c:217 #10 0x0806d530 in connection_operation (ctx=0xb5b8b200, arg_v=0x8f8bce0) at connection.c:1084 #11 0x0806da1d in connection_read_thread (ctx=0xb5b8b200, argv=0x10) at connection.c:1211 #12 0x08192de9 in ldap_int_thread_pool_wrapper (xpool=0x8f00880) at tpool.c:663 #13 0x0076046b in start_thread () from /lib/libpthread.so.0 #14 0x006b7dbe in clone () from /lib/libc.so.6 (gdb) In an hour of testing (with a positive query) yesterday, nine of the crashes were with backtrace 3, two were with backtrace 1, and one was with backtrace 2. In an hour of testing with a negative query, all of the crashes were essentially backtrace 2, but with a longer stack: Thread 1 (process 18684): #0 0x00220402 in __kernel_vsyscall () #1 0x0060fd20 in raise () from /lib/libc.so.6 #2 0x00611631 in abort () from /lib/libc.so.6 #3 0x00647e6b in __libc_message () from /lib/libc.so.6 #4 0x0064fb16 in _int_free () from /lib/libc.so.6 #5 0x00653070 in free () from /lib/libc.so.6 #6 0x081c69ca in ber_memfree_x (p=0x9bf1488, ctx=0x0) at memory.c:152 #7 0x080d4020 in slap_sl_free (ptr=0x9bf1488, ctx=0x9bee420) at sl_malloc.c:456 #8 0x080708de in do_search (op=0x9bf1110, rs=0xb5f27110) at search.c:233 #9 0x0806d530 in connection_operation (ctx=0xb5f27200, arg_v=0x9bf1110) at connection.c:1084 #10 0x0806da1d in connection_read_thread (ctx=0xb5f27200, argv=0x10) at connection.c:1211 #11 0x08192de9 in ldap_int_thread_pool_wrapper (xpool=0x9b65880) at tpool.c:663 #12 0x0076046b in start_thread () from /lib/libpthread.so.0 #13 0x006b7dbe in clone () from /lib/libc.so.6 (gdb) Please let me know if there is any additional information I can provide. Cheers Toby Blake School of Informatics University of Edinburgh

1 0

Re: (ITS#5661) contextCSN gets corrupted on the stand by mirror
by ali.pouya＠free.fr 21 Aug '08

21 Aug '08

Hi Pierangelo, >> contextCSN: 20080727021429.070493Z#000000#000#000000 >> contextCSN:: +HYDCTA4MDIwMzM3MTguMzAwMTExWiMwMDAwMDAjMDAxIzAwMDAwMA== > > which looks like > > 4 bytes of garbage + "0802033718.300111Z#000000#001#000000" > Yes, but I would like to bring a precision : under VI the 4 bytes are handled as 2 characters only. In fact each time the problem occurs I repair my database using a BDB C program wich reads the first key from id2entry.bdb and writes it on disk. Then I use vi to fix the contextCSN, before writing the key back to the database. Using vi I do not delete any characters. I only replace them by 20, then I fix the rest of the fields. Another precision : when the first two chars take corrupted, the rest of the contextCSN gets stuck and does not follow write operations. > I note that, according to the sid values you assigned to servers A and > B, the first contextCSN should not appear, since it has sid == 0, > while the second one, apart from the corruption, is plausible (as > you're writing to server A, with sid == 1). > Yes. The contextCSN with sid=0 is there because at the beginning I initiated my directory without SID (defaults to 0), then I set two difrent SIDs for A and B. Best Regards Ali

1 0

Re: (ITS#5661) contextCSN gets corrupted on the stand by mirror
by ando＠sys-net.it 21 Aug '08

21 Aug '08

ali.pouya(a)free.fr wrote: > Full_Name: Ali Pouya > Version: 2.4.11 > OS: Linux 2.6 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (145.242.11.4) > > > I think there is a documentation issue for OpenLdap 2.4.11 : > The chapter 17.4.4 of the Admin Guide recommends configuring TWO sycrepl > directives for each mirror side. If I do so, the contextCSN of the stand by > mirror gets corrupted very easily. But if I confugure the mirrors with only ONE > syncrepl directive it's OK. > > The test environment : > I have a test directory with two mirrors A (sid=1) and B (sid=2) configured as > recommended in the Admin's Guide, and a replica C connected to A. > The directory contains 10 million objects, and I use the server A for writing > 500 000 new ones. > > Very often and without any apparent reason the contextCSN in the memory of B > gets suddenly corrupted while those of A and C are OK. > In this situation the contextCSN of B gets stuck but B continues to receive data > from A. > > The value of contextCSN in base 64 is : > > contextCSN: 20080727021429.070493Z#000000#000#000000 > contextCSN:: +HYDCTA4MDIwMzM3MTguMzAwMTExWiMwMDAwMDAjMDAxIzAwMDAwMA== which looks like 4 bytes of garbage + "0802033718.300111Z#000000#001#000000" I note that, according to the sid values you assigned to servers A and B, the first contextCSN should not appear, since it has sid == 0, while the second one, apart from the corruption, is plausible (as you're writing to server A, with sid == 1). > I note that only the part indicating the year (2008) is garbled. May be this > part is handled differently ? No. > At service shutdown B writes the corrupt contextCSN to the disk. > At service startup B reads the corrupt contextCSN from the disk and begins to > scan ALL of the data base. > > Also it sends a sync request to A (a persitent search containing the corrupt > contextCSN in the control field) causing A to scan the WHOLE data base. > The replica C remains safe. The fact that the two servers scan the whole database is a side effect of the incorrect contextCSN; I wouldn't bother, as soon as the corruption gets tracked and fixed. > If I reverse the roles of A and B the corruption occurs on A (always on the > stand by mirror). > > I have already encountered the contextCSN corruption problem in OpenLdap 2.3 and > this was one of my reasons to migrate to 2.4.11. p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando(a)sys-net.it -----------------------------------

1 0

Re: (ITS#5652) configure errors
by ando＠sys-net.it 21 Aug '08

21 Aug '08

edpena(a)cisco.com wrote: > Full_Name: Ed Pena > Version: openldap-2.3.39 > OS: hpux 11.11 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (64.102.254.33) Please provide config.log resulting from the execution of configure. p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando(a)sys-net.it -----------------------------------

1 0

Re: (ITS#5653) Segmentation Fault running slapd with mysql back-end
by ando＠sys-net.it 21 Aug '08

21 Aug '08

A fix is in HEAD code; it checks args before trying to parse them. Please test. p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando(a)sys-net.it -----------------------------------

1 0

Re: (ITS#5662) Comments in schema declarations separated by semicolon
by Kurt＠OpenLDAP.org 21 Aug '08

21 Aug '08

On Aug 21, 2008, at 9:57 AM, michael(a)stroeder.com wrote: > Hallvard B Furuseth wrote: >> michael(a)stroeder.com writes: >>> hyc(a)symas.com wrote: >>>> Who benefits from this feature? >>> An admin copying&pasting a schema from an standard document which >>> uses >>> this format. I'm currently looking at such a document with ~500 >>> occurences of OIDs used in declarations instead of NAMEs. >> >> Which one? It's not RFC 4512 format. RFC 4512 uses ';' for comments >> _about_ the syntax of schema elements, not _in_ their syntax. > > http://tools.ietf.org/draft/draft-dally-acp133-and-ldap/ There is a lot of crap in I-Ds, and even some crap in RFCs. -- Kurt

1 0

Re: (ITS#5664) Deadlocks when writing in parallell (two processes)
by quanah＠zimbra.com 21 Aug '08

21 Aug '08

--On Thursday, August 21, 2008 6:29 PM +0000 hyc(a)symas.com wrote: > stelios.xx.grigoriadis(a)ericsson.com wrote: >> tom.bjorkholm(a)aastra.com wrote: >>> Full_Name: Stelios Grigoriadis& Tom Bj?rkholm >>> Version: 2.3.39 >>> OS: Novell SLES 10 >>> URL: ftp://ftp.openldap.org/incoming/ >>> Submission from: (NULL) (194.237.142.7) >>> >>> >>> We get a lot of DB_LOCK_DEADLOCK when using client programs that for a >>> period of time continuously writes to OpenLDAP. >>> Version is 2.3.39. >>> >>> The information added is of the form: >>> ebcmdCustomer=0+ebcmdDir=220xx,ou=AuthCodes,ebcmdVersion=0,ebcmdProduct >>> =ebcmd,dc=example,dc=com where xx varies. >>> >>> Snippet of the output: >>> Mar 27 13:03:21 ldapt1 slapd[7589]: => bdb_dn2id_add: subtree >>> (ebcmdCustomer=0+ebcmdDir=22037,ou=authcodes,ebcmdVersion=0,ebcmdProduc >>> t=ebcmd,dc=example,dc=com) put failed: -30995 >>> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id >>> failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >>> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_dn2id_add: parent >>> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) >>> insert failed: -30995 >>> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id >>> failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >>> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_dn2id_add: parent >>> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) >>> insert failed: -30995 >>> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id >>> failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >>> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_dn2id_add: parent >>> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) >>> insert failed: -30995 >>> Mar 27 13:03:38 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id >>> failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >>> >>> >>> >> >> We've temporarily fixed the problem by introducing a static mutex before >> any add/update operation. > > There's no problem to fix. Deadlocks are normal in these scenarios, and > the code automatically retries. This ITS will be closed. I will note that testing on 2.3 has shown time and again that serialized updates perform better, regardless. Using accesslog with delta-syncrepl replication essentially enforces this. The more you can serialize updates, particularly with batch provisioning, the smoother your system will operate. This may not apply to 2.4. --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration

1 0

Re: (ITS#5664) Deadlocks when writing in parallell (two processes)
by hyc＠symas.com 21 Aug '08

21 Aug '08

stelios.xx.grigoriadis(a)ericsson.com wrote: > tom.bjorkholm(a)aastra.com wrote: >> Full_Name: Stelios Grigoriadis& Tom Björkholm >> Version: 2.3.39 >> OS: Novell SLES 10 >> URL: ftp://ftp.openldap.org/incoming/ >> Submission from: (NULL) (194.237.142.7) >> >> >> We get a lot of DB_LOCK_DEADLOCK when using client programs that for a period of >> time continuously writes to OpenLDAP. >> Version is 2.3.39. >> >> The information added is of the form: >> ebcmdCustomer=0+ebcmdDir=220xx,ou=AuthCodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com >> where xx varies. >> >> Snippet of the output: >> Mar 27 13:03:21 ldapt1 slapd[7589]: => bdb_dn2id_add: subtree >> (ebcmdCustomer=0+ebcmdDir=22037,ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) >> put failed: -30995 >> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed: >> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >> Mar 27 13:03:26 ldapt1 slapd[7589]: => bdb_dn2id_add: parent >> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert >> failed: -30995 >> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed: >> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >> Mar 27 13:03:28 ldapt1 slapd[7589]: => bdb_dn2id_add: parent >> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert >> failed: -30995 >> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed: >> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >> Mar 27 13:03:36 ldapt1 slapd[7589]: => bdb_dn2id_add: parent >> (ou=authcodes,ebcmdVersion=0,ebcmdProduct=ebcmd,dc=example,dc=com) insert >> failed: -30995 >> Mar 27 13:03:38 ldapt1 slapd[7589]: => bdb_idl_insert_key: c_put id failed: >> DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) >> >> >> > > We've temporarily fixed the problem by introducing a static mutex before > any add/update operation. There's no problem to fix. Deadlocks are normal in these scenarios, and the code automatically retries. This ITS will be closed. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#5662) Comments in schema declarations separated by semicolon
by michael＠stroeder.com 21 Aug '08

21 Aug '08

Hallvard B Furuseth wrote: > michael(a)stroeder.com writes: >> hyc(a)symas.com wrote: >>> Who benefits from this feature? >> An admin copying&pasting a schema from an standard document which uses >> this format. I'm currently looking at such a document with ~500 >> occurences of OIDs used in declarations instead of NAMEs. > > Which one? It's not RFC 4512 format. RFC 4512 uses ';' for comments > _about_ the syntax of schema elements, not _in_ their syntax. http://tools.ietf.org/draft/draft-dally-acp133-and-ldap/ Not for my professional work. I was just looking for really complex schemas for testing web2ldap. Ciao, Michael.

1 0

Re: (ITS#5653) Segmentation Fault running slapd with mysql back-end
by ando＠sys-net.it 21 Aug '08

21 Aug '08

ollieeillo(a)yahoo.co.uk wrote: > backsql_oc_get_attr_mapping(): executing at_query > "SELECT name,sel_expr,from_tbls,join_where,add_proc,delete_proc,param_order,expect_return,sel_expr_u > FROM ldap_attr_mappings WHERE oc_map_id=?" > for objectClass "document" > with param oc_id="2" > attributeType: > name="(null)" > sel_expr="(null)" > from="(null)" > join_where="(null)" > add_proc="(null)" > delete_proc="(null)" > sel_expr_u="" > Segmentation fault This log seems to indicate that the ldap_attr_mappings table contains NULLs for required fields. A check will be added to the code, but yours definitely looks like a user error... fix your table and retry. p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Fax: +39 0382 476497 Email: ando(a)sys-net.it -----------------------------------

1 0

← Newer
1
2
3
4
5
6
7
8
...
17
Older →

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-bugs August 2008