openldap-bugs October 2007

openldap-bugs@openldap.org

38 participants
215 discussions

(ITS#5176) Segmentation Fault in test001-slapadd
by openldap＠consotec.de 08 Oct '07

08 Oct '07

Full_Name: Mark Version: 2.3.38 OS: Suse Linux 10.01 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (84.128.87.178) ** Problem test001-slapdadd fails with segmentation fault. ** Environment Suse Linux 10.1 Berkly DB 4.6.21 Openldap 2.3.38 ** Configuration of ldap configure --prefix=/usr --enable-debug ** Shared Library depency libdb-4.6.so => /usr/lib/libdb-4.6.so (0x40028000) libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0x40169000) libdl.so.2 => /lib/libdl.so.2 (0x40182000) libresolv.so.2 => /lib/libresolv.so.2 (0x40185000) libpthread.so.0 => /lib/i686/libpthread.so.0 (0x40197000) libc.so.6 => /lib/i686/libc.so.6 (0x401e9000) ** StackTrace Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 32771 (LWP 28247)] 0x400bd3bd in __lock_get_internal (lt=0x8215ae8, sh_locker=0x7, flags=0, obj=0x82168d4, lock_mode=DB_LOCK_READ, timeout=0, lock=0x40d8e50c) at ../lock/lock.c:740 740 no_dd = sh_locker->master_locker == INVALID_ROFF && (gdb) bt #0 0x400bd3bd in __lock_get_internal (lt=0x8215ae8, sh_locker=0x7, flags=0, obj=0x82168d4, lock_mode=DB_LOCK_READ, timeout=0, lock=0x40d8e50c) at ../lock/lock.c:740 #1 0x400bcc85 in __lock_get (dbenv=0x82154e8, locker=0x7, flags=0, obj=0x82168d4, lock_mode=DB_LOCK_READ, lock=0x40d8e50c) at ../lock/lock.c:447 #2 0x400ec2f7 in __db_lget (dbc=0x8216858, action=0, pgno=1, mode=DB_LOCK_READ, lkflags=0, lockp=0x40d8e50c) at ../db/db_meta.c:1012 #3 0x40054d7d in __bam_get_root (dbc=0x8216858, pg=1, slevel=1, flags=1409, stack=0x40d8e614) at ../btree/bt_search.c:94 #4 0x400551a4 in __bam_search (dbc=0x8216858, root_pgno=1, key=0x40d8e97c, flags=1409, slevel=1, recnop=0x0, exactp=0x40d8e818) at ../btree/bt_search.c:200 #5 0x40045160 in __bamc_search (dbc=0x8216858, root_pgno=0, key=0x40d8e97c, flags=26, exactp=0x40d8e818) at ../btree/bt_cursor.c:2486 #6 0x400411bc in __bamc_get (dbc=0x8216858, key=0x40d8e97c, data=0x40d8e95c, flags=26, pgnop=0x40d8e8ac) at ../btree/bt_cursor.c:961 #7 0x400da81d in __dbc_get (dbc_arg=0x8217158, key=0x40d8e97c, data=0x40d8e95c, flags=26) at ../db/db_cam.c:697 #8 0x400e7dfb in __dbc_get_pp (dbc=0x8217158, key=0x40d8e97c, data=0x40d8e95c, flags=26) at ../db/db_iface.c:2022 #9 0x080d455f in bdb_id2entry (be=0x7, tid=0x0, locker=7, id=1, e=0x40d8e9f8) at id2entry.c:125 #10 0x080cdabb in bdb_cache_find_id (op=0x822a638, tid=0x0, id=1, eip=0x40d8ea84, islocked=0, locker=7, lock=0x40d8eb1c) at cache.c:760 #11 0x080d12cd in bdb_dn2entry (op=0x822a638, tid=0x0, dn=0x0, e=0x40d8eb14, matched=1, locker=7, lock=0x40d8eb1c) at dn2entry.c:68 #12 0x080b4ae9 in bdb_search (op=0x822a638, rs=0x40e4fc9c) at search.c:374 #13 0x0805ec5b in fe_op_search (op=0x822a638, rs=0x40e4fc9c) at search.c:355 #14 0x0805e43f in do_search (op=0x822a638, rs=0x40e4fc9c) at search.c:217 #15 0x0805cae2 in connection_operation (ctx=0x7, arg_v=0x822a638) at connection.c:1133 #16 0x080fd514 in ldap_int_thread_pool_wrapper (xpool=0x81c0a88) at tpool.c:478 #17 0x4019cf60 in pthread_start_thread () from /lib/i686/libpthread.so.0 #18 0x4019d0fe in pthread_start_thread_event () from /lib/i686/libpthread.so.0 #19 0x402c5327 in clone () from /lib/i686/libc.so.6 ** Description We added some printf's and found the that sh_locker has the value 0x7 which seems to be an index but not a valid lock for the database. Note: With the following modification the tests are running: File: servers/slapd/back-bdb/id2entry.c Line: 120 #if 0 /* Use our own locker if needed */ if ( !tid && locker ) cursor->locker = locker; #endif Any help is appriated Regards Mark

1 0

Re: (ITS#5171) hdb txn_checkpoint failures
by richton＠nbcs.rutgers.edu 08 Oct '07

08 Oct '07

> It's still rather suspicious that slave4 and slave6 both had identical log > status for base1 (1/188113) but different requested locations (1/8730339 vs > 1/8730401). If they're identically configured slaves then they ought to be in > lock-step. Then again, obviously they're not identical since slave6 doesn't > show base4 in your log. Identical is relative. They've got the same OpenLDAP and supporting binaries running on the same patches of Solaris 9 running identical turn-up scripts with identical configuration files. But this is production, so we've got data changes over time. For instance, the slaves bootstrap with a slapadd -q, and the underlying slapcat could easily be different from slave4 vs. slave6 (the most recent one is automatically used). I'd imagine this would look different at the db layer, even once syncrepl eventually converged the logical data? > Do you have the db_stat output from an uncorrupted slave? What about the > master? Sure... https://www.nbcs.rutgers.edu/~richton/its5171_dbstatl2

1 0

Re: (ITS#5171) hdb txn_checkpoint failures
by hyc＠symas.com 08 Oct '07

08 Oct '07

Aaron Richton wrote: >> itself. Again, we can't really tell without single-stepping thru the BDB >> library code. It may not be worth the effort, but that's your call. > > The lock was > > env_region.c:290 MUTEX_LOCK(dbenv, &renv->mutex); > > but that wasn't making much sense....and after a couple minutes in dbx I > realized that I've been killing myself with the attempts at db_stat. > Yesterday's attempts were running db_* binaries with a wrong (but > compatible) ABI. It'd be nice if Sleepycat had some more/earlier checks > for that, but oh well... Kinda figured that that's what happened. > So anyway, I corrupted base2/slave4 by running the wrong db_stat, but that > left three other bases on slave4 and all three bases on slave6. I ran > db_stat -l on them, the output is: > > https://www.nbcs.rutgers.edu/~richton/its5171_dbstatl > BTW, this ABI screwup shouldn't be the root cause of the failures...I > haven't tried any db tools until the course of debugging this. These are > AUTOREMOVE, so db_archive is unlikely, for instance. It's still rather suspicious that slave4 and slave6 both had identical log status for base1 (1/188113) but different requested locations (1/8730339 vs 1/8730401). If they're identically configured slaves then they ought to be in lock-step. Then again, obviously they're not identical since slave6 doesn't show base4 in your log. Do you have the db_stat output from an uncorrupted slave? What about the master? -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#5171) hdb txn_checkpoint failures
by richton＠nbcs.rutgers.edu 08 Oct '07

08 Oct '07

> itself. Again, we can't really tell without single-stepping thru the BDB > library code. It may not be worth the effort, but that's your call. The lock was env_region.c:290 MUTEX_LOCK(dbenv, &renv->mutex); but that wasn't making much sense....and after a couple minutes in dbx I realized that I've been killing myself with the attempts at db_stat. Yesterday's attempts were running db_* binaries with a wrong (but compatible) ABI. It'd be nice if Sleepycat had some more/earlier checks for that, but oh well... So anyway, I corrupted base2/slave4 by running the wrong db_stat, but that left three other bases on slave4 and all three bases on slave6. I ran db_stat -l on them, the output is: https://www.nbcs.rutgers.edu/~richton/its5171_dbstatl BTW, this ABI screwup shouldn't be the root cause of the failures...I haven't tried any db tools until the course of debugging this. These are AUTOREMOVE, so db_archive is unlikely, for instance.

1 0

Re: (ITS#5050) array bounds violation in test045-syncreplication-proxied
by hyc＠symas.com 08 Oct '07

08 Oct '07

h.b.furuseth(a)usit.uio.no wrote: > Full_Name: Hallvard B Furuseth > Version: HEAD, RE23 > OS: Linux > URL: > Submission from: (NULL) (129.240.202.105) > Submitted by: hallvard > > > syncprov + back-ldap, and presumably + back-meta if that were used, > give an array bounds violation in test045-syncreplication-proxied: > > syncprov_db_open() uses connection_fake_init(), which sets op->o_tag=0. > It passes op to back-ldap. > > ldap_back_op_result() assumes op is a known LDAP request: It calls > slap_req2op() and gets SLAP_OP_LAST (for unknown tag). That is used as > an index into ldapinfo_t.li_timeout[], which has size SLAP_OP_LAST. > > back-meta/bind.c does the same in meta_back_bind_op_result() and > meta_back_op_result(). I fixed back-ldap but I believe the same fix is still needed in back-meta. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

(ITS#5175) After 6 times authenticate sucessfully.Open LDAP is failing
by pattanaikhr＠gmail.com 08 Oct '07

08 Oct '07

Full_Name: HR Pattanaik Version: 2.2.29 OS: Windows XP Professional URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (203.126.136.220) When I have sign in my authentication page .It's successfully authenticate 5-6 times after that it's failing . I have searched lot of forums But I didn't get any solution for that. So I have raised this issues on your web site. Hopefully I'll get any solutions for that.Please reply me I have mention my exception is here.Please follows below. javax.naming.ServiceUnavailableException: localhost: 389; socket closed

1 0

Re: ITS#5174 openldap.schema entry not valid per RFC 4512
by ando＠sys-net.it 08 Oct '07

08 Oct '07

Not sure if ordering of optional sequence members is required by RFC 4234, but the change you suggest sounds harmless. OpenLDAP software, in this sense, is usually permissive in what is accepted and strict in what is emitted. Thanks, p. Ing. Pierangelo Masarati OpenLDAP Core Team SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it --------------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Email: pierangelo.masarati(a)sys-net.it ---------------------------------------

1 0

Re: (ITS#5164) slapi plugins prevent slapd starting.
by m.d.t.evans＠qmul.ac.uk 08 Oct '07

08 Oct '07

On Sun, 2007-10-07 at 16:29 -0700, Howard Chu wrote: > m.d.t.evans(a)qmul.ac.uk wrote: > > Full_Name: Martin Evans > > Version: 2.4.5beta > > OS: Linux > > URL: ftp://ftp.openldap.org/incoming/ > > Submission from: (NULL) (138.37.8.140) > > > > > > This is mentioned in #4611 but marked as fixed there. > > Now fixed in HEAD. > Thanks! I spotted your fixes to slapi/plugin.c, back-monitor/conn.c and back-monitor/database.c applied these to my version of 2.4.5beta and they seem to work fine. Martin. -- -- Dr MDT Evans, Computing Services, Queen Mary, University of London

1 0

(ITS#5174) openldap.schema entry not valid per RFC 4512
by bhanafee＠gmail.com 08 Oct '07

08 Oct '07

Full_Name: Brian Hanafee Version: 1.24.2.2 OS: Mac OS/X URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (24.4.251.75) The final definition in /servers/slapd/schema/openldap.schema is not valid per RFC 4512. It reads: objectClass ( OpenLDAPobjectClass:6 NAME 'OpenLDAPdisplayableObject' DESC 'OpenLDAP Displayable Object' MAY displayName AUXILIARY ) Per RFC 4512, section 4.1.1, the 'kind' AUXILIARY comes before any MUST or MAY entries. The corrected entry should read: objectClass ( OpenLDAPobjectClass:6 NAME 'OpenLDAPdisplayableObject' DESC 'OpenLDAP Displayable Object' AUXILIARY MAY displayName )

1 0

Re: (ITS#5171) hdb txn_checkpoint failures
by hyc＠symas.com 08 Oct '07

08 Oct '07

richton(a)nbcs.rutgers.edu wrote: >> If this is happening even with slapd cleanly shut down then it should also >> prevent slapd from restarting, since slapd first attempts to join an existing >> environment before trying to create a new one. And that really implies that >> the rest of the environment is shot. > > Agreed, but that's a pretty awful condition to have in a long-running > slapd process. Without db_stat (easily) working, is there any hope at > finding clues as to how this might have happened, or is it just time to > rm/slapadd and hope it doesn't happen again? It doesn't seem like we can get much more info out of this. One more thing to try would be a full-debug build of libdb, so we can see exactly where it hangs when trying to join the environment. Looking thru the code, I only see one mutex to acquire the environment, and looking at your stack trace it's already past that location, but the trace could be lying. Also the mutex used to lock the environment is a regular mutex, not a persistent lock. So when all processes have closed the environment, there shouldn't be anything left to conflict with here. So most likely the environment data structures are hosed, and the thread is locking against itself. Again, we can't really tell without single-stepping thru the BDB library code. It may not be worth the effort, but that's your call. -- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

← Newer
1
...
12
13
14
15
16
17
18
...
22
Older →

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-bugs October 2007