openldap-bugs January 2012

openldap-bugs@openldap.org

22 participants
76 discussions

(ITS#7133) slapd dies with SIGSEGV on SIGHUP
by dpmcgee＠gmail.com 24 Jan '12

24 Jan '12

Full_Name: Dan McGee Version: 2.4.28 OS: Arch Linux, 3.0.2 kernel URL: Submission from: (NULL) (2002:47c2:29f0:1:21f:d0ff:fea2:ee12) Running the test suite looks like a disaster because every time the test server is killed you see lines like this (probably 1409 times for every single test, who knows): >>>>> Test succeeded ./scripts/test000-rootdse: line 84: 3589 Segmentation fault $SLAPD -f $CONF1 -h $URI1 -d $LVL $TIMING > $LOG1 2>&1 >>>>> test000-rootdse completed OK for bdb. Here is a GDB backtrace of that: Program received signal SIGHUP, Hangup. 0x00007f6a3727306f in pthread_join () from /lib/libpthread.so.0 (gdb) c Continuing. [Thread 0x7f6a32e0c700 (LWP 2789) exited] [Thread 0x7f6a3260b700 (LWP 2804) exited] Program received signal SIGSEGV, Segmentation fault. 0x00007f6a36f42a84 in free () from /lib/libc.so.6 (gdb) bt #0 0x00007f6a36f42a84 in free () from /lib/libc.so.6 #1 0x00007f6a3440f320 in ?? () from /usr/lib/libldap-2.4.so.2 #2 0x00007f6a3850a0b8 in _dl_close_worker () from /lib/ld-linux-x86-64.so.2 #3 0x00007f6a3850ab4c in _dl_close () from /lib/ld-linux-x86-64.so.2 #4 0x00007f6a385052a6 in _dl_catch_error () from /lib/ld-linux-x86-64.so.2 #5 0x00007f6a360884cf in ?? () from /lib/libdl.so.2 #6 0x00007f6a3608800f in dlclose () from /lib/libdl.so.2 #7 0x00007f6a376d13a4 in _sasl_done_with_plugins () from /usr/lib/libsasl2.so.2 #8 0x00007f6a376c935b in sasl_done () from /usr/lib/libsasl2.so.2 #9 0x0000000000481fc9 in slap_sasl_destroy () #10 0x000000000045f108 in slap_destroy () #11 0x00000000004195b9 in main () And given that there is SASL stuff in the backtrace, this machine is running libsasl 2.1.23.

1 0

(ITS#7132) syncrepl uses freed naming attrs
by h.b.furuseth＠usit.uio.no 24 Jan '12

24 Jan '12

Full_Name: Hallvard B Furuseth Version: 2.4.28, master OS: Linux x86_64 URL: Submission from: (NULL) (129.240.203.186) Submitted by: hallvard syncrepl_entry() reads naming attrs in an entry saved by dn_callback() in an internal search, after the search returned and released the entry. Found with valgrind and -DSLAP_NO_SL_MALLOC in test017 with mdb. Fixing.

1 0

(ITS#7131) Wrong integer types for connection loops
by h.b.furuseth＠usit.uio.no 23 Jan '12

23 Jan '12

Full_Name: Hallvard B Furuseth Version: 2.4.28, master OS: URL: Submission from: (NULL) (195.1.106.125) Submitted by: hallvard connection_<first,next,done>() expect a ber_socket_t*connindex, but the callers pass an int*. Fixing.

1 0

Re: (ITS#7130) OpenLDAP with BackSQL and Postgres. Upper on bigint?
by masarati＠aero.polimi.it 23 Jan '12

23 Jan '12

> Full_Name: Manny > Version: 2.4.23 > OS: RHEL6 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (193.171.77.1) > > > Hi there. > > I'm posting this into the ITS as I didn't get a response on the mailing > list > after 1 week. > > I'm using the latest stable release of openldap, with back-sql and > postgresql as a backend. > I have an sssd which uses this openldap server for ID providing and > authentication. > A recent update in this sssd changed the filter used to retrieve > groupids of users, which surfaced what seems to be a bug in backsql. > >>From my investigation it seems to me that when constructing the search > query, openldap tries to use the UPPER function on every criteria in > the WHERE clause, no matter which type it is. This causes an error in > postgresql, as the gidNumber that is supposed to be filtered is of > type "bigint". I don't recall seeing your message in openldap-technical, which it belongs to. In any case, there should be a field in ldap_at_mappings that tells how and when an attribute value needs to be uppercased. However, I can't check right now whether it works as intended, and back-sql is unmaintained. p.

1 0

(ITS#7130) OpenLDAP with BackSQL and Postgres. Upper on bigint?
by dermaniac＠gmail.com 23 Jan '12

23 Jan '12

Full_Name: Manny Version: 2.4.23 OS: RHEL6 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (193.171.77.1) Hi there. I'm posting this into the ITS as I didn't get a response on the mailing list after 1 week. I'm using the latest stable release of openldap, with back-sql and postgresql as a backend. I have an sssd which uses this openldap server for ID providing and authentication. A recent update in this sssd changed the filter used to retrieve groupids of users, which surfaced what seems to be a bug in backsql. >From my investigation it seems to me that when constructing the search query, openldap tries to use the UPPER function on every criteria in the WHERE clause, no matter which type it is. This causes an error in postgresql, as the gidNumber that is supposed to be filtered is of type "bigint". I could simply remove "upper_func" from my slapd.conf, but I need it for other queries. Changing the field gidNumber in the database to text would also be a workaround, but I'd rather prefer this fixed in backsql. Anyways, here is the important part of my slapd.conf: #----------------------------------------------- database sql suffix "dc=mydomain,dc=at" rootdn "cn=Manager,dc=mydomain,dc=at" rootpw mysecret dbname mydbname dbuser mydbuser dbpasswd myuserpw subtree_cond "UPPER(ldap_entries.dn) LIKE UPPER('%'||?)" insentry_stmt "insert into ldap_entries (id,dn,oc_map_id,parent,keyval) values ((select max(id)+1 from ldap_entries),?,?,?,?)" upper_func "upper" upper_needs_cast "yes" strcast_func "text" concat_pattern "?||?" has_ldapinfo_dn_ru no sizelimit -1 #----------------------------------------------- The error can be reproduced in my case when running this command: #----------------------------------------------- ldapsearch -H ldaps://127.0.0.1/ -b ou=groups,ou=myserver,dc=mydomain,dc=at -D "uid=myuser,ou=users,ou=myserver,dc=mydomain,DC=at" '(gidNumber=512)' -W #----------------------------------------------- Which results in this output: #----------------------------------------------- # extended LDIF # # LDAPv3 # base <ou=groups,ou=myserver,dc=mydomain,dc=at> with scope subtree # filter: (gidNumber=512) # requesting: ALL # # search result search: 2 result: 80 Other (e.g., implementation specific) error #----------------------------------------------- Here is the interesting part of the level 255 log output: #----------------------------------------------- <==backsql_oc_get_candidates(): 0 ==>backsql_oc_get_candidates(): oc="sambaDomain" ==>backsql_srch_query() ==>backsql_process_filter() <==backsql_process_filter() succeeded <==backsql_srch_query() returns SELECT DISTINCT ldap_entries.id,samba_domain_ldap.id,text('sambaDomain') AS objectClass,ldap_entries.dn AS dn FROM ldap_entries,samba_domain_ldap WHERE samba_domain_ldap.id=ldap_entries.keyval AND ldap_entries.oc_map_id=? AND UPPER(ldap_entries.dn) LIKE UPPER('%'||?) AND 7=7 Constructed query: SELECT DISTINCT ldap_entries.id,samba_domain_ldap.id,text('sambaDomain') AS objectClass,ldap_entries.dn AS dn FROM ldap_entries,samba_domain_ldap WHERE samba_domain_ldap.id=ldap_entries.keyval AND ldap_entries.oc_map_id=? AND UPPER(ldap_entries.dn) LIKE UPPER('%'||?) AND 7=7 id: '6' (sub)dn: "%OU=GROUPS,OU=MYSERVER,DC=MYDOMAIN,DC=AT" <==backsql_oc_get_candidates(): 0 ==>backsql_oc_get_candidates(): oc="inetOrgPerson" ==>backsql_srch_query() ==>backsql_process_filter() ==>backsql_process_filter_attr(gidNumber) <==backsql_process_filter_attr(gidNumber) <==backsql_process_filter() succeeded <==backsql_srch_query() returns SELECT DISTINCT ldap_entries.id,OS_USER.id,text('inetOrgPerson') AS objectClass,ldap_entries.dn AS dn FROM ldap_entries,OS_USER WHERE OS_USER.id=ldap_entries.keyval AND ldap_entries.oc_map_id=? AND UPPER(ldap_entries.dn) LIKE UPPER('%'||?) AND (upper(OS_USER.gidnumber)='512') Constructed query: SELECT DISTINCT ldap_entries.id,OS_USER.id,text('inetOrgPerson') AS objectClass,ldap_entries.dn AS dn FROM ldap_entries,OS_USER WHERE OS_USER.id=ldap_entries.keyval AND ldap_entries.oc_map_id=? AND UPPER(ldap_entries.dn) LIKE UPPER('%'||?) AND (upper(OS_USER.gidnumber)='512') id: '8' (sub)dn: "%OU=GROUPS,OU=MYSERVER,DC=MYDOMAIN,DC=AT" backsql_oc_get_candidates(): error executing query Return code: -1 nativeErrCode=7 SQLengineState=S1000 msg="[unixODBC]ERROR: function upper(bigint) does not exist at character 288;#012Error while executing the query" #----------------------------------------------- Thanks a lot

1 0

Re: (ITS#7113) broken read-only replica results in assertion failure and crash on master slapd process
by hyc＠symas.com 21 Jan '12

21 Jan '12

kacarstensen(a)csupomona.edu wrote: > I've investigated this issue a little bit more since my initial bug > report. > I'm not sure if connection_write is supposed to validate that a stream > is active before or after calling slapd_clr_write, but it seems like the > assertion wouldn't be an issue if that validation were performed before > calling slapd_clr_write. To test this thought, I rebuilt openldap 2.4.28 > with the following patch: > > --- openldap-2.4.28/servers/slapd/connection.c 2011-11-25 10:52:29.000000000 -0800 > +++ openldap-2.4.28-new/servers/slapd/connection.c 2012-01-12 13:35:45.000000000 -0800 > @@ -1893,8 +1893,6 @@ > > assert( connections != NULL ); > > - slapd_clr_write( s, 0 ); > - > c = connection_get( s ); > if( c == NULL ) { > Debug( LDAP_DEBUG_ANY, > @@ -1903,6 +1901,8 @@ > return -1; > } > > + slapd_clr_write( s, 0 ); > + > #ifdef HAVE_TLS > if ( c->c_is_tls&& c->c_needs_tls_accept ) { > connection_return( c ); > > and tried to reproduce the problem under the same circumstances as > reported in my initial bug report. The master slapd tolerated the > misconfigured replicas for 5 days without crashing; before, it would > crash reliably within a half hour or so. I didn't notice any regressions > due to the patch, though the master slapd wasn't exposed to a typical > workload during the experiment. > > Any thoughts on this patch? Sounds OK to me, committed to git master. Thanks. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

(ITS#7129) slapo-valsort(5) missing description of LDAP_CONTROL_VALSORT
by quanah＠OpenLDAP.org 20 Jan '12

20 Jan '12

Full_Name: Version: 2.4.28 OS: NA URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (75.108.184.39) While the slapo-valsort manual page describes the ways in which data can be sorted and displayed, it does not note the fact that there is a control that can be used to disable it from applying to a search result: LDAP_CONTROL_VALSORT This is useful information for people wishing to use the overlay.

1 0

Re: (ITS#7127) Syncrepl config uses freed data
by hyc＠symas.com 19 Jan '12

19 Jan '12

h.b.furuseth(a)usit.uio.no wrote: > Full_Name: Hallvard B Furuseth > Version: 2.4.21++, master > OS: > URL: > Submission from: (NULL) (195.1.106.125) > Submitted by: hallvard > > > In syncrepl_config(), ldap_pvt_runqueue_remove() frees 're', > then the retract statement reads 're->routine': > > ldap_pvt_runqueue_remove(&slapd_rq, re ); > ldap_pvt_thread_mutex_unlock(&slapd_rq.rq_mutex ); > if ( ldap_pvt_thread_pool_retract(&connection_pool, > re->routine, re )> 0 ) > > Formally I think the pointer 're' itself is invalid after freeing it, > so the ISO C-clean fix would involve calling retract() first. If > that's wrong: I assume the thread pool is paused at this point, so > the task can not be started (and use re) before it can be retracted, > and we can just just read re->routine before freeing re. Makes sense. Fixed in master. > > Found by Valgrind in test063-delta-multimaster. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

(ITS#7127) Syncrepl config uses freed data
by h.b.furuseth＠usit.uio.no 19 Jan '12

19 Jan '12

Full_Name: Hallvard B Furuseth Version: 2.4.21++, master OS: URL: Submission from: (NULL) (195.1.106.125) Submitted by: hallvard In syncrepl_config(), ldap_pvt_runqueue_remove() frees 're', then the retract statement reads 're->routine': ldap_pvt_runqueue_remove( &slapd_rq, re ); ldap_pvt_thread_mutex_unlock( &slapd_rq.rq_mutex ); if ( ldap_pvt_thread_pool_retract( &connection_pool, re->routine, re ) > 0 ) Formally I think the pointer 're' itself is invalid after freeing it, so the ISO C-clean fix would involve calling retract() first. If that's wrong: I assume the thread pool is paused at this point, so the task can not be started (and use re) before it can be retracted, and we can just just read re->routine before freeing re. Found by Valgrind in test063-delta-multimaster.

1 0

Re: (ITS#7113) broken read-only replica results in assertion failure and crash on master slapd process
by kacarstensen＠csupomona.edu 17 Jan '12

17 Jan '12

I've investigated this issue a little bit more since my initial bug report. With loglevel conns sync in slapd.conf, the following are the last messages written to the log before slapd crashes: Jan 12 09:33:19 shelley slapd[2482]: Jan 12 09:33:19 shelley slapd[2482]: daemon: write active on 15 Jan 12 09:33:19 shelley slapd[2482]: daemon: epoll: listen=9 active_threads=0 tvp=zero Jan 12 09:33:19 shelley slapd[2482]: daemon: epoll: listen=10 active_threads=0 tvp=zero Jan 12 09:33:19 shelley slapd[2482]: daemon: activity on 1 descriptor Jan 12 09:33:19 shelley slapd[2482]: daemon: activity on: Jan 12 09:33:19 shelley slapd[2482]: 15rw Jan 12 09:33:19 shelley slapd[2482]: ber_flush2 failed errno=32 reason="Broken pipe" Jan 12 09:33:19 shelley slapd[2482]: connection_closing: readying conn=1377 sd=15 for close Jan 12 09:33:19 shelley slapd[2482]: send_search_entry: conn 1377 ber write failed. Jan 12 09:33:19 shelley slapd[2482]: connection_resched: attempting closing conn=1377 sd=15 Jan 12 09:33:19 shelley slapd[2482]: daemon: removing 15 Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error Jan 12 09:33:19 shelley slapd[2482]: Jan 12 09:33:19 shelley slapd[2482]: daemon: write active on 15 "daemon: write active on 15" comes from line 2660 or so of daemon.c. Here's that with some context: for ( i = 0; nwfds > 0; i++ ) { ber_socket_t wd; if ( ! SLAP_EVENT_IS_WRITE( i ) ) continue; wd = i; SLAP_EVENT_CLR_WRITE( wd ); nwfds--; Debug(LDAP_DEBUG_CONNS, "daemon: write active on %d\n", wd, 0, 0); /* * NOTE: it is possible that the connection was closed and that * the stream is now inactive. connection_write() must validate * the stream is still active. * * ITS#4338: if the stream is invalid, there is no need to close it * here. It has already been closed in connection.c. */ if ( connection_write( wd ) < 0 ) { if ( SLAP_EVENT_IS_READ( wd)) { SLAP_EVENT_CLR_READ( (unsigned) wd); nrfds--; } } } We do some housekeeping on the descriptor, then call connection_write to take care of the write. Here's connection_write from connection.c: int connection_write(ber_socket_t s) { Connection *c; Operation *op; assert( connections != NULL ); slapd_clr_write( s, 0 ); c = connection_get( s ); if( c == NULL ) { Debug( LDAP_DEBUG_ANY, "connection_write(%ld): no connection!\n", (long)s, 0, 0 ); return -1; } [...] connection_get seems to perform the validation that is mentioned by the comment in daemon.c, but connection_get is called after slapd_clr_write, which is the source of the assertion implicated by the backtrace in the initial bug report: void slapd_clr_write( ber_socket_t s, int wake ) { int id = DAEMON_ID(s); ldap_pvt_thread_mutex_lock( &slap_daemon[id].sd_mutex ); if ( SLAP_SOCK_IS_WRITE( id, s )) { assert( SLAP_SOCK_IS_ACTIVE( id, s )); SLAP_SOCK_CLR_WRITE( id, s ); slap_daemon[id].sd_nwriters--; } ldap_pvt_thread_mutex_unlock( &slap_daemon[id].sd_mutex ); WAKE_LISTENER(id,wake); } I'm not sure if connection_write is supposed to validate that a stream is active before or after calling slapd_clr_write, but it seems like the assertion wouldn't be an issue if that validation were performed before calling slapd_clr_write. To test this thought, I rebuilt openldap 2.4.28 with the following patch: --- openldap-2.4.28/servers/slapd/connection.c 2011-11-25 10:52:29.000000000 -0800 +++ openldap-2.4.28-new/servers/slapd/connection.c 2012-01-12 13:35:45.000000000 -0800 @@ -1893,8 +1893,6 @@ assert( connections != NULL ); - slapd_clr_write( s, 0 ); - c = connection_get( s ); if( c == NULL ) { Debug( LDAP_DEBUG_ANY, @@ -1903,6 +1901,8 @@ return -1; } + slapd_clr_write( s, 0 ); + #ifdef HAVE_TLS if ( c->c_is_tls && c->c_needs_tls_accept ) { connection_return( c ); and tried to reproduce the problem under the same circumstances as reported in my initial bug report. The master slapd tolerated the misconfigured replicas for 5 days without crashing; before, it would crash reliably within a half hour or so. I didn't notice any regressions due to the patch, though the master slapd wasn't exposed to a typical workload during the experiment. Any thoughts on this patch? -- Kevan Carstensen <kacarstensen(a)csupomona.edu> Operating Systems Analyst, I&IT Systems, Cal Poly Pomona

1 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-bugs January 2012