Full_Name: Olivier GERMAIN
Version: 2.4.24
OS: Solaris 5.10
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (212.157.248.2)
Dear All,
One of our customer got a signal 6 during the execution of the ldap_abandon.
This signal occurred whereas some maintenance was doing on a firewall which is
not located on the same network. It seems that the ldap_abandon thrown this
assert because it was a situation like a can't happen. I don't have much
information on the action performed on the firewall which led to this sigabort.
Please find the logs I have :
2011-07-19 04:08:52.876909|ENG|FATAL|84|engine.cpp(109)|
/appl/kabira/MIPSPLUS-1.5.1-110510-SOL64/kis/distrib/kabira/lib/libsyssrv.so:sw_debugSignalHandler+0x18
2011-07-19 04:08:52.877541|ENG|FATAL|84|engine.cpp(109)|
/lib/sparcv9/libc.so.1:<static-func>+0x5b0
2011-07-19 04:08:52.878082|ENG|FATAL|84|engine.cpp(109)|
/lib/sparcv9/libc.so.1:<static-func>+0x628
2011-07-19 04:08:52.878658|ENG|FATAL|84|engine.cpp(109)|
/lib/sparcv9/libc.so.1:_lwp_kill+0x8 [ Signal 6 (SIGABRT) ]
2011-07-19 04:08:52.879196|ENG|FATAL|84|engine.cpp(109)|
/lib/sparcv9/libc.so.1:abort+0xd0
2011-07-19 04:08:52.879733|ENG|FATAL|84|engine.cpp(109)|
/lib/sparcv9/libc.so.1:_assert+0x74
2011-07-19 04:08:52.880325|ENG|FATAL|84|engine.cpp(109)|
/appl/kabira/MIPSPLUS-1.5.1-110510-SOL64/kis/3rdparty/sol/openldap/2.4.24_64/lib/libldap-2.4.so.2.6.0:<static-func>+0x4ec
2011-07-19 04:08:52.880776|ENG|FATAL|84|engine.cpp(109)|
/appl/kabira/MIPSPLUS-1.5.1-110510-SOL64/kis/3rdparty/sol/openldap/2.4.24_64/lib/libldap-2.4.so.2.6.0:ldap_abandon+0xac
I am in the process of a reproduction of this signal 6 but I would like to share
the following idea.
Do you think there is a possibility to enhance the openldap library to avoid
this assert? Is it possible that instead of having an assert, we have a
exception or a status code which tell us something went wrong of obviously that
we can handle in our source code.
My best regards
Olivier GERMAIN
TIBCO Support team
whm(a)stanford.edu wrote:
> --On Tuesday, August 02, 2011 11:03:24 AM -0700 Quanah Gibson-Mount<quanah(a)zimbra.com> wrote:
>
>> --On Tuesday, August 02, 2011 5:54 PM +0000 whm(a)stanford.edu wrote:
>>>>> Your log shows that the subsequent search request initiates a new
>>>>> Bind to the remote server, which implies that it's not re-using the
>>>>> same connection as the first request. Since a paged results cookie
>>>>> is only valid within the context of a single connection, you get
>>>>> this error result.
>>>>
>>>> Not sure which log you are looking at. When I look at the log:
>>>>
>>>> http://www.stanford.edu/~whm/files/ldap-debugging/slapd-trace-paged-resu
>>>> lts.log.gz
>>>>
>>>> The only connection I see in the log is conn=1000 and it ends with:
>>>>
>>>> conn=1000 op=5 SEARCH RESULT tag=101 err=2 nentries=0 text=paged results
>>>> cookie is invalid ldap_read: want=8, got=7
>>>> 0000: 30 05 02 01 07 42 00 0....B.
>>>> ldap_read: want=8, got=0
>>>>
>>>> conn=1000 op=6 UNBIND
>>>> conn=1000 fd=11 closed
>>>>
>>>> These tests where made with a single ldapsearch request. The ldapsearch
>>>> tests fail when using the proxy and succeed when connecting directly to
>>>> the LDAP server with the database on it.
>>>>
>>>> A side node: the test case I submitted used ldapsearch, but the
>>>> problem was uncovered using a python application that is used for
>>>> syncing Gmail account data.
>>>>
>>>> Bill
>>>
>>> I have copied the backend server configuration to
>>> http://www.stanford.edu/~whm/files/ldap-debugging/. I dumped an
>>> copy of cn=config and there is a files based version the in ldap
>>> subdirectory as well.
>>
>> Where's the configuration for the slapd-ldap server? That's of the
>> most importance...
>>
>> --Quanah
>
> Of course, sorry about that. I have copied the files to the web site.
Sounds like this may be related to ITS#6817. Please try adding a dummy binddn
to your idassert-bind directive and re-test.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Full_Name: Duncan Idaho
Version: 2.4.25
OS: Centos 5.5
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (216.148.0.72)
We are running a cluster of "synchronization hosts" that act as Producers for a
large number of Consumers and additionally Consume from a single global
read/write master. After several weeks of stable operation synchronization came
to a halt on both nodes in the Producer/Consumer cluster simultaneously.
Stack-trace:
Thread 15 (Thread 10745):
#0 0x000000322ea07b35 in pthread_join () from /lib64/libpthread.so.0
#1 0x000000000041929d in slapd_daemon () at daemon.c:2922
#2 0x000000000040730e in main (argc=<value optimized out>, argv=0x7fffd2567cb8)
at main.c:983
Thread 14 (Thread 8866):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:672
#2 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#3 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 13 (Thread 14876):
#0 0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00000000004d7e48 in syncprov_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10)
at syncprov.c:2476
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaac06ab4a0, rs=0x471a9c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaac06ab4a0, rs=0x471a9c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x471a9d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x471a9d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 12 (Thread 14875):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:672
#2 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#3 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 11 (Thread 14874):
#0 0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
#1 0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
#2 0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
#3 0x00000000004d7e5e in syncprov_op_search (op=0x2aaac0897eb0, rs=0x461a7c10)
at syncprov.c:2471
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaac0897eb0, rs=0x461a7c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaac0897eb0, rs=0x461a7c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x461a7d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x461a7d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 10 (Thread 14873):
#0 0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00000000004d7e48 in syncprov_op_search (op=0x2aaab82b63c0, rs=0x459a6c10)
at syncprov.c:2476
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaab82b63c0, rs=0x459a6c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaab82b63c0, rs=0x459a6c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
search.c:217
...back 1 page
Thread 15 (Thread 10745):
#0 0x000000322ea07b35 in pthread_join () from /lib64/libpthread.so.0
#1 0x000000000041929d in slapd_daemon () at daemon.c:2922
#2 0x000000000040730e in main (argc=<value optimized out>, argv=0x7fffd2567cb8)
at main.c:983
Thread 14 (Thread 8866):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:672
#2 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#3 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 13 (Thread 14876):
#0 0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00000000004d7e48 in syncprov_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10)
at syncprov.c:2476
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaac06ab4a0, rs=0x471a9c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaac06ab4a0, rs=0x471a9c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x471a9d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x471a9d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 12 (Thread 14875):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:672
#2 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#3 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 11 (Thread 14874):
#0 0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
#1 0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
#2 0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
#3 0x00000000004d7e5e in syncprov_op_search (op=0x2aaac0897eb0, rs=0x461a7c10)
at syncprov.c:2471
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaac0897eb0, rs=0x461a7c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaac0897eb0, rs=0x461a7c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x461a7d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x461a7d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 10 (Thread 14873):
#0 0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00000000004d7e48 in syncprov_op_search (op=0x2aaab82b63c0, rs=0x459a6c10)
at syncprov.c:2476
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaab82b63c0, rs=0x459a6c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaab82b63c0, rs=0x459a6c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
search.c:217
[root@su1-admin-syncrepl01 ldap]# cat gdb.txt
Thread 15 (Thread 10745):
#0 0x000000322ea07b35 in pthread_join () from /lib64/libpthread.so.0
#1 0x000000000041929d in slapd_daemon () at daemon.c:2922
#2 0x000000000040730e in main (argc=<value optimized out>, argv=0x7fffd2567cb8)
at main.c:983
Thread 14 (Thread 8866):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:672
#2 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#3 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 13 (Thread 14876):
#0 0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00000000004d7e48 in syncprov_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10)
at syncprov.c:2476
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaac06ab4a0, rs=0x471a9c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaac06ab4a0, rs=0x471a9c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaac06ab4a0, rs=0x471a9c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x471a9d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x471a9d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 12 (Thread 14875):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00000000004ec695 in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:672
#2 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#3 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 11 (Thread 14874):
#0 0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
#1 0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
#2 0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
#3 0x00000000004d7e5e in syncprov_op_search (op=0x2aaac0897eb0, rs=0x461a7c10)
at syncprov.c:2471
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaac0897eb0, rs=0x461a7c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaac0897eb0, rs=0x461a7c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaac0897eb0, rs=0x461a7c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x461a7d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x461a7d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 10 (Thread 14873):
#0 0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00000000004d7e48 in syncprov_op_search (op=0x2aaab82b63c0, rs=0x459a6c10)
at syncprov.c:2476
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaab82b63c0, rs=0x459a6c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaab82b63c0, rs=0x459a6c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaab82b63c0, rs=0x459a6c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x459a6d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x459a6d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 9 (Thread 14872):
#0 0x000000322ea0d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x000000322ea08e1a in _L_lock_1034 () from /lib64/libpthread.so.0
#2 0x000000322ea08cdc in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00000000004d7e48 in syncprov_op_search (op=0x2aaab89e58c0, rs=0x451a5c10)
at syncprov.c:2476
#4 0x000000000047e64a in overlay_op_walk (op=0x2aaab89e58c0, rs=0x451a5c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2aaab89e58c0, rs=0x451a5c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2aaab89e58c0, rs=0x451a5c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2aaab89e58c0, rs=0x451a5c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x451a5d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x451a5d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 8 (Thread 16034):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x000000000042db1c in send_ldap_ber (op=<value optimized out>,
ber=0x449a43f0) at result.c:372
#2 0x00000000004308d8 in slap_send_search_entry (op=0x449a45a0, rs=0x449a4af0)
at result.c:1410
#3 0x00000000004da80d in syncprov_qplay (ctx=<value optimized out>, arg=<value
optimized out>) at syncprov.c:884
#4 syncprov_qtask (ctx=<value optimized out>, arg=<value optimized out>) at
syncprov.c:1006
#5 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#6 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#7 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#8 0x0000000000000000 in ?? ()
Thread 7 (Thread 16033):
#0 0x000000322ea0d605 in __lll_unlock_wake () from /lib64/libpthread.so.0
#1 0x000000322ea0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
#2 0x000000322ea0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
#3 0x00000000004d7e5e in syncprov_op_search (op=0x2d554b70, rs=0x441a3c10) at
syncprov.c:2471
#4 0x000000000047e64a in overlay_op_walk (op=0x2d554b70, rs=0x441a3c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#5 0x000000000047ec27 in over_op_func (op=0x2d554b70, rs=0x441a3c10,
which=op_search) at backover.c:721
#6 0x0000000000420a79 in fe_op_search (op=0x2d554b70, rs=0x441a3c10) at
search.c:372
#7 0x0000000000421217 in do_search (op=0x2d554b70, rs=0x441a3c10) at
search.c:217
#8 0x000000000041e965 in connection_operation (ctx=0x441a3d60, arg_v=<value
optimized out>) at connection.c:1113
#9 0x000000000041f23c in connection_read_thread (ctx=0x441a3d60, argv=<value
optimized out>) at connection.c:1249
#10 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#11 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#12 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#13 0x0000000000000000 in ?? ()
Thread 6 (Thread 11937):
#0 0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
#1 0x00000000004d7e7f in syncprov_op_search (op=0x2aaab802af20, rs=0x418f7c10)
at syncprov.c:2475
#2 0x000000000047e64a in overlay_op_walk (op=0x2aaab802af20, rs=0x418f7c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#3 0x000000000047ec27 in over_op_func (op=0x2aaab802af20, rs=0x418f7c10,
which=op_search) at backover.c:721
#4 0x0000000000420a79 in fe_op_search (op=0x2aaab802af20, rs=0x418f7c10) at
search.c:372
#5 0x0000000000421217 in do_search (op=0x2aaab802af20, rs=0x418f7c10) at
search.c:217
#6 0x000000000041e965 in connection_operation (ctx=0x418f7d60, arg_v=<value
optimized out>) at connection.c:1113
#7 0x000000000041f23c in connection_read_thread (ctx=0x418f7d60, argv=<value
optimized out>) at connection.c:1249
#8 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#9 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
Thread 5 (Thread 10755):
#0 0x000000322ea0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00002b1d8ae85bd1 in __db_pthread_mutex_lock () from
/usr/local/openldap/lib/libdb-5.1.so
#2 0x00002b1d8ae8528a in __db_tas_mutex_lock () from
/usr/local/openldap/lib/libdb-5.1.so
#3 0x00002b1d8af1804c in __lock_get_internal () from
/usr/local/openldap/lib/libdb-5.1.so
#4 0x00002b1d8af189ea in __lock_vec () from
/usr/local/openldap/lib/libdb-5.1.so
#5 0x00002b1d8af1969b in __lock_vec_pp () from
/usr/local/openldap/lib/libdb-5.1.so
#6 0x00000000004bc6c1 in hdb_cache_entry_db_relock (bdb=0xe2bc780, txn=<value
optimized out>, ei=0x2aaadaefc6a0, rw=1, tryOnly=<value optimized out>,
lock=0x439a1640) at cache.c:198
#7 0x00000000004bcb70 in hdb_cache_modify (bdb=0x2b1d8f0fa00c,
e=0x2b1df490e138, newAttrs=0x2b1e071d6918, txn=0x0, lock=0x0) at cache.c:1231
#8 0x0000000000495d13 in hdb_modify (op=0x439a2120, rs=0x439a1cb0) at
modify.c:662
#9 0x000000000047e6c2 in overlay_op_walk (op=0x439a2120, rs=0x439a1cb0,
which=op_modify, oi=0xe2be010, on=0x0) at backover.c:669
#10 0x000000000047ec27 in over_op_func (op=0x439a2120, rs=0x439a1cb0,
which=op_modify) at backover.c:721
#11 0x000000000047240f in syncrepl_entry (si=0xe2c2800, op=0x439a2120,
entry=0x2b1df490e408, modlist=0x439a2c68, syncstate=2, syncUUID=<value optimized
out>, syncCSN=0x2aaade6b6ab0) at syncrepl.c:2694
#12 0x0000000000479cf7 in do_syncrep2 (ctx=<value optimized out>, arg=<value
optimized out>) at syncrepl.c:955
#13 do_syncrepl (ctx=<value optimized out>, arg=<value optimized out>) at
syncrepl.c:1455
#14 0x000000000041f7ca in connection_read_thread (ctx=0x439a2d60, argv=<value
optimized out>) at connection.c:1251
#15 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#16 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#17 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#18 0x0000000000000000 in ?? ()
Thread 4 (Thread 10752):
#0 0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
#1 0x00000000004d7e7f in syncprov_op_search (op=0xe6f0990, rs=0x410f6c10) at
syncprov.c:2475
#2 0x000000000047e64a in overlay_op_walk (op=0xe6f0990, rs=0x410f6c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#3 0x000000000047ec27 in over_op_func (op=0xe6f0990, rs=0x410f6c10,
which=op_search) at backover.c:721
#4 0x0000000000420a79 in fe_op_search (op=0xe6f0990, rs=0x410f6c10) at
search.c:372
#5 0x0000000000421217 in do_search (op=0xe6f0990, rs=0x410f6c10) at
search.c:217
#6 0x000000000041e965 in connection_operation (ctx=0x410f6d60, arg_v=<value
optimized out>) at connection.c:1113
#7 0x000000000041f23c in connection_read_thread (ctx=0x410f6d60, argv=<value
optimized out>) at connection.c:1249
#8 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#9 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
Thread 3 (Thread 10749):
#0 0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
#1 0x00000000004d7e7f in syncprov_op_search (op=0xe5eefa0, rs=0x431a1c10) at
syncprov.c:2475
#2 0x000000000047e64a in overlay_op_walk (op=0xe5eefa0, rs=0x431a1c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#3 0x000000000047ec27 in over_op_func (op=0xe5eefa0, rs=0x431a1c10,
which=op_search) at backover.c:721
#4 0x0000000000420a79 in fe_op_search (op=0xe5eefa0, rs=0x431a1c10) at
search.c:372
#5 0x0000000000421217 in do_search (op=0xe5eefa0, rs=0x431a1c10) at
search.c:217
#6 0x000000000041e965 in connection_operation (ctx=0x431a1d60, arg_v=<value
optimized out>) at connection.c:1113
#7 0x000000000041f23c in connection_read_thread (ctx=0x431a1d60, argv=<value
optimized out>) at connection.c:1249
#8 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#9 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
Thread 2 (Thread 10748):
#0 0x000000322e2baa27 in sched_yield () from /lib64/libc.so.6
#1 0x00000000004d7e7f in syncprov_op_search (op=0xe5ebbf0, rs=0x429a0c10) at
syncprov.c:2475
#2 0x000000000047e64a in overlay_op_walk (op=0xe5ebbf0, rs=0x429a0c10,
which=op_search, oi=0xe2be010, on=0xe2c30b0) at backover.c:659
#3 0x000000000047ec27 in over_op_func (op=0xe5ebbf0, rs=0x429a0c10,
which=op_search) at backover.c:721
#4 0x0000000000420a79 in fe_op_search (op=0xe5ebbf0, rs=0x429a0c10) at
search.c:372
#5 0x0000000000421217 in do_search (op=0xe5ebbf0, rs=0x429a0c10) at
search.c:217
#6 0x000000000041e965 in connection_operation (ctx=0x429a0d60, arg_v=<value
optimized out>) at connection.c:1113
#7 0x000000000041f23c in connection_read_thread (ctx=0x429a0d60, argv=<value
optimized out>) at connection.c:1249
#8 0x00000000004ec63c in ldap_int_thread_pool_wrapper (xpool=0xe232120) at
tpool.c:685
#9 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#10 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()
Thread 1 (Thread 10747):
#0 0x000000322e2d4108 in epoll_wait () from /lib64/libc.so.6
#1 0x000000000041bc8b in slapd_daemon_task (ptr=<value optimized out>) at
daemon.c:2528
#2 0x000000322ea0673d in start_thread () from /lib64/libpthread.so.0
#3 0x000000322e2d3d1d in clone () from /lib64/libc.so.6
#4 0x0000000000000000 in ?? ()
Logs at the time of the lock-up:
Producer:
Jul 28 22:27:08 admin02 slapd[5178]: conn=422733 op=92 MOD
dn="thing=thing1,ou=things,dc=example,dc=net"
Jul 28 22:27:08 admin02 slapd[5178]: conn=422733 op=92 MOD attr=stuff
Jul 28 22:27:08 admin02 slapd[5178]: slap_queue_csn: queing 0x4b0883f0
20110728222708.058275Z#000000#000#000000
Jul 28 22:27:08 admin02 slapd[5178]: syncprov_sendresp: cookie=rid=001
,csn=20110728222708.058275Z#000000#000#000000
Jul 28 22:27:08 admin02 slapd[5178]: syncprov_sendresp: cookie=rid=001
Producer later:
Jul 29 07:04:55 admin02 slapd[5178]: Entry
thing=thing1,ou=things,dc=example,dc=net CSN
20110728222708.058275Z#000000#000#000000 older or equal to ctx
20110728222708.058275Z#000000#000#000000
Producer/Consumer:
Jul 28 22:27:08 su1-admin-syncrepl01 slapd[10745]: do_syncrep2: rid=001
cookie=rid=001,csn=20110728222708.058275Z#000000#000#000000
Jul 28 22:27:08 su1-admin-syncrepl01 slapd[10745]: slap_queue_csn: queing
0x2aaadecf4b00 20110728222708.058275Z#000000#000#000000
^^ No more syncrepl logs after this
Help!
--On Tuesday, August 02, 2011 11:03:24 AM -0700 Quanah Gibson-Mount <quanah(a)zimbra.com> wrote:
> --On Tuesday, August 02, 2011 5:54 PM +0000 whm(a)stanford.edu wrote:
>
>>
>>
>> --On Tuesday, August 02, 2011 10:08:32 AM -0700 Bill MacAllister
>> <whm(a)stanford.edu> wrote:
>>
>>>
>>>
>>> --On Monday, August 01, 2011 02:46:50 PM -0700 Howard Chu
>>> <hyc(a)symas.com> wrote:
>>>
>>>> whm(a)stanford.edu wrote:
>>>>> Full_Name: Bill MacAllister
>>>>> Version: 2.4.26
>>>>> OS: Debian 6
>>>>> URL: ftp://ftp.openldap.org/incoming/
>>>>> Submission from: (NULL) (171.64.19.165)
>>>>>
>>>>>
>>>>> We typically setup local proxy servers to support applications that
>>>>> cannot support a GSSAPI bind to the directory server. The proxy
>>>>> server allows anonymous access to the directory for connections from
>>>>> the localhost and connects to the master using GSSAPI. We are
>>>>> experiencing a failures when we attempt to use the paged results
>>>>> control on the proxy. For example:
>>>>>
>>>>> ldapsearch -E pr=1000/noprompt -x -b "cn=people,dc=stanford,dc=edu" -h
>>>>> localhost "(&(objectClass=suPerson)(suVisibIdentity=world))" ou
>>>>> telephonenumber title
>>>>>
>>>>> ends with the error:
>>>>>
>>>>> # search result
>>>>> search: 5
>>>>> result: 0 Success
>>>>> control: 1.2.840.113556.1.4.319 false MA0CAQAECCiDAAAAAAAA
>>>>> pagedresults: cookie=KIMAAAAAAAA=
>>>>> # extended LDIF
>>>>> #
>>>>> # LDAPv3
>>>>> # base<cn=people,dc=stanford,dc=edu> with scope subtree
>>>>> # filter: (&(objectClass=suPerson)(suVisibIdentity=world))
>>>>> # requesting: ou telephonenumber title
>>>>> # with pagedResults control: size=1000
>>>>> #
>>>>>
>>>>> # search result
>>>>> search: 6
>>>>> result: 2 Protocol error
>>>>> text: paged results cookie is invalid
>>>>>
>>>>> # numResponses: 4005
>>>>> # numEntries: 4000
>>>>>
>>>>> This result is not consistent. We have seen examples where 2000 and
>>>>> 3000 entries being returned and then the error. Another test that we
>>>>> performed with a slightly more complex filter, i.e.
>>>>>
>>>>> "(&(objectClass=suPerson)(|(suVisibIdentity=world)(suVisibIdentity=
>>>>> world)))"
>>>>>
>>>>> returned usually returned 1000 entries before erroring.
>>>>>
>>>>> Issuing a similar search directly against the backend ldap server
>>>>> completes without
>>>>> error.
>>>>>
>>>>> We have seen the same behavior on OpenLDAP 2.4.23 as well.
>>>>>
>>>>> Logs generated running slapd standalone with '-d stats,packets' are
>>>>> available at http://www.stanford.edu/~whm/files/ldap-debugging/.
>>>>
>>>> Your log shows that the subsequent search request initiates a new
>>>> Bind to the remote server, which implies that it's not re-using the
>>>> same connection as the first request. Since a paged results cookie
>>>> is only valid within the context of a single connection, you get
>>>> this error result.
>>>
>>> Not sure which log you are looking at. When I look at the log:
>>>
>>> http://www.stanford.edu/~whm/files/ldap-debugging/slapd-trace-paged-resu
>>> lts.log.gz
>>>
>>> The only connection I see in the log is conn=1000 and it ends with:
>>>
>>> conn=1000 op=5 SEARCH RESULT tag=101 err=2 nentries=0 text=paged results
>>> cookie is invalid ldap_read: want=8, got=7
>>> 0000: 30 05 02 01 07 42 00 0....B.
>>> ldap_read: want=8, got=0
>>>
>>> conn=1000 op=6 UNBIND
>>> conn=1000 fd=11 closed
>>>
>>> These tests where made with a single ldapsearch request. The ldapsearch
>>> tests fail when using the proxy and succeed when connecting directly to
>>> the LDAP server with the database on it.
>>>
>>> A side node: the test case I submitted used ldapsearch, but the
>>> problem was uncovered using a python application that is used for
>>> syncing Gmail account data.
>>>
>>> Bill
>>
>> I have copied the backend server configuration to
>> http://www.stanford.edu/~whm/files/ldap-debugging/. I dumped an
>> copy of cn=config and there is a files based version the in ldap
>> subdirectory as well.
>
> Where's the configuration for the slapd-ldap server? That's of the
> most importance...
>
> --Quanah
Of course, sorry about that. I have copied the files to the web site.
Bill
--
Bill MacAllister
Infrastructure Delivery Group, Stanford University
--On Tuesday, August 02, 2011 5:54 PM +0000 whm(a)stanford.edu wrote:
>
>
> --On Tuesday, August 02, 2011 10:08:32 AM -0700 Bill MacAllister
> <whm(a)stanford.edu> wrote:
>
>>
>>
>> --On Monday, August 01, 2011 02:46:50 PM -0700 Howard Chu
>> <hyc(a)symas.com> wrote:
>>
>>> whm(a)stanford.edu wrote:
>>>> Full_Name: Bill MacAllister
>>>> Version: 2.4.26
>>>> OS: Debian 6
>>>> URL: ftp://ftp.openldap.org/incoming/
>>>> Submission from: (NULL) (171.64.19.165)
>>>>
>>>>
>>>> We typically setup local proxy servers to support applications that
>>>> cannot support a GSSAPI bind to the directory server. The proxy
>>>> server allows anonymous access to the directory for connections from
>>>> the localhost and connects to the master using GSSAPI. We are
>>>> experiencing a failures when we attempt to use the paged results
>>>> control on the proxy. For example:
>>>>
>>>> ldapsearch -E pr=1000/noprompt -x -b "cn=people,dc=stanford,dc=edu" -h
>>>> localhost "(&(objectClass=suPerson)(suVisibIdentity=world))" ou
>>>> telephonenumber title
>>>>
>>>> ends with the error:
>>>>
>>>> # search result
>>>> search: 5
>>>> result: 0 Success
>>>> control: 1.2.840.113556.1.4.319 false MA0CAQAECCiDAAAAAAAA
>>>> pagedresults: cookie=KIMAAAAAAAA=
>>>> # extended LDIF
>>>> #
>>>> # LDAPv3
>>>> # base<cn=people,dc=stanford,dc=edu> with scope subtree
>>>> # filter: (&(objectClass=suPerson)(suVisibIdentity=world))
>>>> # requesting: ou telephonenumber title
>>>> # with pagedResults control: size=1000
>>>> #
>>>>
>>>> # search result
>>>> search: 6
>>>> result: 2 Protocol error
>>>> text: paged results cookie is invalid
>>>>
>>>> # numResponses: 4005
>>>> # numEntries: 4000
>>>>
>>>> This result is not consistent. We have seen examples where 2000 and
>>>> 3000 entries being returned and then the error. Another test that we
>>>> performed with a slightly more complex filter, i.e.
>>>>
>>>> "(&(objectClass=suPerson)(|(suVisibIdentity=world)(suVisibIdentity=
>>>> world)))"
>>>>
>>>> returned usually returned 1000 entries before erroring.
>>>>
>>>> Issuing a similar search directly against the backend ldap server
>>>> completes without
>>>> error.
>>>>
>>>> We have seen the same behavior on OpenLDAP 2.4.23 as well.
>>>>
>>>> Logs generated running slapd standalone with '-d stats,packets' are
>>>> available at http://www.stanford.edu/~whm/files/ldap-debugging/.
>>>
>>> Your log shows that the subsequent search request initiates a new
>>> Bind to the remote server, which implies that it's not re-using the
>>> same connection as the first request. Since a paged results cookie
>>> is only valid within the context of a single connection, you get
>>> this error result.
>>
>> Not sure which log you are looking at. When I look at the log:
>>
>> http://www.stanford.edu/~whm/files/ldap-debugging/slapd-trace-paged-resu
>> lts.log.gz
>>
>> The only connection I see in the log is conn=1000 and it ends with:
>>
>> conn=1000 op=5 SEARCH RESULT tag=101 err=2 nentries=0 text=paged results
>> cookie is invalid ldap_read: want=8, got=7
>> 0000: 30 05 02 01 07 42 00 0....B.
>> ldap_read: want=8, got=0
>>
>> conn=1000 op=6 UNBIND
>> conn=1000 fd=11 closed
>>
>> These tests where made with a single ldapsearch request. The ldapsearch
>> tests fail when using the proxy and succeed when connecting directly to
>> the LDAP server with the database on it.
>>
>> A side node: the test case I submitted used ldapsearch, but the
>> problem was uncovered using a python application that is used for
>> syncing Gmail account data.
>>
>> Bill
>
> I have copied the backend server configuration to
> http://www.stanford.edu/~whm/files/ldap-debugging/. I dumped an
> copy of cn=config and there is a files based version the in ldap
> subdirectory as well.
Where's the configuration for the slapd-ldap server? That's of the most
importance...
--Quanah
--
Quanah Gibson-Mount
Sr. Member of Technical Staff
Zimbra, Inc
A Division of VMware, Inc.
--------------------
Zimbra :: the leader in open source messaging and collaboration
--On Tuesday, August 02, 2011 10:08:32 AM -0700 Bill MacAllister <whm(a)stanford.edu> wrote:
>
>
> --On Monday, August 01, 2011 02:46:50 PM -0700 Howard Chu <hyc(a)symas.com> wrote:
>
>> whm(a)stanford.edu wrote:
>>> Full_Name: Bill MacAllister
>>> Version: 2.4.26
>>> OS: Debian 6
>>> URL: ftp://ftp.openldap.org/incoming/
>>> Submission from: (NULL) (171.64.19.165)
>>>
>>>
>>> We typically setup local proxy servers to support applications that cannot
>>> support a GSSAPI bind to the directory server. The proxy server allows
>>> anonymous access to the directory for connections from the localhost and
>>> connects to the master using GSSAPI. We are experiencing a failures when
>>> we attempt to use the paged results control on the proxy. For example:
>>>
>>> ldapsearch -E pr=1000/noprompt -x -b "cn=people,dc=stanford,dc=edu" -h localhost
>>> "(&(objectClass=suPerson)(suVisibIdentity=world))" ou telephonenumber title
>>>
>>> ends with the error:
>>>
>>> # search result
>>> search: 5
>>> result: 0 Success
>>> control: 1.2.840.113556.1.4.319 false MA0CAQAECCiDAAAAAAAA
>>> pagedresults: cookie=KIMAAAAAAAA=
>>> # extended LDIF
>>> #
>>> # LDAPv3
>>> # base<cn=people,dc=stanford,dc=edu> with scope subtree
>>> # filter: (&(objectClass=suPerson)(suVisibIdentity=world))
>>> # requesting: ou telephonenumber title
>>> # with pagedResults control: size=1000
>>> #
>>>
>>> # search result
>>> search: 6
>>> result: 2 Protocol error
>>> text: paged results cookie is invalid
>>>
>>> # numResponses: 4005
>>> # numEntries: 4000
>>>
>>> This result is not consistent. We have seen examples where 2000 and 3000
>>> entries being returned and then the error. Another test that we performed with
>>> a slightly more complex filter, i.e.
>>>
>>> "(&(objectClass=suPerson)(|(suVisibIdentity=world)(suVisibIdentity=world)))"
>>>
>>> returned usually returned 1000 entries before erroring.
>>>
>>> Issuing a similar search directly against the backend ldap server completes
>>> without
>>> error.
>>>
>>> We have seen the same behavior on OpenLDAP 2.4.23 as well.
>>>
>>> Logs generated running slapd standalone with '-d stats,packets' are available at
>>> http://www.stanford.edu/~whm/files/ldap-debugging/.
>>
>> Your log shows that the subsequent search request initiates a new
>> Bind to the remote server, which implies that it's not re-using the
>> same connection as the first request. Since a paged results cookie
>> is only valid within the context of a single connection, you get
>> this error result.
>
> Not sure which log you are looking at. When I look at the log:
>
> http://www.stanford.edu/~whm/files/ldap-debugging/slapd-trace-paged-results…
>
> The only connection I see in the log is conn=1000 and it ends with:
>
> conn=1000 op=5 SEARCH RESULT tag=101 err=2 nentries=0 text=paged results cookie is invalid
> ldap_read: want=8, got=7
> 0000: 30 05 02 01 07 42 00 0....B.
> ldap_read: want=8, got=0
>
> conn=1000 op=6 UNBIND
> conn=1000 fd=11 closed
>
> These tests where made with a single ldapsearch request. The ldapsearch
> tests fail when using the proxy and succeed when connecting directly to
> the LDAP server with the database on it.
>
> A side node: the test case I submitted used ldapsearch, but the
> problem was uncovered using a python application that is used for
> syncing Gmail account data.
>
> Bill
I have copied the backend server configuration to
http://www.stanford.edu/~whm/files/ldap-debugging/. I dumped an
copy of cn=config and there is a files based version the in ldap
subdirectory as well.
Bill
--
Bill MacAllister
Infrastructure Delivery Group, Stanford University
--On Monday, August 01, 2011 02:46:50 PM -0700 Howard Chu <hyc(a)symas.com> wrote:
> whm(a)stanford.edu wrote:
>> Full_Name: Bill MacAllister
>> Version: 2.4.26
>> OS: Debian 6
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (171.64.19.165)
>>
>>
>> We typically setup local proxy servers to support applications that cannot
>> support a GSSAPI bind to the directory server. The proxy server allows
>> anonymous access to the directory for connections from the localhost and
>> connects to the master using GSSAPI. We are experiencing a failures when
>> we attempt to use the paged results control on the proxy. For example:
>>
>> ldapsearch -E pr=1000/noprompt -x -b "cn=people,dc=stanford,dc=edu" -h localhost
>> "(&(objectClass=suPerson)(suVisibIdentity=world))" ou telephonenumber title
>>
>> ends with the error:
>>
>> # search result
>> search: 5
>> result: 0 Success
>> control: 1.2.840.113556.1.4.319 false MA0CAQAECCiDAAAAAAAA
>> pagedresults: cookie=KIMAAAAAAAA=
>> # extended LDIF
>> #
>> # LDAPv3
>> # base<cn=people,dc=stanford,dc=edu> with scope subtree
>> # filter: (&(objectClass=suPerson)(suVisibIdentity=world))
>> # requesting: ou telephonenumber title
>> # with pagedResults control: size=1000
>> #
>>
>> # search result
>> search: 6
>> result: 2 Protocol error
>> text: paged results cookie is invalid
>>
>> # numResponses: 4005
>> # numEntries: 4000
>>
>> This result is not consistent. We have seen examples where 2000 and 3000
>> entries being returned and then the error. Another test that we performed with
>> a slightly more complex filter, i.e.
>>
>> "(&(objectClass=suPerson)(|(suVisibIdentity=world)(suVisibIdentity=world)))"
>>
>> returned usually returned 1000 entries before erroring.
>>
>> Issuing a similar search directly against the backend ldap server completes
>> without
>> error.
>>
>> We have seen the same behavior on OpenLDAP 2.4.23 as well.
>>
>> Logs generated running slapd standalone with '-d stats,packets' are available at
>> http://www.stanford.edu/~whm/files/ldap-debugging/.
>
> Your log shows that the subsequent search request initiates a new
> Bind to the remote server, which implies that it's not re-using the
> same connection as the first request. Since a paged results cookie
> is only valid within the context of a single connection, you get
> this error result.
Not sure which log you are looking at. When I look at the log:
http://www.stanford.edu/~whm/files/ldap-debugging/slapd-trace-paged-results…
The only connection I see in the log is conn=1000 and it ends with:
conn=1000 op=5 SEARCH RESULT tag=101 err=2 nentries=0 text=paged results cookie is invalid
ldap_read: want=8, got=7
0000: 30 05 02 01 07 42 00 0....B.
ldap_read: want=8, got=0
conn=1000 op=6 UNBIND
conn=1000 fd=11 closed
These tests where made with a single ldapsearch request. The ldapsearch
tests fail when using the proxy and succeed when connecting directly to
the LDAP server with the database on it.
A side node: the test case I submitted used ldapsearch, but the
problem was uncovered using a python application that is used for
syncing Gmail account data.
Bill
--
Bill MacAllister
Infrastructure Delivery Group, Stanford University
Full_Name: Bjoern Jacke
Version: 2.4.23
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (193.159.216.39)
When TIMEOUT is set and happend to occur there is nothing that shows that we
timed out. For example if you do a large/slow LDAP search using ldapsearch and
then stop the conversation between slapd and ldapsearch by "killall -STOP slapd"
then ldapsearch will terminate after TIMEOUT seconds and it will return the
number of records that it got so far and exit with return code 0 and not error
message. I'd expect that after TIMEOUT occuring that we get a warning and a
return code != 0
Quanah Gibson-Mount wrote:
> --On Monday, August 01, 2011 9:47 PM +0000 hyc(a)symas.com wrote:
>> Your log shows that the subsequent search request initiates a new Bind to
>> the remote server, which implies that it's not re-using the same
>> connection as the first request. Since a paged results cookie is only
>> valid within the context of a single connection, you get this error
>> result.
>
> Given that the client making the request is ldapsearch, which is not going
> to rebind mid search, and you can see it is done with noprompt, so there's
> no human interaction here, it seems like back-ldap is buggy here. Why is
> *back-ldap* closing its connection to the upstream server and initiating a
> new bind?
You haven't provided any slapd configuration info. Without seeing that it's
way premature to claim there's any bug here.
Look at the log yourself. back-ldap *doesn't* close the connection. Don't spew
garbage about what it does when it didn't actually do such a thing.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
--On Monday, August 01, 2011 9:47 PM +0000 hyc(a)symas.com wrote:
> whm(a)stanford.edu wrote:
>> Full_Name: Bill MacAllister
>> Version: 2.4.26
>> OS: Debian 6
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (171.64.19.165)
>>
>>
>> We typically setup local proxy servers to support applications that
>> cannot support a GSSAPI bind to the directory server. The proxy server
>> allows anonymous access to the directory for connections from the
>> localhost and connects to the master using GSSAPI. We are experiencing
>> a failures when we attempt to use the paged results control on the
>> proxy. For example:
>>
>> ldapsearch -E pr=1000/noprompt -x -b "cn=people,dc=stanford,dc=edu" -h
>> localhost "(&(objectClass=suPerson)(suVisibIdentity=world))" ou
>> telephonenumber title
>>
>> ends with the error:
>>
>> # search result
>> search: 5
>> result: 0 Success
>> control: 1.2.840.113556.1.4.319 false MA0CAQAECCiDAAAAAAAA
>> pagedresults: cookie=KIMAAAAAAAA=
>> # extended LDIF
>> #
>> # LDAPv3
>> # base<cn=people,dc=stanford,dc=edu> with scope subtree
>> # filter: (&(objectClass=suPerson)(suVisibIdentity=world))
>> # requesting: ou telephonenumber title
>> # with pagedResults control: size=1000
>> #
>>
>> # search result
>> search: 6
>> result: 2 Protocol error
>> text: paged results cookie is invalid
>>
>> # numResponses: 4005
>> # numEntries: 4000
>>
>> This result is not consistent. We have seen examples where 2000 and 3000
>> entries being returned and then the error. Another test that we
>> performed with a slightly more complex filter, i.e.
>>
>> "(&(objectClass=suPerson)(|(suVisibIdentity=world)(suVisibIdentity=wo
>> rld)))"
>>
>> returned usually returned 1000 entries before erroring.
>>
>> Issuing a similar search directly against the backend ldap server
>> completes without
>> error.
>>
>> We have seen the same behavior on OpenLDAP 2.4.23 as well.
>>
>> Logs generated running slapd standalone with '-d stats,packets' are
>> available at http://www.stanford.edu/~whm/files/ldap-debugging/.
>
> Your log shows that the subsequent search request initiates a new Bind to
> the remote server, which implies that it's not re-using the same
> connection as the first request. Since a paged results cookie is only
> valid within the context of a single connection, you get this error
> result.
Given that the client making the request is ldapsearch, which is not going
to rebind mid search, and you can see it is done with noprompt, so there's
no human interaction here, it seems like back-ldap is buggy here. Why is
*back-ldap* closing its connection to the upstream server and initiating a
new bind?
--Quanah
--
Quanah Gibson-Mount
Sr. Member of Technical Staff
Zimbra, Inc
A Division of VMware, Inc.
--------------------
Zimbra :: the leader in open source messaging and collaboration