jclarke(a)linagora.com wrote:
> With a simple master/slave setup, and the rwm overlay activated for the bindDN
> context, any modify operation made on the master, replicated with slurpd causes
> the slave to crash.
>
> You will find below a backtrace of the slave when it crashes, and the
> configuration files we're using on the master, the slave, an LDIF of the
> contents of the directory are in the archive on ftp.openldap.org indicated in
> the ITS.
>
> We are using 2.3.37 for both master and slave, and have confirmed that
> disactivating the rwm overlay in slapd.conf avoids the problem.
I'll try to look at your report, but let me note that slurpd is
deprecated; since you're running OpenLDAP 2.3.37, you're strongly
recommended to use syncrepl.
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
rhafer(a)suse.de wrote:
> Hm, current HEAD first calls add_query, which adds the CachedQuery to the
> cache and after that calls cache_entries to add the entries of that Query to
> the cache. That means that query_containment already know about the Query
> before its result is completely cached.
> In RE23 it is just the other way arround (first cache_entries() then
> add_query()).
> I see two possible solution:
>
> 1. Switch back to the old behaviour. But I guess the change was made for a
> reason. I don't know that yet. Seems the change happend between r1.95 and
> r1.96 of pcache.c (log message: "Fix concurrency issues").
>
> 2. Protect the cached query with an rw_lock. Writelock it while
> cache_entries() is executing and readlock it during searches. This would give
> us the behaviour that Ando suggested in the discussion of ITS#5112. (pcache
> would not try to cache the same search request mulitple times, but block the
> second request until the first one is cached and then answer it from the
> cache)
+1 (if there's no drawback, of course)
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
On Mittwoch, 29. August 2007, Howard Chu wrote:
> rhafer(a)suse.de wrote:
> > Full_Name: Ralf Haferkamp
> > Version: HEAD
> > OS:
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (89.166.180.39)
> >
> >
> > While the results of a query are being cached, slapo-pcache will answer
> > queries that match the same template from the cache that is currently
> > being populated. This means that subsequent queries will get incomplete
> > results until the original query is completely cached.
>
> I don't see how that is possible. The query-in-progress isn't added to the
> cache until the final result is received. Until then, query_containment
> should not know anything is there to answer with.
Hm, current HEAD first calls add_query, which adds the CachedQuery to the
cache and after that calls cache_entries to add the entries of that Query to
the cache. That means that query_containment already know about the Query
before its result is completely cached.
In RE23 it is just the other way arround (first cache_entries() then
add_query()).
I see two possible solution:
1. Switch back to the old behaviour. But I guess the change was made for a
reason. I don't know that yet. Seems the change happend between r1.95 and
r1.96 of pcache.c (log message: "Fix concurrency issues").
2. Protect the cached query with an rw_lock. Writelock it while
cache_entries() is executing and readlock it during searches. This would give
us the behaviour that Ando suggested in the discussion of ITS#5112. (pcache
would not try to cache the same search request mulitple times, but block the
second request until the first one is cached and then answer it from the
cache)
--
Ralf
Full_Name: Jonathan Clarke
Version: 2.3.37
OS: Debian Etch
URL: ftp://ftp.openldap.org/incoming/jclarke-slurpd-rwm-modify-bug.tar.gz
Submission from: (NULL) (134.157.159.100)
Hello guys,
With a simple master/slave setup, and the rwm overlay activated for the bindDN
context, any modify operation made on the master, replicated with slurpd causes
the slave to crash.
You will find below a backtrace of the slave when it crashes, and the
configuration files we're using on the master, the slave, an LDIF of the
contents of the directory are in the archive on ftp.openldap.org indicated in
the ITS.
We are using 2.3.37 for both master and slave, and have confirmed that
disactivating the rwm overlay in slapd.conf avoids the problem.
Thanks in advance for your help !
Jon
***** BACKTRACE ******
conn=0 fd=14 ACCEPT from IP=127.0.0.1:57752 (IP=0.0.0.0:392)
[New Thread 32771 (LWP 21187)]
conn=0 op=0 BIND dn="uid=replicator,dc=linagora,dc=org" method=128
conn=0 op=0 BIND dn="uid=replicator,dc=linagora,dc=org" mech=SIMPLE ssf=0
conn=0 op=0 RESULT tag=97 err=0 text=
conn=0 op=1 MOD dn="dc=linagora,dc=org"
conn=0 op=1 MOD attr=description entryCSN modifiersName modifyTimestamp
conn=0 op=1 RESULT tag=103 err=0 text=
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 32771 (LWP 21187)]
0x4035dafb in free () from /lib/libc.so.6
(gdb) bt
#0 0x4035dafb in free () from /lib/libc.so.6
#1 0x4035f805 in malloc () from /lib/libc.so.6
#2 0x0813e70f in ber_memalloc_x (s=1, ctx=0x40412660) at memory.c:226
#3 0x08079fa8 in ch_malloc (size=16) at ch_malloc.c:54
#4 0x08101567 in rwm_op_modify (op=0x8246fc0, rs=0x6301dce4) at rwm.c:590
#5 0x080b6c5e in overlay_op_walk (op=0x8246fc0, rs=0x6301dce4, which=op_modify,
oi=0x81e9688, on=0x81e9778) at backover.c:639
#6 0x080b70ce in over_op_func (op=0x8246fc0, rs=0x6301dce4, which=op_modify) at
backover.c:701
#7 0x08077e00 in do_modify (op=0x8246fc0, rs=0x6301dce4) at modify.c:200
#8 0x080609bb in connection_operation (ctx=0x6301dd58, arg_v=0x8246fc0) at
connection.c:1133
#9 0x0811a8b4 in ldap_int_thread_pool_wrapper (xpool=0x81d0000) at tpool.c:478
#10 0x402adc51 in pthread_start_thread () from /lib/libpthread.so.0
#11 0x402addb4 in pthread_start_thread_event () from /lib/libpthread.so.0
#12 0x403b438a in clone () from /lib/libc.so.6
***** End of backtrace *****
lwartha(a)gmail.com wrote:
> I made stupid typo and use word acces instead of access in slapd.confl.
> Unforutnately slaptest -u does not display any notice about it and display
> config file testing succeeded.
Because up to 2.3 OpenLDAP didn't care about unrecognized statements.
If you run staptest with -dconfig you'll notice warnings (if you're
debugging slapd.conf you should; it is not the default because in most
cases people are only interested in a yes/no question; ITS#4930 was
filed to even remove the "success string").
>
> This of cause takes lot of investigation to find it.
>
> Thanks for fixing it in the future.
It's been changed in 2.4 ever since. Now all errors, including
unrecognized statements, cause a bailout.
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
Full_Name: Ladislav Wartha
Version: 2.3.27-4
OS: Fedora Core release 6
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (195.122.198.222)
Hi,
I made stupid typo and use word acces instead of access in slapd.confl.
Unforutnately slaptest -u does not display any notice about it and display
config file testing succeeded.
This of cause takes lot of investigation to find it.
Thanks for fixing it in the future.
Best regards,
Ladislav
hyc(a)symas.com wrote:
> ando(a)sys-net.it wrote:
>> When slapadd'ing -q, existing database log files seem to become unusable. If
>> this is correct, as it seems to be, slapadd could refuse to start with -q if log
>> files are present, or, for example, remove the logs if -qq.
>
> I guess we could add that check. The docs already say that if an error occurs,
> the entire database will be unusable. As such, you should only use it for
> initially populating a database, not for adding to an existing one.
The story is that I placed logs in a separate directory and I forgot to
clean them up when regenerating the DB after removing the database files :)
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
ando(a)sys-net.it wrote:
> When slapadd'ing -q, existing database log files seem to become unusable. If
> this is correct, as it seems to be, slapadd could refuse to start with -q if log
> files are present, or, for example, remove the logs if -qq.
I guess we could add that check. The docs already say that if an error occurs,
the entire database will be unusable. As such, you should only use it for
initially populating a database, not for adding to an existing one.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
To reproduce:
- set idlcache
- search one entry, so that the idl gets cached
- delete that entry, so that the idl gets cleared - but head/tail don't
- search another entry so that it gets cached - head/tail are corrupted
I've a fix for this about to come (affects 2.4.5 as well, sigh; not sure
about re23).
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Email: pierangelo.masarati(a)sys-net.it
---------------------------------------
Full_Name: Pierangelo Masarati
Version: since back-config
OS: irrelevant
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (193.203.230.29)
Submitted by: ando
Setting a negative number for cachesize causes an assertion in ch_calloc(); for
idlcachesize is just accepted; same for most remaining numeric data.
A neat solution would be to define ARG_UINT/ARG_ULONG, redefine types as
appropriate (now they're ints, but the negative value has no meaning or causes
crash). There seems to be room for 8 more types in the type mask.
Not a showstopper, though.