(ITS#8039) syncprov memory leak
by hyc@openldap.org
Full_Name: Howard Chu
Version: 2.4.40
OS:
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (110.49.207.42)
Submitted by: hyc
In a 3-way MMR setup I'm seeing a consistent leak of queued Entries in syncprov.
I'm not entirely sure of the exact sequence to reproduce it, but this is what I
have so far:
Test DB is 2.5GB on disk. One of the nodes (server #3) is configured with an mdb
maxsize of 1GB, so it will always fail to sync up. The other two have sufficient
maxsize configured.
While server #3 is running, stop server #1 and #2 and completely reinit their
DBs. rm the old files, rerun slapadd -w on server #1 (all new entryUUIDs and
entryCSNs). Start server #1. Server #3 will reconnect and start trying to update
itself and most of the updates will fail (as it runs out of space). Start server
#2 (with an empty DB). It will start syncing from server #1. syncprov leaks will
occur on both #2 and #3 during this time.
Part of the trigger here appears to be from slapadd'ing the DB on server #1
without using the -S <sid> argument, so the DB contents all have SID #0.
In this case updates get sent from #1 to #2 and #3 and syncprov on #2 and
#3ueueues them up for transmission to other nodes, respectively. Syncprov
doesn't know that it doesn't need to do this, because the SID #0 in the updates
doesn't match any of the servers' SID #1, 2, or 3.
The point in the code where the leak occurs is not obvious; the queued entries
are all refcounted and the refcount is incremented whenever an entry is matched
to an outbound queue. The count is decremented when a queue finishes sending the
entry. In this particular case, I don't believe the entry is matched to any
queue, so it should simply be freed again at the end of syncprov_matchops().
8 years, 10 months
(ITS#8038) syncrepl presentlist bug
by hyc@openldap.org
Full_Name: Howard Chu
Version: 2.4
OS:
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (180.180.122.131)
Submitted by: hyc
During a refresh the consumer maintains a presentlist containing the UUIDs of
all entries that exist on the provider. At the end of the refresh this list is
used to delete any entries on the consumer that don't exist on the provider.
In refreshAndPersist, at the end of the refresh phase, the presentlist is not
being freed as it should. It remains intact and unused until the persist session
breaks and is restarted.
8 years, 10 months
Re: (ITS#8023) slappasswd with sha2 overlay can generate hashes but not salted hashes
by freebsd@jonathanprice.org
I have now made progress in narrowing down the cause further.=0A=0AI have=
noticed that it is a regression between FreeBSD 9.x -> FreeBSD 10.x. For=
this reason, I will move any updates on this to the FreeBSD bug tracker,=
rather than the OpenLDAP one, as the bug is platform specific.=0A=0AFutu=
re news will be posted here: https://bugs.freebsd.org/bugzilla/show_bug.c=
gi?id=3D197004=0A=0AThank you for your time,=0A=0A-Jonathan=0A=0AJanuary =
22 2015 2:25 PM, freebsd(a)jonathanprice.org wrote: =0A> Sorry for the slow=
response, but I have made some progress with the issue.=0A> =0A> (as an =
aside, I installed a build from LTB, and unfortunately it does not=0A> co=
ntain this overlay)=0A> =0A> I have detailed my findings (including some =
trawling through the source)=0A> over on the FreeBSD bug tracker, as I su=
spect it could well be a platform=0A> related issue. Nonetheless, it migh=
t be worth reading:=0A> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=
=3D197004=0A> =0A> January 14 2015 4:31 PM, "Quanah Gibson-Mount" <quanah=
@zimbra.com> wrote:=0A> =0A>> --On Wednesday, January 14, 2015 11:00 AM +=
0000 freebsd(a)jonathanprice.org=0A>> wrote:=0A>> =0A>>> Hi,=0A>>> =0A>>> I=
tried 2.4.39 under FreeBSD and still had the same issue.=0A>>> =0A>>> I =
have also tried the packages for both CentOS 7 and Debian Wheezy, but=0A>=
>> unfortunately neither of them include the SHA2 overlay by default.=0A>=
>> =0A>>> Finally, I tried installing zimbra-core and zimbra-ldap under C=
entOS.=0A>>> When I used this installation, it worked successfully.=0A>>>=
=0A>>> I ran slapd -V on the zimbra installation, and it's 2.4.39. Howev=
er,=0A>>> based on it still not working on 2.4.39 on FreeBSD it appears t=
o have=0A>>> narrowed it down to two reasons: - An issue with the packagi=
ng under=0A>>> FreeBSD=0A>>> - The functionality is specific to Zimbra=0A=
>>> =0A>>> The next step in the process to narrow this down is to do a ma=
nual=0A>>> compilation on CentOS, including the SHA2 overlay. If this wor=
ks, then it=0A>>> would confirm it to be a FreeBSD issue, and if it doesn=
't work that would=0A>>> strongly suggest that Zimbra has modified someth=
ing.=0A>> =0A>> You could simply grab the LTB project builds. I'm pretty =
sure they build=0A>> out the contrib modules.=0A>> =0A>> In any case, I a=
lready noted that Zimbra doesn't patch anything in OpenLDAP=0A>> that wou=
ld affect this area.=0A>> =0A>> --Quanah=0A>> =0A>> --=0A>> =0A>> Quanah =
Gibson-Mount=0A>> Platform Architect=0A>> Zimbra, Inc. =0A>> ____________=
___________________=0A>> =0A>> Zimbra :: the leader in open source messag=
ing and collaboration
8 years, 10 months
(ITS#8037) Modifying structural OC w/relax fails on delta-syncrepl consumers
by ian@uns.ac.rs
Full_Name: Ivan Nejgebauer
Version: 2.4.41 Engineering
OS: Linux
URL: ftp://ftp.openldap.org/incoming/ivannejgebauer-150128.tgz
Submission from: (NULL) (2001:4170:2000:2:11e5:197a:fff8:8042)
If an ldapmodify which changes an entry's structural object class using the
Relax Rules control is successfully performed on the provider in a
provider/consumer pair running delta-syncrepl, the modification will fail on the
consumer because relax is not in effect when the consumer attempts to modify its
copy of the entry.
The attached archive, which should be extracted in the root of the OpenLDAP
source tree, contains scripts and data to replicate the issue. Steps to
reproduce:
$ sh relax-syncrel-test/conf-ldap-mdb && make depend && make
$ cd relax-syncrepl-test
$ make clean-all master replica
$ ./start-master.sh
$ ./mod-l-master.sh here # modifies an entry to prime accesslog
$ ./start-replica.sh # writes SYNC debugging to replica.log
$ ./mod-relax-master.sh # ldapmodify w/relax
$ tail replica.log # "entry failed schema check: ..."
$ ./stop-replica.sh
$ ./stop-master.sh
A trivial but indiscriminate fix is to activate Relax Rules for every modify op
on the consumer:
--- servers/slapd/syncrepl.c.orig 2015-01-22 03:02:09.000000000 +0100
+++ servers/slapd/syncrepl.c 2015-01-28 10:31:22.225060880 +0100
@@ -2349,6 +2349,7 @@
oes.oe_si = si;
LDAP_SLIST_INSERT_HEAD( &op->o_extra,
&oes.oe, oe_next );
}
+ op->o_relax = SLAP_CONTROL_CRITICAL;
rc = op->o_bd->be_modify( op, &rs );
if ( SLAP_MULTIMASTER( op->o_bd )) {
LDAP_SLIST_REMOVE( &op->o_extra,
&oes.oe, OpExtra, oe_next );
A real fix would involve modifying the persistent search to include reqControls
in its attribute list and activating Relax Rules on the consumer only if it had
been active on the provider when the modification occurred.
8 years, 10 months
back-mdb regression from ITS#7904
by hyc@openldap.org
Full_Name: Howard Chu
Version: 2.4.40
OS:
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (180.180.122.131)
Submitted by: hyc
The scopes array points to dn2id records from the previous read txn, some or all
of these may be invalid after the txn_reset/txn_renew in mdb_writewait. They
need to be refreshed in mdb_waitfixup. Patch coming shortly.
8 years, 10 months
(ITS#8035) syncrepl consumer memleak
by hyc@openldap.org
Full_Name: Howard Chu
Version: HEAD/RR24
OS:
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (180.180.122.131)
Submitted by: hyc
The consumer can leak a cookie if it gets an error and retries the sync session
and it was the initial refresh of the DB, i.e., it had no valid cookie or
contextCSN yet.
It can also leak an entry if it's in Persist phase and while attempting to add
an entry it gets a NO_SUCH_OBJECT error.
Fix coming shortly.
8 years, 10 months
Re: (ITS#8034) lmdb-0.9.14 | Undefined symbols for architecture ppc: "_posix_memalign" referenced from _mdb_env_copyfd2 | ld: symbol(s) not found for architecture ppc
by h.b.furuseth@usit.uio.no
On 24/01/15 10:10, Howard Chu wrote:
>> We could drop memalign. malloc(desired space + 1 OS page), then
>> adjust for alignment. On machines with a sane linear address
>> space where we can tell alignment from the address, anyway.
>
> Such an address space is already a requirement for LMDB, since mmaps
> are page-aligned.
I may have used the wrong word. mmap() must align, but that's no
reason (size_t)pointer has to look sane. If it wants, the machine
can still have PDP-endian descending address representation with
a checkbit as every 4th bit, while running in a mode with big-
endian integers, all of which shows up in (size_t)pointer.
Anyway, the branch now tries to test for sane-looking addresses.
8 years, 10 months
Re: (ITS#8034) lmdb-0.9.14 | Undefined symbols for architecture ppc: "_posix_memalign" referenced from _mdb_env_copyfd2 | ld: symbol(s) not found for architecture ppc
by hyc@symas.com
Hallvard Breien Furuseth wrote:
> On 23/01/15 18:10, hyc(a)symas.com wrote:
>>> LMDB should not be pulled separately from OpenLDAP. I.e., only the
>>> bundled
>>> version of LMDB should be used with a given version of OpenLDAP.
>>
>> I've already had this conversation with the gentoo maintainers; they
>> refuse to listen to reason. It's all their problem now.
>
> Still, memalign() is a problem. mdb.c defines HAVE_MEMALIGN,
> but it may be wrong for the user to -D"HAVE_MEMALIGN" since that
> may omit whatever #include file declares it. <malloc/malloc.h>
> (some Apple stuff I think), <malloc.h> dunno what else.
>
> We could drop memalign. malloc(desired space + 1 OS page), then
> adjust for alignment. On machines with a sane linear address
> space where we can tell alignment from the address, anyway.
Such an address space is already a requirement for LMDB, since mmaps are
page-aligned.
>
> On weirder hosts, you care about them, omit alignment altogether
> if posix_memalign is missing. And omit O_DIRECT/F_NOCACHE in
> mdb_env_copy2(). I gather those are why we need alignment.
Right.
> Branch "mdb/memalign" in <git://git.uio.no/u/hbf/openldap.git>
> has draft code.
Still don't have time to check this myself, may have an opportunity
tomorrow.
>
> Daniel: You can try that branch, and configure with
> CPPFLAGS="-DMDB_MEMALIGN_METHOD=2".
>
>
> I guess the preprocessor test should be "defined(a test for
> Darwin: __APPLE__, _MACOSX_ or..?)" and not __PPC__. PowerPC is
> an architecture, while features like posix_memalign are defined
> by compilers/operating systems. The macports issue disables mdb
> for Darwin. Or maybe not, I don't know Mac, Darwin, PPC or
> Gentoo, or if you just said this has been resolved, so I'll stay
> out of of that issue.
>
Right. I thought you (OP) were talking about Gentoo Linux on PPC, not
MacOSX on PPC. Linux on PPC should be using the same glibc as any other
architecture, and as such ought to already have posix_memalign.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
8 years, 10 months
Re: (ITS#8018) a lot of warnings building with -Wall
by h.b.furuseth@usit.uio.no
On 24/01/15 07:26, leo(a)yuriev.ru wrote:
> Of course, I understand that __VA_ARGS__ are used only for debugging
> currently.
Aha, a misunderstanding there. Debug() is _not_ just for
debugging, it's also for logging. "Rebus" coding, as you say:-(
20 years of cruft in the code. I see you've talked about that in
ITS#8015, so I'll answer the rest there instead.
> But this is makes me think that recent development and testing of
> OpenLDAP are carried out only in today's environment (modern Linux and gcc).
Certainily not. _Mainly_ on a few platforms, probably. People
submit bug reports about other platforms, Howard once in a while
says "I tested on <that platform> and did/did not find your
problem", etc. See the commit logs and mailinglists, and the
Build Environment part of CHANGES in REL_ENG: gcc-isms and other
stuff creep in, and get thrown out again.
> Could anyone say about the list of a system-compiler pairs with which
> OpenLDAP was tested or just successfully compiled?
Not to my knowledge. We just try to keep the code portable,
within some constraints like we don't support weirdness like
32-bit 'char'.
> For instance, in our fork we plan limit the support to modern systems
> with gcc, clang and last msvc compilers.
That's incompatible with our coding practice. We just try to keep
the code portable, within some constraints like we don't support
weirdness like 32-bit 'char' and some religious disagreements
about what portability is. Anyway, if some code isn't portable,
it needs a good excuse for getting in and staying in.
Note that lots of unportable code can be writtene portably,
or at least it can keep the unportability in one place.
E.g. your repo's change of /bin/sh to bash is not an option
for us. But maybe tests/run.in and tests/scripts/all could
be changed to optionally invoke them with another shell.
--
Hallvard
8 years, 10 months