Fwd: (ITS#5661) contextCSN gets corrupted on the stand by mirror
by ghenry@OpenLDAP.org
------=_Part_5_5686568.1219150476998
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
For Ticket records. Please keep to openldap-its
----- Forwarded Message -----
From: "ali pouya" <ali.pouya(a)free.fr>
To: ghenry(a)OpenLDAP.org
Sent: Tuesday, 19 August, 2008 1:48:53 PM GMT +00:00 GMT Britain, Ireland, Portugal
Subject: Re: (ITS#5661) contextCSN gets corrupted on the stand by mirror
Hi Gavin;
Below you find the answers to your questions :
> Can we get your bdb version, your config and the logs of an empty mirrormode
> node B pulling in the data loaded in mirrormode A (posted/hosted online
> somewhere).
The BDB version is 4.6.21.
You find here attached the file conf.tar.gz containing the configuration of B.
The file syncrepl.conf.simple works well, but the file syncrepl.conf.double
garbles the contextCSN (I write more than 1000 entries per minute).
Do you want a log for the 10 million entries ? Which loglevel ?
The problem only happens if there are write operations on A, not if the server A
is stationary.
>
> Also, has this always happened on the same machine? What are the specs of the
> servers?
The problem happens on the stand by server : If I write on B the contextCSN of
A gets corrupted (I have already tested this).
My servers are quadri-processor Xeon 2.2 GHz.
I think this is not related to the hardware but the "year" part of contextCSN is
not well protected against concurrent operations (?).
>
> Is this a fresh install?
Yes for 2.4.11, but I use OpenLdap since 5 years for my different projects.
Best Regards
Ali
>
--
Kind Regards,
Gavin Henry.
OpenLDAP Engineering Team.
E ghenry(a)OpenLDAP.org
Community developed LDAP software.
http://www.openldap.org/project/
------=_Part_5_5686568.1219150476998
Content-Type: application/x-gzip; name=conf.tar.gz
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=conf.tar.gz
H4sIAJrAqkgAA+2Z3VMbNxDAeT39FRp4SELJfWI7YeYeaNym6aTUKSXN9E0+ybbSO+miD4Lb6f/e
vTsbjI1rHLAZEv2YwWd97Z6kXa3WfrCzcUKg02rVn8D8Z/0cRWErbh1CwwTKO52wvYNbm1dtZ8dq
QxTGO0pK83/tVtU/UvxAj0WmWJn7mRQDX/OizNn9yqgWuH14uHT9D8PW9fWPkk6Y7ODwftW4mW98
/ffQdP2x4jQNwxjhUslzTplKc0rKoyCIQj9qtf3EjzrJUfLiJcJmXLJUsYFienQsaI8pzbVBWDGj
xuluFOPvdhHWjKhs1CeapbtZOlBQNOC5gYF3n8r+R5aZVznROt1/VjXOJIypbb96HLGCwL/sLy6G
qRwMEO5zQQtmRpKmzRZtiqiAoUVavcOBtKCS1tKqjOnndMirkoIN4DMdSnt+MNEB3pYyYTjJdd0R
oYdehAcE7D8nJa2Nf1MyVth/EiVT+0/gmIjB/qN2J3b2vw327gbawwv0lPzIjMZECEu4YpgynBPc
ff1moSn0v7P8H3k24kxVUqpNzIdWEcOlqMU+uVTi15KJt93jHi7BQczIn21DmcYlUYZnNoch9eK7
bUD/95XzBHWPcOh3VkpckI+7xFSaKwEaM1xIygc8a2YAw6DRyyBsB3EYvljS/9gaZtWSEY7wcc5x
T9ox2cz7c5HlljLPC6xWQS4zkgfMZAEcBqI6foLmMAgyqZjfPK/RR3Oxdi/oYqQalrAsUqzbuSxl
zrPxut3gsMpkUdj15V3t1mlXtJfLYc7OWe49jxAqOYVD9/pI50QFyoqJ64cWiKihrpstbVW1QKv1
Ihkcvro+ThCixJDq+MdeIQWHWb0q8vq0jzRT50y96Xox0nYw4BeeNwkUKm9LhefB2U5owcXtDve6
W/nZw94/p6c/Hf8bvbMn3T+S8tO7bu9lYn7Jxh9k+1X+5/c/x+97rQ+vEQWrz0CvsRdA0EMtNzoA
xYJKy9XvCvEHu5i8ah2tlJIL40XNsYbbIdq7oTgKbzGNOS9Al8nYqzfDbAiPCq6UVGDGMO9SICRh
jnMyxlWrKrRD04fnV9rhWr1Kt8tKDdMNLgA2U1N5VSNkCWsBIZT3+29nP1yVK5ZLQkd8WvEtx1WP
hfn7H5W2v+X7XwR/c/Ff0m63XPy3DRbuf9F93P/Cx3X/W+sK3HJXYMdXw5z/34iM2+f/Kv8fgf+P
D2N3/98KLv/3bTs/f/aqsSEZK+2/fThv/2Encva/DZrVx5gKn12QzKxrTtjwgqVW1OMwijX/e+Yr
QpPxvWvjw14TYP3Mqk1KATd2Ma6GI2WZTxJauvo+m+L7EnEPvWb3iT+TxdiUjFX3v6QdLeb/3f1v
K9w9f73AlvP/V7yFGKRJ4td7GlvDc67hKyVC45zhjNBGndn893We4vzJ8dLfA55tQP/Z/H/7xiz/
ch5//h/VS+XNhIMHEJup8avTk+bh7OxN1/M89gntNU2pINkpUxC+ndiiz5TX1DaVllPPu1aSiWnB
QRVcflXO+x7wZzP2G5Kxyv/HnXjO/8dhEjr/vw1Qs/zYSEyMUTq1mqkemOFnqSiqbLw/Bk8uxbiQ
Fny6NaNpKQRbk7vdehGjYoTeNMZlvLayCxhyuvtFgd0y2etHpJ8VxILTofYxzBBDM5O5f5OUNX7K
mhNwj7o+9mWbzvVDW47D4XA4HA6Hw+FwOBwOh8PhcDgcDofD4XA4HA6Hw+FwOBwPx39608HTAFAA
AA==
------=_Part_5_5686568.1219150476998--
15 years, 1 month
Trans.: Re: (ITS#5661) contextCSN gets corrupted on the stand by mirror
by ali.pouya@free.fr
----- Message transféré de ali.pouya(a)free.fr -----
Date : Tue, 19 Aug 2008 14:48:53 +0200
De : ali.pouya(a)free.fr
Adresse de retour :ali.pouya@free.fr
Sujet : Re: (ITS#5661) contextCSN gets corrupted on the stand by mirror
À : ghenry(a)OpenLDAP.org
Hi Gavin;
Below you find the answers to your questions :
> Can we get your bdb version, your config and the logs of an empty mirrormode
> node B pulling in the data loaded in mirrormode A (posted/hosted online
> somewhere).
The BDB version is 4.6.21.
You find here attached the file conf.tar.gz containing the configuration of B.
The file syncrepl.conf.simple works well, but the file syncrepl.conf.double
garbles the contextCSN (I write more than 1000 entries per minute).
Do you want a log for the 10 million entries ? Which loglevel ?
The problem only happens if there are write operations on A, not if the server A
is stationary.
>
> Also, has this always happened on the same machine? What are the specs of the
> servers?
The problem happens on the stand by server : If I write on B the contextCSN of
A gets corrupted (I have already tested this).
My servers are quadri-processor Xeon 2.2 GHz.
I think this is not related to the hardware but the "year" part of contextCSN is
not well protected against concurrent operations (?).
>
> Is this a fresh install?
Yes for 2.4.11, but I use OpenLdap since 5 years for my different projects.
Best Regards
Ali
>
----- Fin du message transféré -----
15 years, 1 month
Re: (ITS#5661) contextCSN gets corrupted on the stand by mirror
by ghenry@OpenLDAP.org
> I think there is a documentation issue for OpenLdap 2.4.11 :
> The chapter 17.4.4 of the Admin Guide recommends configuring TWO
> sycrepl
> directives for each mirror side. If I do so, the contextCSN of the
> stand by
> mirror gets corrupted very easily. But if I confugure the mirrors
> with only ONE
> syncrepl directive it's OK.
The documentation is correct.
> The test environment :
> I have a test directory with two mirrors A (sid=1) and B (sid=2)
> configured as
> recommended in the Admin's Guide, and a replica C connected to A.
> The directory contains 10 million objects, and I use the server A for
> writing
> 500 000 new ones.
>
> Very often and without any apparent reason the contextCSN in the
> memory of B
> gets suddenly corrupted while those of A and C are OK.
> In this situation the contextCSN of B gets stuck but B continues to
> receive data
> from A.
>
> The value of contextCSN in base 64 is :
>
> contextCSN: 20080727021429.070493Z#000000#000#000000
> contextCSN:: +HYDCTA4MDIwMzM3MTguMzAwMTExWiMwMDAwMDAjMDAxIzAwMDAwMA==
perl -MMIME::Base64 -e 'print decode_base64("+HYDCTA4MDIwMzM3MTguMzAwMTExWiMwMDAwMDAjMDAxIzAwMDAwMA=="), "\n";'
does look very funny :-(
Can we get your bdb version, your config and the logs of an empty mirrormode
node B pulling in the data loaded in mirrormode A (posted/hosted online somewhere).
Also, has this always happened on the same machine? What are the specs of the servers?
Is this a fresh install?
--
Kind Regards,
Gavin Henry.
T +44 (0) 1224 279484
M +44 (0) 7930 323266
F +44 (0) 1224 824887
E ghenry(a)suretecsystems.com
Open Source. Open Solutions(tm).
http://www.suretecsystems.com/
15 years, 1 month
(ITS#5661) contextCSN gets corrupted on the stand by mirror
by ali.pouya@free.fr
Full_Name: Ali Pouya
Version: 2.4.11
OS: Linux 2.6
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (145.242.11.4)
I think there is a documentation issue for OpenLdap 2.4.11 :
The chapter 17.4.4 of the Admin Guide recommends configuring TWO sycrepl
directives for each mirror side. If I do so, the contextCSN of the stand by
mirror gets corrupted very easily. But if I confugure the mirrors with only ONE
syncrepl directive it's OK.
The test environment :
I have a test directory with two mirrors A (sid=1) and B (sid=2) configured as
recommended in the Admin's Guide, and a replica C connected to A.
The directory contains 10 million objects, and I use the server A for writing
500 000 new ones.
Very often and without any apparent reason the contextCSN in the memory of B
gets suddenly corrupted while those of A and C are OK.
In this situation the contextCSN of B gets stuck but B continues to receive data
from A.
The value of contextCSN in base 64 is :
contextCSN: 20080727021429.070493Z#000000#000#000000
contextCSN:: +HYDCTA4MDIwMzM3MTguMzAwMTExWiMwMDAwMDAjMDAxIzAwMDAwMA==
I note that only the part indicating the year (2008) is garbled. May be this
part is handled differently ?
At service shutdown B writes the corrupt contextCSN to the disk.
At service startup B reads the corrupt contextCSN from the disk and begins to
scan ALL of the data base.
Also it sends a sync request to A (a persitent search containing the corrupt
contextCSN in the control field) causing A to scan the WHOLE data base.
The replica C remains safe.
If I reverse the roles of A and B the corruption occurs on A (always on the
stand by mirror).
I have already encountered the contextCSN corruption problem in OpenLdap 2.3 and
this was one of my reasons to migrate to 2.4.11.
Thanks for your HELP
Best Regards
Ali Pouya
15 years, 1 month
(ITS#5659) patch for collect overlay
by brett.maxfield@gmail.com
Full_Name: Brett Maxfield
Version: HEAD
OS: Linux suse 2.6.22.18-0.2-xen #1 SMP 2008-06-09 13:53:20 +0200 x86_64 x86_64 x86_64 GNU/Linux
URL: ftp://ftp.openldap.org/incoming/brett-maxfield-080817.tgz
Submission from: (NULL) (220.245.180.135)
Fixes bug with overlay system where it would inherit/collect attributes to the
entry that was supplying those attributes, causing duplicate attributes in that
node.
Also add a alternate config format for easy entry, eg old format:
overlay collect
collectinfo <dn> <attribute>
and now in addition to above also supports :
overlay collect
collectinfo <dn> <attribute1>,<attribute2>,<attributeN>
15 years, 1 month
Re: (ITS#5658) slapo-perl symbol updates
by rra@stanford.edu
rra(a)stanford.edu writes:
> It's a general change to the libperl API and may not only affect HPPA.
> You're apparently now required to call those macros, and in the future
> they may have more effects.
Here is a patch, untested alas because I don't have time at the moment to
set up something with back-perl. It compiles. I'm not fully sure that
the handling of embedded and argc is correct.
Perl 5.10 (and some earlier versions) require calling some additional
macros around Perl interpreter setup and shutdown. Not doing these
calls causes problems on HPPA at least, and may affect other platforms
in the future. This patch adds the additional code modelled on the
perlembed man page and a working patch for INN.
Debian Bug#495069
ITS #5658
--- openldap.orig/servers/slapd/back-perl/close.c
+++ openldap/servers/slapd/back-perl/close.c
@@ -30,6 +30,9 @@
{
perl_destruct(PERL_INTERPRETER);
perl_free(PERL_INTERPRETER);
+#ifdef PERL_SYS_TERM
+ PERL_SYS_TERM();
+#endif
PERL_INTERPRETER = NULL;
ldap_pvt_thread_mutex_destroy( &perl_interpreter_mutex );
--- openldap.orig/servers/slapd/back-perl/init.c
+++ openldap/servers/slapd/back-perl/init.c
@@ -37,6 +37,7 @@
)
{
char *embedding[] = { "", "-e", "0" };
+ int argc = 3;
bi->bi_open = NULL;
bi->bi_config = 0;
@@ -77,9 +78,15 @@
ldap_pvt_thread_mutex_init( &perl_interpreter_mutex );
+#ifdef PERL_SYS_INIT3
+ PERL_SYS_INIT3(&argc, &embedding, (char **)NULL);
+#endif
PERL_INTERPRETER = perl_alloc();
perl_construct(PERL_INTERPRETER);
- perl_parse(PERL_INTERPRETER, perl_back_xs_init, 3, embedding, (char **)NULL);
+#ifdef PERL_EXIT_DESTRUCT_END
+ PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
+#endif
+ perl_parse(PERL_INTERPRETER, perl_back_xs_init, argc, embedding, (char **)NULL);
perl_run(PERL_INTERPRETER);
return 0;
}
--
Russ Allbery (rra(a)stanford.edu) <http://www.eyrie.org/~eagle/>
15 years, 1 month
Re: (ITS#5658) slapo-perl symbol updates
by rra@stanford.edu
quanah(a)OpenLDAP.org writes:
> Full_Name: Quanah Gibson-Mount
> Version: 2.4.11
> OS: NA
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (69.109.79.65)
>
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=495069
>
> Not sure how much we care about hppa architecture.
It's a general change to the libperl API and may not only affect HPPA.
You're apparently now required to call those macros, and in the future
they may have more effects.
What we did for INN was add:
#ifdef PERL_SYS_INIT3
PERL_SYS_INIT3(&argc, &argv, &env);
#endif
before perl_alloc,
#ifdef PERL_EXIT_DESTRUCT_END
PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
#endif
after perl_construct, and:
#ifdef PERL_SYS_TERM
PERL_SYS_TERM();
#endif
after perl_free. See man perlembed; with Perl 5.10, its example now
reads:
int main(int argc, char **argv, char **env)
{
PERL_SYS_INIT3(&argc,&argv,&env);
my_perl = perl_alloc();
perl_construct(my_perl);
PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
perl_parse(my_perl, NULL, argc, argv, (char **)NULL);
perl_run(my_perl);
perl_destruct(my_perl);
perl_free(my_perl);
PERL_SYS_TERM();
}
Notice that we don't use the "env" pointer. Normally handed to
"perl_parse" as its final argument, "env" here is replaced by
"NULL", which means that the current environment will be used. The
macros PERL_SYS_INIT3() and PERL_SYS_TERM() provide system-specific
tune up of the C runtime environment necessary to run Perl
interpreters; since PERL_SYS_INIT3() may change "env", it may be
more appropriate to provide "env" as an argument to perl_parse().
--
Russ Allbery (rra(a)stanford.edu) <http://www.eyrie.org/~eagle/>
15 years, 1 month
(ITS#5658) slapo-perl symbol updates
by quanah@OpenLDAP.org
Full_Name: Quanah Gibson-Mount
Version: 2.4.11
OS: NA
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (69.109.79.65)
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=495069
Not sure how much we care about hppa architecture.
As described in the 'perlembed' document, programs embedding Perl
must use the PERL_SYS_INIT3() and PERL_SYS_TERM() macros to provide
system-specific tune up of the C runtime environment necessary to run
Perl interpreters.
Your package has been identified as failing this:
- at least one of the binary packages built from the source depends
on libperl5.10
- the unpacked source matches 'perl_parse' but not 'PERL_SYS_INIT3'
As a consequence, the embedded Perl interpreter is most probably
broken on the hppa architecture, where PERL_SYS_INIT3() is needed for
initializing lock structures. Without this, calling perl_parse() will
hang inside pthread_mutex_lock(). See #486069 for more information.
15 years, 1 month
(ITS#5657) Broken symlinks in /etc/ssl/certs prevent ssl connection
by Martin.vGagern@gmx.net
Full_Name: Martin von Gagern
Version: 2.3.43
OS: Gentoo Linux
URL: https://bugs.gentoo.org/show_bug.cgi?id=234816
Submission from: (NULL) (84.188.31.84)
On systems with broken symlinks in the default certificate directory of openssl,
I found it impossible to establish an ldaps connection with openldap.
I originally reported this issue as
https://bugs.gentoo.org/show_bug.cgi?id=234816
I first encountered this when I got informed by luma that I could not connect to
a server. I then reproduced this on the command line:
$ ldapsearch -H ldaps://ldap.domain.tld
ldap_sasl_interactive_bind_s: Can't contact LDAP server (-1)
$ ldapsearch -d 15 -H ldaps://ldap.domain.tld
...
ldap_connect_timeout: fd: 3 tm: -1 async: 0
TLS: could not load client CA list (file:`',dir:`/etc/ssl/certs/').
TLS: error:0906D06C:PEM routines:PEM_read_bio:no start line pem_lib.c:647
... previous line repeated 13 times ...
TLS: error:02001002:system library:fopen:No such file or directory
bss_file.c:356
TLS: error:20074002:BIO routines:FILE_CTRL:system lib bss_file.c:358
ldap_perror
ldap_sasl_interactive_bind_s: Can't contact LDAP server (-1)
Sadly it does not say what file can't be opened. The "no start line" errors seem
to be non-fatal. The fopen error in bss_file.c from dev-libs/openssl on the
other hand seems to cause the connection to fail.
Root of the cause seemed to be several broken symlinks in /etc/ssl/certs.
Steps to solve the issue by removing those links:
# find /etc/ssl/certs -type l ! -xtype f ! -xtype d -ok rm -f {} \;
Steps to reproduce the issue, provided you have an SSL-Enabled LDAP server:
# ln -s bar.pem /etc/ssl/certs/foo.pem
$ ldapsearch -d 15 -H ldaps://ldap.domain.tld
Ideally openldap should trat these errors as non-fatal, the way s_client does,
so that such a broken symlink won't prevent all connections.
If that path towards solution proves infeasible (as the openssl function
SSL_add_dir_cert_subjects_to_stack seems to behave this way and adding all files
manually might be a pain), at least there should be some error message telling
the user that the problem lies in the local SSL certificates setup, not really
in the network connection or remote configuration.
By the way, I'm using openssl version 0.9.8h.
15 years, 1 month