ando(a)sys-net.it wrote:
> emmanuel.duru(a)atosorigin.com wrote:
>> I see that tm->tm_usub is negative, there seems to be overflows between
>> LARGE_INTEGER and int variables.
It would take over 4 billion operations in a single timer tick (on the order
of nanoseconds) to make tm_usub overflow. That seems pretty unlikely.
> If the problem disappears by initializing the static variables in
> lutil_gettime(), then it might be a compiler issue.
I suppose that's always possible...
The original post shows that the tm_usec field is negative. That could happen
if the offset we computed between the SYSTEMTIME and the PerformanceCounter
was wrong, or if the SYSTEMTIME was adjusted while the process was running.
What version of Windows are you running? 32 or 64 bit? Can you singlestep
through this function with a debugger and verify all of the values? I haven't
run this code on Windows in a long time, would take a bit of effort to
resurrect my build environment.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
I'm sorry I thought this was an initialization issue, but this is not,
Howard Chu is right.
After some debugging, it is the tm_usec part which is negative (not the
tm_usub, my mistake), tm_usec is computed from the large integers.
> -----Message d'origine-----
> De : Pierangelo Masarati [mailto:ando@sys-net.it]
> Envoyé : mercredi 27 août 2008 12:23
> À : emmanuel.duru(a)atosorigin.com
> Cc : openldap-its(a)openldap.org
> Objet : Re: (ITS#5668) Invalid entryCSN generated, and slapd will not
> restart
>
> emmanuel.duru(a)atosorigin.com wrote:
> > I see that tm->tm_usub is negative, there seems to be overflows between
> > LARGE_INTEGER and int variables.
>
> If the problem disappears by initializing the static variables in
> lutil_gettime(), then it might be a compiler issue.
>
> p.
>
>
> Ing. Pierangelo Masarati
> OpenLDAP Core Team
>
> SysNet s.r.l.
> via Dossi, 8 - 27100 Pavia - ITALIA
> http://www.sys-net.it
> -----------------------------------
> Office: +39 02 23998309
> Mobile: +39 333 4963172
> Fax: +39 0382 476497
> Email: ando(a)sys-net.it
> -----------------------------------
emmanuel.duru(a)atosorigin.com wrote:
> I see that tm->tm_usub is negative, there seems to be overflows between
> LARGE_INTEGER and int variables.
If the problem disappears by initializing the static variables in
lutil_gettime(), then it might be a compiler issue.
p.
Ing. Pierangelo Masarati
OpenLDAP Core Team
SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
-----------------------------------
Office: +39 02 23998309
Mobile: +39 333 4963172
Fax: +39 0382 476497
Email: ando(a)sys-net.it
-----------------------------------
Fixed in HEAD.
Thanks.
--
Kind Regards,
Gavin Henry.
OpenLDAP Engineering Team.
E ghenry(a)OpenLDAP.org
Community developed LDAP software.
http://www.openldap.org/project/
emmanuel.duru(a)atosorigin.com wrote:
> Full_Name: Emmanuel Duru
> Version: 2.4.11
> OS: Windows
> URL:
> Submission from: (NULL) (80.78.0.137)
>
>
> On Windows, slapd generates entryCSN values such as:
> 20080822124130.-657205Z#000000#000#000000 (the problem is the minus sign).
> This is also the case when generating the cn=config branch from a slapd.conf
> file.
> Following this, slapd will not restart, because it checks the validity of
> entryCSN values on cn=config branch at startup.
> I believe that the problem comes from non initialized static variables in
> lutil_gettime() function.
> Do notice that initializations are also missing in non WIN32 section.
No. The C standard specifies that global and static variables are initialized
to zero by default. If you're a C programmer you should know this already.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu wrote:
> ali.pouya(a)free.fr wrote:
>> I precise that I have set the following directives (I have no delete operation
>> in my directory and I wish to avoid the present phase to be engaged) :
>>
>> syncprov-nopresent TRUE
>> syncprov-reloadhint TRUE
>>
>> If I comment out at least one of these directives then the problem disapears
>> (the object o1 is present in the replica).
>
> You're tripping over a behavior change that was made to fix ITS#5493. For now,
> you should only use both of those settings together if the underlying database
> is an accesslog DB (which always returns entries in modification order). We
> should probably use some other method of detecting the accesslog...
In fact, just drop the syncprov-reloadhint setting and you'll be fine. (Since
you never have delete operations, reloads will never occur anyway.)
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
ali.pouya(a)free.fr wrote:
> Full_Name: Ali Pouya
> Version: 2.4.11
> OS: Linux 2.6 (Fedora)
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (145.242.11.3)
>
>
> I have a directory with a master and one replica in RefreshAndPersist mode.
>
> The replica is synchronized with the master. I stop it for making a backup.
> During that time I do three write operations on the master :
>
> I add a new object o1,
> then I modify an already existing object o2,
> and finally I add another new object o3.
>
> After service startup, the replica gets synchronized with the master, and the
> contextCSN attributes are the same on both servers.
> But the object o1 is missing in the replica !
>
> More investigation shows that the sync provider sends objects to the consumer in
> createTimestamp order.
> In other words the sync information is sent in this order : o2, then o1, then
> o3.
>
> After getting o2, the consumer rejects o1 (which has now a smaller entryCSN)
> with this message in the log:
>
> ....
> do_syncrep2: cookie=rid=002,csn=20080822130259.472005Z#000000#001#000000
> do_syncrep2: rid=002 CSN too old, ignoring
> 20080822130259.472005Z#000000#001#000000
> ldap_msgfree
> ...
>
> I think sync data would better be sent in entryCSN order rather than in
> createTimestamp order.
That is not possible, nor is it supposed to be necessary with the way the
syncrepl protocol was designed. During a refresh (which occurs at server
startup time, at least) the consumer's context is not updated until all of the
modified entries have been received. So, this particular comparison should
never fail like this.
> I precise that I have set the following directives (I have no delete operation
> in my directory and I wish to avoid the present phase to be engaged) :
>
> syncprov-nopresent TRUE
> syncprov-reloadhint TRUE
>
> If I comment out at least one of these directives then the problem disapears
> (the object o1 is present in the replica).
You're tripping over a behavior change that was made to fix ITS#5493. For now,
you should only use both of those settings together if the underlying database
is an accesslog DB (which always returns entries in modification order). We
should probably use some other method of detecting the accesslog...
> The environment and configuration files are the same as for ITS 5661.
> Of course I can provide any other information required.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Full_Name: Ali Pouya
Version: 2.4.11
OS: Linux 2.6 (Fedora)
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (145.242.11.3)
I have a directory with a master and one replica in RefreshAndPersist mode.
The replica is synchronized with the master. I stop it for making a backup.
During that time I do three write operations on the master :
I add a new object o1,
then I modify an already existing object o2,
and finally I add another new object o3.
After service startup, the replica gets synchronized with the master, and the
contextCSN attributes are the same on both servers.
But the object o1 is missing in the replica !
More investigation shows that the sync provider sends objects to the consumer in
createTimestamp order.
In other words the sync information is sent in this order : o2, then o1, then
o3.
After getting o2, the consumer rejects o1 (which has now a smaller entryCSN)
with this message in the log:
....
do_syncrep2: cookie=rid=002,csn=20080822130259.472005Z#000000#001#000000
do_syncrep2: rid=002 CSN too old, ignoring
20080822130259.472005Z#000000#001#000000
ldap_msgfree
...
I think sync data would better be sent in entryCSN order rather than in
createTimestamp order.
I precise that I have set the following directives (I have no delete operation
in my directory and I wish to avoid the present phase to be engaged) :
syncprov-nopresent TRUE
syncprov-reloadhint TRUE
If I comment out at least one of these directives then the problem disapears
(the object o1 is present in the replica).
The environment and configuration files are the same as for ITS 5661.
Of course I can provide any other information required.
Best Regards
Ali Pouya