Q: "dn: cn=Uptime,cn=Time,cn=Monitor" non-monotonic occasionally?

List overview All Threads
Download

newer

older

Re: Export issue after an DN change

ACLs: restrict by IP and user

Ulrich Windl

28 Apr 2016 28 Apr '16

1:48 a.m.

Hi!

I'm running a NRPE monitor that monitors openLDAP. To compute delta values, the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by openLDAP to compare it with the current value (to build the time difference). If the time difference is negative, the plugin reports a "server restart" event.

So far, so good...

However, very rarely (once every few weeks) a false server restart is reported. Viewing the historic logs, I can only assume that openLDAP reports a bad uptime value now and then. Could this be?

Example plugin logs: 2016/04/28 10:25:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.246], [3 searches with 12 entries in 0.057s], 2340577, 0.10084, 0.07563, 0.35294, 0, 0, 0, 0, 0, 0, 0.08403, 0.61345 2016/04/28 10:23:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.288], [3 searches with 12 entries in 0.052s], 2340458, 0.09917, 0.08264, 0.32231, 0, 0, 0, 0, 0, 0, 0.08264, 0.58678 2016/04/28 10:21:14 LDAP: Operations Rate UNKNOWN HARD 1 UNKNOWN: [Lock.Delay=0.221], [3 searches with 12 entries in 0.052s], 2340337, [Operations.Bind.i: restart detected (2340337 - 2340341 == -4)], 284553, [Operations.Unbind.i: restart detected (2340337 - 2340341 == -4)], 205927, [Operations.Search.i: restart detected (2340337 - 2340341 == -4)], 985106, [Operations.Compare.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modify.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modrdn.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Add.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Delete.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Abandon.i: restart detected (2340337 - 2340341 == -4)], 4, [Operations.Extended.i: restart detected (2340337 - 2340341 == -4)], 245199, [Operations.i: restart detected (2340337 - 2340341 == -4)], 1720790 2016/04/28 10:19:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.501], [3 searches with 12 entries in 0.060s], 2340341, 0.13333, 0.09167, 0.65000, 0, 0, 0, 0, 0, 0, 0.11667, 0.99167

(The first number after the messages in square brackets is the uptime, then come the operations rate for several LDAP operations) At 10:19:14 the uptime was 2340341, and at 10:25:14 the uptime was 2340577. That could be OK, but at 10:21:14 the uptime was reported as 2340337 which is less than reported at 10:19:14.

Has there ever been such an issue with uptime?

Regards, Ulrich

Show replies by date

Michael Ströder

28 Apr 28 Apr

10:26 a.m.

New subject: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

Ulrich Windl wrote:

...

I'm running a NRPE monitor that monitors openLDAP. To compute delta values, the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by openLDAP to compare it with the current value (to build the time difference). If the time difference is negative, the plugin reports a "server restart" event.

Uptime is the least usable time value for this task. Anyway...

...

So far, so good...

However, very rarely (once every few weeks) a false server restart is reported. Viewing the historic logs, I can only assume that openLDAP reports a bad uptime value now and then. Could this be?

Example plugin logs: 2016/04/28 10:25:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.246], [3 searches with 12 entries in 0.057s], 2340577, 0.10084, 0.07563, 0.35294, 0, 0, 0, 0, 0, 0, 0.08403, 0.61345 2016/04/28 10:23:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.288], [3 searches with 12 entries in 0.052s], 2340458, 0.09917, 0.08264, 0.32231, 0, 0, 0, 0, 0, 0, 0.08264, 0.58678 2016/04/28 10:21:14 LDAP: Operations Rate UNKNOWN HARD 1 UNKNOWN: [Lock.Delay=0.221], [3 searches with 12 entries in 0.052s], 2340337, [Operations.Bind.i: restart detected (2340337 - 2340341 == -4)], 284553, [Operations.Unbind.i: restart detected (2340337 - 2340341 == -4)], 205927, [Operations.Search.i: restart detected (2340337 - 2340341 == -4)], 985106, [Operations.Compare.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modify.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modrdn.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Add.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Delete.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Abandon.i: restart detected (2340337 - 2340341 == -4)], 4, [Operations.Extended.i: restart detected (2340337 - 2340341 == -4)], 245199, [Operations.i: restart detected (2340337 - 2340341 == -4)], 1720790 2016/04/28 10:19:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.501], [3 searches with 12 entries in 0.060s], 2340341, 0.13333, 0.09167, 0.65000, 0, 0, 0, 0, 0, 0, 0.11667, 0.99167

(The first number after the messages in square brackets is the uptime, then come the operations rate for several LDAP operations) At 10:19:14 the uptime was 2340341, and at 10:25:14 the uptime was 2340577. That could be OK, but at 10:21:14 the uptime was reported as 2340337 which is less than reported at 10:19:14.

Has there ever been such an issue with uptime?

I don't want to preclude anything but I personally never experienced an issue with the uptime value. Especially it's way more likely that something in your system environment caused slapd to restart. Only you can find out.

Ciao, Michael.

Howard Chu

10:54 a.m.

New subject: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

Michael Ströder wrote:

...

Ulrich Windl wrote:

...
I'm running a NRPE monitor that monitors openLDAP. To compute delta values, the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by openLDAP to compare it with the current value (to build the time difference). If the time difference is negative, the plugin reports a "server restart" event.

Uptime is the least usable time value for this task. Anyway...

...
So far, so good...

However, very rarely (once every few weeks) a false server restart is reported. Viewing the historic logs, I can only assume that openLDAP reports a bad uptime value now and then. Could this be?

Example plugin logs: 2016/04/28 10:25:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.246], [3 searches with 12 entries in 0.057s], 2340577, 0.10084, 0.07563, 0.35294, 0, 0, 0, 0, 0, 0, 0.08403, 0.61345 2016/04/28 10:23:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.288], [3 searches with 12 entries in 0.052s], 2340458, 0.09917, 0.08264, 0.32231, 0, 0, 0, 0, 0, 0, 0.08264, 0.58678 2016/04/28 10:21:14 LDAP: Operations Rate UNKNOWN HARD 1 UNKNOWN: [Lock.Delay=0.221], [3 searches with 12 entries in 0.052s], 2340337, [Operations.Bind.i: restart detected (2340337 - 2340341 == -4)], 284553, [Operations.Unbind.i: restart detected (2340337 - 2340341 == -4)], 205927, [Operations.Search.i: restart detected (2340337 - 2340341 == -4)], 985106, [Operations.Compare.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modify.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modrdn.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Add.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Delete.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Abandon.i: restart detected (2340337 - 2340341 == -4)], 4, [Operations.Extended.i: restart detected (2340337 - 2340341 == -4)], 245199, [Operations.i: restart detected (2340337 - 2340341 == -4)], 1720790 2016/04/28 10:19:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.501], [3 searches with 12 entries in 0.060s], 2340341, 0.13333, 0.09167, 0.65000, 0, 0, 0, 0, 0, 0, 0.11667, 0.99167

(The first number after the messages in square brackets is the uptime, then come the operations rate for several LDAP operations) At 10:19:14 the uptime was 2340341, and at 10:25:14 the uptime was 2340577. That could be OK, but at 10:21:14 the uptime was reported as 2340337 which is less than reported at 10:19:14.

Has there ever been such an issue with uptime?

I don't want to preclude anything but I personally never experienced an issue with the uptime value. Especially it's way more likely that something in your system environment caused slapd to restart. Only you can find out.

Time values are based on the time(3) subroutine, which can also be affected by anyone explicitly resetting the system clock. Of course, such an event would also be visible on regular syslog timestamps etc.

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Michael Ströder

11:37 a.m.

New subject: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

Howard Chu wrote:

...

Time values are based on the time(3) subroutine, which can also be affected by anyone explicitly resetting the system clock. Of course, such an event would also be visible on regular syslog timestamps etc.

If someone manages to muck with this on a production system one has likely more serious issues than a monitoring check. ;-)

Ciao, Michael.

Ulrich Windl

29 Apr 29 Apr

12:41 a.m.

New subject: Antw: Re: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

...

...
...
Howard Chu hyc@symas.com schrieb am 28.04.2016 um 19:54 in Nachricht

57224E67.7080507@symas.com:

...

Time values are based on the time(3) subroutine, which can also be affected by anyone explicitly resetting the system clock. Of course, such an event would

also be visible on regular syslog timestamps etc.

Is ldapd's uptime using time() minus some fixed value? Why not using the POSIX CLOCK_MONOTONIC value instead.

Actually you gave a clue: the slapd is running in a VM, and the VM had been live-migrated. Maybe the glitch originated from there.

Regards, Ulrich

...

-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

Michael Ströder

1:01 a.m.

New subject: Antw: Re: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

Ulrich Windl wrote:

...

Actually you gave a clue: the slapd is running in a VM, and the VM had been live-migrated. Maybe the glitch originated from there.

VMs and time synchronisation are often an issue. I expected it to be something like this. Note that if you setup MMR this can cause some trouble.

Ciao, Michael.

Ulrich Windl

12:37 a.m.

New subject: Antw: Re: Q: "dn: cn=Uptime,cn=Time,cn=Monitor" non-monotonic occasionally?

...

...
...
Michael Ströder michael@stroeder.com schrieb am 28.04.2016 um 19:26 in

Nachricht 572247C0.9030300@stroeder.com:

...

Ulrich Windl wrote:

...
I'm running a NRPE monitor that monitors openLDAP. To compute delta

values,

...

...
the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by openLDAP to compare it with the current value (to build the time

difference).

...
If the time difference is negative, the plugin reports a "server restart" event.

Uptime is the least usable time value for this task. Anyway...

What would you recommend instead?

...

...
So far, so good...

However, very rarely (once every few weeks) a false server restart is

reported. Viewing the historic logs, I can only assume that openLDAP reports

...

a bad uptime value now and then. Could this be?

...
Example plugin logs: 2016/04/28 10:25:14 LDAP: Operations

Rate OK HARD 1 OK: [Lock.Delay=0.246], [3

...

searches with 12 entries in 0.057s], 2340577, 0.10084, 0.07563, 0.35294, 0,

...

0, 0, 0, 0, 0, 0.08403, 0.61345

...
2016/04/28 10:23:14 LDAP: Operations

Rate OK HARD 1 OK: [Lock.Delay=0.288], [3

...

searches with 12 entries in 0.052s], 2340458, 0.09917, 0.08264, 0.32231, 0,

...

0, 0, 0, 0, 0, 0.08264, 0.58678

...
2016/04/28 10:21:14 LDAP: Operations

Rate UNKNOWN HARD 1 UNKNOWN:

...

[Lock.Delay=0.221], [3 searches with 12 entries in 0.052s], 2340337, [Operations.Bind.i: restart detected (2340337 - 2340341 == -4)], 284553, [Operations.Unbind.i: restart detected (2340337 - 2340341 == -4)], 205927, [Operations.Search.i: restart detected (2340337 - 2340341 == -4)], 985106, [Operations.Compare.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modify.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Modrdn.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Add.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Delete.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Abandon.i: restart detected (2340337 - 2340341 == -4)], 4, [Operations.Extended.i: restart detected (2340337 - 2340341 == -4)], 245199,

...

[Operations.i: restart detected (2340337 - 2340341 == -4)],

1720790

...

...
2016/04/28 10:19:14 LDAP: Operations

Rate OK HARD 1 OK: [Lock.Delay=0.501], [3

...

searches with 12 entries in 0.060s], 2340341, 0.13333, 0.09167, 0.65000, 0,

...

0, 0, 0, 0, 0, 0.11667, 0.99167

...
(The first number after the messages in square brackets is the uptime, then

...

come the operations rate for several LDAP operations)

...
At 10:19:14 the uptime was 2340341, and at 10:25:14 the uptime was 2340577.

...

That could be OK, but at 10:21:14 the uptime was reported as 2340337 which

...

less than reported at 10:19:14.

...
Has there ever been such an issue with uptime?

I don't want to preclude anything but I personally never experienced an issue with the uptime value. Especially it's way more likely that something in your system environment caused slapd to restart. Only you can find out.

Obviously slapd did NOT reastart; otherwise the uptime value would be much lower.

Anyway glad to hear that it is not a known issue.

Regards, Ulrich

Michael Ströder

12:58 a.m.

New subject: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

Ulrich Windl wrote:

...

...
...
...
Michael Ströder michael@stroeder.com schrieb am 28.04.2016 um 19:26 in

Nachricht 572247C0.9030300@stroeder.com:

...
Ulrich Windl wrote:

...
I'm running a NRPE monitor that monitors openLDAP. To compute delta

values,

...
...
the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by openLDAP to compare it with the current value (to build the time

difference).

...
If the time difference is negative, the plugin reports a "server restart" event.

Uptime is the least usable time value for this task. Anyway...

What would you recommend instead?

Calculate the difference quotient over the real time which is supposed to be monotonically increasing given your time sync setup is correct. If you insist on reading it from cn=monitor it's cn=Current,cn=Time,cn=Monitor.

Ciao, Michael.

Ulrich Windl

1:05 a.m.

New subject: Antw: Re: Q: "dn: cn=Uptime,cn=Time,cn=Monitor" non-monotonic occasionally?

...

...
...
Michael Ströder michael@stroeder.com schrieb am 29.04.2016 um 09:58 in

Nachricht 57231425.2080702@stroeder.com:

...

Ulrich Windl wrote:

...
...
...
...
Michael Ströder michael@stroeder.com schrieb am 28.04.2016 um 19:26

...

...
Nachricht 572247C0.9030300@stroeder.com:

...
Ulrich Windl wrote:

...
I'm running a NRPE monitor that monitors openLDAP. To compute delta

values,

...
...
the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported

...

...
...
...
openLDAP to compare it with the current value (to build the time

difference).

...
If the time difference is negative, the plugin reports a "server

restart"

...

...
...
...
event.

Uptime is the least usable time value for this task. Anyway...

What would you recommend instead?

Calculate the difference quotient over the real time which is supposed to

...

monotonically increasing given your time sync setup is correct. If you insist on reading it from cn=monitor it's cn=Current,cn=Time,cn=Monitor.

But for both cases I would be unable to detect a restart of the slapd. Rates may seem negative then.

...

Ciao, Michael.

Michael Ströder

1:35 a.m.

New subject: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

Ulrich Windl wrote:

...

But for both cases I would be unable to detect a restart of the slapd. Rates may seem negative then.

Yes, your monitoring script has to deal with that anyway. But at least the time difference will not be negative.

Ciao, Michael.

Brett ＠Google

2:50 a.m.

New subject: Q: "dn: cn=Uptime, cn=Time, cn=Monitor" non-monotonic occasionally?

On Thu, Apr 28, 2016 at 6:48 PM, Ulrich Windl < Ulrich.Windl@rz.uni-regensburg.de> wrote:

...

Hi!

I'm running a NRPE monitor that monitors openLDAP. To compute delta values, the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by openLDAP to compare it with the current value (to build the time difference). If the time difference is negative, the plugin reports a "server restart" event.

So far, so good...

However, very rarely (once every few weeks) a false server restart is reported. Viewing the historic logs, I can only assume that openLDAP reports a bad uptime value now and then. Could this be?

If uptime is a calculation based on the system clock, is there some process such as ntpd or vmware tools, periodically updating the system clock, sometimes backwards ?

If so, maybe look for an option to limit backwards updates.

Cheers Brett

Ulrich Windl

5:06 a.m.

New subject: Antw: Re: Q: "dn: cn=Uptime,cn=Time,cn=Monitor" non-monotonic occasionally?

...

...
...
"Brett @Google" brett.maxfield@gmail.com schrieb am 29.04.2016 um 11:50 in

Nachricht CAJiMZ95aOoN7pVAHKDzmamzkQCL9J6xKQoanSQCv=eq3daz0Cg@mail.gmail.com:

...

On Thu, Apr 28, 2016 at 6:48 PM, Ulrich Windl < Ulrich.Windl@rz.uni-regensburg.de> wrote:

...
Hi!

I'm running a NRPE monitor that monitors openLDAP. To compute delta values, the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by openLDAP to compare it with the current value (to build the time difference). If the time difference is negative, the plugin reports a "server restart" event.

So far, so good...

However, very rarely (once every few weeks) a false server restart is reported. Viewing the historic logs, I can only assume that openLDAP reports a bad uptime value now and then. Could this be?

If uptime is a calculation based on the system clock, is there some process such as ntpd or vmware tools, periodically updating the system clock, sometimes backwards ?

It should not, but the time could behave odd (jump) immediately after migration. I'll have to investigate this further.

...

If so, maybe look for an option to limit backwards updates.

Cheers Brett

3464

Age (days ago)

3465

Last active (days ago)

openldap-technical@openldap.org

11 comments

4 participants

tags (0)

participants (4)

Brett ＠Google
Howard Chu
Michael Ströder
Ulrich Windl