Michael Ströder wrote:
Ulrich Windl wrote:
> I'm running a NRPE monitor that monitors openLDAP. To compute delta values,
> the NRPE plugin stores the "dn: cn=Uptime,cn=Time,cn=Monitor" reported by
> openLDAP to compare it with the current value (to build the time difference).
> If the time difference is negative, the plugin reports a "server restart"
> event.
Uptime is the least usable time value for this task. Anyway...
> So far, so good...
>
> However, very rarely (once every few weeks) a false server restart is reported.
Viewing the historic logs, I can only assume that openLDAP reports a bad uptime value now
and then. Could this be?
>
> Example plugin logs:
> 2016/04/28 10:25:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.246], [3
searches with 12 entries in 0.057s], 2340577, 0.10084, 0.07563, 0.35294, 0, 0, 0, 0, 0, 0,
0.08403, 0.61345
> 2016/04/28 10:23:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.288], [3
searches with 12 entries in 0.052s], 2340458, 0.09917, 0.08264, 0.32231, 0, 0, 0, 0, 0, 0,
0.08264, 0.58678
> 2016/04/28 10:21:14 LDAP: Operations Rate UNKNOWN HARD 1 UNKNOWN: [Lock.Delay=0.221],
[3 searches with 12 entries in 0.052s], 2340337, [Operations.Bind.i: restart detected
(2340337 - 2340341 == -4)], 284553, [Operations.Unbind.i: restart detected (2340337 -
2340341 == -4)], 205927, [Operations.Search.i: restart detected (2340337 - 2340341 ==
-4)], 985106, [Operations.Compare.i: restart detected (2340337 - 2340341 == -4)], 0,
[Operations.Modify.i: restart detected (2340337 - 2340341 == -4)], 0,
[Operations.Modrdn.i: restart detected (2340337 - 2340341 == -4)], 0, [Operations.Add.i:
restart detected (2340337 - 2340341 == -4)], 0, [Operations.Delete.i: restart detected
(2340337 - 2340341 == -4)], 0, [Operations.Abandon.i: restart detected (2340337 - 2340341
== -4)], 4, [Operations.Extended.i: restart detected (2340337 - 2340341 == -4)], 245199,
[Operations.i: restart detected (2340337 - 2340341 == -4)], 1720790
> 2016/04/28 10:19:14 LDAP: Operations Rate OK HARD 1 OK: [Lock.Delay=0.501], [3
searches with 12 entries in 0.060s], 2340341, 0.13333, 0.09167, 0.65000, 0, 0, 0, 0, 0, 0,
0.11667, 0.99167
>
> (The first number after the messages in square brackets is the uptime, then come the
operations rate for several LDAP operations)
> At 10:19:14 the uptime was 2340341, and at 10:25:14 the uptime was 2340577. That
could be OK, but at 10:21:14 the uptime was reported as 2340337 which is less than
reported at 10:19:14.
>
> Has there ever been such an issue with uptime?
I don't want to preclude anything but I personally never experienced an issue
with the uptime value. Especially it's way more likely that something in your
system environment caused slapd to restart. Only you can find out.
Time values are based on the time(3) subroutine, which can also be affected by
anyone explicitly resetting the system clock. Of course, such an event would
also be visible on regular syslog timestamps etc.
--
-- Howard Chu
CTO, Symas Corp.
http://www.symas.com
Director, Highland Sun
http://highlandsun.com/hyc/
Chief Architect, OpenLDAP
http://www.openldap.org/project/