Re: LMDB dead process detection

18 Jul 2013

      Howard Chu writes:
...
There's been a long-running discussion about the need to have APIs in
liblmdb for displaying the reader table and clearing out stale slots.
Quite a few open questions on the topic:
(...)
3) What approach should be used for automatic detection of stale slots?
Currently we record the process ID and thread ID of a reader in

the table.  It's not clear to me that the thread ID has anything more
than informational value. Since we register a per-thread destructor
for slots, exiting threads should never be leaving stale slots in the
first place.
Unless the thread is killed with TerminateThread() on Windows. The doc
has a bunch of dire warnings about that, but I suspect real life may
differ from Microsoft's recommendations.
...
I'm also not sure that there are good APIs for an outside
caller to determine the liveness of a given thread ID.
As far as I can tell: Windows has thread IDs and handles for this.
Posix does not provide a way for outside callers to get at threads -
either kill them or exampine them.  Individual OSes may, but then they
likely provide both.  E.g. Linux clone() can create a thread, and
tgkill() can kill it.  These calls use another ID than the Posix
thread ID.  I hope we don't want to know...
...
The process ID is also prone to wraparound; it's still very common

for Linux systems to use 15 bit process IDs. (...)
A) set a byte range lock for every process attached to the

environment.
(...)
       c) This approach won't tell us if a process is in Zombie state.
Misplaced (c).  This is the approach which does work portably for
Zombies, at least on Unix.  And as we've discussed, on at least some
OSes, approach (B) below can also check for zombies, but it may take
more time.
...
B) check process ID and process start time.

This appears to be a fairly reliable approach, and reasonably fast,
but there is no POSIX standard API for obtaining this process
information.
(...)
We can implement approach (A) fairly easily, with no major
repercussions.  For (B) we would need to add a field to the reader
table records to store the process start time. (Thus a lockfile format
change.)
We need to change the lockfile version anyway.  Otherwise one process
using the current MDB version and one which uses either of these
approaches, could sabotage each other.
-- 
Hallvard

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: LMDB dead process detection