Re: (ITS#7364) mdb: clean up POSIX semaphores on environment close.
by h.b.furuseth@usit.uio.no
Reopening this.
This is worse with a database with is intended to be used by
different users (A and B):
A opens the DB and creates the semaphores with e.g. mode 0660.
B opens it, A closes, B closes - and fails sem_unlink() which
only A and root can do.
Next, B (or C) fails mdb_env_open() because sem_unlink() fails
again.
The work-around I can think of is a "multi-uid" mode which instead
resets the semaphore with sem_post() if sem_getvalue() returns 0.
I don't know how ugly that is considered to be. Could ask
comp.programming.unix, or check what Berkeley DB does. This mode
should use mode 0666 for the semaphores (temporarily setting umask
0, yuck), or it should not sem_unlink() since next user may create
the semaphores with a group which gives the wrong users access.
Or root may give it root-only access, as in the original report.
If not unlinking, we need a special "remove database + semaphores"
API call.
Or use some other sync primitive, like file locks. The Linux manual
says these work over NFS though, which sounds like they must be rather
slow. flock() does not, but this time mdb would need to seek first.
I don't know which sync variants BSD offers.
Other matters with the current implementation - I'll patch these:
mdb_env_excl_lock() need not retry getting a non-exclusive lock when
closing. mdb_env_close() can pass *excl = -1 to tell it not to.
mdb_env_setup_locks() can sem_unlink both semaphores before doing
anything else, so that reopening a database as root will clean up.
Drop the error checks of sem_unlink (so both get called), instead
use O_EXCL in sem_open(,O_CREAT,,). Unless I'm missing something,
the error checks just work like an emulation of O_EXCL anyway.
9 years, 8 months
Re: (ITS#7363) libmdb should use POSIX semaphores on non-apple BSD systems too.
by h.b.furuseth@usit.uio.no
Chris Mikkelson writes:
> The *BSD systems define PTHREAD_PROCESS_SHARED but do
> not implement sharable pthread mutexes. The flag name
> is part of the API, but implementation of the behavior
> it requests is optional.
Yes, sorry about that.
How about something like a "__BSD" preprocessor symbol to replace
"BSD"? There surely is one. It'll start with an underscore
followed by either another underscore or an uppercase letter.
--
Hallvard
9 years, 8 months
Re: (ITS#7377) Poor libmdb error handling
by Hallvard B Furuseth
Howard Chu wrote:
>h.b.furuseth(a)usit.uio.no wrote:
>> Maybe add a flag to request error paranoia, like setting
>> PTHREAD_MUTEX_ERRORCHECK. Trying to catch if a current mdb
>> process has lost is locks. (The fcntl F_SETLK is lost if the
>> process closes another file descriptor to the same file,
>> e.g. after some other module in the process opened and closed the
>> mdb lock file.) Catching a change to the pid from getpid().
>
> That's really not interesting to me. No other module has any business
> opening and reading our lock file.
I skipped a step there. The other module could also be using mdb.
"other module" = "a program component which the module using this
MDB_env does not know about". They may even be linking different
libmdb.so's, so the 'libmdb's can't notify each other via static vars.
There are a few other use cases like a monolith program which also
can backup a directory - which the user could point at the mdb dir.
Anyway, a CAVEATS section in the doc is the obvious fix.
> We haven't bothered with this for alock either, and
> it hasn't been a problem.
Note that this ITS is about libmdb, not back-mdb.
>> The fork() restriction needs to be documented, as does "don't open
>> or close the same database twice in the same process".
>
> Go ahead.
Hmm. I suggested a too pessimistic caveats section once, I can dig
that up and try again.
9 years, 8 months
Re: (ITS#7377) Poor libmdb error handling
by hyc@symas.com
h.b.furuseth(a)usit.uio.no wrote:
> Full_Name: Hallvard B Furuseth
> Version: 2.4.32
> OS: Linux x86_64
> URL:
> Submission from: (NULL) (193.69.163.163)
> Submitted by: hallvard
>
>
> libmdb omits necessary checks for system call errors, and does
> not always clean up after the errors it does detect.
>
> Also mdb could catch more user errors - including "cannot happen"
> mutex errors from mutex lock/unlock, since some threading user
> errors are be non-obvious from the doc.
>
> At least sem_wait() needs an EINTR loop. POSIX says "The
> sem_wait() function is interruptible by the delivery of a signal."
> The Linux manpage adds that a signal handler always interrupts it
> regardless of sigaction SA_RESTART. I don't know about BSD.
>
> Some error checks first need updates to the Windows emulation, or
> will look cleaner that way. E.g. pthread_key_create().
>
> I'll fix mdb_env_open()'s error cleanup, at least.
> (Hyc, that moves cleanup out of mdb_env_setup_locks(). Any
> preference for using ~25 exit points 'return <ErrCode() or rc>'
> vs. ~25 'goto <failure label>' which does the same?)
In general I prefer that functions have only one exit point, so multiple gotos
is preferable.
> Maybe add a flag to request error paranoia, like setting
> PTHREAD_MUTEX_ERRORCHECK. Trying to catch if a current mdb
> process has lost is locks. (The fcntl F_SETLK is lost if the
> process closes another file descriptor to the same file,
> e.g. after some other module in the process opened and closed the
> mdb lock file.) Catching a change to the pid from getpid().
That's really not interesting to me. No other module has any business opening
and reading our lock file. We haven't bothered with this for alock either, and
it hasn't been a problem.
Change of pid... not worth bothering. Just add a note to the docs.
> Regarding the pid: Repeating getpid() - in mdb_txn_renew0() and
> mdb_env_close() - is useless since mdb cannot survive fork()
> anyway. fork drops fcntl locks. Might as well store the pid (or
> just some unique "environment id") in MDB_env and use that.
Agreed. I had a patch here that does just that, seems it got lost in some
other edits.
> OTOH
> some strategically placed getpid()s can then catch the user error
> of having forked, and set e.g. MDB_FATAL_ERROR.
Not really interesting.
> The fork() restriction needs to be documented, as does "don't open
> or close the same database twice in the same process".
Go ahead.
> mdb_env_open() should catch being called twice. Except, should
> it try to allow a 2nd call if 1st mdb_env_open() failed? It can
> easily do that if 1st attempt failed before reading params from
> the DB files, but fail if the 1st open got that far.
Called twice on the same env handle? Again, not interesting.
As a rule, we should of course check for failure return codes from normal
operation. We should not add extensive checks for programmer error/misuse of API.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
9 years, 8 months
Re: (ITS#7377) Poor libmdb error handling
by openldap-its@OpenLDAP.org
I messed up last reply headers somehow. Trying again. I wrote:
> Also mdb could catch more user errors - including "cannot happen"
> mutex errors from mutex lock/unlock, since some threading user
> errors are be non-obvious from the doc.
That is, "cannot happen" when there is no user error.
> The fork() restriction needs to be documented, as does "don't
> open or close the same database twice in the same process".
Correction, "don't have the same database open twice at the same
time in the same process".
9 years, 8 months
Re: (ITS#7377) Poor libmdb error handling
by openldap-its@OpenLDAP.org
I wrote:
> Also mdb could catch more user errors - including "cannot happen"
> mutex errors from mutex lock/unlock, since some threading user
> errors are be non-obvious from the doc.
That is, "cannot happen" when there is no user error.
> The fork() restriction needs to be documented, as does "don't
> open or close the same database twice in the same process".
Correction, "don't have the same database open twice at the same
time in the same process".
9 years, 8 months
(ITS#7377) Poor libmdb error handling
by h.b.furuseth@usit.uio.no
Full_Name: Hallvard B Furuseth
Version: 2.4.32
OS: Linux x86_64
URL:
Submission from: (NULL) (193.69.163.163)
Submitted by: hallvard
libmdb omits necessary checks for system call errors, and does
not always clean up after the errors it does detect.
Also mdb could catch more user errors - including "cannot happen"
mutex errors from mutex lock/unlock, since some threading user
errors are be non-obvious from the doc.
At least sem_wait() needs an EINTR loop. POSIX says "The
sem_wait() function is interruptible by the delivery of a signal."
The Linux manpage adds that a signal handler always interrupts it
regardless of sigaction SA_RESTART. I don't know about BSD.
Some error checks first need updates to the Windows emulation, or
will look cleaner that way. E.g. pthread_key_create().
I'll fix mdb_env_open()'s error cleanup, at least.
(Hyc, that moves cleanup out of mdb_env_setup_locks(). Any
preference for using ~25 exit points 'return <ErrCode() or rc>'
vs. ~25 'goto <failure label>' which does the same?)
Maybe add a flag to request error paranoia, like setting
PTHREAD_MUTEX_ERRORCHECK. Trying to catch if a current mdb
process has lost is locks. (The fcntl F_SETLK is lost if the
process closes another file descriptor to the same file,
e.g. after some other module in the process opened and closed the
mdb lock file.) Catching a change to the pid from getpid().
Regarding the pid: Repeating getpid() - in mdb_txn_renew0() and
mdb_env_close() - is useless since mdb cannot survive fork()
anyway. fork drops fcntl locks. Might as well store the pid (or
just some unique "environment id") in MDB_env and use that. OTOH
some strategically placed getpid()s can then catch the user error
of having forked, and set e.g. MDB_FATAL_ERROR.
The fork() restriction needs to be documented, as does "don't open
or close the same database twice in the same process".
mdb_env_open() should catch being called twice. Except, should
it try to allow a 2nd call if 1st mdb_env_open() failed? It can
easily do that if 1st attempt failed before reading params from
the DB files, but fail if the 1st open got that far.
9 years, 8 months