Re: (ITS#7774) LMDB assertion failure during Postfix cache cleanup
by h.b.furuseth@usit.uio.no
I wrote:
>Wietse Venema writes:
>> I can confirm that with LMDB-0.9.10 + ITS#7756 patch, the assertion
>> failure goes away when I close the cursor (and its read transaction)
>> before changing the map or the database, and restore the cursor
>> afterwards. I guess this got broken when things were changed to use
>> external locks in order to avoid world-writable lockfiles.
Does this mean we can close this ITS? I made a major typo here:
> Yes.
>
> Ending the read-only transaction and starting a new one means you are
> no longer seeing the same snapshot, though. So I hope the reason you
> couldn't use single transaction which I suggested previously, was
**not**
> that
> the reader transaction needed to keep seeing the same snapshot.
--
Hallvard
7 years
Re: (ITS#7775) LMDB terminates Postfix daemon process without logfile record
by h.b.furuseth@usit.uio.no
wietse(a)porcupine.org writes:
> Below is my "least painful" patch for both parties. What do you think?
That's roughly what I wrote was thinking, yes.
We haven't heard if it will go in at all, though:-) If this ITS
is a reaction to ITS#7774, then the motivation is not catching
an lmdb bug, but letting the user band-aid a user error. Still
makes sense to me, lmdb lets the user screw up in many ways.
Anyway:
Why drop the info parameter? If you don't need it after all,
that's fine. But then it won't be coming back later. It
doesn't fit in the varargs, they're reserved for the message.
I ended up declaring a type for it in ldmb.h, to make future
expansion a bit friendlier:
/** Dummy definition for future extension of #mdb_assert_cb(). */
typedef struct MDB_assertion MDB_assertion;
I dislike when a feature makes me #include <stdarg.h>, so I'd
not make one myself. I'd build the message in a buffer. Also
the code needs fleshing out a bit: Obey NDEBUG, and omit all
this if something already #defined assert. See below.
Finally, __func__ needs a wrapper. It can go in a renamed
"compat" section:
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -151,3 +150,3 @@
*/
-/** @defgroup compat Windows Compatibility Macros
+/** @defgroup compat Compatibility Macros
* A bunch of macros to minimize the amount of platform-specific ifdefs
@@ -158,2 +157,13 @@
*/
+
+ /** Wrapper around __func__, which is a C99 feature */
+#if __STDC_VERSION__ >= 199901L
+# define mdb_func_ __func__
+#elif __GNUC__ >= 2 || _MSC_VER >= 1300
+# define mdb_func_ __FUNCTION__
+#else
+/* If a debug message says <mdb_unknown>(), update the #if statements above */
+# define mdb_func_ "<mdb_unknown>"
+#endif
+
#ifdef _WIN32
@@ -329,3 +339,3 @@ static txnid_t mdb_debug_start;
# define DPRINTF0(fmt, ...) \
- fprintf(stderr, "%s:%d " fmt "\n", __func__, __LINE__, __VA_ARGS__)
+ fprintf(stderr, "%s:%d " fmt "\n", mdb_func_, __LINE__, __VA_ARGS__)
#else
Google found more compiler #ifdefs, but I'm not sure which of
these are still relevant as non-C99 compilers. Could add them
later at need:
__func__: (defined(__SUNPRO_C) && defined(__C99FEATURES__))
__FUNC__: __BORLANDC__ >= 0x550
__FUNCTION__: __MWERKS__ >= 0x3000 || defined(__ghs__) || __DMC__ >= 0x810 || __IBMCPP__ >= 500
And the assert() itself:
/** assert(3) replacement.
* Omitted in programs that #include mdb.c and already define assert().
*/
#ifndef assert
# ifdef NDEBUG
# define assert(expr) ((void) 0)
# else
# define assert(expr) ((expr) ? (void)0 : \
mdb_assert_fail(#expr, mdb_func_, __FILE__, __LINE__))
static void
mdb_assert_fail(const char *expr, const char *func, const char *file, int line)
{
char buf[400];
sprintf(buf, "%.100s:%d: Assertion '%.200s' failed in %.40s()",
file, line, expr, func);
if (mdb_assert_cb)
mdb_assert_cb(buf, NULL);
fprintf(stderr, "%s\n", buf);
abort();
}
# endif /* NDEBUG */
#endif /* assert */
--
Hallvard
7 years
Re: (ITS#7775) LMDB terminates Postfix daemon process without logfile record
by wietse@porcupine.org
Wietse Venema:
> Hallvard Breien Furuseth:
> > wietse(a)porcupine.org writes:
> > >Hallvard Breien Furuseth:
> > >> I suppose we could s/assert(foo)/mdb_assert(mc, foo)/ when that
> > >> compiles so it can report env, txn, dbi. Howard, should I do that?
> > >
> > > That was my idea. Allow me to code up an example, it is simpler
> > > than arguing in the abstract.
> >
> > By all means, but I expect we understand each other. I just needed to
> > know how muchimpact this needs to have on the lmdb's code. The answer
> > is it must walk all over it. And now that I think of it, it'll likely
> > introduce merge conflicts in every outstanding branch of any size.
> > So my preference is not to do this just yet.
>
> I am settling for less, and come with a few-line change, so that
> LMDB might still end up in this month's stable Postfix release.
>
> Hang on for a few more minutes.
Below is my "least painful" patch for both parties. What do you think?
Wietse
*** ./work/mdb-mdb/libraries/liblmdb/mdb.c- Tue Nov 12 11:10:33 2013
--- ./work/mdb-mdb/libraries/liblmdb/mdb.c Thu Jan 2 16:30:47 2014
***************
*** 65,71 ****
#include <fcntl.h>
#endif
- #include <assert.h>
#include <errno.h>
#include <limits.h>
#include <stddef.h>
--- 65,70 ----
***************
*** 146,151 ****
--- 145,155 ----
# error "Two's complement, reasonably sized integer types, please"
#endif
+ /* assert(3) clone. Move this to the project-correct location. */
+ static void mdb_assert(const char *, const char *, int, const char *);
+ #define assert(e) ((e) ? (void) 0 : mdb_assert(__func__, __FILE__, \
+ __LINE__, #e))
+
/** @defgroup internal MDB Internals
* @{
*/
***************
*** 3288,3293 ****
--- 3292,3311 ----
return MDB_SUCCESS;
}
+ void
+ (*mdb_assert_cb)(const char *,...);
+
+ static void
+ mdb_assert(const char *func, const char *file, int line, const char *text)
+ {
+ if (mdb_assert_cb != 0)
+ mdb_assert_cb("Assertion failed: file %s, line %d, function %s: %s",
+ file, line, func, text);
+ fprintf(stderr, "Assertion failed: file %s, line %d, function %s: %s\n",
+ file, line, func, text);
+ abort();
+ }
+
static int
mdb_env_map(MDB_env *env, void *addr, int newsize)
{
*** ./work/mdb-mdb/libraries/liblmdb/lmdb.h- Tue Nov 12 11:10:33 2013
--- ./work/mdb-mdb/libraries/liblmdb/lmdb.h Thu Jan 2 16:01:37 2014
***************
*** 1429,1434 ****
--- 1429,1441 ----
* @return 0 on success, non-zero on failure.
*/
int mdb_reader_check(MDB_env *env, int *dead);
+
+ /** @brief Notify application of assertion failure.
+ * This should become obsolete as lmdb's error handling matures.
+ * @param[in] fmt The assertion message, not including newline.
+ */
+ extern void (*mdb_assert_cb)(const char *fmt, ...);
+
/** @} */
#ifdef __cplusplus
7 years
Re: (ITS#7775) LMDB terminates Postfix daemon process without logfile record
by wietse@porcupine.org
Hallvard Breien Furuseth:
> wietse(a)porcupine.org writes:
> >Hallvard Breien Furuseth:
> >> I suppose we could s/assert(foo)/mdb_assert(mc, foo)/ when that
> >> compiles so it can report env, txn, dbi. Howard, should I do that?
> >
> > That was my idea. Allow me to code up an example, it is simpler
> > than arguing in the abstract.
>
> By all means, but I expect we understand each other. I just needed to
> know how muchimpact this needs to have on the lmdb's code. The answer
> is it must walk all over it. And now that I think of it, it'll likely
> introduce merge conflicts in every outstanding branch of any size.
> So my preference is not to do this just yet.
I am settling for less, and come with a few-line change, so that
LMDB might still end up in this month's stable Postfix release.
Hang on for a few more minutes.
Wietse
7 years
Re: (ITS#7775) LMDB terminates Postfix daemon process without logfile record
by h.b.furuseth@usit.uio.no
wietse(a)porcupine.org writes:
>Hallvard Breien Furuseth:
>> I suppose we could s/assert(foo)/mdb_assert(mc, foo)/ when that
>> compiles so it can report env, txn, dbi. Howard, should I do that?
>
> That was my idea. Allow me to code up an example, it is simpler
> than arguing in the abstract.
By all means, but I expect we understand each other. I just needed to
know how muchimpact this needs to have on the lmdb's code. The answer
is it must walk all over it. And now that I think of it, it'll likely
introduce merge conflicts in every outstanding branch of any size.
So my preference is not to do this just yet.
--
Hallvard
7 years
Re: (ITS#7775) LMDB terminates Postfix daemon process without logfile record
by wietse@porcupine.org
Hallvard Breien Furuseth:
> wietse(a)porcupine.org writes:
> >Hallvard Breien Furuseth:
> >> /** User-settable callback for assert(), called before abort()
> >> * instead of printing the message.
> >> *
> >> * assert(some user errors) might be unaffected. This hack
> >> * should become obsolete as lmdb's error handling matures.
> >> * @param[in] msg The assertion message, not including newline.
> >> * @parem[in] info Currently NULL. Might change in the future.
> >> */
> >> extern void (*mdb_assert_cb)(char *msg, void *info);
> >>
> >> (Or maybe call it -before- printing the message, you can abort or
> >> longjmp yourself if you don't want the message.)
> >
> > Looks good. I'll code up an example implementation to set and use
> > the callback. The "info" argument should be sufficient to figure
> > out which database is failing.
>
> To repeat, 'info' would be NULL initially. Does that make it useless to
> you? Various pieces of code don't know any context of interest to the
> user anyway.
That is unhelpful, as my opens lots of databases.
Instead I propose that for each open database I give you my function
pointer AND my context pointer. Then, you call my function with my
context pointer when there is a problem.
My program opens multiple databases. It is useless if you pass me
your LMDB handle, as I now have to find out a) which of all my open
databases is LMDB and b) which of those LMDB databases is messed up.
That totally breaks all abstractions.
> I suppose we could s/assert(foo)/mdb_assert(mc, foo)/ when that
> compiles so it can report env, txn, dbi. Howard, should I do that?
That was my idea. Allow me to code up an example, it is simpler
than arguing in the abstract.
Wietse
7 years
Re: (ITS#7774) LMDB assertion failure during Postfix cache cleanup
by h.b.furuseth@usit.uio.no
Wietse Venema writes:
> As written in the bug report the assertion also failed with
> mdb_map_size set to an insanely large value (half a billion) in
> which case the map size never changes.
Yes...
> But, it is good to know that the cursor is invalidated by both map
> size changes and transactions that change the database.
No, that's the wrong idea. This is a case of a user error breaking
snapshot isolation: Ending the transaction invalidates its cursors,
and you have to end the transaction in these two cases (modifying
when using MDB_NOLOCK, or resizing).
Those are "user errors" because that made it simple to code those
features. We could add code to make them non-errors on in common cases
and on common hosts, but I'm not sure if that's worth the trouble.
> I can confirm that with LMDB-0.9.10 + ITS#7756 patch, the assertion
> failure goes away when I close the cursor (and its read transaction)
> before changing the map or the database, and restore the cursor
> afterwards. I guess this got broken when things were changed to use
> external locks in order to avoid world-writable lockfiles.
Yes.
Ending the read-only transaction and starting a new one means you are
no longer seeing the same snapshot, though. So I hope the reason you
couldn't use single transaction which I suggested previously, was that
the reader transaction needed to keep seeing the same snapshot.
--
Hallvard
7 years
Re: (ITS#7775) LMDB terminates Postfix daemon process without logfile record
by h.b.furuseth@usit.uio.no
wietse(a)porcupine.org writes:
>Hallvard Breien Furuseth:
>> /** User-settable callback for assert(), called before abort()
>> * instead of printing the message.
>> *
>> * assert(some user errors) might be unaffected. This hack
>> * should become obsolete as lmdb's error handling matures.
>> * @param[in] msg The assertion message, not including newline.
>> * @parem[in] info Currently NULL. Might change in the future.
>> */
>> extern void (*mdb_assert_cb)(char *msg, void *info);
>>
>> (Or maybe call it -before- printing the message, you can abort or
>> longjmp yourself if you don't want the message.)
>
> Looks good. I'll code up an example implementation to set and use
> the callback. The "info" argument should be sufficient to figure
> out which database is failing.
To repeat, 'info' would be NULL initially. Does that make it useless to
you? Various pieces of code don't know any context of interest to the
user anyway.
I suppose we could s/assert(foo)/mdb_assert(mc, foo)/ when that
compiles so it can report env, txn, dbi. Howard, should I do that?
--
Hallvard
7 years
Re: (ITS#7775) LMDB terminates Postfix daemon process without logfile record
by wietse@porcupine.org
Hallvard Breien Furuseth:
> > - Invoke an application call-back function with the error code and
> > problem description text. If the error code is MDB_PANIC then
> > the application knows that it needs to make final arrangements.
>
> Sounds good to me for now, since it doesn't mean walking all over
> the code. mdb.c can just #define its own assert(). No error
> code, the context is just "we're about to abort" and I don't know
> of other planned callbacks which the code would help.
>
> /** User-settable callback for assert(), called before abort()
> * instead of printing the message.
> *
> * assert(some user errors) might be unaffected. This hack
> * should become obsolete as lmdb's error handling matures.
> * @param[in] msg The assertion message, not including newline.
> * @parem[in] info Currently NULL. Might change in the future.
> */
> extern void (*mdb_assert_cb)(char *msg, void *info);
>
> (Or maybe call it -before- printing the message, you can abort or
> longjmp yourself if you don't want the message.)
Looks good. I'll code up an example implementation to set and use
the callback. The "info" argument should be sufficient to figure
out which database is failing.
Wietse
7 years
Re: (ITS#7774) LMDB assertion failure during Postfix cache cleanup
by wietse@porcupine.org
Hallvard Breien Furuseth:
> Wietse Venema writes:
> > Why do you talk about map size changes when I delete a database entry??
>
> That's how Postfix uses lmdb. It calls mdb_env_set_mapsize(). Looking
As written in the bug report the assertion also failed with
mdb_map_size set to an insanely large value (half a billion) in
which case the map size never changes.
But, it is good to know that the cursor is invalidated by both map
size changes and transactions that change the database.
I can confirm that with LMDB-0.9.10 + ITS#7756 patch, the assertion
failure goes away when I close the cursor (and its read transaction)
before changing the map or the database, and restore the cursor
afterwards. I guess this got broken when things were changed to use
external locks in order to avoid world-writable lockfiles.
Wietse
7 years