Full_Name: Hallvard B Furuseth
Version: HEAD
OS: Linux x86_64
URL:
Submission from: (NULL) (129.240.6.233)
Submitted by: hallvard
Here are a number of bugs and other issues with slapd/sl_malloc.c, and
one for ber_memcalloc. Some for RE24, likely some for RE25, but I don't
know where to cut off for RE24? (Waiting for someone to say...)
* It claims to align on 2*sizeof(int) boundaries, but it looks like
stack:realloc and pool:mem_create can un-align to (2n+1)*sizeof(int)
boundaries on 32-bit hosts, where sizeof(ber_len_t) == sizeof(int).
A slap_sl_realloc() fix for the stack implementation is to replace
size += pad + sizeof( ber_len_t );
size &= ~pad;
with
size = ((size + pad) & ~pad) + sizeof( ber_len_t );
however see the optimizations and tweaks below.
For the pool implementation, maybe slap_sl_mem_create() can just set
so->so_ptr = (char *) sh->sh_base + sizeof(ber_len_t) % (pad+1);
Unless that messes up its expectations of how much space is left, I
haven't checked. Should mem_create size be increased accordingly?
Unfortunately sl_malloc.c exports its internals, so I don't know what
kind of fix would affect it:
* Why is there a slap_sl_mem_detach() when there is no way to reattach
it? As far as I can tell all the caller can do is to inspect/clean up
the returned slab_heap itself.
* slap_sl_mem_create() switching implementation stack<->pool on an old
memctx slab can crash or leak, it runs the wrong cleanup code.
Fix: Remove code duplication, rearrange. The pool cleanup duplicates
slap_sl_mem_destroy(), except the final frees. Let destroy(key=NULL,)
mean don't free and call that. Also the create/destroy pool branches
duplicate the stack branches.
* However, will the pool (i.e. non-stack) implementation ever be used?
If not, a simpler fix to some of this is to delete it. OTOH if it
someday may be used, hopefully some of these cleanups will be leaving
it a bit more correct, easier to read, and faster. Though I haven't
gone to the drastic step of studying it to see how it works.
Or we could #ifdef it out by default. It's unclear to me why this is
a run-time argument to the create function instead of instead of a
-D<compiler flag>, why would the caller care?
#define SLAP_SLAB_MODE <undef or 1:stack, 0:pool: -1:support both>.
#define SLAP_SLAB_ISSTACK(stack) <stack if both, else SLAP_SLAB_MODE>
* Regarding code duplication, the #ifdef SLAP_NO_SL_MALLOC code
duplicates the !ctx code. Make that if(<enum No_sl_malloc> && !ctx)
and the #ifdefs can be killed.
* Also we can throw the #ifdef NO_THREADS/#else/#endif code out to two
macros #define GET_MEMCTX()/SET_MEMCTX().
* The "enough room?" test (ptr + size < endptr) is invalid, can e.g.
wrap around, when there is not enough room. Use (size < endptr-ptr).
* Some failures assert(0)/exit() like ch_malloc.c, but without giving a
Debug() message first.
* The slap_sl_malloc() pool implementation can return NULL on failure
at comment "FIXME: missing return; guessing...". For consistency it
should exit on failure like the rest of sl_malloc.c and ch_malloc.c.
* Realloc(last block) when the following free space is not big enough,
does not move sh_last and reclaim free space properly.
* slap_sl_calloc()/ber_memcalloc() do not catch <count>*<size> overflow:
ber_len_t total = n * size;
if (!n && total/n == size) <allocate...>;
That can be optimized a bit, if anyone cares:
#define LIM_SQRT(t) /* some value < sqrt(max value of unsigned type t) */ \
((0UL|(t)-1) >>31>>31 > 1 ? ((t)1 <<32) - 1 : \
(0UL|(t)-1) >>31 ? 65535U : (0UL|(t)-1) >>15 ? 255U : 15U)
...
/* sqrt test = slight optimization: often avoids the division */
if ((n | size) <= LIM_SQRT(ber_len_t) || !n || total/n == size) ...
* slap_sl_<malloc/realloc> increase size before proceeding, they should
reduce it before falling back to ch_malloc and for Debug messages.
Space optimizations/cleanup:
* To save space on 64-bit hosts: We can store store sizes (head, tail)
in 4-byte typedef ber_uint_t slap_sl_size_t instead of a ber_len_t.
After fixing the misalignment issue above, however.
* Also the stack implementation does not need to allocate the tail.
It can store the "block is freed" bit in the head of the _next_ block,
and only store a tail in free blocks (i.e. if that bit is set).
I think that does add a bit of code, though it is convenient for the
"reclaim freed blocks at sh_last" code in slap_sl_free().
* Ignoring last point, the "reclaim" code can in any case be simplified:
/* mark it free */
p[-1] = size |= 1;
/* reclaim free space off tail */
if (sh->sh_last == p) {
do {
p = (slap_sllen_t*) ((char*) p - size + 1) - 1;
size = p[-1];
} while (size & 1);
sh->sh_last = p;
}
* Realloc can check if previous or next block is free before giving up
and doing malloc/copy/free. That does multiply its code size though.
Dunno if it's a good idea. Needs testing/timing, I guess.
* Aligning with (size & ~pad) is wrong for non-two's complement signed
pad, it should be (size & -(pad + 1)) aka (size & -Alignment).
All right, so I'm nitpicking. I just dislike unportable code which
isn't even simpler than the portable variant.
* pad and order_start can be compile-time enums instead of computed at
run time (if any compilers do not optimize it away already):
enum {
Align = 2*sizeof(int),
Align_log2 = 1 + (Align>2) + (Align>4) + (Align>8) + (Align>16),
pad = Align - 1, /* but drop that, it's just as easy to use Align */
order_start = Align_log2 /* which also can be dropped */
};
...
assert(Align == 1 << Align_log2);
/* Adding head+tail preserves alignment */
assert(2*sizeof(ber_len_t) % Align == 0);
* The pool code can be cleaned up by basing 'order' on Align-sized
chunks instead of byte-sized chunks, i.e. subtract order_start from
sh->sh_maxorder, order/i/j variables. Also subtract 1 from
mem_create:order, so that 'order' means the same as 'order' elsewhere.
(Straightforward replacement, no need to invoke the brain and
understand the code:-)
* The pool implementation sometimes does 1<<foo where it should do
(ber_len_t)1<<foo or 1U<<foo.