Sorry, forgot to include db_stat output. It is at http://pastebin.com/5x39j9ru

Regards, Maxim.


On Thu, May 23, 2013 at 12:19 PM, Maxim Shaposhnik <mshaposhnik@codenvy.com> wrote:
Hello, Howard.
Many thanks for you quick reply.

I have tried version 4.8.30 and POSIX mutex option for BDB. Unfortunately, this doesn't helps.

Debugging results of slapd process is here:  http://pastebin.com/FYjp4h61

One more thing I have noticed (may be it can give some clue): while slapd is not responding anymore on tcp:389 , I still able to iterate over DB using ldapsearch.

This looks suspicious for me, isn't it ?

This time, another table was deadlocked:
80000004 WRITE         1 HELD    dn2id.bdb                 page          2
8000005c READ          1 WAIT    dn2id.bdb                 page          2
80000014 READ          1 WAIT    dn2id.bdb                 page          2

What else can I do?

Thanks again for your help.

Regards, Maxim.



On Wed, May 22, 2013 at 6:55 PM, Howard Chu <hyc@symas.com> wrote:
Maxim Shaposhnik wrote:
Hi,

I'm faced with the OpenLDAP freeze problem on concurrent item modification.

OS type\version is FC17, OpenLDAP 2.4.35. Tried both BerkrleyDB  versions
5.2.36 and latest 5.3.21. DB size is about 50K.

 From my experiments,  LDAP stops responding when the count of locks on
objectClass.bdb reaches 3 (when less than 3, seems it resolves OK):

80000573 READ          3 HELD    objectClass.bdb           page          3
80000573 WRITE         7 HELD    objectClass.bdb           page          3
800001b6 READ          1 WAIT    objectClass.bdb           page          3


I also tried different locks detector schemes (different values for
set_lk_detect ) without success.

What may be a root cause of such situation?

It seems problems like this have only been coming up since BerkeleyDB 5. See if switching back to BDB 4.8 helps.

Also make sure BDB is configured --with-mutex=POSIX/pthreads

Compile slapd with no optimization, and with debug symbols enabled (AC_CFLAGS=-g)

The next time this situation occurs, get both the db_stat -CA output and also gdb the slapd process, "thread apply all bt full"

Without both the gdb and db_stat output there's nothing we can say.




This is my full db_stat output:

  db_stat -CA
Default locking region information:
19      Last allocated locker ID
0x7fffffff      Current maximum unused locker ID
9       Number of lock modes
200     Initial number of locks allocated
0       Initial number of lockers allocated
200     Initial number of lock objects allocated
3000    Maximum number of locks possible
1500    Maximum number of lockers possible
1500    Maximum number of lock objects possible
200     Current number of locks allocated
15      Current number of lockers allocated
200     Current number of lock objects allocated
40      Number of lock object partitions
2053    Size of object hash table
46      Number of current locks
115     Maximum number of locks at any one time
6       Maximum number of locks in any one bucket
11      Maximum number of locks stolen by for an empty partition
4       Maximum number of locks stolen for any one partition
13      Number of current lockers
15      Maximum number of lockers at any one time
26      Number of current lock objects
74      Maximum number of lock objects at any one time
2       Maximum number of lock objects in any one bucket
0       Maximum number of objects stolen by for an empty partition
0       Maximum number of objects stolen for any one partition
88126   Total number of locks requested
87895   Total number of locks released
0       Total number of locks upgraded
16      Total number of locks downgraded
174     Lock requests not available due to conflicts, for which we waited
153     Lock requests not available due to conflicts, for which we did not wait
11      Number of deadlocks
0       Lock timeout value
0       Number of locks that have timed out
0       Transaction timeout value
0       Number of transactions that have timed out
2MB 504KB       Region size
16      The number of partition locks that required waiting (0%)
8       The maximum number of times any partition lock was waited for (0%)
0       The number of object queue operations that required waiting (0%)
1       The number of locker allocations that required waiting (0%)
2       The number of region locks that required waiting (0%)
2       Maximum hash bucket length
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=-=-=-=-=-=-=-=
Lock REGINFO information:
Environment     Region type
1       Region ID
__db.001        Region name
0x7fc6ffe7f000  Region address
0x7fc6ffe7f0a0  Region allocation head
0x7fc70007f5b0  Region primary address
0       Region maximum allocation
0       Region allocated
Region allocations: 2874 allocations, 0 failures, 2750 frees, 7 longest
Allocations by power-of-two sizes:
   1KB   2869
   2KB   0
   4KB   1
   8KB   0
  16KB   0
  32KB   0
  64KB   2
128KB   0
256KB   1
512KB   0
1024KB  1
REGION_SHARED   Region flags
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Lock region parameters:
2       Lock region region mutex [2/59655 0% 25161/140492677707584] <wakeups 0/1>
2053    locker table size
2053    object table size
2099280 obj_off
2316456 locker_off
1       need_dd
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Lock conflict matrix:
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Locks grouped by lockers:
Locker   Mode      Count Status  ----------------- Object ---------------
        e dd=11 locks held 1    write locks 0    pid/thread
23242/140324977571648 flags 10   priority 100
        e READ          1 HELD    id2entry.bdb              handle        0
        f dd=10 locks held 1    write locks 0    pid/thread
23242/140324977571648 flags 10   priority 100
        f READ          1 HELD    dn2id.bdb                 handle        0
       10 dd= 9 locks held 0    write locks 0    pid/thread
23242/140324977571648 flags 0    priority 100
       11 dd= 6 locks held 1    write locks 0    pid/thread
23242/140324451448576 flags 10   priority 100
       11 READ          1 HELD    objectClass.bdb           handle        0
       12 dd= 5 locks held 1    write locks 0    pid/thread
23242/140324451448576 flags 10   priority 100
       12 READ          1 HELD    cloudIdeAliases.bdb       handle        0
       13 dd= 4 locks held 1    write locks 0    pid/thread
23242/140324451448576 flags 10   priority 100
       13 READ          1 HELD    ou.bdb                    handle        0
8000019c dd= 8 locks held 0    write locks 0    pid/thread
23242/140324977571648 flags 0    priority 100
8000019d dd= 7 locks held 0    write locks 0    pid/thread
23242/140324451448576 flags 0    priority 100
800001a1 dd= 3 locks held 0    write locks 0    pid/thread
23242/140324443055872 flags 0    priority 100
800001b6 dd= 2 locks held 0    write locks 0    pid/thread
23242/140324332623616 flags 0    priority 100
800001b6 READ          1 WAIT    objectClass.bdb           page          3
8000045f dd= 1 locks held 1    write locks 1    pid/thread
23242/140324164859648 flags 0    priority 100
8000045f WRITE         1 HELD    cloudIdeAliases.bdb       page       5337
80000572 dd= 0 locks held 2    write locks 0    pid/thread
23242/140324451448576 flags 0    priority 100
80000572 READ          1 HELD    0x23f140 len:   9 data: 020000000000000000
80000572 READ          1 HELD    dn2id.bdb                 page      10752
80000573 dd= 0 locks held 36   write locks 19   pid/thread
23242/140324451448576 flags 0    priority 100
80000573 READ          1 WAIT    cloudIdeAliases.bdb       page       5337
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       4604
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       4604
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       6375
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       6375
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        200
80000573 READ          1 HELD    cloudIdeAliases.bdb       page        200
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       1438
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       1438
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page         16
80000573 READ          1 HELD    cloudIdeAliases.bdb       page         16
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        286
80000573 READ          1 HELD    cloudIdeAliases.bdb       page        286
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       2308
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       2308
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       4708
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       4708
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        123
80000573 READ          1 HELD    cloudIdeAliases.bdb       page        123
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        540
80000573 READ          1 HELD    cloudIdeAliases.bdb       page        540
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       4737
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       4737
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       2806
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       2806
80000573 WRITE         1 HELD    ou.bdb                    page        271
80000573 READ          1 HELD    ou.bdb                    page        271
80000573 WRITE         7 HELD    objectClass.bdb           page          3
80000573 READ          3 HELD    objectClass.bdb           page          3
80000573 WRITE         3 HELD    objectClass.bdb           page          2
80000573 READ          1 HELD    objectClass.bdb           page          2
80000573 WRITE         1 HELD    dn2id.bdb                 page      10234
80000573 READ          1 HELD    dn2id.bdb                 page      10234
80000573 WRITE         1 HELD    dn2id.bdb                 page          2
80000573 READ          1 HELD    dn2id.bdb                 page          2
80000573 WRITE         1 HELD    dn2id.bdb                 page      10225
80000573 WRITE         1 HELD    dn2id.bdb                 page      10752
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Locks grouped by object:
Locker   Mode      Count Status  ----------------- Object ---------------
80000573 READ          1 HELD    cloudIdeAliases.bdb       page       4604
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       4604

80000573 READ          1 HELD    cloudIdeAliases.bdb       page       4708
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       4708

80000573 READ          1 HELD    cloudIdeAliases.bdb       page        540
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        540

       13 READ          1 HELD    ou.bdb                    handle        0

80000573 READ          1 HELD    cloudIdeAliases.bdb       page       2806
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       2806

80000573 READ          1 HELD    cloudIdeAliases.bdb       page       4737
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       4737

80000573 READ          1 HELD    ou.bdb                    page        271
80000573 WRITE         1 HELD    ou.bdb                    page        271

80000572 READ          1 HELD    dn2id.bdb                 page      10752
80000573 WRITE         1 HELD    dn2id.bdb                 page      10752

        e READ          1 HELD    id2entry.bdb              handle        0

8000045f WRITE         1 HELD    cloudIdeAliases.bdb       page       5337
80000573 READ          1 WAIT    cloudIdeAliases.bdb       page       5337

80000573 READ          1 HELD    cloudIdeAliases.bdb       page       1438
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       1438

        f READ          1 HELD    dn2id.bdb                 handle        0

80000573 READ          1 HELD    dn2id.bdb                 page          2
80000573 WRITE         1 HELD    dn2id.bdb                 page          2

80000572 READ          1 HELD    0x23f140 len:   9 data: 020000000000000000

80000573 READ          1 HELD    dn2id.bdb                 page      10234
80000573 WRITE         1 HELD    dn2id.bdb                 page      10234

80000573 WRITE         1 HELD    dn2id.bdb                 page      10225

80000573 READ          1 HELD    cloudIdeAliases.bdb       page        123
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        123

       12 READ          1 HELD    cloudIdeAliases.bdb       handle        0

80000573 READ          1 HELD    cloudIdeAliases.bdb       page         16
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page         16

80000573 READ          1 HELD    cloudIdeAliases.bdb       page        200
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        200

80000573 READ          1 HELD    cloudIdeAliases.bdb       page       6375
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       6375

80000573 READ          3 HELD    objectClass.bdb           page          3
80000573 WRITE         7 HELD    objectClass.bdb           page          3
800001b6 READ          1 WAIT    objectClass.bdb           page          3

80000573 READ          1 HELD    objectClass.bdb           page          2
80000573 WRITE         3 HELD    objectClass.bdb           page          2

       11 READ          1 HELD    objectClass.bdb           handle        0

80000573 READ          1 HELD    cloudIdeAliases.bdb       page       2308
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page       2308

80000573 READ          1 HELD    cloudIdeAliases.bdb       page        286
80000573 WRITE         1 HELD    cloudIdeAliases.bdb       page        286


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/