RE: (ITS#7633) Slapd hangs on hdb write lock
by ck@cksoft.de
Hi Dusan,
On Fri, 28 Jun 2013, dusan.fric(a)t-systems.sk wrote:
> Hi Chu,
>
> thanks for your quick response. We will try to contact Oracle to advice here.
>
> However, what is your suggestion here ' try different settings for the deadlock detector' exactly?
>
> BTW - certainly, we are patched DB 4.7.25 (all 4 patches)...
In case you can update to 2.4.35 you might want to test the mdb backend with your workload.
Especially if you can easily reproduce the problem.
Greetings
Christian
>
> Regards,
> Dusan
>
> -----Original Message-----
> From: Howard Chu [mailto:hyc@symas.com]
> Sent: 26. j??na 2013 13:32
> To: Fric, Dusan; openldap-its(a)openldap.org
> Subject: Re: (ITS#7633) Slapd hangs on hdb write lock
>
> dusan.fric(a)t-systems.sk wrote:
>> Full_Name: Dusan Fric
>> Version: 2.4.32
>> OS: RHEL 5.7 x64
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (88.212.40.139)
>>
>>
>> We are experiencing frequent hangs in slapd. Once hung we usualy
>> cannot continue to connect until we kill -9 the slapd process and
>> restart it. The directory is used for 2 applications as user eDir and
>> we are using it in production over 1 month - we have noted the busier
>> the directory becomes the more often it hangs (now twice per week).
>>
>> We have installed identical configurations on 3 environments, each has
>> only one single server (no replication). There are 30k entries in the
>> directory (production).
>>
>> We are running:
>>
>> RHEL 5.7 x64 (VMWare with NFS mountpoints) OpenLDAP 2.4.32 Berkeley DB
>> 4.8.30 (4.7.25)
>>
>> We were starting with DB 4.8.30, after downgrade to version 4.7.25
>> (according to
>> ITS#7378 -
>> https://www.openldap.org/its/index.cgi/Incoming?id=7378;selectid=7378)
>> we are facing the same issue.
>>
>> The problem occurs when two requests simultaneously try to update
>> attributes of a record and in most cases of the same DN value.
>> We can easily reproduce it with a java test program running 2 threads
>> each connecting to the ldap server and updating the record for a
>> particular DN value.
>> It need not be the same DN value but DN values which reside on the
>> same BDB page.
>
> Thanks for the traces, but the information you've provided shows that this is a BDB problem, not an OpenLDAP bug. You'll have to contact Oracle for help.
> The BDB deadlock detector is supposed to handle these situations.
>
> For example in your 4.8.30 lock status there are only 2 conflicting transactions 80000028 and 80000029 and they are only contending on a single page. This is one of the most elementary deadlock situations.
>
> You might try different settings for the deadlock detector, but ultimately this appears to be a BDB bug and the choice of detector shouldn't matter.
>
>> Configuration
>>
>> olc cn=config (part):
>> dn: cn=config
>> objectClass: olcGlobal
>> cn: config
>> olcConcurrency: 0
>> olcConnMaxPending: 100
>> olcConnMaxPendingAuth: 1000
>> olcSockbufMaxIncoming: 262143
>> olcSockbufMaxIncomingAuth: 16777215
>> olcThreads: 32
>>
>> olc DB hdb config (part):
>> dn: olcDatabase={1}hdb,cn=config
>> objectClass: olcHdbConfig
>> objectClass: olcDatabaseConfig
>> olcDatabase: {1}hdb
>> olcDbCacheSize: 80000
>> olcDbCheckpoint: 128 5
>> olcDbConfig: {0}set_cachesize 0 268435456 1
>> olcDbConfig: {1}set_lg_max 10485760
>> olcDbConfig: {2}set_lg_bsize 2097152
>> olcDbConfig: {3}set_lg_dir /pkg/openldap/dblog
>> olcDbConfig: {4}set_lg_regionmax 262144
>> olcDbConfig: {5}set_lk_detect DB_LOCK_EXPIRE
>> olcDbConfig: {6}set_flags DB_TXN_NOSYNC
>> olcDbDirtyRead: FALSE
>> olcDbDNcacheSize: 0
>> olcDbIDLcacheSize: 240000
>> olcDbIndex: default eq,sub
>> olcDbIndex: objectClass eq
>> olcDbIndex: cn eq,sub
>> olcDbIndex: uid eq,sub
>> olcDbIndex: mail pres,eq
>> olcDbIndex: sn eq,sub
>> olcDbIndex: member eq
>> olcDbLinearIndex: FALSE
>> olcDbMode: 0600
>> olcDbNoSync: TRUE
>> olcDbSearchStack: 20
>> olcDbShmKey: 0
>> olcLastMod: TRUE
>> olcMaxDerefDepth: 15
>> olcMonitoring: TRUE
>> olcReadOnly: FALSE
>>
>> We have managed to collect db_stat lock information, which indicates
>> the same issue with DB write locks on both DB versions.
>>
>> db_stat -C ol (4.8.30)
>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> Lock REGINFO information:
>> Lock Region type
>> 5 Region ID
>> /apps/DECCLASA-1/data/openldap/__db.005 Region name
>> 0x2ab41cdc3000 Region address
>> 0x2ab41cdc3138 Region primary address
>> 0 Region maximum allocation
>> 0 Region allocated
>> Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
>> REGION_JOIN_OK Region flags
>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> Locks grouped by lockers:
>> Locker Mode Count Status ----------------- Object ---------------
>> 1 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
>> 1 READ 1 HELD id2entry.bdb handle 0
>> 2 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
>> 3 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
>> 3 READ 1 HELD dn2id.bdb handle 0
>> 4 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 5 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
>> 6 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
>> 7 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 8 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
>> 9 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
>> 9 READ 1 HELD objectClass.bdb handle 0
>> a dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
>> b dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
>> c dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
>> c READ 1 HELD uid.bdb handle 0
>> d dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
>> e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
>> f dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
>> 10 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
>> 10 READ 1 HELD member.bdb handle 0
>> 11 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 12 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 13 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
>> 13 READ 1 HELD mail.bdb handle 0
>> 14 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 15 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 16 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 17 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 18 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
>> 18 READ 1 HELD sn.bdb handle 0
>> 19 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 1a dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 1b dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
>> 1b READ 1 HELD cn.bdb handle 0
>> 1c dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 1d dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 1e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
>> 1f dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
>> 80000003 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
>> 80000004 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
>> 80000005 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
>> 80000006 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
>> 80000007 dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
>> 80000026 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
>> 80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
>> 80000027 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
>> 80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
>> 80000028 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
>> 80000028 WRITE 1 WAIT mail.bdb page 3
>> 80000028 READ 1 HELD mail.bdb page 3
>> 80000029 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
>> 80000029 WRITE 1 WAIT mail.bdb page 3
>> 80000029 READ 1 HELD mail.bdb page 3
>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> Locks grouped by object:
>> Locker Mode Count Status ----------------- Object ---------------
>> 80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
>>
>> 1b READ 1 HELD cn.bdb handle 0
>>
>> 10 READ 1 HELD member.bdb handle 0
>>
>> 9 READ 1 HELD objectClass.bdb handle 0
>>
>> c READ 1 HELD uid.bdb handle 0
>>
>> 80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
>>
>> 1 READ 1 HELD id2entry.bdb handle 0
>>
>> 18 READ 1 HELD sn.bdb handle 0
>>
>> 3 READ 1 HELD dn2id.bdb handle 0
>>
>> 80000029 READ 1 HELD mail.bdb page 3
>> 80000028 READ 1 HELD mail.bdb page 3
>> 80000029 WRITE 1 WAIT mail.bdb page 3
>> 80000028 WRITE 1 WAIT mail.bdb page 3
>>
>> 13 READ 1 HELD mail.bdb handle 0
>>
>> db_stat -C ol (4.7.25)
>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> Lock REGINFO information:
>> Lock Region type
>> 5 Region ID
>> /apps/DECCLASA-1/data/openldap/__db.005 Region name
>> 0x2b4a8a577000 Original region address
>> 0x2b4a8a577000 Region address
>> 0x2b4a8a577138 Region primary address
>> 0 Region maximum allocation
>> 0 Region allocated
>> Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
>> REGION_JOIN_OK Region flags
>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> Locks grouped by lockers:
>> Locker Mode Count Status ----------------- Object ---------------
>> 1 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
>> 1 READ 1 HELD id2entry.bdb handle 0
>> 2 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
>> 3 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
>> 3 READ 1 HELD dn2id.bdb handle 0
>> 4 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
>> 5 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
>> 6 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 7 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
>> 7 READ 1 HELD objectClass.bdb handle 0
>> 8 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 9 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> a dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> b dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> c dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
>> c READ 1 HELD cn.bdb handle 0
>> d dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> e dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> f dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
>> f READ 1 HELD uid.bdb handle 0
>> 10 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 11 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 12 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
>> 12 READ 1 HELD member.bdb handle 0
>> 13 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 14 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 15 dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
>> 15 READ 1 HELD mail.bdb handle 0
>> 16 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 17 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 18 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 19 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 1a dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
>> 1a READ 1 HELD sn.bdb handle 0
>> 1b dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 1c dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 1d dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 1e dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 1f dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 80000003 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
>> 80000004 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
>> 80000008 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
>> 80000038 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
>> 800000a9 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
>> 800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
>> 800000aa dd= 0 locks held 2 write locks 0 pid/thread 32447/1097341248
>> 800000aa WRITE 1 WAIT mail.bdb page 3
>> 800000aa READ 1 HELD mail.bdb page 1
>> 800000aa READ 1 HELD mail.bdb page 3
>> 800000ab dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
>> 800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
>> 800000ac dd= 0 locks held 2 write locks 0 pid/thread 32447/1105733952
>> 800000ac WRITE 1 WAIT mail.bdb page 3
>> 800000ac READ 1 HELD mail.bdb page 1
>> 800000ac READ 1 HELD mail.bdb page 3
>> 800000b1 dd= 0 locks held 0 write locks 0 pid/thread 32447/1122519360
>> 800000b2 dd= 0 locks held 0 write locks 0 pid/thread 32447/1130912064
>> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>> Locks grouped by object:
>> Locker Mode Count Status ----------------- Object ---------------
>> 800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
>>
>> 1 READ 1 HELD id2entry.bdb handle 0
>>
>> 3 READ 1 HELD dn2id.bdb handle 0
>>
>> 7 READ 1 HELD objectClass.bdb handle 0
>>
>> f READ 1 HELD uid.bdb handle 0
>>
>> 1a READ 1 HELD sn.bdb handle 0
>>
>> 800000aa READ 1 HELD mail.bdb page 3
>> 800000ac READ 1 HELD mail.bdb page 3
>> 800000aa WRITE 1 WAIT mail.bdb page 3
>> 800000ac WRITE 1 WAIT mail.bdb page 3
>>
>> c READ 1 HELD cn.bdb handle 0
>>
>> 15 READ 1 HELD mail.bdb handle 0
>>
>> 800000aa READ 1 HELD mail.bdb page 1
>> 800000ac READ 1 HELD mail.bdb page 1
>>
>> 12 READ 1 HELD member.bdb handle 0
>>
>> 800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
>>
>>
>> We have also collected the backtrace of threads for both DB versions
>> which I have uploaded to:
>> https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.8.30_
>> 20130625.txt
>>
>> https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.7.25_
>> 20130625.txt
>>
>>
>
>
> --
> -- Howard Chu
> CTO, Symas Corp. http://www.symas.com
> Director, Highland Sun http://highlandsun.com/hyc/
> Chief Architect, OpenLDAP http://www.openldap.org/project/
>
>
>
--
Christian Kratzer CK Software GmbH
Email: ck(a)cksoft.de Wildberger Weg 24/2
Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden
Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart
Web: http://www.cksoft.de/ Geschaeftsfuehrer: Christian Kratzer
10 years, 5 months
RE: (ITS#7633) Slapd hangs on hdb write lock
by dusan.fric@t-systems.sk
Hi Chu,
thanks for your quick response. We will try to contact Oracle to advice here.
However, what is your suggestion here ' try different settings for the deadlock detector' exactly?
BTW - certainly, we are patched DB 4.7.25 (all 4 patches)...
Regards,
Dusan
-----Original Message-----
From: Howard Chu [mailto:hyc@symas.com]
Sent: 26. júna 2013 13:32
To: Fric, Dusan; openldap-its(a)openldap.org
Subject: Re: (ITS#7633) Slapd hangs on hdb write lock
dusan.fric(a)t-systems.sk wrote:
> Full_Name: Dusan Fric
> Version: 2.4.32
> OS: RHEL 5.7 x64
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (88.212.40.139)
>
>
> We are experiencing frequent hangs in slapd. Once hung we usualy
> cannot continue to connect until we kill -9 the slapd process and
> restart it. The directory is used for 2 applications as user eDir and
> we are using it in production over 1 month - we have noted the busier
> the directory becomes the more often it hangs (now twice per week).
>
> We have installed identical configurations on 3 environments, each has
> only one single server (no replication). There are 30k entries in the
> directory (production).
>
> We are running:
>
> RHEL 5.7 x64 (VMWare with NFS mountpoints) OpenLDAP 2.4.32 Berkeley DB
> 4.8.30 (4.7.25)
>
> We were starting with DB 4.8.30, after downgrade to version 4.7.25
> (according to
> ITS#7378 -
> https://www.openldap.org/its/index.cgi/Incoming?id=7378;selectid=7378)
> we are facing the same issue.
>
> The problem occurs when two requests simultaneously try to update
> attributes of a record and in most cases of the same DN value.
> We can easily reproduce it with a java test program running 2 threads
> each connecting to the ldap server and updating the record for a
> particular DN value.
> It need not be the same DN value but DN values which reside on the
> same BDB page.
Thanks for the traces, but the information you've provided shows that this is a BDB problem, not an OpenLDAP bug. You'll have to contact Oracle for help.
The BDB deadlock detector is supposed to handle these situations.
For example in your 4.8.30 lock status there are only 2 conflicting transactions 80000028 and 80000029 and they are only contending on a single page. This is one of the most elementary deadlock situations.
You might try different settings for the deadlock detector, but ultimately this appears to be a BDB bug and the choice of detector shouldn't matter.
> Configuration
>
> olc cn=config (part):
> dn: cn=config
> objectClass: olcGlobal
> cn: config
> olcConcurrency: 0
> olcConnMaxPending: 100
> olcConnMaxPendingAuth: 1000
> olcSockbufMaxIncoming: 262143
> olcSockbufMaxIncomingAuth: 16777215
> olcThreads: 32
>
> olc DB hdb config (part):
> dn: olcDatabase={1}hdb,cn=config
> objectClass: olcHdbConfig
> objectClass: olcDatabaseConfig
> olcDatabase: {1}hdb
> olcDbCacheSize: 80000
> olcDbCheckpoint: 128 5
> olcDbConfig: {0}set_cachesize 0 268435456 1
> olcDbConfig: {1}set_lg_max 10485760
> olcDbConfig: {2}set_lg_bsize 2097152
> olcDbConfig: {3}set_lg_dir /pkg/openldap/dblog
> olcDbConfig: {4}set_lg_regionmax 262144
> olcDbConfig: {5}set_lk_detect DB_LOCK_EXPIRE
> olcDbConfig: {6}set_flags DB_TXN_NOSYNC
> olcDbDirtyRead: FALSE
> olcDbDNcacheSize: 0
> olcDbIDLcacheSize: 240000
> olcDbIndex: default eq,sub
> olcDbIndex: objectClass eq
> olcDbIndex: cn eq,sub
> olcDbIndex: uid eq,sub
> olcDbIndex: mail pres,eq
> olcDbIndex: sn eq,sub
> olcDbIndex: member eq
> olcDbLinearIndex: FALSE
> olcDbMode: 0600
> olcDbNoSync: TRUE
> olcDbSearchStack: 20
> olcDbShmKey: 0
> olcLastMod: TRUE
> olcMaxDerefDepth: 15
> olcMonitoring: TRUE
> olcReadOnly: FALSE
>
> We have managed to collect db_stat lock information, which indicates
> the same issue with DB write locks on both DB versions.
>
> db_stat -C ol (4.8.30)
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Lock REGINFO information:
> Lock Region type
> 5 Region ID
> /apps/DECCLASA-1/data/openldap/__db.005 Region name
> 0x2ab41cdc3000 Region address
> 0x2ab41cdc3138 Region primary address
> 0 Region maximum allocation
> 0 Region allocated
> Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
> REGION_JOIN_OK Region flags
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by lockers:
> Locker Mode Count Status ----------------- Object ---------------
> 1 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
> 1 READ 1 HELD id2entry.bdb handle 0
> 2 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
> 3 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
> 3 READ 1 HELD dn2id.bdb handle 0
> 4 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 5 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
> 6 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
> 7 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 8 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
> 9 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> 9 READ 1 HELD objectClass.bdb handle 0
> a dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> b dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> c dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> c READ 1 HELD uid.bdb handle 0
> d dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> f dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> 10 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 10 READ 1 HELD member.bdb handle 0
> 11 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 12 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 13 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 13 READ 1 HELD mail.bdb handle 0
> 14 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 15 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 16 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 17 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 18 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 18 READ 1 HELD sn.bdb handle 0
> 19 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1a dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1b dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 1b READ 1 HELD cn.bdb handle 0
> 1c dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1d dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> 1f dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
> 80000003 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
> 80000004 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
> 80000005 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 80000006 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
> 80000007 dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> 80000026 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> 80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
> 80000027 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
> 80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
> 80000028 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
> 80000028 WRITE 1 WAIT mail.bdb page 3
> 80000028 READ 1 HELD mail.bdb page 3
> 80000029 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> 80000029 WRITE 1 WAIT mail.bdb page 3
> 80000029 READ 1 HELD mail.bdb page 3
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by object:
> Locker Mode Count Status ----------------- Object ---------------
> 80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
>
> 1b READ 1 HELD cn.bdb handle 0
>
> 10 READ 1 HELD member.bdb handle 0
>
> 9 READ 1 HELD objectClass.bdb handle 0
>
> c READ 1 HELD uid.bdb handle 0
>
> 80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
>
> 1 READ 1 HELD id2entry.bdb handle 0
>
> 18 READ 1 HELD sn.bdb handle 0
>
> 3 READ 1 HELD dn2id.bdb handle 0
>
> 80000029 READ 1 HELD mail.bdb page 3
> 80000028 READ 1 HELD mail.bdb page 3
> 80000029 WRITE 1 WAIT mail.bdb page 3
> 80000028 WRITE 1 WAIT mail.bdb page 3
>
> 13 READ 1 HELD mail.bdb handle 0
>
> db_stat -C ol (4.7.25)
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Lock REGINFO information:
> Lock Region type
> 5 Region ID
> /apps/DECCLASA-1/data/openldap/__db.005 Region name
> 0x2b4a8a577000 Original region address
> 0x2b4a8a577000 Region address
> 0x2b4a8a577138 Region primary address
> 0 Region maximum allocation
> 0 Region allocated
> Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
> REGION_JOIN_OK Region flags
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by lockers:
> Locker Mode Count Status ----------------- Object ---------------
> 1 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
> 1 READ 1 HELD id2entry.bdb handle 0
> 2 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
> 3 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
> 3 READ 1 HELD dn2id.bdb handle 0
> 4 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
> 5 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
> 6 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 7 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> 7 READ 1 HELD objectClass.bdb handle 0
> 8 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 9 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> a dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> b dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> c dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> c READ 1 HELD cn.bdb handle 0
> d dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> e dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> f dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> f READ 1 HELD uid.bdb handle 0
> 10 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 11 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 12 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> 12 READ 1 HELD member.bdb handle 0
> 13 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 14 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 15 dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
> 15 READ 1 HELD mail.bdb handle 0
> 16 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 17 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 18 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 19 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1a dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
> 1a READ 1 HELD sn.bdb handle 0
> 1b dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1c dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1d dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1e dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1f dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 80000003 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
> 80000004 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 80000008 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 80000038 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
> 800000a9 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> 800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
> 800000aa dd= 0 locks held 2 write locks 0 pid/thread 32447/1097341248
> 800000aa WRITE 1 WAIT mail.bdb page 3
> 800000aa READ 1 HELD mail.bdb page 1
> 800000aa READ 1 HELD mail.bdb page 3
> 800000ab dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
> 800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
> 800000ac dd= 0 locks held 2 write locks 0 pid/thread 32447/1105733952
> 800000ac WRITE 1 WAIT mail.bdb page 3
> 800000ac READ 1 HELD mail.bdb page 1
> 800000ac READ 1 HELD mail.bdb page 3
> 800000b1 dd= 0 locks held 0 write locks 0 pid/thread 32447/1122519360
> 800000b2 dd= 0 locks held 0 write locks 0 pid/thread 32447/1130912064
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by object:
> Locker Mode Count Status ----------------- Object ---------------
> 800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
>
> 1 READ 1 HELD id2entry.bdb handle 0
>
> 3 READ 1 HELD dn2id.bdb handle 0
>
> 7 READ 1 HELD objectClass.bdb handle 0
>
> f READ 1 HELD uid.bdb handle 0
>
> 1a READ 1 HELD sn.bdb handle 0
>
> 800000aa READ 1 HELD mail.bdb page 3
> 800000ac READ 1 HELD mail.bdb page 3
> 800000aa WRITE 1 WAIT mail.bdb page 3
> 800000ac WRITE 1 WAIT mail.bdb page 3
>
> c READ 1 HELD cn.bdb handle 0
>
> 15 READ 1 HELD mail.bdb handle 0
>
> 800000aa READ 1 HELD mail.bdb page 1
> 800000ac READ 1 HELD mail.bdb page 1
>
> 12 READ 1 HELD member.bdb handle 0
>
> 800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
>
>
> We have also collected the backtrace of threads for both DB versions
> which I have uploaded to:
> https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.8.30_
> 20130625.txt
>
> https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.7.25_
> 20130625.txt
>
>
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
10 years, 5 months
Re: (ITS#7377) Poor libmdb error handling
by h.b.furuseth@usit.uio.no
I wrote:
> at
> least MDB_KEYEXIST can happen after touching pages, which affects
> the txn size even if the change "does nothing".
Only when updating subDBs, I think.
--
Hallvard
10 years, 5 months
Re: (ITS#7377) Poor libmdb error handling
by h.b.furuseth@usit.uio.no
Howard Chu writes:
> h.b.furuseth(a)usit.uio.no wrote:
>> Changes from a failed liblmdb operation can be visible. Sometimes
>> the resulting DB will also be inconsistent. The operation should
>> invalidate the affected transaction/cursors or revert the change.
>
> The transaction should be invalidated. Nothing more can be done
> once any error occurs in a transaction.
That would break back-mdb for MDB_NOTFOUND/MDB_KEYEXIST. And at
least MDB_KEYEXIST can happen after touching pages, which affects
the txn size even if the change "does nothing".
I expected to generally invalidate except in necessary cases, and
from then on make more cases "gentle" at our convenience without
spending much time on it.
>> put(MDB_MULTIPLE) can write some of the provided items and then
>> fail, but be able to keep the txn usable. (...)
>
> No, "partial success" is not allowed. A transaction is atomic, all
> or nothing.
It would be atomic like a partial OS write(). It'd be up to the
user what to do next. Not that I care myself if it is supported.
>> A failed txn could still grow the txn size, unless MDB "un-touches"
>> pages which memcmp says have not changed, rewinds me_pglast and
>> me_pghead, etc. Which seems a lot of work, likely not worth the
>> trouble.
>
> pghead/pglast should not persist into the environment until a successful
> commit. Until then any error should just discard them.
Sorry, I meant a failed _operation_ could grow them and thus affect
the committed freelist size.
>> I think cursors need a C_INVALID flag, so future ops cannot give a
>> surprising success result instead of failing: get(MDB_NEXT/MDB_PREV)
>> take ~C_INITIALIZED to mean (re)start from the beginning/end.
>
> Good idea.
...or C_VALID so a single OR/AND can set/clear them both.
>> Don't know how often reverting is a relevant option. (...)
>
> Reverting is irrelevant. Once a txn is aborted no changes are visible.
True, but I meant while the txn is still live and usable. Hopefully
it's only relevant for "nice to have" cases.
--
Hallvard
10 years, 5 months
Re: (ITS#7377) Poor libmdb error handling
by hyc@symas.com
h.b.furuseth(a)usit.uio.no wrote:
> Changes from a failed liblmdb operation can be visible. Sometimes
> the resulting DB will also be inconsistent. The operation should
> invalidate the affected transaction/cursors or revert the change.
The transaction should be invalidated. Nothing more can be done once any error
occurs in a transaction.
> At least I hope it's feasible to invalidate so the user cannot see
> surprising results, instead of just documenting "don't do that".
>
> Related issues:
>
> put(MDB_MULTIPLE) can write some of the provided items and then
> fail, but be able to keep the txn usable. It should invalidate the
> txn anyway, or document that the caller must check for e.g. return
> value MDB_PARTIAL_SUCCESS. Maybe invalidate unless the the caller
> requests support for partial success. put() would update the input
> params to indicate which items remain unwritten.
No, "partial success" is not allowed. A transaction is atomic, all or nothing.
> A failed txn could still grow the txn size, unless MDB "un-touches"
> pages which memcmp says have not changed, rewinds me_pglast and
> me_pghead, etc. Which seems a lot of work, likely not worth the
> trouble.
pghead/pglast should not persist into the environment until a successful
commit. Until then any error should just discard them.
>
> I think cursors need a C_INVALID flag, so future ops cannot give a
> surprising success result instead of failing: get(MDB_NEXT/MDB_PREV)
> take ~C_INITIALIZED to mean (re)start from the beginning/end.
Good idea.
> Don't know how often reverting is a relevant option. Rearranging
> code can help, but maybe not much. E.g. the mdb_del() in mdb_drop()
> can do some harmless touches and then fail cleanly - but mdb_drop()
> must invalidate the txn anyway since mdb_drop0() has deleted pages.
> Unless it does extra work - remember mt_free_pgs[0] before and after
> mdb_drop0(), then delete from mt_free_pgs[] the pages added by
> drop0. Or drop0 could be done last since it can fail cleanly just
> by resetting mt_free_pgs[0]. But I don't know if del(MDB_MULTIPLE)
> can also do that if the DB used a subpage. drop0 can't examine the
> subpage after it got deleted.
>
Reverting is irrelevant. Once a txn is aborted no changes are visible.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
10 years, 5 months
Re: (ITS#7377) Poor libmdb error handling
by h.b.furuseth@usit.uio.no
Changes from a failed liblmdb operation can be visible. Sometimes
the resulting DB will also be inconsistent. The operation should
invalidate the affected transaction/cursors or revert the change.
At least I hope it's feasible to invalidate so the user cannot see
surprising results, instead of just documenting "don't do that".
Related issues:
put(MDB_MULTIPLE) can write some of the provided items and then
fail, but be able to keep the txn usable. It should invalidate the
txn anyway, or document that the caller must check for e.g. return
value MDB_PARTIAL_SUCCESS. Maybe invalidate unless the the caller
requests support for partial success. put() would update the input
params to indicate which items remain unwritten.
A failed txn could still grow the txn size, unless MDB "un-touches"
pages which memcmp says have not changed, rewinds me_pglast and
me_pghead, etc. Which seems a lot of work, likely not worth the
trouble.
I think cursors need a C_INVALID flag, so future ops cannot give a
surprising success result instead of failing: get(MDB_NEXT/MDB_PREV)
take ~C_INITIALIZED to mean (re)start from the beginning/end.
Don't know how often reverting is a relevant option. Rearranging
code can help, but maybe not much. E.g. the mdb_del() in mdb_drop()
can do some harmless touches and then fail cleanly - but mdb_drop()
must invalidate the txn anyway since mdb_drop0() has deleted pages.
Unless it does extra work - remember mt_free_pgs[0] before and after
mdb_drop0(), then delete from mt_free_pgs[] the pages added by
drop0. Or drop0 could be done last since it can fail cleanly just
by resetting mt_free_pgs[0]. But I don't know if del(MDB_MULTIPLE)
can also do that if the DB used a subpage. drop0 can't examine the
subpage after it got deleted.
--
Hallvard
10 years, 5 months
Re: (ITS#7633) Slapd hangs on hdb write lock
by hyc@symas.com
dusan.fric(a)t-systems.sk wrote:
> Full_Name: Dusan Fric
> Version: 2.4.32
> OS: RHEL 5.7 x64
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (88.212.40.139)
>
>
> We are experiencing frequent hangs in slapd. Once hung we usualy cannot continue
> to
> connect until we kill -9 the slapd process and restart it. The directory is used
> for 2 applications as user eDir and we are using it in production over 1 month -
> we have noted the busier the directory becomes the more often it hangs (now
> twice per week).
>
> We have installed identical configurations on 3 environments, each has only one
> single server (no replication). There are 30k entries in the directory
> (production).
>
> We are running:
>
> RHEL 5.7 x64 (VMWare with NFS mountpoints)
> OpenLDAP 2.4.32
> Berkeley DB 4.8.30 (4.7.25)
>
> We were starting with DB 4.8.30, after downgrade to version 4.7.25 (according to
> ITS#7378 - https://www.openldap.org/its/index.cgi/Incoming?id=7378;selectid=7378)
> we are facing the same issue.
>
> The problem occurs when two requests simultaneously try to update attributes of
> a record and in most cases of the same DN value.
> We can easily reproduce it with a java test program running 2 threads each
> connecting to the ldap server and updating the record for a particular DN
> value.
> It need not be the same DN value but DN values which reside on the same BDB
> page.
Thanks for the traces, but the information you've provided shows that this is
a BDB problem, not an OpenLDAP bug. You'll have to contact Oracle for help.
The BDB deadlock detector is supposed to handle these situations.
For example in your 4.8.30 lock status there are only 2 conflicting
transactions 80000028 and 80000029 and they are only contending on a single
page. This is one of the most elementary deadlock situations.
You might try different settings for the deadlock detector, but ultimately
this appears to be a BDB bug and the choice of detector shouldn't matter.
> Configuration
>
> olc cn=config (part):
> dn: cn=config
> objectClass: olcGlobal
> cn: config
> olcConcurrency: 0
> olcConnMaxPending: 100
> olcConnMaxPendingAuth: 1000
> olcSockbufMaxIncoming: 262143
> olcSockbufMaxIncomingAuth: 16777215
> olcThreads: 32
>
> olc DB hdb config (part):
> dn: olcDatabase={1}hdb,cn=config
> objectClass: olcHdbConfig
> objectClass: olcDatabaseConfig
> olcDatabase: {1}hdb
> olcDbCacheSize: 80000
> olcDbCheckpoint: 128 5
> olcDbConfig: {0}set_cachesize 0 268435456 1
> olcDbConfig: {1}set_lg_max 10485760
> olcDbConfig: {2}set_lg_bsize 2097152
> olcDbConfig: {3}set_lg_dir /pkg/openldap/dblog
> olcDbConfig: {4}set_lg_regionmax 262144
> olcDbConfig: {5}set_lk_detect DB_LOCK_EXPIRE
> olcDbConfig: {6}set_flags DB_TXN_NOSYNC
> olcDbDirtyRead: FALSE
> olcDbDNcacheSize: 0
> olcDbIDLcacheSize: 240000
> olcDbIndex: default eq,sub
> olcDbIndex: objectClass eq
> olcDbIndex: cn eq,sub
> olcDbIndex: uid eq,sub
> olcDbIndex: mail pres,eq
> olcDbIndex: sn eq,sub
> olcDbIndex: member eq
> olcDbLinearIndex: FALSE
> olcDbMode: 0600
> olcDbNoSync: TRUE
> olcDbSearchStack: 20
> olcDbShmKey: 0
> olcLastMod: TRUE
> olcMaxDerefDepth: 15
> olcMonitoring: TRUE
> olcReadOnly: FALSE
>
> We have managed to collect db_stat lock information, which indicates the same
> issue with DB write locks on both DB versions.
>
> db_stat -C ol (4.8.30)
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Lock REGINFO information:
> Lock Region type
> 5 Region ID
> /apps/DECCLASA-1/data/openldap/__db.005 Region name
> 0x2ab41cdc3000 Region address
> 0x2ab41cdc3138 Region primary address
> 0 Region maximum allocation
> 0 Region allocated
> Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
> REGION_JOIN_OK Region flags
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by lockers:
> Locker Mode Count Status ----------------- Object ---------------
> 1 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
> 1 READ 1 HELD id2entry.bdb handle 0
> 2 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
> 3 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
> 3 READ 1 HELD dn2id.bdb handle 0
> 4 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 5 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
> 6 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
> 7 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 8 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
> 9 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> 9 READ 1 HELD objectClass.bdb handle 0
> a dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> b dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> c dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> c READ 1 HELD uid.bdb handle 0
> d dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> f dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> 10 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 10 READ 1 HELD member.bdb handle 0
> 11 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 12 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 13 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 13 READ 1 HELD mail.bdb handle 0
> 14 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 15 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 16 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 17 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 18 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 18 READ 1 HELD sn.bdb handle 0
> 19 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1a dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1b dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
> 1b READ 1 HELD cn.bdb handle 0
> 1c dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1d dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 1e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> 1f dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
> 80000003 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
> 80000004 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
> 80000005 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
> 80000006 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
> 80000007 dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
> 80000026 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> 80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
> 80000027 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
> 80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
> 80000028 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
> 80000028 WRITE 1 WAIT mail.bdb page 3
> 80000028 READ 1 HELD mail.bdb page 3
> 80000029 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
> 80000029 WRITE 1 WAIT mail.bdb page 3
> 80000029 READ 1 HELD mail.bdb page 3
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by object:
> Locker Mode Count Status ----------------- Object ---------------
> 80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
>
> 1b READ 1 HELD cn.bdb handle 0
>
> 10 READ 1 HELD member.bdb handle 0
>
> 9 READ 1 HELD objectClass.bdb handle 0
>
> c READ 1 HELD uid.bdb handle 0
>
> 80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
>
> 1 READ 1 HELD id2entry.bdb handle 0
>
> 18 READ 1 HELD sn.bdb handle 0
>
> 3 READ 1 HELD dn2id.bdb handle 0
>
> 80000029 READ 1 HELD mail.bdb page 3
> 80000028 READ 1 HELD mail.bdb page 3
> 80000029 WRITE 1 WAIT mail.bdb page 3
> 80000028 WRITE 1 WAIT mail.bdb page 3
>
> 13 READ 1 HELD mail.bdb handle 0
>
> db_stat -C ol (4.7.25)
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Lock REGINFO information:
> Lock Region type
> 5 Region ID
> /apps/DECCLASA-1/data/openldap/__db.005 Region name
> 0x2b4a8a577000 Original region address
> 0x2b4a8a577000 Region address
> 0x2b4a8a577138 Region primary address
> 0 Region maximum allocation
> 0 Region allocated
> Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
> REGION_JOIN_OK Region flags
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by lockers:
> Locker Mode Count Status ----------------- Object ---------------
> 1 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
> 1 READ 1 HELD id2entry.bdb handle 0
> 2 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
> 3 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
> 3 READ 1 HELD dn2id.bdb handle 0
> 4 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
> 5 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
> 6 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 7 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> 7 READ 1 HELD objectClass.bdb handle 0
> 8 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 9 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> a dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> b dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> c dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> c READ 1 HELD cn.bdb handle 0
> d dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> e dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> f dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> f READ 1 HELD uid.bdb handle 0
> 10 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 11 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 12 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> 12 READ 1 HELD member.bdb handle 0
> 13 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 14 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 15 dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
> 15 READ 1 HELD mail.bdb handle 0
> 16 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 17 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 18 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 19 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1a dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
> 1a READ 1 HELD sn.bdb handle 0
> 1b dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1c dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1d dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1e dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 1f dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 80000003 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
> 80000004 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
> 80000008 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
> 80000038 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
> 800000a9 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
> 800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
> 800000aa dd= 0 locks held 2 write locks 0 pid/thread 32447/1097341248
> 800000aa WRITE 1 WAIT mail.bdb page 3
> 800000aa READ 1 HELD mail.bdb page 1
> 800000aa READ 1 HELD mail.bdb page 3
> 800000ab dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
> 800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
> 800000ac dd= 0 locks held 2 write locks 0 pid/thread 32447/1105733952
> 800000ac WRITE 1 WAIT mail.bdb page 3
> 800000ac READ 1 HELD mail.bdb page 1
> 800000ac READ 1 HELD mail.bdb page 3
> 800000b1 dd= 0 locks held 0 write locks 0 pid/thread 32447/1122519360
> 800000b2 dd= 0 locks held 0 write locks 0 pid/thread 32447/1130912064
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Locks grouped by object:
> Locker Mode Count Status ----------------- Object ---------------
> 800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
>
> 1 READ 1 HELD id2entry.bdb handle 0
>
> 3 READ 1 HELD dn2id.bdb handle 0
>
> 7 READ 1 HELD objectClass.bdb handle 0
>
> f READ 1 HELD uid.bdb handle 0
>
> 1a READ 1 HELD sn.bdb handle 0
>
> 800000aa READ 1 HELD mail.bdb page 3
> 800000ac READ 1 HELD mail.bdb page 3
> 800000aa WRITE 1 WAIT mail.bdb page 3
> 800000ac WRITE 1 WAIT mail.bdb page 3
>
> c READ 1 HELD cn.bdb handle 0
>
> 15 READ 1 HELD mail.bdb handle 0
>
> 800000aa READ 1 HELD mail.bdb page 1
> 800000ac READ 1 HELD mail.bdb page 1
>
> 12 READ 1 HELD member.bdb handle 0
>
> 800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
>
>
> We have also collected the backtrace of threads for both DB versions which I
> have uploaded to:
> https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.8.30_201...
>
> https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.7.25_201...
>
>
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
10 years, 5 months
(ITS#7633) Slapd hangs on hdb write lock
by dusan.fric@t-systems.sk
Full_Name: Dusan Fric
Version: 2.4.32
OS: RHEL 5.7 x64
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (88.212.40.139)
We are experiencing frequent hangs in slapd. Once hung we usualy cannot continue
to
connect until we kill -9 the slapd process and restart it. The directory is used
for 2 applications as user eDir and we are using it in production over 1 month -
we have noted the busier the directory becomes the more often it hangs (now
twice per week).
We have installed identical configurations on 3 environments, each has only one
single server (no replication). There are 30k entries in the directory
(production).
We are running:
RHEL 5.7 x64 (VMWare with NFS mountpoints)
OpenLDAP 2.4.32
Berkeley DB 4.8.30 (4.7.25)
We were starting with DB 4.8.30, after downgrade to version 4.7.25 (according to
ITS#7378 - https://www.openldap.org/its/index.cgi/Incoming?id=7378;selectid=7378)
we are facing the same issue.
The problem occurs when two requests simultaneously try to update attributes of
a record and in most cases of the same DN value.
We can easily reproduce it with a java test program running 2 threads each
connecting to the ldap server and updating the record for a particular DN
value.
It need not be the same DN value but DN values which reside on the same BDB
page.
Configuration
olc cn=config (part):
dn: cn=config
objectClass: olcGlobal
cn: config
olcConcurrency: 0
olcConnMaxPending: 100
olcConnMaxPendingAuth: 1000
olcSockbufMaxIncoming: 262143
olcSockbufMaxIncomingAuth: 16777215
olcThreads: 32
olc DB hdb config (part):
dn: olcDatabase={1}hdb,cn=config
objectClass: olcHdbConfig
objectClass: olcDatabaseConfig
olcDatabase: {1}hdb
olcDbCacheSize: 80000
olcDbCheckpoint: 128 5
olcDbConfig: {0}set_cachesize 0 268435456 1
olcDbConfig: {1}set_lg_max 10485760
olcDbConfig: {2}set_lg_bsize 2097152
olcDbConfig: {3}set_lg_dir /pkg/openldap/dblog
olcDbConfig: {4}set_lg_regionmax 262144
olcDbConfig: {5}set_lk_detect DB_LOCK_EXPIRE
olcDbConfig: {6}set_flags DB_TXN_NOSYNC
olcDbDirtyRead: FALSE
olcDbDNcacheSize: 0
olcDbIDLcacheSize: 240000
olcDbIndex: default eq,sub
olcDbIndex: objectClass eq
olcDbIndex: cn eq,sub
olcDbIndex: uid eq,sub
olcDbIndex: mail pres,eq
olcDbIndex: sn eq,sub
olcDbIndex: member eq
olcDbLinearIndex: FALSE
olcDbMode: 0600
olcDbNoSync: TRUE
olcDbSearchStack: 20
olcDbShmKey: 0
olcLastMod: TRUE
olcMaxDerefDepth: 15
olcMonitoring: TRUE
olcReadOnly: FALSE
We have managed to collect db_stat lock information, which indicates the same
issue with DB write locks on both DB versions.
db_stat -C ol (4.8.30)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Lock REGINFO information:
Lock Region type
5 Region ID
/apps/DECCLASA-1/data/openldap/__db.005 Region name
0x2ab41cdc3000 Region address
0x2ab41cdc3138 Region primary address
0 Region maximum allocation
0 Region allocated
Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
REGION_JOIN_OK Region flags
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Locks grouped by lockers:
Locker Mode Count Status ----------------- Object ---------------
1 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
1 READ 1 HELD id2entry.bdb handle 0
2 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
3 dd= 0 locks held 1 write locks 0 pid/thread 30034/47555767609232
3 READ 1 HELD dn2id.bdb handle 0
4 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
5 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
6 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
7 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
8 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
9 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
9 READ 1 HELD objectClass.bdb handle 0
a dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
b dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
c dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
c READ 1 HELD uid.bdb handle 0
d dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
f dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
10 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
10 READ 1 HELD member.bdb handle 0
11 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
12 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
13 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
13 READ 1 HELD mail.bdb handle 0
14 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
15 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
16 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
17 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
18 dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
18 READ 1 HELD sn.bdb handle 0
19 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
1a dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
1b dd= 0 locks held 1 write locks 0 pid/thread 30034/1147971904
1b READ 1 HELD cn.bdb handle 0
1c dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
1d dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
1e dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
1f dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
80000003 dd= 0 locks held 0 write locks 0 pid/thread 30034/47555767609232
80000004 dd= 0 locks held 0 write locks 0 pid/thread 30034/1139579200
80000005 dd= 0 locks held 0 write locks 0 pid/thread 30034/1147971904
80000006 dd= 0 locks held 0 write locks 0 pid/thread 30034/1131186496
80000007 dd= 0 locks held 0 write locks 0 pid/thread 30034/1122793792
80000026 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
80000027 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
80000028 dd= 0 locks held 1 write locks 0 pid/thread 30034/1131186496
80000028 WRITE 1 WAIT mail.bdb page 3
80000028 READ 1 HELD mail.bdb page 3
80000029 dd= 0 locks held 1 write locks 0 pid/thread 30034/1122793792
80000029 WRITE 1 WAIT mail.bdb page 3
80000029 READ 1 HELD mail.bdb page 3
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Locks grouped by object:
Locker Mode Count Status ----------------- Object ---------------
80000027 READ 1 HELD 0x36490 len: 9 data: 0x370000000000000000
1b READ 1 HELD cn.bdb handle 0
10 READ 1 HELD member.bdb handle 0
9 READ 1 HELD objectClass.bdb handle 0
c READ 1 HELD uid.bdb handle 0
80000026 READ 1 HELD 0x5e948 len: 9 data: 0x560x0500000000000000
1 READ 1 HELD id2entry.bdb handle 0
18 READ 1 HELD sn.bdb handle 0
3 READ 1 HELD dn2id.bdb handle 0
80000029 READ 1 HELD mail.bdb page 3
80000028 READ 1 HELD mail.bdb page 3
80000029 WRITE 1 WAIT mail.bdb page 3
80000028 WRITE 1 WAIT mail.bdb page 3
13 READ 1 HELD mail.bdb handle 0
db_stat -C ol (4.7.25)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Lock REGINFO information:
Lock Region type
5 Region ID
/apps/DECCLASA-1/data/openldap/__db.005 Region name
0x2b4a8a577000 Original region address
0x2b4a8a577000 Region address
0x2b4a8a577138 Region primary address
0 Region maximum allocation
0 Region allocated
Region allocations: 3006 allocations, 0 failures, 0 frees, 1 longest
REGION_JOIN_OK Region flags
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Locks grouped by lockers:
Locker Mode Count Status ----------------- Object ---------------
1 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
1 READ 1 HELD id2entry.bdb handle 0
2 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
3 dd= 0 locks held 1 write locks 0 pid/thread 32447/47463753631632
3 READ 1 HELD dn2id.bdb handle 0
4 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
5 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
6 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
7 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
7 READ 1 HELD objectClass.bdb handle 0
8 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
9 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
a dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
b dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
c dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
c READ 1 HELD cn.bdb handle 0
d dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
e dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
f dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
f READ 1 HELD uid.bdb handle 0
10 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
11 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
12 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
12 READ 1 HELD member.bdb handle 0
13 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
14 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
15 dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
15 READ 1 HELD mail.bdb handle 0
16 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
17 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
18 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
19 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
1a dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
1a READ 1 HELD sn.bdb handle 0
1b dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
1c dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
1d dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
1e dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
1f dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
80000003 dd= 0 locks held 0 write locks 0 pid/thread 32447/47463753631632
80000004 dd= 0 locks held 0 write locks 0 pid/thread 32447/1097341248
80000008 dd= 0 locks held 0 write locks 0 pid/thread 32447/1105733952
80000038 dd= 0 locks held 0 write locks 0 pid/thread 32447/1114126656
800000a9 dd= 0 locks held 1 write locks 0 pid/thread 32447/1097341248
800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
800000aa dd= 0 locks held 2 write locks 0 pid/thread 32447/1097341248
800000aa WRITE 1 WAIT mail.bdb page 3
800000aa READ 1 HELD mail.bdb page 1
800000aa READ 1 HELD mail.bdb page 3
800000ab dd= 0 locks held 1 write locks 0 pid/thread 32447/1105733952
800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
800000ac dd= 0 locks held 2 write locks 0 pid/thread 32447/1105733952
800000ac WRITE 1 WAIT mail.bdb page 3
800000ac READ 1 HELD mail.bdb page 1
800000ac READ 1 HELD mail.bdb page 3
800000b1 dd= 0 locks held 0 write locks 0 pid/thread 32447/1122519360
800000b2 dd= 0 locks held 0 write locks 0 pid/thread 32447/1130912064
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Locks grouped by object:
Locker Mode Count Status ----------------- Object ---------------
800000a9 READ 1 HELD 0x2d228 len: 9 data: 0x370000000000000000
1 READ 1 HELD id2entry.bdb handle 0
3 READ 1 HELD dn2id.bdb handle 0
7 READ 1 HELD objectClass.bdb handle 0
f READ 1 HELD uid.bdb handle 0
1a READ 1 HELD sn.bdb handle 0
800000aa READ 1 HELD mail.bdb page 3
800000ac READ 1 HELD mail.bdb page 3
800000aa WRITE 1 WAIT mail.bdb page 3
800000ac WRITE 1 WAIT mail.bdb page 3
c READ 1 HELD cn.bdb handle 0
15 READ 1 HELD mail.bdb handle 0
800000aa READ 1 HELD mail.bdb page 1
800000ac READ 1 HELD mail.bdb page 1
12 READ 1 HELD member.bdb handle 0
800000ab READ 1 HELD 0x43d58 len: 9 data: 0x540x0500000000000000
We have also collected the backtrace of threads for both DB versions which I
have uploaded to:
https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.8.30_201...
https://dl.dropboxusercontent.com/u/92115703/dusan_gdb_threads_4.7.25_201...
10 years, 5 months
(ITS#7632) ldap_start_tls_s core dumps for 64 bit
by a16474@gmail.com
Full_Name: Amit Sinha
Version: openldap-2.4.35
OS: Linux 2.6.18-308.11.1.el5
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (72.163.217.105)
When i compile and run the below code on linux 64 bit, it core dumps in
ldap_start_tls_s().
#include <ldap.h>
#include <iostream>
using namespace std;
int main()
{
char* hostname = "myhost.mydomain.com" ;
LDAP* l=ldap_init(hostname,389);
int version = LDAP_VERSION3 ;
if( LDAP_SUCCESS != ldap_set_option(NULL, LDAP_OPT_DEBUG_LEVEL, "7"))
{
cout << "error LDAP_OPT_DEBUG_LEVEL" << endl ;
}
if( LDAP_SUCCESS != ldap_set_option(l, LDAP_OPT_PROTOCOL_VERSION, &version ))
{
cout << "error LDAP_OPT_PROTOCOL_VERSION\n" ;
}
if( LDAP_SUCCESS != ldap_set_option(l, LDAP_OPT_REFERRALS, LDAP_OPT_OFF))
{
cout << "error LDAP_OPT_REFERRALS\n" ;
}
if( LDAP_SUCCESS != ldap_set_option( NULL,LDAP_OPT_X_TLS_CTX,NULL ))
{
cout << "error LDAP_OPT_X_TLS_CTX\n" ;
}
/if( LDAP_SUCCESS != ldap_set_option(
NULL,LDAP_OPT_X_TLS_CACERTDIR,"/myDirOfCert"))
{
cout << "error LDAP_OPT_X_TLS_CACERTDIR\n" ;
}
int rc ;
if( LDAP_SUCCESS != ( rc = ldap_start_tls_s(l, NULL, NULL)) )
{
cout << "error ldap_start_tls_s:" << ldap_err2string(rc) << endl ;
}
}
BACKTRACE:
(gdb) bt
#0 0x00002b7d266bbbff in sk_value () from
/root/openssl/openssl-1.0.1e/libcrypto.so.1.0.0
#1 0x00002b7d264a62fe in ssl3_output_cert_chain () from
/root/openssl/openssl-1.0.1e/libssl.so.1.0.0
#2 0x00002b7d2649fffd in ssl3_send_client_certificate () from
/root/openssl/openssl-1.0.1e/libssl.so.1.0.0
#3 0x00002b7d264a0531 in ssl3_connect () from
/root/openssl/openssl-1.0.1e/libssl.so.1.0.0
#4 0x00002b7d264a9487 in ssl23_connect () from
/root/openssl/openssl-1.0.1e/libssl.so.1.0.0
#5 0x00002b7d263676ec in tlso_session_connect (ld=<value optimized out>,
sess=0x0) at tls_o.c:363
#6 0x00002b7d2636698d in ldap_int_tls_connect (ld=0x3d44e80, conn=0x3d675f0,
srv=<value optimized out>) at tls2.c:362
#7 ldap_int_tls_start (ld=0x3d44e80, conn=0x3d675f0, srv=<value optimized out>)
at tls2.c:860
#8 0x00002b7d26366d35 in ldap_start_tls_s (ld=0x3d44e80, serverctrls=0x0,
clientctrls=<value optimized out>) at tls2.c:1040
#9 0x0000000000400dd2 in main ()
The open ssl version i am using is openssl-1.0.1e
10 years, 5 months