Hello,
after inserting (ADD) one object, I get following messages in the logfile and the sapld hangs:
Jun 1 09:02:24 ldap-01 slapd[8836]: conn=633789 op=1 ADD dn="cn=3,cn=2,cn=node,cn=1,cn=BBB,cn=AAA,cn=companies,ou=root" Jun 1 09:02:24 ldap-01 slapd[8836]: => bdb_idl_insert_key: c_get failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) Jun 1 09:02:24 ldap-01 slapd[8836]: => bdb_dn2id_add 0x205e7e: subtree (cn=BBB,cn=AAA,cn=companies,ou=root) insert failed: -30995
After this, I don't see any messages in the log till the staff was initiating a stop/start:
Jun 1 09:09:29 ldap-01 slapd[8836]: daemon: shutdown requested and initiated. Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633113 fd=9 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=405426 fd=12 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633787 fd=13 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=1011 fd=14 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=1013 fd=18 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632703 fd=33 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632710 fd=37 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632883 fd=39 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632762 fd=40 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633211 fd=41 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633735 fd=45 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632829 fd=47 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633170 fd=48 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633200 fd=50 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633788 fd=55 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: slapd shutdown: waiting for 22 operations/tasks to finish Jun 1 09:09:42 ldap-01 slapd[20945]: @(#) $OpenLDAP: slapd 2.4.30 $ opensuse-buildservice@opensuse.org Jun 1 09:09:43 ldap-01 slapd[20945]: slapd starting
After the error message at 09:02:24 the slapd did not answer any request. I cannot recover that problem in a test environment.
The server is running in a MM environment (two masters), and the server gets 200-1200 search request/s Because of this high rate, we set "loglevel 0". Since we updated the slapd to 2.4.30 (from 2.4.28) the server crashes/hangs about on times a week. Because of this we set loglevel 256 back again.
Would be very nice, if I can fix the problem, please help.
Thanks in advance
Meike
PS: The slapd is a modified, self compiled version because of larger IDL with following changes:
openldap-2.4.30/servers/slapd/back-bdb/idl.h: -#define BDB_IDL_LOGN 16 /* DB_SIZE is 2^16, UM_SIZE is 2^17 */ +#define BDB_IDL_LOGN 17 /* DB_SIZE is 2^17, UM_SIZE is 2^18 */
openldap-2.4.30/include/ldap_pvt_thread.h: -# define LDAP_PVT_THREAD_STACK_SIZE ( 1 * 1024 * 1024 * sizeof(void *) ) +# define LDAP_PVT_THREAD_STACK_SIZE ( 2 * 1024 * 1024 * sizeof(void *) )
All tests where running well!
Hello dear list,
does anyone can help me?
Kindly regards and thanks
Meike
2012/6/1 Meike Stone meike.stone@googlemail.com:
Hello,
after inserting (ADD) one object, I get following messages in the logfile and the sapld hangs:
Jun 1 09:02:24 ldap-01 slapd[8836]: conn=633789 op=1 ADD dn="cn=3,cn=2,cn=node,cn=1,cn=BBB,cn=AAA,cn=companies,ou=root" Jun 1 09:02:24 ldap-01 slapd[8836]: => bdb_idl_insert_key: c_get failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) Jun 1 09:02:24 ldap-01 slapd[8836]: => bdb_dn2id_add 0x205e7e: subtree (cn=BBB,cn=AAA,cn=companies,ou=root) insert failed: -30995
After this, I don't see any messages in the log till the staff was initiating a stop/start:
Jun 1 09:09:29 ldap-01 slapd[8836]: daemon: shutdown requested and initiated. Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633113 fd=9 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=405426 fd=12 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633787 fd=13 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=1011 fd=14 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=1013 fd=18 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632703 fd=33 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632710 fd=37 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632883 fd=39 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632762 fd=40 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633211 fd=41 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633735 fd=45 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632829 fd=47 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633170 fd=48 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633200 fd=50 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633788 fd=55 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: slapd shutdown: waiting for 22 operations/tasks to finish Jun 1 09:09:42 ldap-01 slapd[20945]: @(#) $OpenLDAP: slapd 2.4.30 $ opensuse-buildservice@opensuse.org Jun 1 09:09:43 ldap-01 slapd[20945]: slapd starting
After the error message at 09:02:24 the slapd did not answer any request. I cannot recover that problem in a test environment.
The server is running in a MM environment (two masters), and the server gets 200-1200 search request/s Because of this high rate, we set "loglevel 0". Since we updated the slapd to 2.4.30 (from 2.4.28) the server crashes/hangs about on times a week. Because of this we set loglevel 256 back again.
Would be very nice, if I can fix the problem, please help.
Thanks in advance
Meike
PS: The slapd is a modified, self compiled version because of larger IDL with following changes:
openldap-2.4.30/servers/slapd/back-bdb/idl.h: -#define BDB_IDL_LOGN 16 /* DB_SIZE is 2^16, UM_SIZE is 2^17 */ +#define BDB_IDL_LOGN 17 /* DB_SIZE is 2^17, UM_SIZE is 2^18 */
openldap-2.4.30/include/ldap_pvt_thread.h: -# define LDAP_PVT_THREAD_STACK_SIZE ( 1 * 1024 * 1024 * sizeof(void *) ) +# define LDAP_PVT_THREAD_STACK_SIZE ( 2 * 1024 * 1024 * sizeof(void *) )
All tests where running well!
Hello,
yesterday, the slapd hangs again, but this time the other master...
Here is the log from the time, the Server was hanging. But today, no error was logged, not in Master 2 (ldap-02) nor Master 1 (not included).
Jun 10 15:19:03 ldap-02 slapd[31642]: conn=1710241 fd=61 ACCEPT from IP=19.168.1.10:45678 (IP=0.0.0.0:389) Jun 10 15:19:03 ldap-02 slapd[31642]: conn=1710241 op=0 BIND dn="cn=root,ou=admins,ou=root" method=128 Jun 10 15:19:03 ldap-02 slapd[31642]: conn=1710241 op=0 BIND dn="cn=root,ou=admins,ou=root" mech=SIMPLE ssf=0 Jun 10 15:19:03 ldap-02 slapd[31642]: conn=1710241 op=0 RESULT tag=97 err=0 text= Jun 10 15:19:03 ldap-02 slapd[31642]: conn=1710241 op=1 ADD dn="cn=object1,cn=node10,cn=Room,cn=2,cn=1329915568-54148,cn=BBB,cn=AAA,cn=companies,ou=root" Jun 10 15:29:28 ldap-02 slapd[31642]: daemon: shutdown requested and initiated. Jun 10 15:29:28 ldap-02 slapd[31642]: conn=206330 fd=9 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1000 fd=12 closed (connection lost on write) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709749 fd=31 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709789 fd=34 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709851 fd=36 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709870 fd=38 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709888 fd=40 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709896 fd=41 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709914 fd=43 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709899 fd=44 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1710145 fd=47 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709933 fd=48 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1710075 fd=50 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1709981 fd=52 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1710016 fd=54 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1710150 fd=56 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1710240 fd=59 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: conn=1710238 fd=60 closed (slapd shutdown) Jun 10 15:29:28 ldap-02 slapd[31642]: slapd shutdown: waiting for 20 operations/tasks to finish Jun 10 15:34:20 ldap-02 slapd[5030]: @(#) $OpenLDAP: slapd 2.4.30 $ opensuse-buildservice@opensuse.org Jun 10 15:34:20 ldap-02 slapd[5030]: daemon: IPv6 socket() failed errno=97 (Address family not supported by protocol) Jun 10 15:34:20 ldap-02 slapd[5037]: bdb_db_open: database "ou=root": unclean shutdown detected; attempting recovery. Jun 10 15:34:21 ldap-02 slapd[5037]: slapd starting Jun 10 15:34:21 ldap-02 slapd[5037]: conn=1000 fd=13 ACCEPT from IP=127.0.0.1:56444 (IP=0.0.0.0:389) Jun 10 15:34:21 ldap-02 slapd[5037]: conn=1000 op=0 BIND dn="" method=128 Jun 10 15:34:21 ldap-02 slapd[5037]: conn=1000 op=0 RESULT tag=97 err=48 text=anonymous bind disallowed Jun 10 15:34:21 ldap-02 slapd[5037]: conn=1000 op=1 UNBIND Jun 10 15:34:21 ldap-02 slapd[5037]: conn=1000 fd=13 closed Jun 10 15:34:22 ldap-02 slapd[5037]: conn=1001 fd=13 ACCEPT from IP=192.168.1.10:48743 (IP=0.0.0.0:389)
As you can see, the ADD operation was the last operation on (15:19:03). On 15:29:28, the responsible operation staff did reboot the slapd. He took care, that no slapd process was running before he started the slapd again.
We set "gentlehup on" on the config, so the server tried to answer all pending request and close the connections..(Does the server answer pending requests?). But after starting, bdb_db_open says "unclean shutdown detected", so it seems to me, that the server quit after time without fulfill all requirements for a clean shutdown?
Additionally, I have running sar to collect informations about the system activity (see below). Here we can see, that in the time range (15:20-15:30) while the slapd was hanging, the slapd accepts new connections (tcpsck / connection count is growing), but does not serve answers (tcp-tw / no "closed" connection).
15:00:01 totsck tcpsck udpsck rawsck ip-frag tcp-tw 15:05:01 165 8 5 0 0 11 15:10:01 164 7 5 0 0 17 15:15:01 166 9 5 0 0 14 15:20:01 196 79 5 0 0 0 15:25:01 194 166 5 0 0 0 15:30:01 159 4 5 0 0 16 15:35:01 162 6 5 0 0 14 15:40:01 167 11 5 0 0 27 15:45:01 167 11 5 0 0 28
During the whole time, the CPU was idling between 97% ... 100%
The Systems are running in production, so I can't make any tests. In our test environment are no problems seen till now. But there the load (ldap operations) is very low ..
The configuration for larger IDL (see first posting), we running since 2,5 years without problems by mostly the same size of the directory.
Thanks Meike
2012/6/6 Meike Stone meike.stone@googlemail.com:
Hello dear list,
does anyone can help me?
Kindly regards and thanks
Meike
2012/6/1 Meike Stone meike.stone@googlemail.com:
Hello,
after inserting (ADD) one object, I get following messages in the logfile and the sapld hangs:
Jun 1 09:02:24 ldap-01 slapd[8836]: conn=633789 op=1 ADD dn="cn=3,cn=2,cn=node,cn=1,cn=BBB,cn=AAA,cn=companies,ou=root" Jun 1 09:02:24 ldap-01 slapd[8836]: => bdb_idl_insert_key: c_get failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30995) Jun 1 09:02:24 ldap-01 slapd[8836]: => bdb_dn2id_add 0x205e7e: subtree (cn=BBB,cn=AAA,cn=companies,ou=root) insert failed: -30995
After this, I don't see any messages in the log till the staff was initiating a stop/start:
Jun 1 09:09:29 ldap-01 slapd[8836]: daemon: shutdown requested and initiated. Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633113 fd=9 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=405426 fd=12 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633787 fd=13 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=1011 fd=14 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=1013 fd=18 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632703 fd=33 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632710 fd=37 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632883 fd=39 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632762 fd=40 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633211 fd=41 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633735 fd=45 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=632829 fd=47 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633170 fd=48 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633200 fd=50 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: conn=633788 fd=55 closed (slapd shutdown) Jun 1 09:09:29 ldap-01 slapd[8836]: slapd shutdown: waiting for 22 operations/tasks to finish Jun 1 09:09:42 ldap-01 slapd[20945]: @(#) $OpenLDAP: slapd 2.4.30 $ opensuse-buildservice@opensuse.org Jun 1 09:09:43 ldap-01 slapd[20945]: slapd starting
After the error message at 09:02:24 the slapd did not answer any request. I cannot recover that problem in a test environment.
The server is running in a MM environment (two masters), and the server gets 200-1200 search request/s Because of this high rate, we set "loglevel 0". Since we updated the slapd to 2.4.30 (from 2.4.28) the server crashes/hangs about on times a week. Because of this we set loglevel 256 back again.
Would be very nice, if I can fix the problem, please help.
Thanks in advance
Meike
PS: The slapd is a modified, self compiled version because of larger IDL with following changes:
openldap-2.4.30/servers/slapd/back-bdb/idl.h: -#define BDB_IDL_LOGN 16 /* DB_SIZE is 2^16, UM_SIZE is 2^17 */ +#define BDB_IDL_LOGN 17 /* DB_SIZE is 2^17, UM_SIZE is 2^18 */
openldap-2.4.30/include/ldap_pvt_thread.h: -# define LDAP_PVT_THREAD_STACK_SIZE ( 1 * 1024 * 1024 * sizeof(void *) ) +# define LDAP_PVT_THREAD_STACK_SIZE ( 2 * 1024 * 1024 * sizeof(void *) )
All tests where running well!
openldap-technical@openldap.org