On Sat, 29 Mar 2008, ando@sys-net.it wrote:
rein@basefarm.no wrote:
I was seeing random failures of the test050-syncrepl-multimaster test. One of the failures was that it went into a tight loop traversing a circular runqueue it had managed to create in slapd_rq.task_list. It seems as this was caused by missing mutex locks around accesses to slapd_rq, which the patch uploaded to ftp://ftp.openldap.org/incoming/slapd_rq_lock.patch fixes.
Before I applied this patch the test failed after being run a few times, with it it has now passed 100 times and is still counting.
locks in back-bdb/config.c should be pointless, as modifications to the configuration should only occur while all threads are paused. The rest makes sort of sense, but I'd leave it to Howard.
That is probably true, it looks as if the places in config.c where locks really are required already had them. My patch adds locks everywhere slapd_rq was used without them, as I don't have enough knowledge of the code to know which functions guarranteed to only be used when threads are not running.
I believe that the important patch is to syncrepl.c, but I found I it best to add locks everywhere just to be on the safe side.
Looking at my copy of the patch it appears that another syncrepl.c patch which I was sure I had edited out has slipped through anyhow :-(. It is the first two modifications related to ldap_get_option, please disregard them in this bugreport. They have been reported in ITS#5403.
Rein