I'm seeing random failures with test050 and current HEAD. this time I saved the testrun dirs. There seem to be two different kind of failures: - one kind can simply be related to uncomplete synch between the different instances of slapd (database differs); - the other one worries me more:
[luca@luca-nb tests]$ ./run -b hdb test050 Cleaning up test run directory leftover from previous run. Running ./scripts/test050-syncrepl-multimaster... running defines.sh Initializing server configurations... Starting producer slapd on TCP/IP port 9011... Using ldapsearch to check that producer slapd is running... Inserting syncprov overlay on producer... Starting consumer slapd on TCP/IP port 9012... Using ldapsearch to check that consumer slapd is running... Configuring syncrepl on consumer... Starting consumer2 slapd on TCP/IP port 9013... Using ldapsearch to check that consumer2 slapd is running... Configuring syncrepl on consumer2... Adding schema and databases on producer... Using ldapadd to populate producer... Waiting 20 seconds for syncrepl to receive changes... Using ldapsearch to check that syncrepl received database changes... Using ldapsearch to check that syncrepl received database changes on consumer2... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... ldapsearch failed (32)!
see attachment
Ing. Luca Scamoni Responsabile Ricerca e Sviluppo
SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 0382 573859 (137) Mobile: +39 347 1014425 Email: luca.scamoni@sys-net.it -----------------------------------
Luca Scamoni wrote:
I'm seeing random failures with test050 and current HEAD. this time I saved the testrun dirs. There seem to be two different kind of failures:
- one kind can simply be related to uncomplete synch between the
different instances of slapd (database differs);
- the other one worries me more:
Seems like server3 never received the config entries for dc=example,dc=com. I haven't been able to reproduce this issue. I suppose this indicates a race condition somewhere in the initial refresh. What kind of system are you testing on? Can you try inserting a "sleep 3" at around line 277 of test050, right before "Adding schema and databases"...
[luca@luca-nb tests]$ ./run -b hdb test050 Cleaning up test run directory leftover from previous run. Running ./scripts/test050-syncrepl-multimaster... running defines.sh Initializing server configurations... Starting producer slapd on TCP/IP port 9011... Using ldapsearch to check that producer slapd is running... Inserting syncprov overlay on producer... Starting consumer slapd on TCP/IP port 9012... Using ldapsearch to check that consumer slapd is running... Configuring syncrepl on consumer... Starting consumer2 slapd on TCP/IP port 9013... Using ldapsearch to check that consumer2 slapd is running... Configuring syncrepl on consumer2... Adding schema and databases on producer... Using ldapadd to populate producer... Waiting 20 seconds for syncrepl to receive changes... Using ldapsearch to check that syncrepl received database changes... Using ldapsearch to check that syncrepl received database changes on consumer2... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... ldapsearch failed (32)!
see attachment
Howard Chu wrote:
Seems like server3 never received the config entries for dc=example,dc=com. I haven't been able to reproduce this issue. I suppose this indicates a race condition somewhere in the initial refresh. What kind of system are you testing on? Can you try inserting a "sleep 3" at around line 277 of test050, right before "Adding schema and databases"...
I'm testing on FC8 on my laptop (pentium M 1.8GHz, 2GB RAM). db-4.6.21, cyrus-sasl-2.1.22, openssl-0.9.8g all from tarballs. I too can hardly reproduce it: say, once in 20 or so runs. I've been able to reproduce again even with the suggested delay. Attached are the directories resulting from two failed runs. testrun.f1 shows different databases at the end of test050 testrun.f2 shows the results of this run: [luca@luca-nb tests]$ ./run -b hdb test050 Cleaning up test run directory leftover from previous run. Running ./scripts/test050-syncrepl-multimaster... running defines.sh Initializing server configurations... Starting producer slapd on TCP/IP port 9011... Using ldapsearch to check that producer slapd is running... Inserting syncprov overlay on producer... Starting consumer slapd on TCP/IP port 9012... Using ldapsearch to check that consumer slapd is running... Configuring syncrepl on consumer... Starting consumer2 slapd on TCP/IP port 9013... Using ldapsearch to check that consumer2 slapd is running... Configuring syncrepl on consumer2... Sleeping 3 as per Howard request... Adding schema and databases on producer... Using ldapadd to populate producer... Waiting 20 seconds for syncrepl to receive changes... Using ldapsearch to check that syncrepl received database changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... ldapsearch failed (32)!
if there's anything I can try...
Ing. Luca Scamoni Responsabile Ricerca e Sviluppo
SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ----------------------------------- Office: +39 0382 573859 (137) Mobile: +39 347 1014425 Email: luca.scamoni@sys-net.it -----------------------------------
<quote who="Howard Chu">
Luca Scamoni wrote:
I'm seeing random failures with test050 and current HEAD. this time I saved the testrun dirs. There seem to be two different kind of failures:
- one kind can simply be related to uncomplete synch between the
different instances of slapd (database differs);
- the other one worries me more:
Seems like server3 never received the config entries for dc=example,dc=com. I haven't been able to reproduce this issue. I suppose this indicates a race condition somewhere in the initial refresh. What kind of system are you testing on? Can you try inserting a "sleep 3" at around line 277 of test050, right before "Adding schema and databases"...
That worked for me. I'm just building HEAD so I can test soem changes to test006-acls