Hi folks,
I've just been testing a build of openldap-2.4.21 here in preparation for deployment, and have found that test058-syncrepl-asymmetric seems to fail randomly on our test box (on both i386 and amd64). A sample output is given below:
Starting test058-syncrepl-asymmetric for bdb...
running defines.sh Initializing master configurations... Initializing search configurations... Starting central master slapd on TCP/IP port 9011... Using ldapsearch to check that central master slapd is running... Starting site1 master slapd on TCP/IP port 9012... Using ldapsearch to check that site1 master is running... Starting site2 master slapd on TCP/IP port 9013... Using ldapsearch to check that site2 master is running... Starting central search slapd on TCP/IP port 9014... Using ldapsearch to check that central search slapd is running... Starting site1 search slapd on TCP/IP port 9015... Using ldapsearch to check that site1 search slapd is running... Waiting 1 seconds for slapd to start... Starting site2 search slapd on TCP/IP port 9016... Using ldapsearch to check that site2 search slapd is running... Adding schema on ldap://localhost:9011/... Adding backend module on ldap://localhost:9011/... Adding schema on ldap://localhost:9012/... Adding backend module on ldap://localhost:9012/... Adding schema on ldap://localhost:9013/... Adding backend module on ldap://localhost:9013/... Adding schema on ldap://localhost:9014/... Adding backend module on ldap://localhost:9014/... Adding schema on ldap://localhost:9015/... Adding backend module on ldap://localhost:9015/... Adding schema on ldap://localhost:9016/... Adding backend module on ldap://localhost:9016/... Adding database config on central master... Adding database config on site1 master... Adding database config on site2 master... Adding access rules on central master... Adding access rules on site1 master... Adding access rules on site2 master... Adding database config on central search... Adding database config on site1 search... Adding database config on site2 search... Populating central master... Adding syncrepl on site1 master... Adding syncrepl on site2 master... Using ldapsearch to check that site1 master received changes... Using ldapsearch to check that site2 master received changes... Populating site1 master... Populating site2 master... Stopping site1 master... Adding syncrepl on central master... Using ldapsearch to check that central master received site2 entries... Restarting site1 master slapd on TCP/IP port 9012... Using ldapsearch to check that site1 master is running... Using ldapsearch to check that central master received site1 entries... Adding syncrepl consumer on central search... Adding syncrepl consumer on site1 search... Adding syncrepl consumer on site2 search... Using ldapsearch to check that central search received changes... Using ldapsearch to check that site1 search received changes... Using ldapsearch to check that site2 search received changes... Checking contextCSN after initial replication... Using ldapmodify to modify first backend on central master... Using ldapsearch to check replication to central search... Using ldapsearch to check replication to site1 search... Using ldapsearch to check replication to site2 search... Checking contextCSN after modify of first backend on central master... Using ldapmodify to modify second backend on central master... Using ldapsearch to check replication to site2 search... Using ldapsearch to check no replication to site1 master... Using ldapsearch to check no replication to central search... Checking contextCSN after modify of second backend on central master... Using ldapmodify to modify first backend on site1 master... Using ldapsearch to check replication to site1 search... Using ldapsearch to check replication to site2 master... Using ldapsearch to check no replication to site2 search... Using ldapsearch to check no replication to central search... Checking contextCSN after modify of first backend on site1 master... Using ldapmodify to modify second backend on site1 master... Using ldapsearch to check replication to site1 search... Using ldapsearch to check no replication to central master... Checking contextCSN after modify of second backend on site1 master... Using ldapmodify to modify first backend on site2 master... Using ldapsearch to check replication to central master... Using ldapsearch to check replication to site2 search... Using ldapsearch to check no replication to site1 master... Using ldapsearch to check no replication to central search... Checking contextCSN after modify of first backend on site2 master... Using ldapmodify to modify second backend on site2 master... Using ldapsearch to check replication to site2 search... Using ldapsearch to check no replication to central master... Checking contextCSN after modify of second backend on site2 master... Stopping central master and site2 servers to test start with emtpy db... Starting site2 master slapd on TCP/IP port 9013... Using ldapsearch to check that site2 master slapd is running... Starting site2 search slapd on TCP/IP port 9016... Using ldapsearch to check that site2 search slapd is running... Waiting 1 seconds for slapd to start... Waiting 2 seconds for slapd to start... Waiting 3 seconds for slapd to start... Starting central master slapd on TCP/IP port 9011... Using ldapsearch to check that central master slapd is running... Using ldapsearch to check that site2 master received base... Using ldapsearch to check that site2 search received base... Waiting 1 seconds for syncrepl to receive changes... Waiting 2 seconds for syncrepl to receive changes... Checking contextCSN after site2 servers repopulated... Adding syncrepl of second site1 master backend on central master... Using ldapsearch to check that central master received second site1 backend... Waiting 1 seconds for syncrepl to receive changes... Waiting 2 seconds for syncrepl to receive changes... Waiting 3 seconds for syncrepl to receive changes... Waiting 4 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... ERROR: Second site1 backend not replicated to central master Restarting central master slapd on TCP/IP port 9011... Using ldapsearch to check that central master slapd is running... Waiting 1 seconds for slapd to start... Using ldapsearch to check that central master received second site1 backend... Using ldapsearch to check that central search received second site1 backend... Waiting 1 seconds for syncrepl to receive changes... Waiting 2 seconds for syncrepl to receive changes... Waiting 3 seconds for syncrepl to receive changes... Waiting 4 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... ERROR: Second site1 backend not replicated to central search Restarting central search slapd on TCP/IP port 9014... Using ldapsearch to check that central search slapd is running... Waiting 1 seconds for slapd to start... Using ldapsearch to check that central search received second site1 backend... Running 1 of 10 syncrepl race tests... Stopping central master... Using ldapadd to add entry on site1 master... Starting central master again... Using ldapsearch to check that central master received entry... Using ldapsearch to check that central search received entry... Stopping central master... Using ldapdelete to delete entry on site1 master... Starting central master again... Using ldapsearch to check that entry was deleted on central master... Using ldapsearch to check that entry was deleted on central search... ERROR: Entry not removed on central search! Race error found after 1 of 10 iterations Found 3 errors
Exiting with a false success status for now
/home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/tests/scripts/test058-syncrepl-asymmetric completed OK for bdb.
Is this something I should be worried about? One thing to note is that the underlying disk array is being rebuilt with extra disks and so I/O performance is quite poor on this machine at the moment - so I'm wondering whether it's a timing issue with the test script rather than within openldap itself?
ATB,
Mark.
--On Monday, June 21, 2010 5:45 PM +0100 Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Hi folks,
I've just been testing a build of openldap-2.4.21 here in preparation for deployment, and have found that test058-syncrepl-asymmetric seems to fail randomly on our test box (on both i386 and amd64). A sample output is given below:]
> Exiting with a false success status for now
The point of this exit message is that this test is currently known to fail, and failure should be ignored.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
The point of this exit message is that this test is currently known to fail, and failure should be ignored.
--Quanah
Okay, thanks. This was the output on our i386 build host, so I just went to cross-check against our amd64 build host and I am also seeing some random failures in test043, e.g.
Starting test043-delta-syncrepl for bdb...
running defines.sh Starting producer slapd on TCP/IP port 9011... Using ldapsearch to check that producer slapd is running... Waiting 5 seconds for slapd to start... Using ldapadd to create the context prefix entries in the producer... Starting consumer slapd on TCP/IP port 9012... Using ldapsearch to check that consumer slapd is running... Waiting 5 seconds for slapd to start... Using ldapadd to populate the producer directory... Waiting 7 seconds for syncrepl to receive changes... Stopping the provider, sleeping 10 seconds and restarting it... Using ldapsearch to check that producer slapd is running... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... ldapsearch failed (255)! /home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/tests/scripts/test043-delta-syncrepl: line 156: kill: (30144) - No such process
/home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/tests/scripts/test043-delta-syncrepl failed for bdb (exit 255) make[3]: *** [bdb-mod] Error 255 make[3]: Leaving directory `/home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/debian/build/tests' make[2]: *** [test] Error 2 make[2]: Leaving directory `/home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/debian/build/tests' make[1]: *** [test] Error 2 make[1]: Leaving directory `/home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/debian/build' make: *** [build-stamp] Error 2 dpkg-buildpackage: error: debian/rules build gave error exit status 2 build@lenny-amd64-build:~/deb/openldap/2.4.21/openldap_2.4.21.orig$ cd debian/build/tests/ build@lenny-amd64-build:~/deb/openldap/2.4.21/openldap_2.4.21.orig/debian/build/tests$ ./run test043-delta-syncrepl Cleaning up test run directory leftover from previous run. Running /home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/tests/scripts/test043-delta-syncrepl for bdb... running defines.sh Starting producer slapd on TCP/IP port 9011... Using ldapsearch to check that producer slapd is running... Waiting 5 seconds for slapd to start... Using ldapadd to create the context prefix entries in the producer... Starting consumer slapd on TCP/IP port 9012... Using ldapsearch to check that consumer slapd is running... Using ldapadd to populate the producer directory... Waiting 7 seconds for syncrepl to receive changes... Stopping the provider, sleeping 10 seconds and restarting it... Using ldapsearch to check that producer slapd is running... Using ldapmodify to modify producer directory... Waiting 7 seconds for syncrepl to receive changes... Stopping consumer to test recovery... Modifying more entries on the producer... Restarting consumer... Waiting 7 seconds for syncrepl to receive changes... Try updating the consumer slapd... Waiting 7 seconds for syncrepl to receive changes... Using ldapsearch to read all the entries from the producer... Using ldapsearch to read all the entries from the consumer... Filtering producer results... Filtering consumer results... Comparing retrieved entries from producer and consumer...
Test succeeded
ATB,
Mark.
--On Tuesday, June 22, 2010 5:24 PM +0100 Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Quanah Gibson-Mount wrote:
The point of this exit message is that this test is currently known to fail, and failure should be ignored.
--Quanah
Okay, thanks. This was the output on our i386 build host, so I just went to cross-check against our amd64 build host and I am also seeing some random failures in test043, e.g.
Starting test043-delta-syncrepl for bdb...
running defines.sh Starting producer slapd on TCP/IP port 9011... Using ldapsearch to check that producer slapd is running... Waiting 5 seconds for slapd to start... Using ldapadd to create the context prefix entries in the producer... Starting consumer slapd on TCP/IP port 9012... Using ldapsearch to check that consumer slapd is running... Waiting 5 seconds for slapd to start... Using ldapadd to populate the producer directory... Waiting 7 seconds for syncrepl to receive changes... Stopping the provider, sleeping 10 seconds and restarting it... Using ldapsearch to check that producer slapd is running... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... Waiting 5 seconds for slapd to start... ldapsearch failed (255)! /home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/tests/scripts/test04 3-delta-syncrepl: line 156: kill: (30144) - No such process
Can you reliably reproduce it using the run script? I.e.,
./run -b <backend> -l 500 test043
Would run the test 500 times using the specified backend. Your output suggests that slapd didn't start, but you don't provide anything from the slapd.1.log in the testrun directory, so there is no saying why, or if simply the script didn't allow enough time for it to start up on your system.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
openldap-technical@openldap.org