Hi folks,
I've just been testing a build of openldap-2.4.21 here in preparation for deployment, and have found that test058-syncrepl-asymmetric seems to fail randomly on our test box (on both i386 and amd64). A sample output is given below:
Starting test058-syncrepl-asymmetric for bdb...
running defines.sh Initializing master configurations... Initializing search configurations... Starting central master slapd on TCP/IP port 9011... Using ldapsearch to check that central master slapd is running... Starting site1 master slapd on TCP/IP port 9012... Using ldapsearch to check that site1 master is running... Starting site2 master slapd on TCP/IP port 9013... Using ldapsearch to check that site2 master is running... Starting central search slapd on TCP/IP port 9014... Using ldapsearch to check that central search slapd is running... Starting site1 search slapd on TCP/IP port 9015... Using ldapsearch to check that site1 search slapd is running... Waiting 1 seconds for slapd to start... Starting site2 search slapd on TCP/IP port 9016... Using ldapsearch to check that site2 search slapd is running... Adding schema on ldap://localhost:9011/... Adding backend module on ldap://localhost:9011/... Adding schema on ldap://localhost:9012/... Adding backend module on ldap://localhost:9012/... Adding schema on ldap://localhost:9013/... Adding backend module on ldap://localhost:9013/... Adding schema on ldap://localhost:9014/... Adding backend module on ldap://localhost:9014/... Adding schema on ldap://localhost:9015/... Adding backend module on ldap://localhost:9015/... Adding schema on ldap://localhost:9016/... Adding backend module on ldap://localhost:9016/... Adding database config on central master... Adding database config on site1 master... Adding database config on site2 master... Adding access rules on central master... Adding access rules on site1 master... Adding access rules on site2 master... Adding database config on central search... Adding database config on site1 search... Adding database config on site2 search... Populating central master... Adding syncrepl on site1 master... Adding syncrepl on site2 master... Using ldapsearch to check that site1 master received changes... Using ldapsearch to check that site2 master received changes... Populating site1 master... Populating site2 master... Stopping site1 master... Adding syncrepl on central master... Using ldapsearch to check that central master received site2 entries... Restarting site1 master slapd on TCP/IP port 9012... Using ldapsearch to check that site1 master is running... Using ldapsearch to check that central master received site1 entries... Adding syncrepl consumer on central search... Adding syncrepl consumer on site1 search... Adding syncrepl consumer on site2 search... Using ldapsearch to check that central search received changes... Using ldapsearch to check that site1 search received changes... Using ldapsearch to check that site2 search received changes... Checking contextCSN after initial replication... Using ldapmodify to modify first backend on central master... Using ldapsearch to check replication to central search... Using ldapsearch to check replication to site1 search... Using ldapsearch to check replication to site2 search... Checking contextCSN after modify of first backend on central master... Using ldapmodify to modify second backend on central master... Using ldapsearch to check replication to site2 search... Using ldapsearch to check no replication to site1 master... Using ldapsearch to check no replication to central search... Checking contextCSN after modify of second backend on central master... Using ldapmodify to modify first backend on site1 master... Using ldapsearch to check replication to site1 search... Using ldapsearch to check replication to site2 master... Using ldapsearch to check no replication to site2 search... Using ldapsearch to check no replication to central search... Checking contextCSN after modify of first backend on site1 master... Using ldapmodify to modify second backend on site1 master... Using ldapsearch to check replication to site1 search... Using ldapsearch to check no replication to central master... Checking contextCSN after modify of second backend on site1 master... Using ldapmodify to modify first backend on site2 master... Using ldapsearch to check replication to central master... Using ldapsearch to check replication to site2 search... Using ldapsearch to check no replication to site1 master... Using ldapsearch to check no replication to central search... Checking contextCSN after modify of first backend on site2 master... Using ldapmodify to modify second backend on site2 master... Using ldapsearch to check replication to site2 search... Using ldapsearch to check no replication to central master... Checking contextCSN after modify of second backend on site2 master... Stopping central master and site2 servers to test start with emtpy db... Starting site2 master slapd on TCP/IP port 9013... Using ldapsearch to check that site2 master slapd is running... Starting site2 search slapd on TCP/IP port 9016... Using ldapsearch to check that site2 search slapd is running... Waiting 1 seconds for slapd to start... Waiting 2 seconds for slapd to start... Waiting 3 seconds for slapd to start... Starting central master slapd on TCP/IP port 9011... Using ldapsearch to check that central master slapd is running... Using ldapsearch to check that site2 master received base... Using ldapsearch to check that site2 search received base... Waiting 1 seconds for syncrepl to receive changes... Waiting 2 seconds for syncrepl to receive changes... Checking contextCSN after site2 servers repopulated... Adding syncrepl of second site1 master backend on central master... Using ldapsearch to check that central master received second site1 backend... Waiting 1 seconds for syncrepl to receive changes... Waiting 2 seconds for syncrepl to receive changes... Waiting 3 seconds for syncrepl to receive changes... Waiting 4 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... ERROR: Second site1 backend not replicated to central master Restarting central master slapd on TCP/IP port 9011... Using ldapsearch to check that central master slapd is running... Waiting 1 seconds for slapd to start... Using ldapsearch to check that central master received second site1 backend... Using ldapsearch to check that central search received second site1 backend... Waiting 1 seconds for syncrepl to receive changes... Waiting 2 seconds for syncrepl to receive changes... Waiting 3 seconds for syncrepl to receive changes... Waiting 4 seconds for syncrepl to receive changes... Waiting 5 seconds for syncrepl to receive changes... ERROR: Second site1 backend not replicated to central search Restarting central search slapd on TCP/IP port 9014... Using ldapsearch to check that central search slapd is running... Waiting 1 seconds for slapd to start... Using ldapsearch to check that central search received second site1 backend... Running 1 of 10 syncrepl race tests... Stopping central master... Using ldapadd to add entry on site1 master... Starting central master again... Using ldapsearch to check that central master received entry... Using ldapsearch to check that central search received entry... Stopping central master... Using ldapdelete to delete entry on site1 master... Starting central master again... Using ldapsearch to check that entry was deleted on central master... Using ldapsearch to check that entry was deleted on central search... ERROR: Entry not removed on central search! Race error found after 1 of 10 iterations Found 3 errors
Exiting with a false success status for now
/home/build/deb/openldap/2.4.21/openldap_2.4.21.orig/tests/scripts/test058-syncrepl-asymmetric completed OK for bdb.
Is this something I should be worried about? One thing to note is that the underlying disk array is being rebuilt with extra disks and so I/O performance is quite poor on this machine at the moment - so I'm wondering whether it's a timing issue with the test script rather than within openldap itself?
ATB,
Mark.