--On Friday, September 03, 2010 01:23:17 AM -0700 Bill MacAllister whm@stanford.edu wrote:
The problem with the database was only coincidental. Restoring the database got the failing replica past the problem replication event.
In the replica pool of 6 servers we have seen the problem on there of the servers. In thinking about this more it is unlikely that it is a slave problem since the slaves have been in use for about 6 weeks and we did not see the problem. Only when we changed the master to 2.4.23 did we see the problem. I have captured a master debug log of the problem event. It is at http://www.stanford.edu/~whm/files/master-debug.txt.
Bill