In my environment I have a need to synchronize from a single master to 125 globally distributed read-only consumers.
I've attempted this in two ways and run into problems in either direction.
First, I attempted a multi-tier replication strategy where the master would sync to a regional consumer which would in-turn act as a producer for around 20 slaves each. It seems that a server should be able to act as both a producer and a consumer, but in my experience with 2.4.25 this will cause a repeatable segfault within a days time. (test_filter() is passed a NULL filter in syncrepl.c) I think this would probably be the best solution if I could resolve the segfault issue.
The other option I've tried is pointing all 125 slaves at a single master. This works if I bring the slaves up gradually, but if they all attempt to connect at once (like after a master restart) the initial sync process seems to monopolize a thread per replica which causes any other searches to fail for a period of greater than 30 seconds. Bumping the threads up to over 125 seems to solve the issue on a test machine but I'm hesitant to do this on the production master which is used heavily for a variety of other purposes.
Can anyone offer advice on how I could go about resolving these issues or other methods for successfully replicating to this many slaves?
Thanks, Duncan.