Le lundi 02 mars 2009 12:18:23, Howard Chu a écrit :
Adrien Futschik wrote:
Considering that M1& M3 are on the same server and therefore have exactly the same time, if this was a time related problem, I should'nt get any "CSN too old" messages between M1&M3 and M2&M4, should I ?
I have also noticed that when M1 gets a new entry and passes it to M2&M3&M4, when M2&M3&M4 revieve it, they also pass it to M2&M3&M4 ! I don't understand why this happends but it look's very much like this is what's happening, because sometimes, M2 would have passed-it to M4, before M4 has actualy revieved the add order from M1.
I therefore happend to notice that sometimes, entries send from M1 are revieved in the wrong ordrer by other masters and therefore some entries may be skipped !!!
Yes, that makes sense. The CSN check assumes changes will always be received in the same order they were sent from the provider. Obviously in this case this assumption is wrong. You should submit an ITS for this.
This problem was discussed on the -devel list back in 2007; the code ought to be using a spanning tree/routing algorithm to ensure that when multiple routes exist for propagating a change, the change is delivered exactly once. Unfortunately no one has spent any further time on this issue since then.
I am not whethere it is M1 that sends them in the wrong order to M2 and then cascaded to M3&M4, or if it is the order of M2 queue's that's wrong. I guess this must be the second option.
I'll submit an ITS right away.
Personnaly I believe that the best way to avoid this problem to happen, would be not to propagate entries just recieved from an other master.
Adrien Futschik