https://bugs.openldap.org/show_bug.cgi?id=8102
--- Comment #5 from Howard Chu hyc@openldap.org --- (In reply to tpretz@gmail.com from comment #3)
When a cookie is not sent with an entry the cs_pmutex is not acquired. Without having some protection, non-cookie modifications will race each other between syncrepl threads.
So, i am testing surrounding the syncrepl_entry "if" block (line 1036) with a cs_pmutex lock/release (when punlock < 0) to serialize non_cookie mods just like the cookie ones. So far this is running tests and i haven't seen the null_callback issue, either when catching up from the session log, or running with ongoing out of order writes being replicated (running alongside unmodified 2.4.44 to compare differences).
When acquiring the cs_pmutex i have used the same logic as at line 958 (using trylock, with a shutdown check). I wonder if it is safe to acquire the mutex with a standard ldap_pvt_thread_mutex_lock at this point without spinning.
Sounds like you've done things correctly. It's not safe to do a normal mutex_lock here because the wait time could be quite long, and interfere with other the pool pause or shutdown operations.
We'll be adding the same code now.
line numbers from RELENG_2_4 (721a038b7bc9732f52eeef5324c180c4f137cd75)
Thanks
Tom