On Wed, 30 Apr 2008, Howard Chu wrote:
rein@OpenLDAP.org wrote:
syncprov_findbase() must search the backend saved with the syncrepl operation, not the one from the operation passed as argument. The backend in the op argument can be a subordinate database, in which case the search for the base in the superior database will fail, and syncrepl consumers will be force to do a an unneccessary full refresh of the database.
OK.
The patch at the end should fix this. Note that both fop.o_bd and fop.o_bd->bd_info can be changed by the overlay_op_walk() call, which is the reason for the long pointer traversal to find the correct bd_info to save and restore.
But the overlay_op_walk call is only appropriate when the DB to be searched is the current database, and the current DB is an overlay DB structure.
Ah, the changing of the BackendDB->bd_info that takes place when overlays are called feels like an open pit I manage to fall into every time I get close to it... I wish it could be replaced in a future version.
Your patch causes fc->fss->s_op->o_bd's bd_info pointer to change, which is not allowed. That's in the original backendDB, which must be treated as read-only since multiple threads may be accessing it. The correct approach here is to use a new local backendDB variable, copy the s_op->o_bd into it, and then just do a regular be_search invocation instead of using overlay_op_walk.
But, this patch must not take effect on the first call to syncprov_findbase (which occurred in syncprov_op_search) - in that case, the current code is correct. So, you need to tweak things based on whether (s_flags & PS_IS_REFRESHING) is true or not - if true, this is the first search, and it should use the original code. Else, it must use be_search.
A new patch that I hope should fix this is at the end. It always use be_search, after putting back the original bd_info if needed. I feel that using the generic be_search is better than interfering directly with the overlay code as overlay_op_walk does. I also tested for SLAP_ISOVERLAY rather than PS_IS_REFRESHING, as that appeared more generic to me. But again, I may be totally wrong here. Does this patch look better?
Rein
Index: OpenLDAP/servers/slapd/overlays/syncprov.c =================================================================== RCS file: /f/CVSROOT/drift/OpenLDAP/servers/slapd/overlays/syncprov.c,v retrieving revision 1.1.1.18 diff -u -u -r1.1.1.18 syncprov.c --- OpenLDAP/servers/slapd/overlays/syncprov.c 30 Apr 2008 11:17:58 -0000 1.1.1.18 +++ OpenLDAP/servers/slapd/overlays/syncprov.c 2 May 2008 11:19:46 -0000 @@ -404,7 +404,7 @@ slap_callback cb = {0}; Operation fop; SlapReply frs = { REP_RESULT }; - BackendInfo *bi; + BackendDB be; int rc;
fc->fss->s_flags ^= PS_FIND_BASE; @@ -413,10 +413,15 @@ fop = *fc->fss->s_op;
fop.o_hdr = op->o_hdr; - fop.o_bd = op->o_bd; fop.o_time = op->o_time; fop.o_tincr = op->o_tincr; - bi = op->o_bd->bd_info; + + if ( SLAP_ISOVERLAY( fop.o_bd )) { + slap_overinst *on = (slap_overinst *)fop.o_bd->bd_info; + be = *fop.o_bd; + be.bd_info = (BackendInfo *)on->on_info; + fop.o_bd = &be; + }
cb.sc_response = findbase_cb; cb.sc_private = fc; @@ -434,8 +439,7 @@ fop.ors_filter = &generic_filter; fop.ors_filterstr = generic_filterstr;
- rc = overlay_op_walk( &fop, &frs, op_search, on->on_info, on ); - op->o_bd->bd_info = bi; + rc = fop.o_bd->be_search( &fop, &frs ); } else { ldap_pvt_thread_mutex_unlock( &fc->fss->s_mutex ); fc->fbase = 1;