On Thu, Apr 03, 2008 at 01:53:17PM -0700, Howard Chu wrote:
John Morrissey wrote:
On Thu, Apr 03, 2008 at 12:10:49AM -0700, Howard Chu wrote:
Yes, it seems a bit silly to explicitly replicate it; that means the producer has to send its contents twice to each delta-sync consumer. Just configure an accesslog overlay on each consumer and let them regenerate the log locally.
Won't configuring a separate accesslog on the consumers result in slightly different accesslogs (relative to each other and to the provider) being generated, since there is inherent latency between the writes on the provider and the writes on the slaves? Also, won't the reqAuthzID in the consumers' accesslogs be different, since the syncrepl engine uses the rootdn internally to make the changes to consumers' databases?
There is write latency no matter what. If you interrupt things before an operation can be logged, then of course it will be different, and explicit replication won't change that fact.
I'm not contesting that. The write latency I'm referring to is between the time of the write (and therefore the relevant timestamp(s)) on the provider and consumers if the consumers generate their own access logs.
Yes, the reqAuthzID will be different. But the replication mechanisms don't care about that; the modifiersName opattr is recorded and replicated explicitly. All of the other attributes will be identical.
Of course, delta-syncrepl will be just fine if one of the independently-accesslogging consumers is promoted to provider.
But what if someone wants to use the accesslog data for something else, something that *does* depend on reqAuthzID? Granted, it would be the perfect storm for a consumer to be promoted *and* someone happened to use the accesslog for something other than delta-syncrepl *and* that other use cared about the differences, but allowing the two databases to differ in this manner could be easily overlooked.
I'd like to avoid that situation entirely if it's reasonably easy to do so, and replicating the accesslog database from the master seems straightforward. It doesn't matter to me that the data are transferred twice; the bandwidth consumed by syncrepl in this situation is so small as to be easily dwarfed by all manner of other traffic on our networks.
That aside, the accesslog overlay does indeed not set an entryUUID on the database it creates. Only the contextCSN is carried over (as the entryCSN, if I'm reading accesslog_db_open() in servers/slapd/overlays/accesslog.c correctly):
/* Get contextCSN from main DB */ op->o_bd = be; op->o_bd->bd_info = on->on_info->oi_orig; rc = be_entry_get_rw( op, be->be_nsuffix, NULL, slap_schema.si_ad_contextCSN, 0, &e_ctx );
if ( e_ctx ) { Attribute *a;
a = attr_find( e_ctx->e_attrs, slap_schema.si_ad_contextCSN ); if ( a ) { attr_merge( e, slap_schema.si_ad_entryCSN, a->a_vals, NULL ); attr_merge( e, a->a_desc, a->a_vals, NULL ); } be_entry_release_rw( op, e_ctx, 0 ); } op->o_bd->bd_info = (BackendInfo *)on; op->o_bd = li->li_db;
entryUUID is NO-USER-MODIFICATION, so I made creative use of the updatedn directive on the master to add an entryUUID to the accesslog database. Our consumers then started replicating that database on their own.
If it's the appropriate place, could accesslog_db_open() be modified to add an entryUUID when creating its database? I don't see a reason this shouldn't Just Work.
john