On 02/08/14 18:57, Howard Chu wrote:
Hallvard Breien Furuseth wrote:
> If so, MDB_NOLOCK may be in trouble since it uses pick_meta()
> instead of mti_txnid. Should there be a separate CACHEFLUSH after
> writing the datapages if MDB_NOLOCK, and the current CACHEFLUSH
> should just flush the metapages?
I don't see any reason for that. As always, the only thing that matters
is when the metapages get written.
What matters is that nothing *sees* the metapage before its data
pages, nor sees the mti_txnid change before the metapage. I thought
that's what cache coherency and memory ordering was about.
So to explain my previous message a bit: A cacheflush() which flushes
a metapage and its datapages all in one chunk makes me nervous. If
that's necessary (rather than just flushing the meta at that point), I
imagine that just before the flush, it's possible for something to see
the metapage before its datapages. Delaying the mti_txnid change
protects from that, except when something does not use mti_txnid -
hence the concern for MDB_NOLOCK using mdb_env_pick_meta().
Hmm. There are some other places that use pick_meta even when
the lockfile is in use. Maybe they too should try (mti_txnid & 1).
> Does the code contradict this comment above, or is it about
> something else?
> /* Memory ordering issues are irrelevant ... */
Quite simply, on MIPS, write()s into the buffer cache aren't coherent
with the on-chip data (or instruction, but irrelevant) cache.