contextCSN of subordinate syncrepl DBs
by Rein Tollevik
I've been trying to figure out why syncrepl used on a backend that is
subordinate to a glue database with the syncprov overlay should save the
contextCSN in the suffix of the glue database rather than the suffix of
the backend where syncrepl is used. But all I come up with are reasons
why this should not be the case. So, unless anyone can enlighten me as
to what I'm missing, I suggest that this be changed.
The problem with the current design is that it makes it impossible to
reliably replicate more than one subordinate db from the same remote
server, as there are now race conditions where one of the subordinate
backends could save an updated contextCSN value that is picked up by the
other before it has finished its synchronization. An example of a
configuration where more than one subordinate db replicated from the
same server might be necessary is the central master described in my
previous posting in
http://www.openldap.org/lists/openldap-devel/200806/msg00041.html
My idea as to how this race condition could be verified was to add
enough entries to one of the backends (while the consumer was stopped)
to make it possible to restart the consumer after the first backend had
saved the updated contextCSN but before the second has finished its
synchronization. But I was able to produce it by simply add or delete
of an entry in one of the backends before starting the consumer. Far to
often was the backend without any changes able to pick up and save the
updated contextCSN from the producer before syncrepl on the second
backend fetched its initial value. I.e it started with an updated
contextCSN and didn't receive the changes that had taken place on the
producer. If syncrepl stored the values in the suffix of their own
database then they wouldn't interfere with each other like this.
There is a similar problem in syncprov, as it must use the lowest
contextCSN value (with a given sid) saved by the syncrepl backends
configured within the subtree where syncprov is used. But to do that it
also needs to distinguish the contextCSN values of each syncrepl
backend, which it can't do when they all save them in the glue suffix.
This also implies that syncprov must ignore contextCSN updates from
syncrepl until all syncrepl backends has saved a value, and that
syncprov on the provider must send newCookie sync info messages when it
updates its contextCSN value when the changed entry isn't being
replicated to a consumer. I.e as outlined in the message referred to above.
Neither of these changes should interfere with ordinary multi-master
configurations where syncrepl and syncprov are both use on the same
(glue) database.
I'll volunteer to implement and test the necessary changes if this is
the right solution. But to know whether my analysis is correct or not I
need feedback. So, comments please?
--
Rein Tollevik
Basefarm AS
13 years, 6 months
contextCSN interaction between syncrepl and syncprov
by Rein Tollevik
The remaining errors and race condition that test058 demonstrates cannot
be solved unless syncrepl is changed to always store the contextCSN in
the suffix of the database where it is configured, not the suffix of its
glue database as it does today.
Assuming serverID 0 is reserved for the single master case, syncrepl and
syncprov can in that case only be configured within the same database
context if syncprov is a pure forwarding server I.e, it will not update
any CSN value and syncrepl have no need to fetch any values from it.
In the multi-master case it is only the contextCSN whose SID matches the
current serverID that syncprov maintains, the other are all received by
syncrepl. So, the only time syncrepl should need an updated CSN from
syncprov is when it is about to present it to its peer, i.e when it
initiates a refresh phase. Actually, a race condition that would render
the state of the database undetermined could occur if syncrepl fetches
an updated CSN from syncprov during the initial refresh phase. So, it
should be sufficient to read the contextCSN values from the database
before a new refresh phase is initiated, independent of whether syncprov
is in use or not.
Syncrepl will receive updates to the contextCSN value with its own SID
from its peers, at least with ITS#5972 and ITS#5973 in place. I.e, the
normal ignoring of updates tagged with a too old contextCSN value will
continue to work. It should also be safe to ignore all updates tagged
with a contextCSN or entryCSN value whose SID is the current servers
non-zero serverID, provided a complete refresh cycle is known to have
taken place. I.e, when a contextCSN value with the current non-zero
serverID was read from the database before the refresh phase started, or
after the persistent phase have been entered.
The state of the database will be undetermined unless an initial refresh
(i.e starting from an empty database or CSN set) have been run to
completion. I cannot see how this can be avoided, and as far as I know
it is so now too. It might be worth mentioning in the doc. though
(unless it already is).
Syncprov must continue to monitor the contextCSN updates from syncrepl.
When it receives updates destined for the suffix of the database it
itself is configured it must replace any CSN value whose SID matches its
own non-zero serverID with the value it manages itself (which should be
greater or equal to the value syncrepl tried to store unless something
is seriously wrong). Updates to "foreign" contextCSN values (i.e those
with a SID not matching the current non-zero serverID) should be
imported into the set of contextCSN values syncprov itself maintain.
Syncprov could also short-circuit the contextCSN update and delay it to
its own checkpoint. I'm not sure what effect the checkpoint feature
have today when syncrepl constantly updates the contextCSN..
Syncprov must, when syncrepl updates the contextCSN in the suffix of a
subordinate DB, update its own knowledge of the "foreign" CSNs to be the
*lowest* CSN with any given SID stored in all the subordinate DBs (where
syncrepl is configured). And no update must take place unless a
contextCSN value have been stored in *all* the syncrepl-enabled
subordinate DBs. Any values matching the current non-zero serverID
should be updated in this case too, but a new value should probably not
be inserted.
These changes should (unless I'm completely lost that is..) create a
cleaner interface between syncrepl and syncprov without harming the
current multi-master configurations, and make asymmetric multiple
masters configurations like the one in test058 work. Comments please?
Rein
13 years, 6 months
NUMA-aware tcmalloc
by Howard Chu
For those of you running multi-socket Opteron servers (and eventually,
multi-socket Nehalem servers), AMD published a whitepaper last week on their
work adapting Google's tcmalloc to be NUMA-aware. The whitepaper includes
links to their source code / diffs. It appears to be quite a performance boost
in their (very artificial) benchmark. I'll be trying it out soon myself.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
14 years
Build farm?
by William Jojo
Is there a build farm presently for OpenLDAP? I have a server in the
Samba build farm and I am willing to offer the same for the OpenLDAP
team for testing on AIX 5.3/6.1 if you desire.
In lieu of a build farm, are there already platform independent scripts
that can automate build/test of cvs snapshots, gather the relevant
success/failure pieces and send them back to the team?
Last question: if neither of these exist, is there a desire for such
instruments?
Cheers,
Bill
14 years, 2 months
Re: SEGV on AIX (Was: Please test RE24 (3/18/2009 call for testing))
by William Jojo
---- Original message ----
>Date: Wed, 25 Mar 2009 13:26:18 -0400 (EDT)
>From: Aaron Richton <richton(a)nbcs.rutgers.edu>
>Subject: Re: SEGV on AIX (Was: Please test RE24 (3/18/2009 call for testing))
>To: William Jojo <w.jojo(a)hvcc.edu>
>Cc: openldap-devel(a)openldap.org
>
>On Wed, 25 Mar 2009, William Jojo wrote:
>
>> Was running test050 500 times. Been watching threads on test038 and test050
>> and decided to stress them on AIX as well. I am betting there is an
>
>Thanks for that. Platform diversity is always a plus...
My pleasure. test050 successful for 500 iterations. Could not reproduce the original failure.
Cheers,
Bill
14 years, 2 months
Please test RE24 (3/18/2009 call for testing)
by Quanah Gibson-Mount
Please test RE24 for possible 2.4.16 release.
Thanks!
--Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration
14 years, 2 months
search progress control
by Howard Chu
Dunno if there's already anything like this, I haven't bothered to search yet.
I was considering adding the progress meter to ldapadd. Then it was suggested
that it might be useful for slapcat/ldapsearch as well. For ldapsearch we
would need a means to tell the client how large the result set is expected to
be. Currently it's easy for back-bdb/hdb to provide this number, because they
just walk through an IDL of candidates where the IDL size is already known. Of
course the real result set size may be smaller due to actual filter evaluation.
This seems to call for a control that we could attach to a search request
"give me an estimate of the result set size and update me every N entries".
The response control would be attached to the first entry and every N entries
after that, with an updated estimate of the total number of results. A control
like this would be particularly useful for administrators using GUIs to browse
a large directory, to give feedback about when they will receive the complete
set of entries, and give an opportunity to decide to abandon the search or
continue.
Comments?
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
14 years, 2 months
dITStructureRules/nameForms in subschema subentry for informational purpose
by Michael Ströder
HI!
Discussed this very briefly with Howard at LDAPcon 2007 based on an idea
of Steve:
Support for dITStructureRules and nameForms is still in OpenLDAP's TODO.
In the meanwhile slapd could accept definitions for both in slapd.conf
and simply pass them on to a schema-aware LDAP client for informational
purpose without enforcing them. Same function like rootDSE <file> in
slapd.conf.
Opinions?
Ciao, Michael.
--
Michael Ströder
E-Mail: michael(a)stroeder.com
http://www.stroeder.com
14 years, 2 months
LD_LIBRARY_PATH for make test
by Michael Ströder
HI!
I'm running make test on a system where older OpenLDAP libs are
installed from the Linux distribution packages (here RPMs of openSUSE
11.1). This leads to problems during 'make test':
LDAP vendor version mismatch: library 20413, header 20416
>>>>> Test failed
>>>>> ./scripts/test000-rootdse failed (exit 1)
make[2]: *** [bdb-yes] Error 1
make[2]: Leaving directory
`/usr/src/michael/openldap/OPENLDAP_REL_ENG_2_4/openldap/tests'
make[1]: *** [test] Error 2
make[1]: Leaving directory
`/usr/src/michael/openldap/OPENLDAP_REL_ENG_2_4/openldap/tests'
make: *** [test] Error 2
$ grep -r LD_LIBRARY_PATH tests/
tests/scripts/defines.sh:LD_LIBRARY_PATH=$TESTWD/../libraries:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH
Hmm...
find libraries/ -name "*.so"
libraries/libldap_r/.libs/libldap_r.so
libraries/liblber/.libs/liblber.so
libraries/libldap/.libs/libldap.so
Does setting LD_LIBRARY_PATH in tests/scripts/defines.sh have any effect?
I've changed this line to (line wrapped)
LD_LIBRARY_PATH=$TESTWD/../libraries/liblber/.libs:$TESTWD/../libraries/libldap/.libs:$TESTWD/../libraries/libldap_r/.libs:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH
and now it works as expected.
This might also be the reason that some tests failed in the past
although gdb reported that the right libs were used. But the symbols
were missing in gdb's stack trace.
Ciao, Michael.
14 years, 2 months
Passwordmodify ext.Op. question
by Dieter Kluenter
Hi,
it seems that slapd generates a random password and modifies the
userPassword although a newpassword is presented. This few lines of
perl may show what I am talking about, based on this code slapd
generates a random password and modifies userPassword.
$msg = $ldap->set_password(
oldpassword=> '0J2zrRpD',
newpassword=> 'Mu321Ha'
);
die "Error: ",$msg->code(), ":",$msg->error() if ($msg->code());
print "Password modified to: ", $msg->gen_password() , "\n";
I know this code is buggy, but still, as newpassword is present, a
genpassword call should not be accepted or an error message should be
returned, but under no circumstances the password should be modified
with the random password.
Any thoughts?
-Dieter
--
Dieter Klünter | Systemberatung
http://www.dpunkt.de/buecher/2104.html
sip: +49.180.1555.7770535
GPG Key ID:8EF7B6C6
53°08'09,95"N
10°08'02,42"E
14 years, 2 months