contextCSN of subordinate syncrepl DBs
by Rein Tollevik
I've been trying to figure out why syncrepl used on a backend that is
subordinate to a glue database with the syncprov overlay should save the
contextCSN in the suffix of the glue database rather than the suffix of
the backend where syncrepl is used. But all I come up with are reasons
why this should not be the case. So, unless anyone can enlighten me as
to what I'm missing, I suggest that this be changed.
The problem with the current design is that it makes it impossible to
reliably replicate more than one subordinate db from the same remote
server, as there are now race conditions where one of the subordinate
backends could save an updated contextCSN value that is picked up by the
other before it has finished its synchronization. An example of a
configuration where more than one subordinate db replicated from the
same server might be necessary is the central master described in my
previous posting in
http://www.openldap.org/lists/openldap-devel/200806/msg00041.html
My idea as to how this race condition could be verified was to add
enough entries to one of the backends (while the consumer was stopped)
to make it possible to restart the consumer after the first backend had
saved the updated contextCSN but before the second has finished its
synchronization. But I was able to produce it by simply add or delete
of an entry in one of the backends before starting the consumer. Far to
often was the backend without any changes able to pick up and save the
updated contextCSN from the producer before syncrepl on the second
backend fetched its initial value. I.e it started with an updated
contextCSN and didn't receive the changes that had taken place on the
producer. If syncrepl stored the values in the suffix of their own
database then they wouldn't interfere with each other like this.
There is a similar problem in syncprov, as it must use the lowest
contextCSN value (with a given sid) saved by the syncrepl backends
configured within the subtree where syncprov is used. But to do that it
also needs to distinguish the contextCSN values of each syncrepl
backend, which it can't do when they all save them in the glue suffix.
This also implies that syncprov must ignore contextCSN updates from
syncrepl until all syncrepl backends has saved a value, and that
syncprov on the provider must send newCookie sync info messages when it
updates its contextCSN value when the changed entry isn't being
replicated to a consumer. I.e as outlined in the message referred to above.
Neither of these changes should interfere with ordinary multi-master
configurations where syncrepl and syncprov are both use on the same
(glue) database.
I'll volunteer to implement and test the necessary changes if this is
the right solution. But to know whether my analysis is correct or not I
need feedback. So, comments please?
--
Rein Tollevik
Basefarm AS
14 years
contextCSN interaction between syncrepl and syncprov
by Rein Tollevik
The remaining errors and race condition that test058 demonstrates cannot
be solved unless syncrepl is changed to always store the contextCSN in
the suffix of the database where it is configured, not the suffix of its
glue database as it does today.
Assuming serverID 0 is reserved for the single master case, syncrepl and
syncprov can in that case only be configured within the same database
context if syncprov is a pure forwarding server I.e, it will not update
any CSN value and syncrepl have no need to fetch any values from it.
In the multi-master case it is only the contextCSN whose SID matches the
current serverID that syncprov maintains, the other are all received by
syncrepl. So, the only time syncrepl should need an updated CSN from
syncprov is when it is about to present it to its peer, i.e when it
initiates a refresh phase. Actually, a race condition that would render
the state of the database undetermined could occur if syncrepl fetches
an updated CSN from syncprov during the initial refresh phase. So, it
should be sufficient to read the contextCSN values from the database
before a new refresh phase is initiated, independent of whether syncprov
is in use or not.
Syncrepl will receive updates to the contextCSN value with its own SID
from its peers, at least with ITS#5972 and ITS#5973 in place. I.e, the
normal ignoring of updates tagged with a too old contextCSN value will
continue to work. It should also be safe to ignore all updates tagged
with a contextCSN or entryCSN value whose SID is the current servers
non-zero serverID, provided a complete refresh cycle is known to have
taken place. I.e, when a contextCSN value with the current non-zero
serverID was read from the database before the refresh phase started, or
after the persistent phase have been entered.
The state of the database will be undetermined unless an initial refresh
(i.e starting from an empty database or CSN set) have been run to
completion. I cannot see how this can be avoided, and as far as I know
it is so now too. It might be worth mentioning in the doc. though
(unless it already is).
Syncprov must continue to monitor the contextCSN updates from syncrepl.
When it receives updates destined for the suffix of the database it
itself is configured it must replace any CSN value whose SID matches its
own non-zero serverID with the value it manages itself (which should be
greater or equal to the value syncrepl tried to store unless something
is seriously wrong). Updates to "foreign" contextCSN values (i.e those
with a SID not matching the current non-zero serverID) should be
imported into the set of contextCSN values syncprov itself maintain.
Syncprov could also short-circuit the contextCSN update and delay it to
its own checkpoint. I'm not sure what effect the checkpoint feature
have today when syncrepl constantly updates the contextCSN..
Syncprov must, when syncrepl updates the contextCSN in the suffix of a
subordinate DB, update its own knowledge of the "foreign" CSNs to be the
*lowest* CSN with any given SID stored in all the subordinate DBs (where
syncrepl is configured). And no update must take place unless a
contextCSN value have been stored in *all* the syncrepl-enabled
subordinate DBs. Any values matching the current non-zero serverID
should be updated in this case too, but a new value should probably not
be inserted.
These changes should (unless I'm completely lost that is..) create a
cleaner interface between syncrepl and syncprov without harming the
current multi-master configurations, and make asymmetric multiple
masters configurations like the one in test058 work. Comments please?
Rein
14 years
random topics
by Howard Chu
Some of these should go onto the roadmap or TODO list. Just scribbling them
down here before I forget.
Had a chat with tridge and abartlet about the LDAP TXN support they're looking
for with Samba4. One of their requirements was being able to do reads inside
the TXN, and that is explicitly not supported by the LDAP TXN draft. But after
talking a bit more about how they do things in LDB, we decided to take the
same approach here - LDB only allows one writer when a TXN is active. We can
write a (global) overlay that gives this same behavior, and then not have to
worry about any of the other TXN details.
While thinking about back-mdb, I realized that since we're mapping storage to
memory we don't really need a traditional disk-DB structure. Instead we can
just manage the space as its own heap and use in-memory structures like AVL
trees and such. I've been thinking about extending our current AVL library to
a T-tree implementation. Further reading along those lines led me to
Cache-Sensitive T-trees (CST-Trees) which also have the important property of
being friendly to CPU cacheline sizes, and so behaving a lot better in
multi-processor systems. This is a pretty generic project - if anyone's been
looking for a relatively self-contained bit of work they could contribute,
this would be ideal. You can get the paper here:
http://ids.snu.ac.kr/wiki/CST-Trees:_Cache_Sensitive_T-Trees
(Note that these things are only good because they can fit all the keys into a
single node - i.e., they only have their most desirable properties when keys
are small. We could certainly use them for the id2entry index, and maybe a few
others, but in the general case we need to be able to use pointers to
arbitrary data as keys.)
ITS#6301 also fed into this search, but I think we just want to use threaded
AVL here instead of the Red-Black tree code. That would shrink the size of the
patch down considerably, which would also be a good thing. Again, anyone
looking for a relatively small project, this one is pretty straightforward.
Also a completely fringe topic, but still interesting - we've got OpenLDAP
running well on Android now; on a G1 phone. But it's still just a set of
command-line tools. UnboundID showed off their Java LDAP SDK in a simple app
also running on Android. A combination of these two would make an extremely
powerful package:
Develop a canned slapd config that uses syncrepl to sync with a remote LDAP
address book. Use proxied multimaster, to allow local changes to be propagated
back to the remote directory. Write a java app that uses LDAP for the G1's
contact book backend instead of the SQLite ****stuff that it's currently
using. Accomplishing this will necessitate writing a few GUI menus for
configuring the handful of variables needed for the slapd config.
This also brings me to another topic - adopting features from OpenDS... They
expose a cn=Tasks tree which can be used for submitting tasks via LDAP.
Currently we expose our runqueue under cn=Monitor but that's only read-only.
It would be nice to be able to submit/schedule/trigger tasks on the fly... In
particular, it would be nice to have a defined task for triggering a syncrepl
refresh.
The Android slapd config would be a refreshOnly syncrepl consumer. It would
also have a hidden back-ldap refreshOnly consumer replicating from its local
database and writing to the remote master. This way both push and pull would
be under control of the Android device. (You have no idea how much of a bill
you can rack up with fully automatic background synching. It's much better to
have this under complete user control.) The GUI would just submit a request to
cn=Tasks to trigger a synch.
OpenDS also has matching rules defined for comparing timestamp attributes to
"current server time". This is extremely handy for a lot of things. Again,
this is a small, self-contained project that should be simple for someone to
jump in on.
More ideas later...
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
14 years, 1 month
RE24 call for testing
by Quanah Gibson-Mount
Please test RE24 for 2.4.19 preparation. Thanks!
--Quanah
--
Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra :: the leader in open source messaging and collaboration
14 years, 1 month
tls error messages
by Ralf Haferkamp
Hi,
In case of certificate verification failures I'd like to include the
verification error message ("certificate has expired", "unable to get issuer
certificate", ...) in the diagnostic errormessage.
For that I need pass the tls_session* as an extra argument to the
TI_session_errmsg functions (for openssl I need the SSL* handle to get the
verification error). Does anyone see a problem with this?
--
Ralf
14 years, 2 months
Problems compiling OpenLDAP in 64 bit due to old libtool version included
by Dagobert Michelsen
Hi,
I have a problem compiling the current OpenLDAP (2.4.18) in 64 bit
on Solaris. The problem occurs when building with modules and
enabling 64 bit in CFLAGS rather than setting CC to the compiler
including flags. The source of the problem is the old libtool
version 1.5.x included in OpenLDAP. The libtool maintainers
recommended upgrading libtool to at least 2.2 where the
problem was fixed. It would be nice if it could be updated,
or is there a specific reason why the old version is kept?
See
<http://lists.gnu.org/archive/html/bug-libtool/2009-09/msg00017.html>
for details on identifying the bug.
Best regards
-- Dago
14 years, 2 months
slapo-accesslog: Preserve some attributes of deleted entries in auditDelete entries
by Michael Ströder
HI!
When using slapo-accesslog in a meta-directory environment you might wanna
query the accesslog database for quickly detecting deleted entries with
(&(objectClass=auditDelete)(reqResult=0)(<time-interval-filter>) and act
accordingly. Now when receiving this entry of object class auditDelete the
entry referenced by 'reqDN' is already gone. But the primary key used for
synchronization might be some attribute within the deleted entry and not being
part of the DN.
So it would be helpful to preserve a set of configurable attributes of the
deleted entry in those entries of object class 'auditDelete' in the accesslog
database just like attribute 'reqOld' for modify and modifyDN requests
(configurable with logold/logoldattr).
Ciao, Michael.
14 years, 2 months
gnutls viewpoint?
by Gavin Henry
Hi All,
Are we still recommending OpenSSL over gnutls? Is it still insecurely coded? I'm just digging out the past threads we have.
Thanks.
--
Kind Regards,
Gavin Henry.
Managing Director.
T +44 (0) 1224 279484
M +44 (0) 7930 323266
F +44 (0) 1224 824887
E ghenry(a)suretecsystems.com
Open Source. Open Solutions(tm).
http://www.suretecsystems.com/
Suretec Systems is a limited company registered in Scotland. Registered
number: SC258005. Registered office: 13 Whiteley Well Place, Inverurie,
Aberdeenshire, AB51 4FP.
Subject to disclaimer at http://www.suretecgroup.com/disclaimer.html
14 years, 2 months
libldap_r
by Howard Chu
It seems the dichotomy between libldap and libldap_r is a relic from the bad
old days of dcethreads / cmathreads when linking a threaded library into an
otherwise non-threaded program would cause all sorts of strange and wonderful
failures. Unless anyone knows of any current platform where this is unsafe, I
think it's time we dropped this distinction, and just use libldap_r (until we
get to writing a completely new C API).
libldap_r is still missing some thread-specific features though - we should
wrap all library initialization in a pthread_once() call, and we should be
using thread-specific data for the LDAP* errno value.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
14 years, 2 months
back-mdb locking strategies
by Howard Chu
There are two main problems being addressed by the mdb approach - eliminating
multiple layers of cache copying, and eliminating multiple layers of locking.
Currently we have per-entry mutexes and also BDB locks for every cached item.
The mutex overhead is ridiculous, 40-some bytes per mutex x 10 million entries
is a lot of wasted storage. Especially since a single slapd can only actually
lock 2-3 items per thread - i.e., we only need on the order of 100 locks,
ever. The sane thing to do here is continue to use the BDB approach to
locking, but with fewer options.
By default BDB allocates a new locker for every database handle and every
cursor that's created. This actually causes us a lot of trouble since it
allows two cursors created by the same thread to deadlock each other. (And
it's why we had to go to the trouble of wrapping read operations in a TXN,
even though we don't need any of the ACID properties for reads.) For mdb we
will want an API for which (essentially) threadID == lockerID and cursors and
such will not have their own lockerIDs.
Actual locks will be managed in separate tables. The most common case will be
locks indexed by entryID. For this case we can easily structure the table to
minimize lock and cache contention. E.g., for 16 threads, use a 16x16 array of
structures. Each element contains a vector of 16 (mutex,locker,entryID)
tuples. The element to lock for a given entryID is ((entryID & 0xff0) >> 4).
In the worst case of all 16 threads going after the same entry, they will all
be waiting on a single mutex which is no worse than what we have now. For all
16 threads going after the same slot, they will simply have to use other slots
in the lock vector.
But in the typical case, all threads will be going after different mutexes
residing in different processor cachelines. And, a thread that is iterating
sequentially through the database will be re-using the same mutex/cacheline
for 16 entries at a time.
(16 was chosen just for the purpose of this example. The actual numbers should
be derived from the maximum number of threads allowed to access the database.)
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
14 years, 2 months