With 2.4.21 out, and hopefully stable enough to promote to the next Stable release, it's time to feature-freeze 2.4 and prepare for the 2.5 branch. As I already announced to the OpenLDAP-Committers, we're also planning to switch from CVS to GIT in mid-January. Commits for 2.5 will begin after we've settled into GIT.
We haven't really laid out a formal roadmap for 2.5 yet, but I think most of it has been discussed here or in Development ITSs already.
I would like to be able to resolve all outstanding Development ITSs - we will either implement them or reject/close them. There are 42 outstanding at the moment.
Likewise for all outstanding ITSs in Software Bugs - many of them have been deferred because a proper fix would require invasive changes to large parts of the code base. There are 26 outstanding. With 2.5 beginning we are free to make these large scale changes.
We should also walk thru the Software Enhancement requests and decide which to accept and which to reject. Currently there are 37 outstanding.
I also have a number of specific areas I want to see worked on; some of these are included in the above ITSs but I'll outline them here:
syncrepl config - this is pretty unwieldy already; syncrepl needs to be moved outside of the slapd core and into an overlay. That will allow us a lot more flexibility in configuring while also eliminating a lot of redundant parsing code. suffixmassage - we at the very least need to be able to point a consumer at some non-homogeneous suffix in the provider. We may go for complete librewrite support as well, although at this point I don't see as strong a need.
config TLS certs and keys should be stored as LDAP attributes, not pointers to filesystem locations. This is a prereq to using some of the dynamic cert generation features of the CA overlay. (This can be troublesome as there may not be clean APIs for reading certs from memory in all of the TLS APIs we support.) Disabling individual config attribute values and entries. At the moment I'm thinking of adding an ";x-disabled" tag to those values.
back-mdb Using a single-level store for Entries will impact all of the schema engine as well. I think the simplest solution here is going to be using an mmap'd file for all of the schema elements. The actual design of back-mdb still needs to be defined in several areas. The single-level store approach exposes us to some new failure modes that the current multi-level backends don't have. (E.g., corruptions due to bad RAM / wild pointer writes are very likely to get persisted on disk, implicitly.) The solution I'm considering is based on a mirroring strategy. Every database will be stored twice on disk: once in the file that is actively mmap'd into the process, and once in a write-only file. On every intentional update of a memory page, we will also store a checksum of the page, and manually write the page to the mirror. If we detect a checksum failure on any in-memory page we can still retrieve a valid copy from the mirror file. This of course doubles our potential I/O load, but I don't believe it's any worse than the load from performing write-ahead logging on a traditional database. (And yes, mirroring will take the place of writing transaction log files.) Some of these same considerations apply to the schema storage, but not entirely. At runtime, the schema is effectively read-only. When we do dynamic schema changes thru cn=config, all other threads are suspended. For the mmap purposes, we can mark all of the schema pages as read-only during runtime, and only make them read-write when cn=config is actually trying to perform an update. As such, the only sticky issue is dealing with changes made to the back-config internal files by plain text editors and such.
These are the things I'm interested in. But as always, this Project is driven forward by the particular interests of each individual contributor. If you have other ideas you want to pursue, speak up.
Howard Chu wrote:
We should also walk thru the Software Enhancement requests and decide which to accept and which to reject. Currently there are 37 outstanding.
1. I'd hope to see DIT structure rules and name forms to be implemented.
2. If support for slapd.conf is completely dropped would it also be possible to implement a more client-friendly back-config schema which does not have to be backward compatible to slapd.conf?
Ciao, Michael.
Michael Ströder wrote:
Howard Chu wrote:
We should also walk thru the Software Enhancement requests and decide which to accept and which to reject. Currently there are 37 outstanding.
- I'd hope to see DIT structure rules and name forms to be implemented.
Sure. There's already an ITS for that. Really not hard to do, DITStructureRUles just need to be checked in the frontend do_add() and not much else. Of course, this would be a good opportunity to take advantage of the per-DB subschemasubentry which till now has not served any purpose.
- If support for slapd.conf is completely dropped would it also be possible
to implement a more client-friendly back-config schema which does not have to be backward compatible to slapd.conf?
Describe what would be more client-friendly.
These are the things I'm interested in. But as always, this Project is
driven forward by the particular interests of each individual contributor. If you have other ideas you want to pursue, speak up.
Project related, rather than 2.5 specific:
* A website theme competition for overhaul to bring us into the 2010 * A wiki for more web based options for user contributions like http://wiki.samba.org * The build farm sevrer back
Thanks.
We should also walk thru the Software Enhancement requests and decide which to accept and which to reject. Currently there are 37 outstanding.
Here's a few things we talked about on IRC:
* Re-design and re-implemenation of the C LDAP (and LBER) API.
per http://scratchpad.wikia.com/wiki/LDAP_C_API
** No global state; everything in app or connection handles. No exceptions, no mercy! ** Use function pointers to allow override of *** Memory allocation *** Non-reentrant functions *** Have sane internal defaults as well as defaults for nspr, apr, glib, etc ** Better defaults! (v3 etc) ** Simple function alternatives for simple apps ** Use structures instead of many arguments.
* Overhaul Debug():
** Clean up multi-arg issues ** Revise (create) log formatting guidelines ** Better macros to normalize function trace enter/leave ** Support adding an optional high-res timestamp
* Revised memory allocation:
** Common memory allocator interface with explicit scoping/extent tags ** Consider built-in substitutes for malloc with better performance ** Possible interactions/conflicts/advantages of back-mdb integration
* Back-MDB concerns:
** Schema info in non-mmap ram pointed to from an mmap()d database? (no, and you mentioned this, but we'll need to re-arrange things, as discussed on IRC) Per DB schemas?
* Ensure fork() can based safely in more places - Essential for a fully operational slapo-(shell, perl, lua, python, etc)
* With API cleanups here and there, start supporting more languages like those.
* liblforth for dynamic syntax extensions and other features. (back-forth just sounds like an apr1 proposal...)
* Entry cache overhaul
** The right fix for this is MDB instead, but... ** Would be very nice to specify limits in terms of memory size rather than object count. Yes, this is hard. - Allocation of entry blocks into usage pools with generational mark/sweep collections?
- If support for slapd.conf is completely dropped would it also
be possible to implement a more client-friendly back-config schema which does not have to be backward compatible to slapd.conf?
Describe what would be more client-friendly.
* Make database suffixes be the RDN for database configuration - Use unique RDNs with inherently ordered characteristics where-ever possible Obviously not for ACLs, but many other things are fixable
* Similarly, Drop the concept of backend ordering entirely and move slapo-glue into more generic VFS-like support of the DIT
* Databases should only store one suffix. No, I'm not sure how that should integrate with people putting "" in a database yet. Need to figure that out.
* slapmodify - too useful to not implement now that we have a configdb. Bonus points if it (and slapadd) can be made safe to use with a running directory in all cases. (Perhaps a cache- invalidation-hint unix domain socket?)
* Full two-phase commit support at all layers, including read-transaction support. Back-MDB should make this feasible, and it offers lots of nice data reliability guaranties.
** Use multiple dbroots in MDB to track read/write transactions as they develop.
** Commited data in MDB is effectively read-only. Read transactions just hold their relevant generational root open.
** Write transactions hold open a parent-generation and maintain a log of updates as well as a local difference-tree holding just modified branch/leaf nodes. On prepare, this is reconciled with the new master-commited root and entered into the mmap zone. (or rejected) On commit, that new root is promoted to master-commited root. (Copy-on-write)
** Roots disappear when no one is watching them anymore. If all roots and temp-roots are known, we can easily remove internal nodes and leaves at the same time, so this is much simpler than full garbage collection.
** Root cleanup can be done asynchronously with respect to the triggering reads or writes, potentially in another thread.
** This method is equivalent to, and replaces a journal file. Commited MDB tree pages could be mprotect()ed if desirable.
Matthew Backes Symas Corporation mbackes@symas.com
On Wed, Dec 23, 2009 at 11:46:54AM -0800, Matthew Backes wrote:
We should also walk thru the Software Enhancement requests and decide which to accept and which to reject. Currently there are 37 outstanding.
Here's a few things we talked about on IRC:
- Re-design and re-implemenation of the C LDAP (and LBER) API.
per http://scratchpad.wikia.com/wiki/LDAP_C_API
** No global state; everything in app or connection handles. No exceptions, no mercy! ** Use function pointers to allow override of *** Memory allocation *** Non-reentrant functions *** Have sane internal defaults as well as defaults for nspr, apr, glib, etc ** Better defaults! (v3 etc) ** Simple function alternatives for simple apps ** Use structures instead of many arguments.
For the "thread-free" piece I might throw in
http://git.samba.org/?p=samba.git;a=blob;f=source3/include/tldap.h;h=cd50298...
and
http://git.samba.org/?p=samba.git;a=blob;f=source3/lib/tldap.c;h=fa56763a335...
It is really far from being complete and very much tied to Samba APIs like talloc and tevent, but the _send and _recv call model (multiple _recv calls for a search request) might provide an alternative API that I found pretty usable so far.
Volker
--On Tuesday, December 22, 2009 7:40 PM -0800 Howard Chu hyc@symas.com wrote:
With 2.4.21 out, and hopefully stable enough to promote to the next Stable release, it's time to feature-freeze 2.4 and prepare for the 2.5 branch. As I already announced to the OpenLDAP-Committers, we're also planning to switch from CVS to GIT in mid-January. Commits for 2.5 will begin after we've settled into GIT.
Yeah! One question from the RE side, is how to best handle 2.4 fixes with HEAD getting further & further apart.
At Zimbra, what we do is integrate from HEAD -> branch while development/features are in parallel. Once the branch is feature frozen, the integration is from branch -> HEAD. I've already run into a few cases with RE24 where I had to ask Howard to do the integration, because the code was very different, and it's only going increase from here on out.
One other thing Zimbra does is create specific branches for every release. In equivalence for OpenLDAP, that'd be a 2.4 branch plus a branch cloned off of it for every release (like 2.4.20 branch, etc). We only make the release branch when we're at "code freeze" for a given release. This allows developers to continue to make changes to the trunk of that branch without affecting the upcoming release. Any fixes we deem "release critical" then get integrated into the newly formed release branch.
That may be a bit overkill for this project, but thought I'd note it.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
On Jan 5, 2010, at 10:33 AM, Quanah Gibson-Mount wrote:
--On Tuesday, December 22, 2009 7:40 PM -0800 Howard Chu hyc@symas.com wrote:
With 2.4.21 out, and hopefully stable enough to promote to the next Stable release, it's time to feature-freeze 2.4 and prepare for the 2.5 branch. As I already announced to the OpenLDAP-Committers, we're also planning to switch from CVS to GIT in mid-January. Commits for 2.5 will begin after we've settled into GIT.
Yeah! One question from the RE side, is how to best handle 2.4 fixes with HEAD getting further & further apart.
At Zimbra, what we do is integrate from HEAD -> branch while development/features are in parallel. Once the branch is feature frozen, the integration is from branch -> HEAD.
This implies that only features suitable for the branch can be committed because you would never commit to a branch code that was not intended to be released there. This approach also tends to lead to regressions.
Our current practices allow developers to generally move forward without regard to what's happening on release branches. Release engineers can pick and choose what to bring over on a per branch basis. And, yes, as branches diverge from trunk, back porting gets harder and harder. The response to this is to cut off support for older branches as they diverge too far. This is why we've always had the "at most two release branches at once" rule. One branch, the young branch, would be mostly following trunk. The old branch would be feature frozen and, at some point, restricted to only security and other critical bug fixes (and eventually moved to historic).
I've already run into a few cases with RE24 where I had to ask Howard to do the integration, because the code was very different, and it's only going increase from here on out.
The significant divergence between 2.4 and head is a sign its time to start a 2.5 branch so that one can further restrict what goes on 2.4. These restrictions both stabilize 2.4 but reduce the back porting.
One other thing Zimbra does is create specific branches for every release. In equivalence for OpenLDAP, that'd be a 2.4 branch plus a branch cloned off of it for every release (like 2.4.20 branch, etc).
Why? You only need a branch if you're going to do releases off of it.
We only make the release branch when we're at "code freeze" for a given release. This allows developers to continue to make changes to the trunk of that branch without affecting the upcoming release.
We simply allow developers to continue work on HEAD while releases are being prepared. There's little need to freeze trunk.
Any fixes we deem "release critical" then get integrated into the newly formed release branch.
That may be a bit overkill for this project, but thought I'd note it.
The approach is so 1970s.
-- Kurt
--On Friday, January 08, 2010 5:18 PM -0800 Kurt Zeilenga Kurt@OpenLDAP.org wrote:
On Jan 5, 2010, at 10:33 AM, Quanah Gibson-Mount wrote:
--On Tuesday, December 22, 2009 7:40 PM -0800 Howard Chu hyc@symas.com wrote:
With 2.4.21 out, and hopefully stable enough to promote to the next Stable release, it's time to feature-freeze 2.4 and prepare for the 2.5 branch. As I already announced to the OpenLDAP-Committers, we're also planning to switch from CVS to GIT in mid-January. Commits for 2.5 will begin after we've settled into GIT.
Yeah! One question from the RE side, is how to best handle 2.4 fixes with HEAD getting further & further apart.
At Zimbra, what we do is integrate from HEAD -> branch while development/features are in parallel. Once the branch is feature frozen, the integration is from branch -> HEAD.
This implies that only features suitable for the branch can be committed because you would never commit to a branch code that was not intended to be released there. This approach also tends to lead to regressions.
No. The integration process from HEAD -> branch is manual. Thus commits for any feature can go to HEAD, but only those features going into the branch get migrated over. Once the branch is frozen for features, then fixes go from branch -> HEAD.
Our current practices allow developers to generally move forward without regard to what's happening on release branches.
Same here. Except that we sometimes have 3 branches of work occurring (HEAD, most current release, release - 1) due to support reasons.
One other thing Zimbra does is create specific branches for every release. In equivalence for OpenLDAP, that'd be a 2.4 branch plus a branch cloned off of it for every release (like 2.4.20 branch, etc).
Why? You only need a branch if you're going to do releases off of it.
Because we have customers to support, which sometimes requires hot fixes to an old release (sometimes several releases back) where we integrate a single fix to that release branch, rebuild the binaries, and give it to them. As I noted, this type of stuff is likely very overkill for the project.
We only make the release branch when we're at "code freeze" for a given release. This allows developers to continue to make changes to the trunk of that branch without affecting the upcoming release.
We simply allow developers to continue work on HEAD while releases are being prepared. There's little need to freeze trunk.
Any fixes we deem "release critical" then get integrated into the newly formed release branch.
That may be a bit overkill for this project, but thought I'd note it.
The approach is so 1970s.
There are a number of reasons why this approach is taken. We didn't use to make release branches, and it was a nightmare not to. The development requirements and process for Zimbra are substantially different than what is needed for OpenLDAP which is why I noted this was likely a bit overkill.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
On Jan 11, 2010, at 10:49 AM, Quanah Gibson-Mount wrote:
--On Friday, January 08, 2010 5:18 PM -0800 Kurt Zeilenga Kurt@OpenLDAP.org wrote:
On Jan 5, 2010, at 10:33 AM, Quanah Gibson-Mount wrote:
--On Tuesday, December 22, 2009 7:40 PM -0800 Howard Chu hyc@symas.com wrote:
With 2.4.21 out, and hopefully stable enough to promote to the next Stable release, it's time to feature-freeze 2.4 and prepare for the 2.5 branch. As I already announced to the OpenLDAP-Committers, we're also planning to switch from CVS to GIT in mid-January. Commits for 2.5 will begin after we've settled into GIT.
Yeah! One question from the RE side, is how to best handle 2.4 fixes with HEAD getting further & further apart.
At Zimbra, what we do is integrate from HEAD -> branch while development/features are in parallel. Once the branch is feature frozen, the integration is from branch -> HEAD.
This implies that only features suitable for the branch can be committed because you would never commit to a branch code that was not intended to be released there. This approach also tends to lead to regressions.
No. The integration process from HEAD -> branch is manual. Thus commits for any feature can go to HEAD, but only those features going into the branch get migrated over. Once the branch is frozen for features, then fixes go from branch -> HEAD.
Regression risk is huge here. If you always commit to trunk first, your regression risk is low.
Also, you never have to forward port, only back port.
And you can test on trunk and only bring onto the branch after the patch has been determined to be releasable. This improves stability of the release branches.
Our current practices allow developers to generally move forward without regard to what's happening on release branches.
Same here. Except that we sometimes have 3 branches of work occurring (HEAD, most current release, release - 1) due to support reasons.
One other thing Zimbra does is create specific branches for every release. In equivalence for OpenLDAP, that'd be a 2.4 branch plus a branch cloned off of it for every release (like 2.4.20 branch, etc).
Why? You only need a branch if you're going to do releases off of it.
Because we have customers to support, which sometimes requires hot fixes to an old release (sometimes several releases back) where we integrate a single fix to that release branch, rebuild the binaries, and give it to them.
Our hot fixes are just another release.
As I noted, this type of stuff is likely very overkill for the project.
I think it problematic model (release branch first) in general. With todays concurrent development practices and shared engineering responsibilities, trunk first (IMO) is better.
We only make the release branch when we're at "code freeze" for a given release. This allows developers to continue to make changes to the trunk of that branch without affecting the upcoming release.
We simply allow developers to continue work on HEAD while releases are being prepared. There's little need to freeze trunk.
Any fixes we deem "release critical" then get integrated into the newly formed release branch.
That may be a bit overkill for this project, but thought I'd note it.
The approach is so 1970s.
There are a number of reasons why this approach is taken. We didn't use to make release branches, and it was a nightmare not to.
That's because you (not us) want to be able to do hot fixes for any release. But as long as you have a tag, you can subsequently branch. We always tag releases. We've never had a need to branch off those tags.
The development requirements and process for Zimbra are substantially different than what is needed for OpenLDAP which is why I noted this was likely a bit overkill.
Right. But you did suggest we consider it. I note that I did consider this ages ago (when I set up the repo) and rejected it.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc
Zimbra :: the leader in open source messaging and collaboration
On Wed, Dec 23, 2009 at 1:40 PM, Howard Chu hyc@symas.com wrote:
With 2.4.21 out, and hopefully stable enough to promote to the next Stable release, it's time to feature-freeze 2.4 and prepare for the 2.5 branch. As I already announced to the OpenLDAP-Committers, we're also planning to switch from CVS to GIT in mid-January. Commits for 2.5 will begin after we've settled into GIT.
One feature might be some sort of internal thread or process api.
Perhaps the ability to run a backend(s) in a seperate thread of execution, and also maybe the ability to run arbitrary threads in the server core for things such as cache managment, without having to implement a backend.
Cheers Brett
Brett @Google wrote:
On Wed, Dec 23, 2009 at 1:40 PM, Howard Chu <hyc@symas.com mailto:hyc@symas.com> wrote:
With 2.4.21 out, and hopefully stable enough to promote to the next Stable release, it's time to feature-freeze 2.4 and prepare for the 2.5 branch. As I already announced to the OpenLDAP-Committers, we're also planning to switch from CVS to GIT in mid-January. Commits for 2.5 will begin after we've settled into GIT.
One feature might be some sort of internal thread or process api.
We already have the runqueue manager and thread pool for using internal threads.
Perhaps the ability to run a backend(s) in a seperate thread of execution, and also maybe the ability to run arbitrary threads in the server core for things such as cache managment, without having to implement a backend.
We tried cache pruning in a separate thread before. It doesn't make anything faster. Threads aren't a magic solution to everything. Frequently they're not the solution to *anything*.
Note that this mailing list is for discussion between actual developers of OpenLDAP software - a basic precondition is thus that you've actually spent time developing the code, and know what it already implements. Another obvious precondition is that you know the LDAP specs well enough to actually write code that implements the specs correctly.
Howard Chu wrote:
Brett @Google wrote:
Perhaps the ability to run a backend(s) in a seperate thread of execution, and also maybe the ability to run arbitrary threads in the server core for things such as cache managment, without having to implement a backend.
We tried cache pruning in a separate thread before. It doesn't make anything faster. Threads aren't a magic solution to everything. Frequently they're not the solution to *anything*.
Just to clarify - in back-bdb/hdb, we tried putting the cache pruning in a separate thread. The problem was that if the cache was full and the slapd was fully loaded, it was very likely that the cache thread would not get a chance to run until long after the cache had grown too large. Thread scheduling is non-deterministic, and if there are a lot of LDAP requests going into the thread pool, you can't guess when the cache thread will get to start, and have no idea when it will finish.
In the pcache overlay, cache pruning still runs in a separate thread. While the same constraints apply, the actual job is different, so it works fine.
As for "cache management, without having to implement a backend" - isn't that exactly what the pcache overlay provides? What are you asking about?
Howard Chu wrote:
These are the things I'm interested in. But as always, this Project is driven forward by the particular interests of each individual contributor. If you have other ideas you want to pursue, speak up.
Hi Howard,
first of all many thanks for the chance to exert influence on the future development!
I cannot estimate how complicated it would be (or if it's at least possible somehow) to provide access to an operation's details during normalization, but I would appreciate to see such a kind of possibility in 2.5. My question/demand is related to "server side current time matching (ssctm)" which I'm from time to time (every now and then ;-)) experimenting with. ssctm among other things is planed to provide support for filter statements, like for example (timestampAttr>=NOW) etc. pp.
My (currently very experimental) prototype expands the above example filter's assertion value during its normalization whereas the current time in generalizedTime syntax is determined on-demand (using a system call). The result is used for the later matching operation(s). There are two points I'm currently don't know how to proper solve without many changes regarding 2.4:
- In my opinion it would be a general advantage to take the constant operation's time-stamp "op->o_time" instead of using a system call, but there is currently no (at least I'm not aware of any) chance to access an operation's details from within the normalization (or matching rule) functions. - Additionally some kind of reliable evaluation of filter-lists, like for example: (&(timestampAttr=NOW)(timestampAttr=NOW)) is needed, too.
In my current understanding of the 2.4 API's filter normalization processing each filter-list's element gets normalized independently and thus potentially sequentially (have not investigated/tested in deep). With the above on-demand system calls the time-stamps resulting from the equivalent assertion values could differ which could unexpectedly lead to no, too little or too many matches (depending on the logical combination and the filter-list's details. The above example is just for demonstration and represents the most obvious potential failure scenario).
Many thanks for your feedback.
Best regards Daniel