overlay order important and not documented?

List overview All Threads
Download

newer

older

libldap_r thread-safety problems

addressbook ACLs - cannot create...

Francis Swasey

16 Mar 2007 16 Mar '07

3 p.m.

I have found using OpenLDAP 2.3.34 on a RHEL4 system (a locally built RPM) that the order of the overlays can lead to a problem (slapd locks up and doesn't do anything).

I have found that using syncprov, accesslog, auditlog, unique will work fine. But, adding refint to the end of that chain and then do a modrdn that triggers refint to want to change an attribute in another entry and your slapd is locked up. There is now one thread that will not end and you will only be able to shutdown slapd with a "kill -9".

However, if I change the order so that the refint is prior to the accesslog overlay, then it works.

Is this interaction between the two overlays and the importance of their relative order a known (or at least expected) -- or should I open an ITS?

-- Frank Swasey | http://www.uvm.edu/~fcs Sr Systems Administrator | Always remember: You are UNIQUE, University of Vermont | just like everyone else. "I am not young enough to know everything." - Oscar Wilde (1854-1900)

Attachments:

smime.p7s (application/x-pkcs7-signature — 3.2 KB)

Show replies by date

Francis Swasey

16 Mar 16 Mar

4:25 p.m.

On 3/16/07 10:00 AM, Francis Swasey wrote:

...

Is this interaction between the two overlays and the importance of their relative order a known (or at least expected) -- or should I open an ITS?

Before anyone spends a lot of time working on this, I have downloaded the version of refint.c from head and am testing that to see if the fixes for ITS4802 and ITS4853 that have been applied might somehow have fixed the problem.

Pierangelo Masarati

5:11 p.m.

Francis Swasey wrote:

...

I have found using OpenLDAP 2.3.34 on a RHEL4 system (a locally built RPM) that the order of the overlays can lead to a problem (slapd locks up and doesn't do anything).

I have found that using syncprov, accesslog, auditlog, unique will work fine. But, adding refint to the end of that chain and then do a modrdn that triggers refint to want to change an attribute in another entry and your slapd is locked up. There is now one thread that will not end and you will only be able to shutdown slapd with a "kill -9".

However, if I change the order so that the refint is prior to the accesslog overlay, then it works.

Is this interaction between the two overlays and the importance of their relative order a known (or at least expected) -- or should I open an ITS?

AFAIK, it is not known, although some strange and adverse interaction is possible (and feared). I suggest you file an ITS.

Ing. Pierangelo Masarati OpenLDAP Core Team

SysNet s.n.c. Via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it ------------------------------------------ Office: +39.02.23998309 Mobile: +39.333.4963172 Email: pierangelo.masarati@sys-net.it ------------------------------------------

Howard Chu

7:36 p.m.

Pierangelo Masarati wrote:

...

Francis Swasey wrote:

...
I have found using OpenLDAP 2.3.34 on a RHEL4 system (a locally built RPM) that the order of the overlays can lead to a problem (slapd locks up and doesn't do anything).

It's already documented that overlays are executed in a specific order. Obviously if the order didn't matter we wouldn't worry about it.

...

...
I have found that using syncprov, accesslog, auditlog, unique will work fine. But, adding refint to the end of that chain and then do a modrdn that triggers refint to want to change an attribute in another entry and your slapd is locked up. There is now one thread that will not end and you will only be able to shutdown slapd with a "kill -9".

However, if I change the order so that the refint is prior to the accesslog overlay, then it works.

Is this interaction between the two overlays and the importance of their relative order a known (or at least expected) -- or should I open an ITS?

AFAIK, it is not known, although some strange and adverse interaction is possible (and feared). I suggest you file an ITS.

Not documented... The accesslog overlay serializes all write operations by taking a lock. This means only one write operation is allowed to progress at any time.

The refint overlay creates multiple write operations from a single write operation. If the accesslog overlay has already locked the current operation, then yes, the refint overlay will deadlock at that point because its write operations still go through the entire overlay stack. I think the fix for this will be to change the refint overlay to bypass any overlays above it when performing its own writes.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc Chief Architect, OpenLDAP http://www.openldap.org/project/

Pierangelo Masarati

7:42 p.m.

Howard Chu wrote:

...

Not documented... The accesslog overlay serializes all write operations by taking a lock. This means only one write operation is allowed to progress at any time.

The refint overlay creates multiple write operations from a single write operation. If the accesslog overlay has already locked the current operation, then yes, the refint overlay will deadlock at that point because its write operations still go through the entire overlay stack. I think the fix for this will be to change the refint overlay to bypass any overlays above it when performing its own writes.

Wouldn't this prevent writes by refint from being logged by accesslog?

Ing. Pierangelo Masarati OpenLDAP Core Team

Howard Chu

7:53 p.m.

Pierangelo Masarati wrote:

...

Howard Chu wrote:

...
Not documented... The accesslog overlay serializes all write operations by taking a lock. This means only one write operation is allowed to progress at any time.

The refint overlay creates multiple write operations from a single write operation. If the accesslog overlay has already locked the current operation, then yes, the refint overlay will deadlock at that point because its write operations still go through the entire overlay stack. I think the fix for this will be to change the refint overlay to bypass any overlays above it when performing its own writes.

Wouldn't this prevent writes by refint from being logged by accesslog?

Yes, and that's probably the cleanest behavior. The overlay stack is only supposed to be executed completely for operations that came from the frontend. Any overlay on the stack should go through any other overlays below it, but not through any that are above it.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc Chief Architect, OpenLDAP http://www.openldap.org/project/

Pierangelo Masarati

7:58 p.m.

Howard Chu wrote:

...

Yes, and that's probably the cleanest behavior. The overlay stack is only supposed to be executed completely for operations that came from the frontend. Any overlay on the stack should go through any other overlays below it, but not through any that are above it.

However, the accesslog would be unusable for replication; I see the point: refint should be present on the shadow as well, and do locally refint modifications, right?

Ing. Pierangelo Masarati OpenLDAP Core Team

Howard Chu

8:17 p.m.

Pierangelo Masarati wrote:

...

Howard Chu wrote:

...
Yes, and that's probably the cleanest behavior. The overlay stack is only supposed to be executed completely for operations that came from the frontend. Any overlay on the stack should go through any other overlays below it, but not through any that are above it.

However, the accesslog would be unusable for replication; I see the point: refint should be present on the shadow as well, and do locally refint modifications, right?

Yes, that's one alternative. The other alternative is to put refint above the accesslog, as Frank has already done. In the latter case you replicate the refint ops explicitly, in the other you don't. But the current behavior, where refint executes the whole stack regardless of its position in the stack, is definitely not desired.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc Chief Architect, OpenLDAP http://www.openldap.org/project/

Francis Swasey

10:36 p.m.

On 3/16/07 2:58 PM, Pierangelo Masarati wrote:

...

Howard Chu wrote:

...
Yes, and that's probably the cleanest behavior. The overlay stack is only supposed to be executed completely for operations that came from the frontend. Any overlay on the stack should go through any other overlays below it, but not through any that are above it.

However, the accesslog would be unusable for replication; I see the point: refint should be present on the shadow as well, and do locally refint modifications, right?

I disagree! Since the only modifications that arrive at the replicas (or shadows) came from the master. I should be able to set up the overlays for unique and refint on the master and be done with it. I should not have to waste the cpu cycles on each of the replicas to perform the same operations that were already figured out and done on the master ldap server.

Howard Chu

17 Mar 17 Mar

10:52 a.m.

Francis Swasey wrote:

...

On 3/16/07 2:58 PM, Pierangelo Masarati wrote:

...
Howard Chu wrote:

...
Yes, and that's probably the cleanest behavior. The overlay stack is only supposed to be executed completely for operations that came from the frontend. Any overlay on the stack should go through any other overlays below it, but not through any that are above it.

However, the accesslog would be unusable for replication; I see the point: refint should be present on the shadow as well, and do locally refint modifications, right?

I disagree! Since the only modifications that arrive at the replicas (or shadows) came from the master. I should be able to set up the overlays for unique and refint on the master and be done with it. I should not have to waste the cpu cycles on each of the replicas to perform the same operations that were already figured out and done on the master ldap server.

There are valid reasons to do it either way - for one thing, instantiating the refint overlay on the replicas reduces network traffic. Also if you have a hot standby setup you want the replicas to be configured as nearly identically to the master as possible. (In mirrormode the replica and the master should be identical, or mirror image, anyway.)

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc Chief Architect, OpenLDAP http://www.openldap.org/project/

Francis Swasey

19 Mar 19 Mar

2:31 p.m.

On 3/17/07 5:52 AM, Howard Chu wrote:

...

Francis Swasey wrote:

<snip>

...

...
I disagree! Since the only modifications that arrive at the replicas (or shadows) came from the master. I should be able to set up the overlays for unique and refint on the master and be done with it. I should not have to waste the cpu cycles on each of the replicas to perform the same operations that were already figured out and done on the master ldap server.

There are valid reasons to do it either way - for one thing, instantiating the refint overlay on the replicas reduces network traffic. Also if you have a hot standby setup you want the replicas to be configured as nearly identically to the master as possible. (In mirrormode the replica and the master should be identical, or mirror image, anyway.)

Sadly, while there are valid reasons for doing it either way, the existing code forces me to run refint in the way I don't want to -- namely on every replica.

I do not have, nor do I want, a hot standby configuration. If the master server goes down, no updates are possible until I get it fixed. (and since I'm running the master as a Guest under VMWare ESX 3 with HA configured -- hardware problems on the master are not a big concern to me).

So, how can the interaction between accesslog (required for syncrepl to work) and refint be resolved so the refint overlay can be configured after accesslog (so it gets executed before accesslog logs the transactions for syncrepl)?

Frank

Michael Ströder

17 Mar 17 Mar

10:53 a.m.

Francis Swasey wrote:

...

On 3/16/07 2:58 PM, Pierangelo Masarati wrote:

...
Howard Chu wrote:

...
Yes, and that's probably the cleanest behavior. The overlay stack is only supposed to be executed completely for operations that came from the frontend. Any overlay on the stack should go through any other overlays below it, but not through any that are above it.

However, the accesslog would be unusable for replication; I see the point: refint should be present on the shadow as well, and do locally refint modifications, right?

I disagree! Since the only modifications that arrive at the replicas (or shadows) came from the master. I should be able to set up the overlays for unique and refint on the master and be done with it.

Ciao, Michael.

Francis Swasey

16 Mar 16 Mar

10:32 p.m.

On 3/16/07 2:36 PM, Howard Chu wrote:

...

Pierangelo Masarati wrote:

...
Francis Swasey wrote:

...
I have found using OpenLDAP 2.3.34 on a RHEL4 system (a locally built RPM) that the order of the overlays can lead to a problem (slapd locks up and doesn't do anything).

It's already documented that overlays are executed in a specific order. Obviously if the order didn't matter we wouldn't worry about it.

Yes, and it is documented that the overlays receive control in the reverse order that they are configured in. I therefore, thought that since accesslog was configured in slapd.conf before refint that refint would receive control first and so refint's multiple operations would all then get logged by accesslog.

...

...
...
I have found that using syncprov, accesslog, auditlog, unique will work fine. But, adding refint to the end of that chain and then do a modrdn that triggers refint to want to change an attribute in another entry and your slapd is locked up. There is now one thread that will not end and you will only be able to shutdown slapd with a "kill -9".

However, if I change the order so that the refint is prior to the accesslog overlay, then it works.

Is this interaction between the two overlays and the importance of their relative order a known (or at least expected) -- or should I open an ITS?

AFAIK, it is not known, although some strange and adverse interaction is possible (and feared). I suggest you file an ITS.

Not documented... The accesslog overlay serializes all write operations by taking a lock. This means only one write operation is allowed to progress at any time.

That fact is not obvious from reading the source code, is not mentioned in slapo-accesslog, and is not in any of the brief statements in the admin guide about overlays.

So, I do not believe that Howard's paragraph is clear to anyone using accesslog (other than people that are deeply involved in the code and where the variable name li has some special significance).

...

The refint overlay creates multiple write operations from a single write operation. If the accesslog overlay has already locked the current operation, then yes, the refint overlay will deadlock at that point because its write operations still go through the entire overlay stack. I think the fix for this will be to change the refint overlay to bypass any overlays above it when performing its own writes.

But if the refint overlay is specified in slapd.conf before the accesslog overlay it works and if they are configured with accesslog first and refint second it doesn't work. Which says (to me) that the slapd.conf manual page's declaration

Overlays are pushed onto a stack over the database, and so they will execute in the reverse of the order in which they were configured and the database itself will receive control last of all.

is wrong.

Frank

6680

Age (days ago)

6683

Last active (days ago)

openldap-software@openldap.org

12 comments

4 participants

tags (0)

participants (4)

Francis Swasey
Howard Chu
Michael Ströder
Pierangelo Masarati