arun s wrote:
Hi, To try and overcome this issue, we tried two fixes:
- Every time a user was deleted from a group, we force-updated the user
object manually to make sure its entryCSN got updated and it got replicated properly. This is an expensive operation and did not scale well for big group sizes (10-20k), and did not work out.
This is also one of the reasons the code in memberof.c was reverted. Working as you suggest *cannot* scale. It makes no difference whether you do it in your external client or inside slapd, the amount of actual work required to replicate everything quickly grows out of control. This is why the documentation states that memberof must be configured on each replica - the only way to successfully execute the amount of work is to distribute the work evenly to each replica.
- We then tried to do the same thing in OpenLDAP. We noticed in the
memberof.c commits that there were a couple of patches to force the entryCSN of the user object to get updated. (http://tinyurl.com/8k4qrdj and http://tinyurl.com/9akqgfl)These have since been reverted because of access log and some replication issues, but for us, speed was a higher priority. I reapplied these patches back to the code. This solved the member-of replication issue, but we noticed that occasionally under a heavy load, there was a sudden surge in OpenLDAP's memory usage going up to whatever memory was available and finally crashing.
We have gone back to option (1) though (2) would be the preferred option.
Any help on figuring out why (2) caused the memory bloat would be really great. I can provide more details/memory traces if needed.
We will be glad to contribute any fixes once we are able to nail down the issue.
Probably this conversation should continue on the openldap-devel mailing list. It needs some new design work; it is not a simple bugfix.
Thanks, Arunkumar
*From:* Howard Chu hyc@symas.com *To:* arun s arunkumar_1123@yahoo.com *Cc:* "openldap-its@openldap.org" openldap-its@openldap.org *Sent:* Monday, 1 October 2012 6:59 PM *Subject:* Re: (ITS#7400) Memberof and Syncrepl incompatibility
arun s wrote:
Hi, Yes, I am able to reproduce the issue with 2.4.32
Making sense of the logs for the exact reproduction is hard since it needs a lot of operations in a short time. But this will probably help:
At the start of the test, the group temp_group existed.
I created a user temp_user and added temp_user to temp_group.
During replication, the group was replicated first and I got an error 32
(NO_SUCH_OBJECT) when it tried to modify the memberOf of the temp_user object (This does not exist in the readslave yet).
- The temp_user object was replicated next, and as you see, querying it does
show a memberOf attribute, proving that this field was replicated. (Note that I have run OpenLDAP with debug as -1 and verified that the replicated data has the memberOf field in it. I can provide this too if needed.)
I see. The current code drops the memberOf attribute if it was not explicitly requested in the search. However, by default the consumer requests "+" which means "all operational attributes" and so slapd considers memberOf to have been requested. We need to reconsider this aspect of the design.
- The more serious problem occurs when the sequence is reversed and the group
has been deleted as a last operation - The user is replicated first, but since the group is deleted, it is never replicated and a stale memberOf entry stays with the user.