quanah@openldap.org wrote:
Full_Name: Quanah Gibson-Mount Version: 2.4.47 OS: N/A URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (47.208.144.40)
In testing a particular use case/setup scenario, I found that it's possible to cause a replica to slam a provider with unending requests. In this specific case, I was setting up delta-syncrepl MMR, but I believe the issue applies to standard syncrepl, and is not MMR specific. The scenario looks like this:
Initially we have a stand alone server, which no overlays in place. The configuration is done via cn=config, which allows for us to update the configuration without a server restart.
The configuration is modified to load the syncprov and accesslog overlays, create a new accesslog database, and to send all change data to the accesslog db.
After that is done, a secondary server is brought online with the same configuration other than the serverID being different and the syncrepl statement adjusted.
When the secondary server is started, it pummels the initial provider with queries like:
Apr 23 06:39:06 anvil4 slapd[28967]: conn=1003 op=361868131 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)" Apr 23 06:39:06 anvil4 slapd[28967]: conn=1003 op=361868131 SRCH attr=* + Apr 23 06:39:06 anvil4 slapd[28967]: conn=1003 op=361868131 SEARCH RESULT tag=101 err=0 nentries=0 text=
(Averaging around 2000 queries/second on my server per syncrepl client).
I believe the problem is that the root entry for the database contains no contextCSN. This is likely due to the fact that:
a) There was never a syncprov overlay present until I loaded this one in b) The serverID was set prior to the syncprov overlay being loaded (So it went from "0" to "1", with no changes ever recorded for "1").
Now there is a trivial ways to handle this, by making a change on the provider prior to starting up the other servers.
However, I think the overall behavior is undesirable. If there is no contextCSN present, it should not lead to replication clients executing a potential DoS on the provider. It also generated ~60GB of logs at loglevel stats in 1 day.
The consumer should not be reconnecting more frequently than its retry config.