I do have sync logging enabled. This is the last record in the log file, and after that, nothing. It isn't attempting to get changes. 

Log on the replica:
Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: slap_queue_csn: queueing 0x7f67101bc860 20241212135921.652395Z#000000#000#000000
Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: slap_graduate_commit_csn: removing 0x7f67101bc860 20241212135921.652395Z#000000#000#000000
Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: slap_queue_csn: queueing 0x7f67102311b0 20241212135921.652395Z#000000#000#000000
Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: slap_graduate_commit_csn: removing 0x7f67102311b0 20241212135921.652395Z#000000#000#000000

Log on the master:
Dec 12 08:59:30 aaa-prod-master-1 slapd[4072]: conn=936345 op=1 syncprov_sendresp: cookie=rid=129,csn=20241212135921.637305Z#000000#000#000000
Dec 12 08:59:30 aaa-prod-master-1 slapd[4072]: conn=936345 op=1 syncprov_sendresp: cookie=rid=129,csn=20241212135921.652395Z#000000#000#000000

There is nothing in the log on the master or the replica after this entry. 

Regards,
Suresh

On Fri, Dec 20, 2024 at 1:40 PM Quanah Gibson-Mount <quanah@fast-mail.org> wrote:


--On Friday, December 20, 2024 11:22 AM -0500 Suresh Veliveli
<Suresh.Veliveli@georgetown.edu> wrote:

>
>
>
> Hi,
>
>
> We have a single master with multiple replicas. Our backend is mdb, and
> we are on the latest version, 2.6.9.  The replication type
> is refreshAndPersist. Here is the relevant configuration. 
>
>
> syncrepl rid=141
>         provider=ldaps://ldap-master.georgetown.edu:636/
>         type=refreshAndPersist
>
>          ..
>          ..
>         keepalive=300:5:5        retry="5 5 300 +"
>
>
>
> This is now happening at regular intervals. When a consumer replication
> gets stuck, only a service restart seems to restart replication. 


Need more information on what you mean by "getting stuck". Is it actually
attemping to get changes? Do you have sync logging enabled?  If so, you
could see if that particular consumer is having an issue during replication
and returning a non-zero result code as to why it's unable to proceed.  Or
is it not establishing a replication connection at all? etc.

--Quanah




--
Suresh Veliveli
Sr. UNIX Systems Engineer
Georgetown University
University Information Services | Security Infrastructure and Policy-Identity and Collaboration
202-262-6676 (cell) | 202-687-3108 (work)