This is another instance where replication stops.
Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241223060050.594465Z#000000#000#000000 tid 0x7f5aed0fb640 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_search (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 uid=xxxx,ou=People,dc=georgetown,dc=edu Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_modify uid=xxxxx,ou=People,dc=georgetown,dc=edu (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_message_to_entry: rid=143 DN: uid=xxxxx,ou=People,dc=georgetown,dc=edu, UUID: db41232e-527d-103f-995c-4344349468b5 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241223060104.603554Z#000000#000#000000 tid 0x7f5aed0fb640 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_search (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 uid=yyyyy,ou=People,dc=georgetown,dc=edu Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_modify uid=yyyy,ou=People,dc=georgetown,dc=edu (0)
No "syncrep_" messages were logged after the above.
Regards, Suresh
On Mon, Dec 23, 2024 at 2:42 PM Quanah Gibson-Mount quanah@fast-mail.org wrote:
--On Friday, December 20, 2024 5:46 PM -0500 Suresh Veliveli Suresh.Veliveli@georgetown.edu wrote:
I do have these but no syncrepl_sync.
Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: syncrepl_message_to_entry: rid=129 DN: uid=xxxx,ou=xxxx,dc=georgetown,dc=edu, UUID: 37387976-3ae0-103f-94fc-3b4be3361dc7 Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: syncrepl_entry: rid=129 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241212135921.652395Z#000000#000#000000 tid 0x7f671b7fe640
Yes sorry, that was a typo on my end.
"syncrepl_" sync messages. Specifically you want to see the ones that have "be_" in them, but not "be_search". I.e., be_add, be_modify, etc. They show the result of the operation, something like:
syncrepl_entry: rid=100 be_modify <DN> (<result code>)
--Quanah
Any thoughts?
Thanks, suresh
On Tue, Dec 24, 2024 at 10:41 AM Suresh Veliveli < Suresh.Veliveli@georgetown.edu> wrote:
This is another instance where replication stops.
Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241223060050.594465Z#000000#000#000000 tid 0x7f5aed0fb640 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_search (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 uid=xxxx,ou=People,dc=georgetown,dc=edu Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_modify uid=xxxxx,ou=People,dc=georgetown,dc=edu (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_message_to_entry: rid=143 DN: uid=xxxxx,ou=People,dc=georgetown,dc=edu, UUID: db41232e-527d-103f-995c-4344349468b5 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241223060104.603554Z#000000#000#000000 tid 0x7f5aed0fb640 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_search (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 uid=yyyyy,ou=People,dc=georgetown,dc=edu Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_modify uid=yyyy,ou=People,dc=georgetown,dc=edu (0)
No "syncrep_" messages were logged after the above.
Regards, Suresh
On Mon, Dec 23, 2024 at 2:42 PM Quanah Gibson-Mount quanah@fast-mail.org wrote:
--On Friday, December 20, 2024 5:46 PM -0500 Suresh Veliveli Suresh.Veliveli@georgetown.edu wrote:
I do have these but no syncrepl_sync.
Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: syncrepl_message_to_entry: rid=129 DN: uid=xxxx,ou=xxxx,dc=georgetown,dc=edu, UUID: 37387976-3ae0-103f-94fc-3b4be3361dc7 Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: syncrepl_entry: rid=129 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241212135921.652395Z#000000#000#000000 tid 0x7f671b7fe640
Yes sorry, that was a typo on my end.
"syncrepl_" sync messages. Specifically you want to see the ones that have "be_" in them, but not "be_search". I.e., be_add, be_modify, etc. They show the result of the operation, something like:
syncrepl_entry: rid=100 be_modify <DN> (<result code>)
--Quanah
-- Suresh Veliveli Sr. UNIX Systems Engineer Georgetown University University Information Services | Security Infrastructure and Policy-Identity and Collaboration 202-262-6676 (cell) | 202-687-3108 (work)
Another instance where replication is stuck and not recovering.
# requesting: contextCSN contextCSN: *20241229135907.725117Z#000000#000#000000* aaa-prod-aws-10:2636 # requesting: contextCSN contextCSN:* 20241228185913.665451Z#000000#000#000000*
*Log info:* Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: do_syncrep2: rid=650 cookie=rid=650,csn=20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: syncrepl_entry: rid=650 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241228185913.665451Z#000000#000#000000 tid 0x7f26ee5fd640 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: removing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: removing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000
Nothing gets logged about replication after the above.
Am I missing something?
Regards, Suresh
On Sat, Dec 28, 2024 at 10:05 AM Suresh Veliveli < Suresh.Veliveli@georgetown.edu> wrote:
Any thoughts?
Thanks, suresh
On Tue, Dec 24, 2024 at 10:41 AM Suresh Veliveli < Suresh.Veliveli@georgetown.edu> wrote:
This is another instance where replication stops.
Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241223060050.594465Z#000000#000#000000 tid 0x7f5aed0fb640 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_search (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 uid=xxxx,ou=People,dc=georgetown,dc=edu Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_modify uid=xxxxx,ou=People,dc=georgetown,dc=edu (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_message_to_entry: rid=143 DN: uid=xxxxx,ou=People,dc=georgetown,dc=edu, UUID: db41232e-527d-103f-995c-4344349468b5 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241223060104.603554Z#000000#000#000000 tid 0x7f5aed0fb640 Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_search (0) Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 uid=yyyyy,ou=People,dc=georgetown,dc=edu Dec 23 01:01:23 aaa-prod-aws-3 slapd[1182444]: syncrepl_entry: rid=143 be_modify uid=yyyy,ou=People,dc=georgetown,dc=edu (0)
No "syncrep_" messages were logged after the above.
Regards, Suresh
On Mon, Dec 23, 2024 at 2:42 PM Quanah Gibson-Mount quanah@fast-mail.org wrote:
--On Friday, December 20, 2024 5:46 PM -0500 Suresh Veliveli Suresh.Veliveli@georgetown.edu wrote:
I do have these but no syncrepl_sync.
Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: syncrepl_message_to_entry: rid=129 DN: uid=xxxx,ou=xxxx,dc=georgetown,dc=edu, UUID: 37387976-3ae0-103f-94fc-3b4be3361dc7 Dec 12 08:59:30 aaa-prod-gcp-9 slapd[4165]: syncrepl_entry: rid=129 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241212135921.652395Z#000000#000#000000 tid 0x7f671b7fe640
Yes sorry, that was a typo on my end.
"syncrepl_" sync messages. Specifically you want to see the ones that have "be_" in them, but not "be_search". I.e., be_add, be_modify, etc. They show the result of the operation, something like:
syncrepl_entry: rid=100 be_modify <DN> (<result code>)
--Quanah
-- Suresh Veliveli Sr. UNIX Systems Engineer Georgetown University University Information Services | Security Infrastructure and Policy-Identity and Collaboration 202-262-6676 (cell) | 202-687-3108 (work)
On Sun, Dec 29, 2024 at 01:28:42PM -0500, Suresh Veliveli wrote:
Another instance where replication is stuck and not recovering.
# requesting: contextCSN contextCSN: *20241229135907.725117Z#000000#000#000000* aaa-prod-aws-10:2636 # requesting: contextCSN contextCSN:* 20241228185913.665451Z#000000#000#000000*
*Log info:* Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: do_syncrep2: rid=650 cookie=rid=650,csn=20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: syncrepl_entry: rid=650 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241228185913.665451Z#000000#000#000000 tid 0x7f26ee5fd640 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: removing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: removing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000
Nothing gets logged about replication after the above.
Am I missing something?
Hi Suresh, anything in the provider logs around that time? All consumers messages will be tagged with a specific "conn=xxx op=yyy" which you can discover e.g. by looking for the cookie it sends at the beginning of the session.
Couple of other questions: - is the TCP connection alive as far as the OS is concerned (I see in the thread you've confirmed TCP keepalive is enabled, correct?) - could you post the cn=monitor info for the consumer? The objectclass to look for is olmSyncReplInstance
Thanks,
Another instance: Yes, TCP keepalive is enabled.
aaa-prod-aws-7:1636 # requesting: contextCSN *contextCSN: 20250101065905.147164Z#000000#000#000000*
aaa-prod-aws-7:2636 # requesting: contextCSN contextCSN: 20250102140005.217756Z#000000#000#000000
dn: cn=Consumer 147,cn=Database 1,cn=Databases,cn=Monitor objectClass: olmSyncReplInstance cn: Consumer 147
*Consumer logs:*
Jan 1 01:59:12 aaa-prod-aws-7 slapd[1161089]: do_syncrep2: rid=147 cookie=rid=147,csn=20250101065905.147164Z#000000#000#000000 Jan 1 01:59:12 aaa-prod-aws-7 slapd[1161089]: syncrepl_entry: rid=147 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20250101065905.147164Z#000000#000#000000 tid 0x7f8e4a5fd640 Jan 1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_queue_csn: queueing 0x7f8e40152360 20250101065905.147164Z#000000#000#000000 Jan 1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_graduate_commit_csn: removing 0x7f8e40152360 20250101065905.147164Z#000000#000#000000 Jan 1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_queue_csn: queueing 0x7f8e4036c6c0 20250101065905.147164Z#000000#000#000000 Jan 1 01:59:12 aaa-prod-aws-7 slapd[1161089]: slap_graduate_commit_csn: removing 0x7f8e4036c6c0 20250101065905.147164Z#000000#000#000000
(Nothing after the above is logged regarding replication)
*Master:*
Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.124585Z#000000#000#000000 Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.147164Z#000000#000#000000
(Nothing after the above for rid=147)
*After restarting Consumer:*
Jan 2 09:29:54 aaa-prod-aws-7 slapd[2750000]: slap_get_csn: conn=-1 op=0 generated new csn=20250102142954.929120Z#000000#000#000000 manage= 0 Jan 2 09:29:55 aaa-prod-aws-7 slapd[2750000]: do_syncrep1: rid=147 starting refresh (sending cookie=rid=147,csn=20250101065905.147164Z#0000 00#000#000000) Jan 2 09:29:59 aaa-prod-aws-7 slapd[2750000]: syncrepl_entry: rid=147 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) csn=(none) tid 0x7f3288dfd640 Jan 2 09:29:59 aaa-prod-aws-7 slapd[2750000]: syncrepl_entry: rid=147 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) csn=(none) tid 0x7f3288dfd640
Thanks, Suresh
On Thu, Jan 2, 2025 at 6:34 AM Ondřej Kuzník ondra@mistotebe.net wrote:
On Sun, Dec 29, 2024 at 01:28:42PM -0500, Suresh Veliveli wrote:
Another instance where replication is stuck and not recovering.
# requesting: contextCSN contextCSN: *20241229135907.725117Z#000000#000#000000* aaa-prod-aws-10:2636 # requesting: contextCSN contextCSN:* 20241228185913.665451Z#000000#000#000000*
*Log info:* Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: do_syncrep2: rid=650 cookie=rid=650,csn=20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: syncrepl_entry: rid=650 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20241228185913.665451Z#000000#000#000000 tid 0x7f26ee5fd640 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: removing 0x7f26e0dcee50 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_queue_csn: queueing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000 Dec 28 13:59:21 aaa-prod-aws-10 slapd[1161864]: slap_graduate_commit_csn: removing 0x7f26e0f34360 20241228185913.665451Z#000000#000#000000
Nothing gets logged about replication after the above.
Am I missing something?
Hi Suresh, anything in the provider logs around that time? All consumers messages will be tagged with a specific "conn=xxx op=yyy" which you can discover e.g. by looking for the cookie it sends at the beginning of the session.
Couple of other questions:
- is the TCP connection alive as far as the OS is concerned (I see in the thread you've confirmed TCP keepalive is enabled, correct?)
- could you post the cn=monitor info for the consumer? The objectclass to look for is olmSyncReplInstance
Thanks,
-- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP
On Thu, Jan 02, 2025 at 09:39:34AM -0500, Suresh Veliveli wrote:
Another instance: Yes, TCP keepalive is enabled.
So is the TCP connection still open from the point of both servers? See in netstat or ss.
aaa-prod-aws-7:1636 # requesting: contextCSN *contextCSN: 20250101065905.147164Z#000000#000#000000*
aaa-prod-aws-7:2636 # requesting: contextCSN contextCSN: 20250102140005.217756Z#000000#000#000000
dn: cn=Consumer 147,cn=Database 1,cn=Databases,cn=Monitor objectClass: olmSyncReplInstance cn: Consumer 147
All the data in cn=monitor is contained in the operational attributes, as such, you'll have to request them either by name specifically, objectClass ('@olmSyncReplInstance') or blanket '+', maybe also '*' if you want regular attributes as well.
*Consumer logs:*
[...]
(Nothing after the above is logged regarding replication)
*Master:*
Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.124585Z#000000#000#000000 Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.147164Z#000000#000#000000 (Nothing after the above for rid=147)
This gives you the string to search for: searching for "conn=1034 op=1" here would give you the messages related to the replication session above. You'll see what happens on the provider and correlate that with what the consumer. For every new consumer session there will be a new "conn=xxx op=yyy" to search for.
Regards,
This is another instance where the replication stops.
aaa-prod-aws-12:1636 # requesting: contextCSN contextCSN: *20250102015911.702871Z#000000#000#000000*
All the relevant logs and info:
dn: cn=Consumer 152,cn=Database 1,cn=Databases,cn=Monitor structuralObjectClass: olmSyncReplInstance creatorsName: modifiersName: createTimestamp: 20241209130653Z modifyTimestamp: 20241209130653Z olmSRProviderURIList: ldaps://aaa-master-1.uis.georgetown.edu:636/ olmSRConnection: IP=172.20.86.12:49880 olmSRSyncPhase: Persist olmSRNextConnect: 00000101000000Z olmSRLastConnect: 20241229203510Z olmSRLastContact: 20250102015934Z olmSRLastCookieRcvd: rid=152,csn=*20250102015911.702871Z#000000#000#000000* olmSRLastCookieSent: rid=152,csn=20241229202835.459483Z#000000#000#000000 entryDN: cn=Consumer 152,cn=Database 1,cn=Databases,cn=Monitor subschemaSubentry: cn=Subschema hasSubordinates: FALSE
*Consumer:* netstat -an | grep 49880 tcp 0 0 172.20.86.12:49880 172.17.21.52:636 ESTABLISHED
*Master:* netstat -an | grep 172.20.86.12 tcp 0 0 172.17.21.52:636 172.20.86.12:49880 ESTABLISHED
*Master logs:* Jan 1 20:59:18 aaa-prod-master-1 slapd[3281130]: conn=1035 op=1 syncprov_sendresp: cookie=rid=152,csn=20250102015911.686467Z#000000#000#000000 Jan 1 20:59:18 aaa-prod-master-1 slapd[3281130]: conn=1035 op=1 syncprov_sendresp: cookie=rid=152,csn=20250102015911.702871Z#000000#000#000000
Nothing about rid=152 is logged after the above
*Consumer logs:* Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: do_syncrep2: rid=152 cookie=rid=152,csn=20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: syncrepl_entry: rid=152 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20250102015911.702871Z#000000#000#000000 tid 0x7f7a753fc640 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_queue_csn: queueing 0x7f7a687c6190 20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_graduate_commit_csn: removing 0x7f7a687c6190 20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_queue_csn: queueing 0x7f7a6877d9b0 20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_graduate_commit_csn: removing 0x7f7a6877d9b0 20250102015911.702871Z#000000#000#000000
Nothing about replication is logged after the above.
Thanks, Suresh
On Thu, Jan 2, 2025 at 10:08 AM Ondřej Kuzník ondra@mistotebe.net wrote:
On Thu, Jan 02, 2025 at 09:39:34AM -0500, Suresh Veliveli wrote:
Another instance: Yes, TCP keepalive is enabled.
So is the TCP connection still open from the point of both servers? See in netstat or ss.
aaa-prod-aws-7:1636 # requesting: contextCSN *contextCSN: 20250101065905.147164Z#000000#000#000000*
aaa-prod-aws-7:2636 # requesting: contextCSN contextCSN: 20250102140005.217756Z#000000#000#000000
dn: cn=Consumer 147,cn=Database 1,cn=Databases,cn=Monitor objectClass: olmSyncReplInstance cn: Consumer 147
All the data in cn=monitor is contained in the operational attributes, as such, you'll have to request them either by name specifically, objectClass ('@olmSyncReplInstance') or blanket '+', maybe also '*' if you want regular attributes as well.
*Consumer logs:*
[...]
(Nothing after the above is logged regarding replication)
*Master:*
Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.124585Z#000000#000#000000 Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.147164Z#000000#000#000000 (Nothing after the above for rid=147)
This gives you the string to search for: searching for "conn=1034 op=1" here would give you the messages related to the replication session above. You'll see what happens on the provider and correlate that with what the consumer. For every new consumer session there will be a new "conn=xxx op=yyy" to search for.
Regards,
-- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP
The OS is Rock9, AWS EC2 instance.
On Thu, Jan 2, 2025 at 10:32 PM Suresh Veliveli < Suresh.Veliveli@georgetown.edu> wrote:
This is another instance where the replication stops.
aaa-prod-aws-12:1636 # requesting: contextCSN contextCSN: *20250102015911.702871Z#000000#000#000000*
All the relevant logs and info:
dn: cn=Consumer 152,cn=Database 1,cn=Databases,cn=Monitor structuralObjectClass: olmSyncReplInstance creatorsName: modifiersName: createTimestamp: 20241209130653Z modifyTimestamp: 20241209130653Z olmSRProviderURIList: ldaps://aaa-master-1.uis.georgetown.edu:636/ olmSRConnection: IP=172.20.86.12:49880 olmSRSyncPhase: Persist olmSRNextConnect: 00000101000000Z olmSRLastConnect: 20241229203510Z olmSRLastContact: 20250102015934Z olmSRLastCookieRcvd: rid=152,csn= *20250102015911.702871Z#000000#000#000000* olmSRLastCookieSent: rid=152,csn=20241229202835.459483Z#000000#000#000000 entryDN: cn=Consumer 152,cn=Database 1,cn=Databases,cn=Monitor subschemaSubentry: cn=Subschema hasSubordinates: FALSE
*Consumer:* netstat -an | grep 49880 tcp 0 0 172.20.86.12:49880 172.17.21.52:636 ESTABLISHED
*Master:* netstat -an | grep 172.20.86.12 tcp 0 0 172.17.21.52:636 172.20.86.12:49880 ESTABLISHED
*Master logs:* Jan 1 20:59:18 aaa-prod-master-1 slapd[3281130]: conn=1035 op=1 syncprov_sendresp: cookie=rid=152,csn=20250102015911.686467Z#000000#000#000000 Jan 1 20:59:18 aaa-prod-master-1 slapd[3281130]: conn=1035 op=1 syncprov_sendresp: cookie=rid=152,csn=20250102015911.702871Z#000000#000#000000
Nothing about rid=152 is logged after the above
*Consumer logs:* Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: do_syncrep2: rid=152 cookie=rid=152,csn=20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: syncrepl_entry: rid=152 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_MODIFY) csn=20250102015911.702871Z#000000#000#000000 tid 0x7f7a753fc640 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_queue_csn: queueing 0x7f7a687c6190 20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_graduate_commit_csn: removing 0x7f7a687c6190 20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_queue_csn: queueing 0x7f7a6877d9b0 20250102015911.702871Z#000000#000#000000 Jan 1 20:59:34 aaa-prod-aws-12 slapd[1229307]: slap_graduate_commit_csn: removing 0x7f7a6877d9b0 20250102015911.702871Z#000000#000#000000
Nothing about replication is logged after the above.
Thanks, Suresh
On Thu, Jan 2, 2025 at 10:08 AM Ondřej Kuzník ondra@mistotebe.net wrote:
On Thu, Jan 02, 2025 at 09:39:34AM -0500, Suresh Veliveli wrote:
Another instance: Yes, TCP keepalive is enabled.
So is the TCP connection still open from the point of both servers? See in netstat or ss.
aaa-prod-aws-7:1636 # requesting: contextCSN *contextCSN: 20250101065905.147164Z#000000#000#000000*
aaa-prod-aws-7:2636 # requesting: contextCSN contextCSN: 20250102140005.217756Z#000000#000#000000
dn: cn=Consumer 147,cn=Database 1,cn=Databases,cn=Monitor objectClass: olmSyncReplInstance cn: Consumer 147
All the data in cn=monitor is contained in the operational attributes, as such, you'll have to request them either by name specifically, objectClass ('@olmSyncReplInstance') or blanket '+', maybe also '*' if you want regular attributes as well.
*Consumer logs:*
[...]
(Nothing after the above is logged regarding replication)
*Master:*
Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.124585Z#000000#000#000000 Jan 1 01:59:05 aaa-prod-master-1 slapd[3281130]: conn=1034 op=1 syncprov_sendresp: cookie=rid=147,csn=20250101065905.147164Z#000000#000#000000 (Nothing after the above for rid=147)
This gives you the string to search for: searching for "conn=1034 op=1" here would give you the messages related to the replication session above. You'll see what happens on the provider and correlate that with what the consumer. For every new consumer session there will be a new "conn=xxx op=yyy" to search for.
Regards,
-- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP
-- Suresh Veliveli Sr. UNIX Systems Engineer Georgetown University University Information Services | Security Infrastructure and Policy-Identity and Collaboration 202-262-6676 (cell) | 202-687-3108 (work)
--On Friday, January 3, 2025 7:52 AM -0500 Suresh Veliveli Suresh.Veliveli@georgetown.edu wrote:
The OS is Rock9, AWS EC2 instance.
Question - Was this working on an earlier OpenLDAP version, and just started having issues with 2.6.9? I was re-reading your initial post and that wasn't clear.
Thanks!
--Quanah
openldap-technical@openldap.org