Full_Name: Adrien Futschik Version: 2.4.15 OS: Linux RHEL 4-5 & Solaris 10 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (192.54.193.59)
Hello,
As suggested by Howard Chu, I am filling an ITS for a problem I encountered when testing N-way multimaster with OpenLDAP 2.4.15.
Here is the situation :
I have been testing n-way multimaster replication with OpenLDAP for a while now (from 2.4.11, to 2.4.15) and just when I though that everything was working perfectly, I dicided to test N-way multimaster not only with 2 masters on different servers, but with 4 ! (all 4 servers are time-synced using NTP)
2 OpenLDAP instances per server.
I have been configuring syncprov and syncrepl accordingly : olcServerID: 1 ldap://163.106.38.90:9011/ olcServerID: 2 ldap://163.106.38.92:9012/ olcServerID: 3 ldap://163.106.38.90:9013/ olcServerID: 4 ldap://163.106.38.92:9014/
olcSyncrepl: {0}rid=011 provider=ldap://163.106.38.90:9011/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {1}rid=012 provider=ldap://163.106.38.92:9012/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {2}rid=013 provider=ldap://163.106.38.90:9013/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {3}rid=014 provider=ldap://163.106.38.92:9014/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3
I am starting with all instances synced and I am trying to add entries en all four instances (in //). If I do so, I have a few entries that are not replicated on the others. I am getting this kind of messages :
do_syncrep2: cookie=rid=011,sid=002,csn=20090227130003.849482Z#000000#004#000000 do_syncrep2: rid=011 CSN too old, ignoring 20090227130003.849482Z#000000#004#000000 do_syncrep2: cookie=rid=013,sid=002,csn=20090227130003.849482Z#000000#004#000000 do_syncrep2: rid=013 CSN too old, ignoring 20090227130003.849482Z#000000#004#000000 do_syncrep2: cookie=rid=014,sid=002,csn=20090227130003.946474Z#000000#004#000000
Did someone face the same issue ?
Here is my configuration : (I am using refreshAndPersist mode for both cn=config
and olcDatabase={1}bdb)
M1 on IP1 / PORT1 : dn: cn=config objectClass: olcGlobal cn: config structuralObjectClass: olcGlobal creatorsName: cn=config olcServerID: 1 ldap://163.106.38.90:9011/ olcServerID: 2 ldap://163.106.38.92:9012/ olcServerID: 3 ldap://163.106.38.90:9013/ olcServerID: 4 ldap://163.106.38.92:9014/ entryUUID: ef89c876-adb3-4dc7-aa7d-024bbc359c98 createTimestamp: 20090227085748Z entryCSN: 20090227085749.920499Z#000000#004#000000 modifiersName: cn=config modifyTimestamp: 20090227085749Z contextCSN: 20090227085752.833630Z#000000#001#000000
dn: olcDatabase={1}bdb objectClass: olcDatabaseConfig objectClass: olcBdbConfig olcDatabase: {1}bdb olcDbDirectory: ./openldap-data olcSuffix: c=fr olcRootDN: cn=admin,c=fr olcRootPW:: e1NTSEF9WVZNSHJtYTRvUGd4KzFoak9kYWhBcm5NVHJxU1Zmdno= olcSizeLimit: 100 olcSyncrepl: {0}rid=011 provider=ldap://163.106.38.90:9011/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {1}rid=012 provider=ldap://163.106.38.92:9012/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {2}rid=013 provider=ldap://163.106.38.90:9013/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {3}rid=014 provider=ldap://163.106.38.92:9014/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcTimeLimit: 600 olcMirrorMode: TRUE olcDbCacheSize: 2000 olcDbCheckpoint: 2000 10 olcDbIndex: default pres,eq olcDbIndex: cn,sn pres,eq,sub olcDbIndex: objectClass,entryCSN,entryUUID eq structuralObjectClass: olcBdbConfig entryUUID: 00c01e5d-69ee-4baa-8e5a-4ef609dfd958 creatorsName: cn=config createTimestamp: 20090227085752Z entryCSN: 20090227085752.729899Z#000000#001#000000 modifiersName: cn=config modifyTimestamp: 20090227085752Z
M2 on IP2 / PORT2 : dn: cn=config objectClass: olcGlobal cn: config structuralObjectClass: olcGlobal entryUUID: 8da75037-65e6-4375-8c21-7e5c0194a60b creatorsName: cn=config createTimestamp: 20090227085723Z olcServerID: 1 ldap://163.106.38.90:9011/ olcServerID: 2 ldap://163.106.38.92:9012/ olcServerID: 3 ldap://163.106.38.90:9013/ olcServerID: 4 ldap://163.106.38.92:9014/ entryCSN: 20090227085725.003182Z#000000#002#000000 modifiersName: cn=config modifyTimestamp: 20090227085725Z contextCSN: 20090227085752.833630Z#000000#001#000000
dn: olcDatabase={1}bdb objectClass: olcDatabaseConfig objectClass: olcBdbConfig olcDatabase: {1}bdb olcDbDirectory: ./openldap-data olcSuffix: c=fr olcRootDN: cn=admin,c=fr olcRootPW:: e1NTSEF9WVZNSHJtYTRvUGd4KzFoak9kYWhBcm5NVHJxU1Zmdno= olcSizeLimit: 100 olcSyncrepl: {0}rid=011 provider=ldap://163.106.38.90:9011/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {1}rid=012 provider=ldap://163.106.38.92:9012/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {2}rid=013 provider=ldap://163.106.38.90:9013/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {3}rid=014 provider=ldap://163.106.38.92:9014/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcTimeLimit: 600 olcMirrorMode: TRUE olcDbCacheSize: 2000 olcDbCheckpoint: 2000 10 olcDbIndex: default pres,eq olcDbIndex: cn,sn pres,eq,sub olcDbIndex: objectClass,entryCSN,entryUUID eq structuralObjectClass: olcBdbConfig entryUUID: 00c01e5d-69ee-4baa-8e5a-4ef609dfd958 creatorsName: cn=config createTimestamp: 20090227085752Z entryCSN: 20090227085752.729899Z#000000#001#000000 modifiersName: cn=config modifyTimestamp: 20090227085752Z
M3 on IP1 / PORT3 : dn: cn=config objectClass: olcGlobal cn: config structuralObjectClass: olcGlobal entryUUID: cf068647-318f-4848-9c72-9c7745a8a4b3 creatorsName: cn=config createTimestamp: 20090227085742Z olcServerID: 1 ldap://163.106.38.90:9011/ olcServerID: 2 ldap://163.106.38.92:9012/ olcServerID: 3 ldap://163.106.38.90:9013/ olcServerID: 4 ldap://163.106.38.92:9014/ entryCSN: 20090227085743.825685Z#000000#003#000000 modifiersName: cn=config modifyTimestamp: 20090227085743Z contextCSN: 20090227085752.833630Z#000000#001#000000
dn: olcDatabase={1}bdb objectClass: olcDatabaseConfig objectClass: olcBdbConfig olcDatabase: {1}bdb olcDbDirectory: ./openldap-data olcSuffix: c=fr olcRootDN: cn=admin,c=fr olcRootPW:: e1NTSEF9WVZNSHJtYTRvUGd4KzFoak9kYWhBcm5NVHJxU1Zmdno= olcSizeLimit: 100 olcSyncrepl: {0}rid=011 provider=ldap://163.106.38.90:9011/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {1}rid=012 provider=ldap://163.106.38.92:9012/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {2}rid=013 provider=ldap://163.106.38.90:9013/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {3}rid=014 provider=ldap://163.106.38.92:9014/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcTimeLimit: 600 olcMirrorMode: TRUE olcDbCacheSize: 2000 olcDbCheckpoint: 2000 10 olcDbIndex: default pres,eq olcDbIndex: cn,sn pres,eq,sub olcDbIndex: objectClass,entryCSN,entryUUID eq structuralObjectClass: olcBdbConfig entryUUID: 00c01e5d-69ee-4baa-8e5a-4ef609dfd958 creatorsName: cn=config createTimestamp: 20090227085752Z entryCSN: 20090227085752.729899Z#000000#001#000000 modifiersName: cn=config modifyTimestamp: 20090227085752Z
M4 on IP2 / PORT4 : dn: cn=config objectClass: olcGlobal cn: config structuralObjectClass: olcGlobal entryUUID: ef89c876-adb3-4dc7-aa7d-024bbc359c98 creatorsName: cn=config createTimestamp: 20090227085748Z olcServerID: 1 ldap://163.106.38.90:9011/ olcServerID: 2 ldap://163.106.38.92:9012/ olcServerID: 3 ldap://163.106.38.90:9013/ olcServerID: 4 ldap://163.106.38.92:9014/ entryCSN: 20090227085749.920499Z#000000#004#000000 modifiersName: cn=config modifyTimestamp: 20090227085749Z contextCSN: 20090227085752.833630Z#000000#001#000000
dn: olcDatabase={1}bdb objectClass: olcDatabaseConfig objectClass: olcBdbConfig olcDatabase: {1}bdb olcDbDirectory: ./openldap-data olcSuffix: c=fr olcRootDN: cn=admin,c=fr olcRootPW:: e1NTSEF9WVZNSHJtYTRvUGd4KzFoak9kYWhBcm5NVHJxU1Zmdno= olcSizeLimit: 100 olcSyncrepl: {0}rid=011 provider=ldap://163.106.38.90:9011/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {1}rid=012 provider=ldap://163.106.38.92:9012/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {2}rid=013 provider=ldap://163.106.38.90:9013/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcSyncrepl: {3}rid=014 provider=ldap://163.106.38.92:9014/ binddn="cn=admin,c =fr" bindmethod=simple credentials=secret searchbase="c=fr" type=refreshAndPe rsist retry="5 5 300 12 3600 +" timeout=3 olcTimeLimit: 600 olcMirrorMode: TRUE olcDbCacheSize: 2000 olcDbCheckpoint: 2000 10 olcDbIndex: default pres,eq olcDbIndex: cn,sn pres,eq,sub olcDbIndex: objectClass,entryCSN,entryUUID eq structuralObjectClass: olcBdbConfig entryUUID: 00c01e5d-69ee-4baa-8e5a-4ef609dfd958 creatorsName: cn=config createTimestamp: 20090227085752Z entryCSN: 20090227085752.729899Z#000000#001#000000 modifiersName: cn=config modifyTimestamp: 20090227085752Z
Considering that M1 & M3 are on the same server and therefore have exactly the same time, if this was a time related problem, I shouldn't get any "CSN too old" messages between M1&M3 and M2&M4, should I ?
I have also noticed that when M1 gets a new entry and passes it to M2&M3&M4, when M2&M3&M4 receive it, they also pass it to M2&M3&M4 ! I don't understand why this appends but it look's very much like this is what's happening, because sometimes, M2 would have passed-it to M4, before M4 has actually received the add order from M1.
I therefore happen to notice that sometimes, entries send from M1 are received in the wrong order by other masters and therefore some entries may be skipped !!!
Here is a example : I add cn=M1client1 & cn=M1client2 on M1,
M1client1 & M1client2 are successfully replicated on M2&M4 but on M3, only M1client2 is inserted and I am getting an "CSN too old" message for M1client1 on M3.
I guess that M2 or M4 are not managing there queues in the right order. I don't exactly understand why M2&M3&M4 should propagate en entry sent by M1, because they will eventually receive the entry sent by M1.
Adrien Futschik