New subject: OpenLDAP syncrepl message: AttributeDescription contains inappropriate characters

13 Dec 2009


      openldap-2.3.41
db-4.2.52.NC-PLUS_5_PATCHES
SunOS ldapmaster01.unix 5.10 Generic_127128-11
Our LDAP setup has been running rather well for the last 2 years or so, but 
increasingly we have had to restart slapd more and more frequently. Occasional 
core-dumps and "very large" process footprints as well.
I suspect that it is perhaps stuck on replication, this message show up on the 
clients fairly frequently:
Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 190661 local4.debug] <= 
entry_decode: slap_str2undef_ad( 20091027142315Z#000001#00#000000): 
AttributeDescription contains inappropriate characters
Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 818565 local4.debug] 
null_callback: error code 0x50
Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 776556 local4.debug] 
syncrepl_entry: rid 405 be_add failed (80)
Dec 14 11:30:49 forward01.unix slapd[10494]: [ID 747041 local4.debug] 
do_syncrepl: rid 405 retrying (9 retries left)
Which makes me think that it has been stuck since 20091027. Is there a way for 
me to find out which entry 20091027142315Z#000001#00#000000 refers to, so I can 
just delete it? (or fix it).
The pertinent lines in ldapmaster are:
lastmod         on
checkpoint 128 15
directory       /usr/local/var/openldap-data
index   objectClass     eq
index   uid                     eq
index   uidNumber               eq
index   mail                    eq
index   mailAlternateAddress    pres,eq
index   deliveryMode            eq
index   accountStatus           eq
index   gecos                   eq
index   radiusGroupName         eq
index   o                       pres,eq
index   entryCSN,entryUUID      eq
index   gidNumber               eq
overlay syncprov
syncprov-checkpoint 100 10
syncprov-sessionlog 100
And each slave has identical conf, except for the RID which is based on the last 
octet of the IP address:
lastmod         on
checkpoint 128 15
directory       /usr/local/var/openldap-data
index   objectClass     eq
index   uid                     eq
index   uidNumber               eq
index   mail                    eq
index   mailAlternateAddress    pres,eq
index   deliveryMode            eq
index   accountStatus           eq
index   gecos                   eq
index   radiusGroupName         eq
index   o                       pres,eq
index   entryCSN,entryUUID      eq
index   gidNumber               eq
syncrepl   rid=405
                 provider=ldap://172.20.12.113
                 type=refreshAndPersist
                 interval=00:00:00:30
                 searchbase="ou=mail,dc=gmo,dc=jp"
                 filter="(objectClass=*)"
                 attrs="*"
                 scope=sub
                 schemachecking=off
                 updatedn="cn=admin,dc=gmo,dc=jp"
                 bindmethod=simple
                 binddn="cn=admin,dc=gmo,dc=jp"
                 credentials="*censored*"
                 retry="60 10 300 +"
updateref       ldap://172.20.12.113
In general, do they seem reasonable? Everything has been appearing to be 
correct, except for the 20091027 entry. What is the recommended procedure? 
Reading the documentation makes me think I should also index "ContextCSN"?
A recent ldapmaster core was triggered here:
#0  0xce7948da in t_delete () from /lib/libc.so.1
#1  0xce79417f in _malloc_unlocked () from /lib/libc.so.1
#2  0xce794058 in malloc () from /lib/libc.so.1
#3  0x08133175 in ber_memalloc_x ()
#4  0x08133487 in ber_dupbv_x ()
#5  0x0813352b in ber_dupbv ()
#6  0x0807c22e in attr_dup ()
... which I believe is replication-related due to a post made earlier in the 
list. Most likely also fixed in later versions.
Thanks for any reply,
Jorgen Lundman
-- 
Jorgen Lundman       | lundman@lundman.net
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)