Hello dear List,
I tried to import a slapcat backup from our production machine in a test environment and got following message:
debld02:~ # time slapadd -w -q -f /etc/openldap/slapd.conf -l /backup.ldif 50f98421 mdb_monitor_db_open: monitoring disabled; configure monitor database to enable -#################### 100.00% eta none elapsed 09m18s spd 4.6 M/s Closing DB...Error, entries missing! entry 1156449: ou=a,ou=b,ou=c,ou=root
First I did not noticed this message, but now I see the database is broken, because the node "ou=a" is missing. So my questions:
- What ist the origin for such orphaned nodes (In MMR, it happens and I see a few glue records, but in my backup this one node is complete missing...)?
- How can I prevent from such entires and how can I recognize them without importing?
- How can I remove this entry (esp. in production DB without downtime), because a delete show following messages:
~ # ldapdelete -x -h localhost -w password -D cn=admin,ou=root 'cn=cname,ou=a,ou=b,ou=c,ou=root' ldap_delete: Other (e.g., implementation specific) error (80) additional info: could not locate parent of entry
and if I try to add this missing node, then I get: ldapadd -x -h localhost -w password -D"cn=admin,ou=root" -f test.ldif adding new entry ou=a,ou=b,ou=c,ou=root ldap_add: Already exists (68)
Thanks for help
Meike
Meike Stone wrote:
Hello dear List,
I tried to import a slapcat backup from our production machine in a test environment and got following message:
debld02:~ # time slapadd -w -q -f /etc/openldap/slapd.conf -l /backup.ldif 50f98421 mdb_monitor_db_open: monitoring disabled; configure monitor database to enable -#################### 100.00% eta none elapsed 09m18s spd 4.6 M/s Closing DB...Error, entries missing! entry 1156449: ou=a,ou=b,ou=c,ou=root
First I did not noticed this message, but now I see the database is broken, because the node "ou=a" is missing. So my questions:
- What ist the origin for such orphaned nodes (In MMR, it happens and
I see a few glue records, but in my backup this one node is complete missing...)?
- How can I prevent from such entires and how can I recognize them
without importing?
It's easiest just to let slapadd tell you.
- How can I remove this entry (esp. in production DB without
downtime), because a delete show following messages:
~ # ldapdelete -x -h localhost -w password -D cn=admin,ou=root 'cn=cname,ou=a,ou=b,ou=c,ou=root' ldap_delete: Other (e.g., implementation specific) error (80) additional info: could not locate parent of entry
and if I try to add this missing node, then I get: ldapadd -x -h localhost -w password -D"cn=admin,ou=root" -f test.ldif adding new entry ou=a,ou=b,ou=c,ou=root ldap_add: Already exists (68)
Use slapadd to add the missing entry. For back-mdb you don't need to stop slapd while running other slap* tools.
- How can I prevent from such entires and how can I recognize them
without importing?
It's easiest just to let slapadd tell you.
So I understand, I make a dry-run (slapadd -u) to test the backup?
I tried this, but got no error, only if I make a real import, then slapadd throws the error.
(The following import I've tested with BDB configuration.)
Real import: ---------------- debld02:~ # slapadd -f /etc/openldap/slapd.conf -q -l /backup.ldif _#################### 100.00% eta none elapsed 19m11s spd 2.2 M/s Closing DB...Error, entries missing! entry 1156449: ou=a,ou=b,ou=c,ou=root debld02:~ # echo $? 1
Quick and dry-run import: ------------------------------------ debld02:~ # slapadd -f /etc/openldap/slapd.conf -u -q -l /backup.ldif .#################### 100.00% eta none elapsed 03m56s spd 10.8 M/s debld02:~ # echo $? 0
Dry-run import without quick option: --------------------------------------------------- debld02:~ # slapadd -f /etc/openldap/slapd.conf -u -l /backup.ldif .#################### 100.00% eta none elapsed 04m04s spd 10.4 M/s debld02:~ # echo $? 0
So I get no error after dry run import, while the backup is damaged. Do I misunderstand anything?
- How can I remove this entry (esp. in production DB without
downtime), because a delete show following messages:
~ # ldapdelete -x -h localhost -w password -D cn=admin,ou=root 'cn=cname,ou=a,ou=b,ou=c,ou=root' ldap_delete: Other (e.g., implementation specific) error (80) additional info: could not locate parent of entry
and if I try to add this missing node, then I get: ldapadd -x -h localhost -w password -D"cn=admin,ou=root" -f test.ldif adding new entry ou=a,ou=b,ou=c,ou=root ldap_add: Already exists (68)
Use slapadd to add the missing entry. For back-mdb you don't need to stop slapd while running other slap* tools.
I'll try this!
Many thanks, Meike
and if I try to add this missing node, then I get: ldapadd -x -h localhost -w password -D"cn=admin,ou=root" -f test.ldif adding new entry ou=a,ou=b,ou=c,ou=root ldap_add: Already exists (68)
Use slapadd to add the missing entry. For back-mdb you don't need to stop slapd while running other slap* tools.
I tried it on the test server sucessfully. But in production evironment the Server is configured as one master in a MMR evironment.
What is the best way there? If I add the missing entry on on server does this entry replicate to the second master? Or is in dangerous to do this - maybe it is better stop all server and add the entry?
Thanks in advance
Meike
Meike Stone wrote:
and if I try to add this missing node, then I get: ldapadd -x -h localhost -w password -D"cn=admin,ou=root" -f test.ldif adding new entry ou=a,ou=b,ou=c,ou=root ldap_add: Already exists (68)
Use slapadd to add the missing entry. For back-mdb you don't need to stop slapd while running other slap* tools.
I tried it on the test server sucessfully. But in production evironment the Server is configured as one master in a MMR evironment.
What is the best way there? If I add the missing entry on on server does this entry replicate to the second master? Or is in dangerous to do this - maybe it is better stop all server and add the entry?
What was "dangerous" was to run all of your production servers with bogus data in the first place.
Is the entry missing on all of the masters?
Are you using regular syncrepl or delta syncrepl?
There are any number of approaches here, use your imagination.
The easiest may just be to slapadd the entry so that it exists on one master, and then run an ldapmodify on the entry. From that point on any regular syncrepl consumers will receive the entry.
Think about it, think about how replication works, come up with your own solution.
Meike Stone writes:
- What ist the origin for such orphaned nodes (In MMR, it happens and
I see a few glue records, but in my backup this one node is complete missing...)?
Do you check the exit code from slapcat before saving its output? If slapcat (well, any program) fails, discard the output file.
Corollary: When you depend on correct output from a program, don't pipe it to e.g. gzip unless your shell can be told to check both exit codes. Normally 'false | true' succeeds, 'true | false' fails.
2013/1/24 Hallvard Breien Furuseth h.b.furuseth@usit.uio.no:
Meike Stone writes:
- What ist the origin for such orphaned nodes (In MMR, it happens and
I see a few glue records, but in my backup this one node is complete missing...)?
Do you check the exit code from slapcat before saving its output? If slapcat (well, any program) fails, discard the output file.
Hello,
yes, every time I make a Backup, the exitcode from slapcat is evaluated. If any error occur, I write a message in syslog. I searched in syslog over one year, but no error occured. Espechially the date, where the backup was created, what I use now for tests.
So I tried it again:
Load broken DB via slapadd in slapd (messages proof that it is broken): ------------------------------------------------------------------------------------------------------- debld02:~ # slapadd -f /etc/openldap/slapd.conf -q -l /backup.ldif _#################### 100.00% eta none elapsed 19m11s spd 2.2 M/s Closing DB...Error, entries missing! entry 1156449: ou=a,ou=b,ou=c,ou=root
Check, that the "parent entry" not exist: --------------------------------------------------------- ~ # ldapdelete -x -h localhost -w password -D cn=admin,ou=root "cn=cname,ou=a,ou=b,ou=c,ou=root" ldap_delete: Other (e.g., implementation specific) error (80) additional info: could not locate parent of entry ~ # echo $? 80
Try to check exit code from slapcat after backup the broken DB: --------------------------------------------------------------------------------------------- ~ # slapcat -f /etc/openldap/slapd.conf >/backup.ldif; echo $? 0
It seems to me, that in such case, the slapcat does not trows an error?!
Thanks Meike
Meike Stone wrote:
2013/1/24 Hallvard Breien Furuseth h.b.furuseth@usit.uio.no:
Meike Stone writes:
- What ist the origin for such orphaned nodes (In MMR, it happens and
I see a few glue records, but in my backup this one node is complete missing...)?
Do you check the exit code from slapcat before saving its output? If slapcat (well, any program) fails, discard the output file.
Hello,
yes, every time I make a Backup, the exitcode from slapcat is evaluated. If any error occur, I write a message in syslog. I searched in syslog over one year, but no error occured. Espechially the date, where the backup was created, what I use now for tests.
So I tried it again:
Load broken DB via slapadd in slapd (messages proof that it is broken):
debld02:~ # slapadd -f /etc/openldap/slapd.conf -q -l /backup.ldif _#################### 100.00% eta none elapsed 19m11s spd 2.2 M/s Closing DB...Error, entries missing! entry 1156449: ou=a,ou=b,ou=c,ou=root
Check, that the "parent entry" not exist:
~ # ldapdelete -x -h localhost -w password -D cn=admin,ou=root "cn=cname,ou=a,ou=b,ou=c,ou=root" ldap_delete: Other (e.g., implementation specific) error (80) additional info: could not locate parent of entry ~ # echo $? 80
Try to check exit code from slapcat after backup the broken DB:
~ # slapcat -f /etc/openldap/slapd.conf >/backup.ldif; echo $? 0
It seems to me, that in such case, the slapcat does not trows an error?!
slapcat doesn't check for missing entries. Its only job is to dump out the contents of what is in the DB. It doesn't try to tell you what isn't in the DB. Your DB must have been in this state for a long time, probably ever since its initial import.
~ # slapcat -f /etc/openldap/slapd.conf >/backup.ldif; echo $? 0
It seems to me, that in such case, the slapcat does not trows an error?!
slapcat doesn't check for missing entries. Its only job is to dump out the contents of what is in the DB. It doesn't try to tell you what isn't in the DB. Your DB must have been in this state for a long time, probably ever since its initial import.
Hello Howard,
thanks for answer, I presumed something like this!
I think, it would be a great thing to test the slapcat file (after dumping it) instantly.
So as reported in http://www.openldap.org/lists/openldap-technical/201301/msg00254.html I tried to do this with a broken backup and the dry-run switch. But slapadd did not report the error. It will work only with a real import. Are there other possibilities, or can be slapadd modified, that this will report such error?
Thanks in advance
Meike
On Mon, Jan 28, 2013 at 12:15:19PM +0100, Meike Stone wrote:
I think, it would be a great thing to test the slapcat file (after dumping it) instantly.
Testing backups is always wise...
So as reported in http://www.openldap.org/lists/openldap-technical/201301/msg00254.html I tried to do this with a broken backup and the dry-run switch. But slapadd did not report the error. It will work only with a real import. Are there other possibilities, or can be slapadd modified, that this will report such error?
Dryrun won't be able to detect missing structural entries: that requires a database. Even an internal list of DNs is not enough, as the actual entries have to be available in order to check things like schema and content rules.
To be a valid test you really have to import the data into a server with configuration identical to the production server. If that would take too long then a reasonable compromise might be to import to a server set up to write data as fast as possible. You could reasonably turn off all database-safety functions to do that, or put the database on a ramdisk. Even if you use hdb on the production servers you might consider trying mdb for the backup tests.
Andrew
Hello Andrew,
Dryrun won't be able to detect missing structural entries: that requires a database. Even an internal list of DNs is not enough, as the actual entries have to be available in order to check things like schema and content rules.
To be a valid test you really have to import the data into a server with configuration identical to the production server. If that would take too long then a reasonable compromise might be to import to a server set up to write data as fast as possible. You could reasonably turn off all database-safety functions to do that, or put the database on a ramdisk. Even if you use hdb on the production servers you might consider trying mdb for the backup tests.
Thanks for enlighten me. I've a separate backup server (read only slave), where I can do this. So I'll try to get money from the FC for more RAM to make the test in a ramdisk ^^
kindly regards
Meike
On Wed, Jan 30, 2013 at 04:19:17PM +0100, Meike Stone wrote:
Thanks for enlighten me. I've a separate backup server (read only slave), where I can do this. So I'll try to get money from the FC for more RAM to make the test in a ramdisk ^^
You may not need to expand the physical RAM. Most Linux systems these days have a tmpfs mounted on /tmp or on something like /dev/shm - this is a 'virtual ramdisk' which will use RAM and swapspace as appropriate. Its size is fixed at mount time, but can be configured to be larger than the physical RAM if you have enough swapspace to support it. Even when extended into swap, I would expect this to be faster than a normal filesystem as it does not have to take precautions to recover after a crash.
Andrew
openldap-technical@openldap.org