https://bugs.openldap.org/show_bug.cgi?id=10047
Issue ID: 10047 Summary: slapd SEGV after slapindex -q Product: OpenLDAP Version: 2.6.3 Hardware: All OS: All Status: UNCONFIRMED Keywords: needs_review Severity: normal Priority: --- Component: slapd Assignee: bugs@openldap.org Reporter: quanah@openldap.org Target Milestone: ---
In an environment where cn=config is replicated:
a) Added an equality index for an existing attribute b) Stopped slapd after the change to the configuration had been replicated to the server. The indexing process that was automatically kicked off by this had been running for ~30 minutes before I stopped slapd c) ran: slapindex -q -F /path/to/config -b <base> <attr> d) started slapd e) segfault
To verify it wasn't an overall data issue, I then:
a) slapcat the database b) moved the problem database files aside for debugging c) reloaded the database with slapadd -q d) everything works fine
I will attach the gdb output to this ticket momentarily
https://bugs.openldap.org/show_bug.cgi?id=10047
--- Comment #1 from Quanah Gibson-Mount quanah@openldap.org --- Created attachment 964 --> https://bugs.openldap.org/attachment.cgi?id=964&action=edit GDB backtrace of SEGV
https://bugs.openldap.org/show_bug.cgi?id=10047
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|bugs@openldap.org |hyc@openldap.org Target Milestone|--- |2.6.5 Keywords|needs_review |
https://bugs.openldap.org/show_bug.cgi?id=10047
--- Comment #2 from Howard Chu hyc@openldap.org --- I've identified two potential problems and we need to check that both get fixed.
1) slapindex should delete the index checkpoint table after it completes, but doesn't
You can verify this by doing an `mdb_dump -s ixck` on the offending DB; on a normal DB that table should be empty or absent.
2) since the index checkpoint says there's still work to do, slapd will restart the online indexer on next startup. It looks like it's giving it an invalid backend struct when init.c:mdb_db_open() calls mdb_start_index_task(). Probably it used a fake be struct from an overlay instead of the real be struct.
For (2) you can change the `mdb_start_index_task( be )` to `mdb_start_index_task( be->be_self )`
That should fix the crash, but slapd shouldn't even need to start the indexer at all. Please verify that the crash is fixed, and I'll patch the slapindex code in a further update.
https://bugs.openldap.org/show_bug.cgi?id=10047
Howard Chu hyc@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |IN_PROGRESS
--- Comment #3 from Howard Chu hyc@openldap.org --- https://git.openldap.org/openldap/openldap/-/merge_requests/621
Both fixes are in this MR but you should test the crash fix separately first before testing the idxckp fix.
https://bugs.openldap.org/show_bug.cgi?id=10047
--- Comment #4 from Quanah Gibson-Mount quanah@openldap.org --- (In reply to Howard Chu from comment #2)
I've identified two potential problems and we need to check that both get fixed.
- slapindex should delete the index checkpoint table after it completes,
but doesn't
You can verify this by doing an `mdb_dump -s ixck` on the offending DB; on a normal DB that table should be empty or absent.
Status of ixck Tree depth: 1 Branch pages: 0 Leaf pages: 1 Overflow pages: 0 Entries: 2
- since the index checkpoint says there's still work to do, slapd will
restart the online indexer on next startup. It looks like it's giving it an invalid backend struct when init.c:mdb_db_open() calls mdb_start_index_task(). Probably it used a fake be struct from an overlay instead of the real be struct.
For (2) you can change the `mdb_start_index_task( be )` to `mdb_start_index_task( be->be_self )`
That should fix the crash, but slapd shouldn't even need to start the indexer at all. Please verify that the crash is fixed, and I'll patch the slapindex code in a further update.
Ok!
https://bugs.openldap.org/show_bug.cgi?id=10047
--- Comment #5 from Quanah Gibson-Mount quanah@openldap.org --- (In reply to Quanah Gibson-Mount from comment #4)
For (2) you can change the `mdb_start_index_task( be )` to `mdb_start_index_task( be->be_self )`
That should fix the crash, but slapd shouldn't even need to start the indexer at all. Please verify that the crash is fixed, and I'll patch the slapindex code in a further update.
Ok!
I can confirm slapd can start with just this fix.
https://bugs.openldap.org/show_bug.cgi?id=10047
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|IN_PROGRESS |RESOLVED
--- Comment #6 from Quanah Gibson-Mount quanah@openldap.org --- head:
• 3271bfa1 by Howard Chu at 2023-05-15T17:55:46+00:00 ITS#10047 back-mdb: delete idxckp table after slapindex
• ec3fafd1 by Howard Chu at 2023-05-15T17:55:46+00:00 ITS#10047 back-mdb: fix indexer resume on slapd restart
RE26:
• 13093c92 by Howard Chu at 2023-05-15T19:21:44+00:00 ITS#10047 back-mdb: delete idxckp table after slapindex
• 0f2433e5 by Howard Chu at 2023-05-15T19:21:49+00:00 ITS#10047 back-mdb: fix indexer resume on slapd restart
https://bugs.openldap.org/show_bug.cgi?id=10047
Quanah Gibson-Mount quanah@openldap.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED