Crashing database

List overview All Threads
Download

newer

older

Re: (ITS#5291) Client timeout...

Re: (ITS#5305) Contribware: Two...

Carlos Parada

4 Jan 2008 4 Jan '08

7:24 p.m.

Hi all,

(First of all, sorry for this "generic" thread)

I'm using the openldap 2.3.34 version (I've already used previous ones).

My usage is a bit different than the usual. I have a lot of write operations

- tens per second - (adds, deletes) and also a lots of reads.

I use to have very often a crash in my database. I'm using the bdb backend.

Sometimes it is enough to restart my server to corrupt the database (at least

to not be able to restart it). In other occasions, I corrupt my database using

the slapcat command, which I know it is nor recommended, but it is also not

disallowed by the application (and I think it was in previous versions).

That buggy behavior should not be normal. I've already search and it seems it

is a common problem using the bdb, many times reported.

My question is: what is really happening? Am I configuring wrongly the server,

DBD parameters, or anything else? Or there is a systematic bug not solved yet ?

How to solve it?

Another secondary issue is the recovery. Most of the times, I'm not able to recover

my database either using db_recover or slapd_db_recover. Is there any trick for a

more efficient recovery.

Cumprimentos,

Carlos Parada

Attachments:

attachment.htm (text/html — 4.9 KB)

Show replies by date

Pierangelo Masarati

4 Jan 4 Jan

8:25 p.m.

Carlos Parada wrote:

...

I'm using the openldap 2.3.34 version (I've already used previous ones).

The current 2.3 release is 2.3.40; since you don't point out any specific issue, I strongly suggest you update to the latest release, so that we don't risk chasing something that has already been fixed.

...

My usage is a bit different than the usual. I have a lot of write operations

tens per second - (adds, deletes) and also a lots of reads.

This is by no means a heavy write load.

...

I use to have very often a crash in my database. I'm using the bdb backend.

Sometimes it is enough to restart my server to corrupt the database (at least

"Crash" is rather generic. Can you be more specific?

...

to not be able to restart it). In other occasions, I corrupt my database using

the slapcat command, which I know it is nor recommended, but it is also not

disallowed by the application (and I think it was in previous versions).

slapcat is 100% recommended, otherwise it wouldn't be distributed at all.

...

That buggy behavior should not be normal. I've already search and it seems it

is a common problem using the bdb, many times reported.

If by "bdb" you mean "back-bdb", it's the default backend, it's the recommended one and it's considered very reliable. The problem you indicated is by no means typical, since most of the users do not experience issues with it within days, months and even years of continued service.

...

My question is: what is really happening? Am I configuring wrongly the server,

DBD parameters, or anything else? Or there is a systematic bug not solved yet ?

How to solve it?

Another secondary issue is the recovery. Most of the times, I'm not able to recover

my database either using db_recover or slapd_db_recover. Is there any trick for a

more efficient recovery.

With OpenLDAP 2.3 you shouldn't need to run any of those tools, as slapd (and other slaptools) can detect itself when a recover is needed. Since you don't share your slapd and your Berkeley DB configuration, there's very little to comment about.

As you're not pointing out any specific issue, I suggest you post your comments to the openldap-software list, until anything indicating a specific bug surfaces.

Ing. Pierangelo Masarati OpenLDAP Core Team

SysNet s.r.l. via Dossi, 8 - 27100 Pavia - ITALIA http://www.sys-net.it --------------------------------------- Office: +39 02 23998309 Mobile: +39 333 4963172 Email: pierangelo.masarati@sys-net.it ---------------------------------------

Carlos Parada

5 Jan 5 Jan

6:24 p.m.

Carlos Parada wrote:

...

I'm using the openldap 2.3.34 version (I've already used previous ones).

...

My usage is a bit different than the usual. I have a lot of write operations

tens per second - (adds, deletes) and also a lots of reads.

This is by no means a heavy write load.

[CP] Indeed. And very critical usage.

...

I use to have very often a crash in my database. I'm using the bdb backend.

Sometimes it is enough to restart my server to corrupt the database (at least

"Crash" is rather generic. Can you be more specific?

[CP] Yes. It starts returning the 80 error code.

Jan 3 23:47:39 linux slapd[14817]: conn=8284139 op=1 SEARCH RESULT tag=101 err=80 nentries=0 text=internal error

When you try to restart to get it right, it doesnt start and suggests you to use db_recovery.

...

to not be able to restart it). In other occasions, I corrupt my database using

the slapcat command, which I know it is nor recommended, but it is also not

disallowed by the application (and I think it was in previous versions).

slapcat is 100% recommended, otherwise it wouldn't be distributed at all.

[CP] Acording to the slapcat manual, it is not recommended while the server is running.

...

That buggy behavior should not be normal. I've already search and it seems it

is a common problem using the bdb, many times reported.

[CP] Yes. But sometimes this happen. Try to run slapcat to provoke it.

Perhaps I'm too much exigent, but I am very familiar with sql database engines

(like postgres) and there this NEVER happen (it is quite reliable).

Thats doesnt seems the case of openldap.

...

My question is: what is really happening? Am I configuring wrongly the server,

DBD parameters, or anything else? Or there is a systematic bug not solved yet ?

How to solve it?

Another secondary issue is the recovery. Most of the times, I'm not able to recover

my database either using db_recover or slapd_db_recover. Is there any trick for a

more efficient recovery.

[CP] The slapd.conf is the following (includes DB_CONFIG):

database bdb #dbnosync #dbsync 60 10 30 cachesize 500000 dbcachesize 1000000000 suffix "dc=ptin" rootdn "dc=ptin" rootpw xxxxxx

directory /mnt/storage/wifi

index objectClass pres,eq index uid eq,pres index dc eq,pres

reverse-lookup off lastmod off dbnosync

#DB_CONFIG # one 0.25 GB cache dbconfig set_cachesize 0 268435456 1

# Data Directory #set_data_dir db

# Transaction Log settings dbconfig set_lg_regionmax 262144 dbconfig set_lg_bsize 2097152 #set_lg_dir logs

#remove log lines dbconfig set_flags DB_LOG_AUTOREMOVE

# locks dbconfig set_lk_max_locks 5000 dbconfig set_lk_max_lockers 5000 dbconfig set_lk_max_objects 5000

As you're not pointing out any specific issue, I suggest you post your comments to the openldap-software list, until anything indicating a specific bug surfaces.

[CP] Thats the reason of my initial '(First of all, sorry for this "generic" thread)' :)

I this that there is a missconfiguration by my side of a bug that I can not

replicate, but happens sometimes.

Many thanks anyway.

Pierangelo Masarati

6 Jan 6 Jan

3:12 p.m.

...

Carlos Parada wrote:

...

...
the slapcat command, which I know it is nor recommended, but it is also not

disallowed by the application (and I think it was in previous versions).

slapcat is 100% recommended, otherwise it wouldn't be distributed at all.

[CP] Acording to the slapcat manual, it is not recommended while the server is running.

For some unreliable database types. The man page clearly states it is 100% reliable for back-bdb and back-hdb (the back-null is not a real database).

...

[CP] The slapd.conf is the following (includes DB_CONFIG):

database bdb #dbnosync #dbsync 60 10 30 cachesize 500000 dbcachesize 1000000000 suffix "dc=ptin" rootdn "dc=ptin" rootpw xxxxxx

directory /mnt/storage/wifi

index objectClass pres,eq index uid eq,pres index dc eq,pres

reverse-lookup off lastmod off dbnosync

#DB_CONFIG # one 0.25 GB cache dbconfig set_cachesize 0 268435456 1

# Data Directory #set_data_dir db

# Transaction Log settings dbconfig set_lg_regionmax 262144 dbconfig set_lg_bsize 2097152 #set_lg_dir logs

#remove log lines dbconfig set_flags DB_LOG_AUTOREMOVE

# locks dbconfig set_lk_max_locks 5000 dbconfig set_lk_max_lockers 5000 dbconfig set_lk_max_objects 5000

As you're not pointing out any specific issue, I suggest you post your comments to the openldap-software list, until anything indicating a specific bug surfaces.

[CP] Thats the reason of my initial '(First of all, sorry for this "generic" thread)' :)

I this that there is a missconfiguration by my side of a bug that I can not

replicate, but happens sometimes.

Many thanks anyway.

Lookingh at your configuration, I note "dnbosync" which means looking for trouble. Also, you placed your data in "/mnt/storage/wifi"; does it indicate some remote storage? That's also looking for trouble. Please don't complain about software reliability when you violate the essential rules of reliability: know what you're doing, and one unreliable point in the chain makes the whole chain unreliable.

This seems to all end up in not having checked the documentation carefully enough, and in having misused some unreliable options. So I strongly recommend you stop posting to openldap-bugs, and start posting to openldap-software (after you re-read the relevant documentation, of course).

Ing. Pierangelo Masarati OpenLDAP Core Team

6388

Age (days ago)

6390

Last active (days ago)

openldap-bugs@openldap.org

3 comments

2 participants

tags (0)

participants (2)

Carlos Parada
Pierangelo Masarati