slapd, bdb, 32-bit -> 64-bit

List overview All Threads
Download

newer

older

Issue while starting "slapd"

How was developers' day 2006

Frode Nordahl

15 Oct 2006 15 Oct '06

2:28 a.m.

Hello,

I am trying to use a slapd BDB database created on a 32-bit intel system on a 64-bit (amd64/EMT64) system. Running slapcat on the database works fine, but when the server is started, ldapsearch is unable to find anything.

I understand that there are architectural differences between a 32- bit and 64-bit systems, but I don't think Berkeley DB have any problems with having its data being interchanged between the platforms.

We are running OpenLDAP in a distributed environment where every server has a local copy of the LDAP database for performance and reliability reasons. The database is 6 GB, so it is non-trivial export it to LDIF and import it again on the 64-bit systems. (It takes way too long).

I am setting up one server to just be a spare LDAP slave so we easilly can take it down and copy the database to any new system added to the cluster without causing any downtime anywhere. But this is not possible as long as I cannot use the same database on 32-bit and 64-bit systems.

Would it be possible to make this work at all?

Could this be caused by a platform-dependent variable type being used somewhere, rather than a fixed sized variable type, making slapd interpret the same data diffrently on different platforms?

-- Frode Nordahl

Show replies by date

matthew sporleder

15 Oct 15 Oct

9:52 a.m.

On 10/15/06, Frode Nordahl frode@nordahl.net wrote:

...

Hello,

I am trying to use a slapd BDB database created on a 32-bit intel system on a 64-bit (amd64/EMT64) system. Running slapcat on the database works fine, but when the server is started, ldapsearch is unable to find anything.

I understand that there are architectural differences between a 32- bit and 64-bit systems, but I don't think Berkeley DB have any problems with having its data being interchanged between the platforms.

We are running OpenLDAP in a distributed environment where every server has a local copy of the LDAP database for performance and reliability reasons. The database is 6 GB, so it is non-trivial export it to LDIF and import it again on the 64-bit systems. (It takes way too long).

I am setting up one server to just be a spare LDAP slave so we easilly can take it down and copy the database to any new system added to the cluster without causing any downtime anywhere. But this is not possible as long as I cannot use the same database on 32-bit and 64-bit systems.

Would it be possible to make this work at all?

Could this be caused by a platform-dependent variable type being used somewhere, rather than a fixed sized variable type, making slapd interpret the same data diffrently on different platforms?

Are you using the same binary on both machines? If it's 32bit-compiled slapd/bdb, then I don't think you would have any problems based on the underlying architecture. BDB does have an endian-independent option, but I don't think you're running into that. Maybe you just have to reindex/recover after the copy.

And you can definitely add new replicas without causing any downtime by using the strategy you're suggesting- even using slapcat/slapadd.

Frode Nordahl

10:52 a.m.

On 15. okt. 2006, at 18.52, matthew sporleder wrote:

...

On 10/15/06, Frode Nordahl frode@nordahl.net wrote:

...
Hello,

I am trying to use a slapd BDB database created on a 32-bit intel system on a 64-bit (amd64/EMT64) system. Running slapcat on the database works fine, but when the server is started, ldapsearch is unable to find anything.

I understand that there are architectural differences between a 32- bit and 64-bit systems, but I don't think Berkeley DB have any problems with having its data being interchanged between the platforms.

We are running OpenLDAP in a distributed environment where every server has a local copy of the LDAP database for performance and reliability reasons. The database is 6 GB, so it is non-trivial export it to LDIF and import it again on the 64-bit systems. (It takes way too long).

I am setting up one server to just be a spare LDAP slave so we easilly can take it down and copy the database to any new system added to the cluster without causing any downtime anywhere. But this is not possible as long as I cannot use the same database on 32-bit and 64-bit systems.

Would it be possible to make this work at all?

Could this be caused by a platform-dependent variable type being used somewhere, rather than a fixed sized variable type, making slapd interpret the same data diffrently on different platforms?

Are you using the same binary on both machines? If it's 32bit-compiled slapd/bdb, then I don't think you would have any problems based on the underlying architecture. BDB does have an endian-independent option, but I don't think you're running into that. Maybe you just have to reindex/recover after the copy.

Different binary. The 64-bit computer is running a 64-bit binary, and I am trying to use the BDB database created on a 32-bit computer with a 32-bit binary.

I know I may be asking for trouble trying this, but I really would like it to work.

If there are deliberate differences I would like to know about it, if not I will try and help to find the source of the problem.

...

And you can definitely add new replicas without causing any downtime by using the strategy you're suggesting- even using slapcat/slapadd.

Yes, this would be possible, but as the slapadd step may take several hours I cannot rely on this. I have tested it with slapadd -q and it still takes too long. There just is alot of data that must go in there :-)

-- Frode Nordahl

Quanah Gibson-Mount

3:15 p.m.

--On Sunday, October 15, 2006 7:52 PM +0200 Frode Nordahl frode@nordahl.net wrote:

...

Yes, this would be possible, but as the slapadd step may take severalhours I cannot rely on this. I have tested it with slapadd -q and itstill takes too long. There just is alot of data that must go inthere :-)

Well, how much RAM do you have on the system? What is the size of your *.bdb files? To have optimal slapadd -q performance, the size of your BDB cache defined in the DB_CONFIG file should be the sum of all your *.bdb files.

For example on my database:

du -c -h *.bdb

3.3G total

cat DB_CONFIG set_cachesize 4 0 1 set_lg_regionmax 262144 set_lg_bsize 2097152 set_lg_dir /var/log/bdb set_lk_max_locks 3000 set_lk_max_objects 1500 set_lk_max_lockers 1500 set_flags DB_LOG_AUTOREMOVE set_tas_spins 1

--Quanah

-- Quanah Gibson-Mount Principal Software Developer ITS/Shared Application Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

Frode Nordahl

16 Oct 16 Oct

12:59 a.m.

On 16. okt. 2006, at 00.15, Quanah Gibson-Mount wrote:

...

--On Sunday, October 15, 2006 7:52 PM +0200 Frode Nordahl frode@nordahl.net wrote:

...
Yes, this would be possible, but as the slapadd step may take severalhours I cannot rely on this. I have tested it with slapadd - q and itstill takes too long. There just is alot of data that must go inthere :-)

Well, how much RAM do you have on the system? What is the size of your *.bdb files? To have optimal slapadd -q performance, the size of your BDB cache defined in the DB_CONFIG file should be the sum of all your *.bdb files.

For example on my database:

du -c -h *.bdb

3.3G total

cat DB_CONFIG set_cachesize 4 0 1 set_lg_regionmax 262144 set_lg_bsize 2097152 set_lg_dir /var/log/bdb set_lk_max_locks 3000 set_lk_max_objects 1500 set_lk_max_lockers 1500 set_flags DB_LOG_AUTOREMOVE set_tas_spins 1

Thank you for your comments. I will revisit slapadd -q and see if I can make it go fast enough and escape my wish for binary copies across platforms.

-- Frode Nordahl

matthew sporleder

15 Oct 15 Oct

6:08 p.m.

On 10/15/06, Frode Nordahl frode@nordahl.net wrote:

...

On 15. okt. 2006, at 18.52, matthew sporleder wrote:

...
On 10/15/06, Frode Nordahl frode@nordahl.net wrote:

...
Hello,

I am trying to use a slapd BDB database created on a 32-bit intel system on a 64-bit (amd64/EMT64) system. Running slapcat on the database works fine, but when the server is started, ldapsearch is unable to find anything.

I understand that there are architectural differences between a 32- bit and 64-bit systems, but I don't think Berkeley DB have any problems with having its data being interchanged between the platforms.

We are running OpenLDAP in a distributed environment where every server has a local copy of the LDAP database for performance and reliability reasons. The database is 6 GB, so it is non-trivial export it to LDIF and import it again on the 64-bit systems. (It takes way too long).

I am setting up one server to just be a spare LDAP slave so we easilly can take it down and copy the database to any new system added to the cluster without causing any downtime anywhere. But this is not possible as long as I cannot use the same database on 32-bit and 64-bit systems.

Would it be possible to make this work at all?

Could this be caused by a platform-dependent variable type being used somewhere, rather than a fixed sized variable type, making slapd interpret the same data diffrently on different platforms?

Are you using the same binary on both machines? If it's 32bit-compiled slapd/bdb, then I don't think you would have any problems based on the underlying architecture. BDB does have an endian-independent option, but I don't think you're running into that. Maybe you just have to reindex/recover after the copy.

Different binary. The 64-bit computer is running a 64-bit binary, and I am trying to use the BDB database created on a 32-bit computer with a 32-bit binary.

I know I may be asking for trouble trying this, but I really would like it to work.

If there are deliberate differences I would like to know about it, if not I will try and help to find the source of the problem.

...
And you can definitely add new replicas without causing any downtime by using the strategy you're suggesting- even using slapcat/slapadd.

Yes, this would be possible, but as the slapadd step may take several hours I cannot rely on this. I have tested it with slapadd -q and it still takes too long. There just is alot of data that must go in there :-)

Use the same binary set on all of your replicas, otherwise you can't reasonably expect the same database files to work.

I also have a large database (my slapcat-ed file is over 4gb), but I don't see how it's more reliable to shutdown a spare for one hour while you scp versus four hours while you slapadd. What's the difference? A minute's worth of replication to catch-up with?

Frode Nordahl

16 Oct 16 Oct

12:56 a.m.

On 16. okt. 2006, at 03.08, matthew sporleder wrote:

...

...
...
And you can definitely add new replicas without causing any

downtime

...
by using the strategy you're suggesting- even using slapcat/

slapadd.

Yes, this would be possible, but as the slapadd step may take several hours I cannot rely on this. I have tested it with slapadd -q and it still takes too long. There just is alot of data that must go in there :-)

Use the same binary set on all of your replicas, otherwise you can't reasonably expect the same database files to work.

There is nothing in Berkeley DB that prevents you from doing this, hence the question. But I have learned that it is a deliberate decision behind this not working, and I am content with knowing that.

...

I also have a large database (my slapcat-ed file is over 4gb), but I don't see how it's more reliable to shutdown a spare for one hour while you scp versus four hours while you slapadd. What's the difference? A minute's worth of replication to catch-up with?

I would have to take the slave down to do the slapcat as well, and I guess the time difference between slapcat and tar of the binary files is next to nil.

The whole point for having the spare slave is to take it down at the same time as I add a new replica to the master servers configuration. That way the slapcat / binary copy will be in a good state, and the replication will start at the right spot.

As soon as the slapcat or binary copy is done the replication can start on the slave again.

I can probably evade this by using syncrepl instead?

-- Frode Nordahl

Buchan Milne

3:08 a.m.

On Monday 16 October 2006 09:56, Frode Nordahl wrote:

...

On 16. okt. 2006, at 03.08, matthew sporleder wrote:

...
...
...
And you can definitely add new replicas without causing any

downtime

...
by using the strategy you're suggesting- even using slapcat/

slapadd.

Yes, this would be possible, but as the slapadd step may take several hours I cannot rely on this. I have tested it with slapadd -q and it still takes too long. There just is alot of data that must go in there :-)

Use the same binary set on all of your replicas, otherwise you can't reasonably expect the same database files to work.

There is nothing in Berkeley DB that prevents you from doing this, hence the question. But I have learned that it is a deliberate decision behind this not working, and I am content with knowing that.

...
I also have a large database (my slapcat-ed file is over 4gb), but I don't see how it's more reliable to shutdown a spare for one hour while you scp versus four hours while you slapadd. What's the difference? A minute's worth of replication to catch-up with?

I would have to take the slave down to do the slapcat as well,

No, you can slapcat while the slave is running.

...

and I guess the time difference between slapcat and tar of the binary files is next to nil.

The whole point for having the spare slave is to take it down at the same time as I add a new replica to the master servers configuration. That way the slapcat / binary copy will be in a good state, and the replication will start at the right spot.

As soon as the slapcat or binary copy is done the replication can start on the slave again.

I can probably evade this by using syncrepl instead?

Yes, if you're using sync-repl, there is no need for this. Take any valid snapshot of the database, slapadd on the consumer, start it up, and it will catch up. Or, if you can wait a little bit longer, skip the whole slapcat/slapadd step entirely.

Regards, Buchan

-- Buchan Milne ISP Systems Specialist - Monitoring/Authentication Team Leader B.Eng,RHCE(803004789010797),LPIC-2(LPI000074592)

Frode Nordahl

3:29 a.m.

On 16. okt. 2006, at 12.08, Buchan Milne wrote:

...

On Monday 16 October 2006 09:56, Frode Nordahl wrote:

...
On 16. okt. 2006, at 03.08, matthew sporleder wrote:

...
I also have a large database (my slapcat-ed file is over 4gb), but I don't see how it's more reliable to shutdown a spare for one hour while you scp versus four hours while you slapadd. What's the difference? A minute's worth of replication to catch-up with?

I would have to take the slave down to do the slapcat as well,

No, you can slapcat while the slave is running.

Wouldn't that leave the LDIF in a inconsistent state? Or is slapcat protected by a transaction?

...

...
and I guess the time difference between slapcat and tar of the binary files is next to nil.

The whole point for having the spare slave is to take it down at the same time as I add a new replica to the master servers configuration. That way the slapcat / binary copy will be in a good state, and the replication will start at the right spot.

As soon as the slapcat or binary copy is done the replication can start on the slave again.

I can probably evade this by using syncrepl instead?

Yes, if you're using sync-repl, there is no need for this. Take any valid snapshot of the database, slapadd on the consumer, start it up, and it will catch up. Or, if you can wait a little bit longer, skip the whole slapcat/slapadd step entirely.

Thanks, I will look into converting to syncrepl.

-- Frode Nordahl

Howard Chu

6:13 a.m.

Frode Nordahl wrote:

...

On 16. okt. 2006, at 12.08, Buchan Milne wrote:

...
On Monday 16 October 2006 09:56, Frode Nordahl wrote:

...
On 16. okt. 2006, at 03.08, matthew sporleder wrote:

...
I also have a large database (my slapcat-ed file is over 4gb), but I don't see how it's more reliable to shutdown a spare for one hour while you scp versus four hours while you slapadd. What's the difference? A minute's worth of replication to catch-up with?

I would have to take the slave down to do the slapcat as well,

No, you can slapcat while the slave is running.

Wouldn't that leave the LDIF in a inconsistent state? Or is slapcat protected by a transaction?

All write operations in back-bdb/hdb are transactional, so they are fully isolated. slapcat will only see consistent data. That was one of the reasons for writing back-bdb in the first place...

...

...
Yes, if you're using sync-repl, there is no need for this. Take any valid snapshot of the database, slapadd on the consumer, start it up, and it will catch up. Or, if you can wait a little bit longer, skip the whole slapcat/slapadd step entirely.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/

Quanah Gibson-Mount

10:11 a.m.

--On Monday, October 16, 2006 6:13 AM -0700 Howard Chu hyc@symas.com wrote:

...

Frode Nordahl wrote:

...
On 16. okt. 2006, at 12.08, Buchan Milne wrote:

...
On Monday 16 October 2006 09:56, Frode Nordahl wrote:

...
On 16. okt. 2006, at 03.08, matthew sporleder wrote:

...
I also have a large database (my slapcat-ed file is over 4gb), but I don't see how it's more reliable to shutdown a spare for one hour while you scp versus four hours while you slapadd. What's the difference? A minute's worth of replication to catch-up with?

I would have to take the slave down to do the slapcat as well,

No, you can slapcat while the slave is running.

Wouldn't that leave the LDIF in a inconsistent state? Or is slapcat protected by a transaction?

All write operations in back-bdb/hdb are transactional, so they are fully isolated. slapcat will only see consistent data. That was one of the reasons for writing back-bdb in the first place...

I'd think if one is using slurpd, there's the possibility that the replica could receive changes to the database to entries that had already been dumped while the other entries are being dumped, meaning the new replica, when loaded, wouldn't have those changes, and no way to get them. Another excellent reason for not using slurpd. ;)

--Quanah

-- Quanah Gibson-Mount Principal Software Developer ITS/Shared Application Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

Howard Chu

15 Oct 15 Oct

3:01 p.m.

Frode Nordahl wrote:

...

Hello,

I am trying to use a slapd BDB database created on a 32-bit intel system on a 64-bit (amd64/EMT64) system. Running slapcat on the database works fine, but when the server is started, ldapsearch is unable to find anything.

I understand that there are architectural differences between a 32- bit and 64-bit systems, but I don't think Berkeley DB have any problems with having its data being interchanged between the platforms.

No, this won't work. On a 64 bit build the entryIDs in the database are 64 bits. You must export to LDIF and re-import the DB.

I spent a lot of time thinking over this decision; whether to keep entryIDs 32 bits wide for 64 bit builds. Ultimately I decided that 64 bits was better. We have a deployment that has upwards of 3 billion entries, and they add/delete millions of entries per day, so restricting things to only 4 billion entryIDs would have been too much of a restriction.

I note that most other directory servers are restricted to 32 bits here, and they are simply unable to scale out to the numbers that we manage.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/

Frode Nordahl

16 Oct 16 Oct

12:45 a.m.

On 16. okt. 2006, at 00.01, Howard Chu wrote:

...

Frode Nordahl wrote:

...
Hello, I am trying to use a slapd BDB database created on a 32-bit intel system on a 64-bit (amd64/EMT64) system. Running slapcat on the database works fine, but when the server is started, ldapsearch is unable to find anything. I understand that there are architectural differences between a 32- bit and 64-bit systems, but I don't think Berkeley DB have any problems with having its data being interchanged between the platforms.

No, this won't work. On a 64 bit build the entryIDs in the database are 64 bits. You must export to LDIF and re-import the DB.

Thank you for your reply!

...

I spent a lot of time thinking over this decision; whether to keep entryIDs 32 bits wide for 64 bit builds. Ultimately I decided that 64 bits was better. We have a deployment that has upwards of 3 billion entries, and they add/delete millions of entries per day, so restricting things to only 4 billion entryIDs would have been too much of a restriction.

I think you made a wise decision. 5-10 years from now most servers will have been replaced by newer systems capable of running 64-bit operating systems, and this will be a thing of the past.

-- Frode Nordahl

Howard Chu

15 Oct 15 Oct

3:04 p.m.

Frode Nordahl wrote:

...

We are running OpenLDAP in a distributed environment where every server has a local copy of the LDAP database for performance and reliability reasons. The database is 6 GB, so it is non-trivial export it to LDIF and import it again on the 64-bit systems. (It takes way too long).

If this is taking too long, then you're doing something wrong. With OpenLDAP 2.3 and -q (quick mode) we can slapadd a 1 terabyte database in only 8 hours. Your 6GB database could be loaded in just 15 minutes or so.

-- -- Howard Chu Chief Architect, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc OpenLDAP Core Team http://www.openldap.org/project/

Frode Nordahl

16 Oct 16 Oct

12:47 a.m.

On 16. okt. 2006, at 00.04, Howard Chu wrote:

...

Frode Nordahl wrote:

...
We are running OpenLDAP in a distributed environment where every server has a local copy of the LDAP database for performance and reliability reasons. The database is 6 GB, so it is non-trivial export it to LDIF and import it again on the 64-bit systems. (It takes way too long).

If this is taking too long, then you're doing something wrong. With OpenLDAP 2.3 and -q (quick mode) we can slapadd a 1 terabyte database in only 8 hours. Your 6GB database could be loaded in just 15 minutes or so.

Ok, thanks. I will revisit this problem and try to find out how fast we can make it go, and see if it is sufficient for disaster recovery and adding of new systems.

-- Frode Nordahl

6838

Age (days ago)

6839

Last active (days ago)

openldap-software@openldap.org

14 comments

5 participants

tags (0)

participants (5)

Buchan Milne
Frode Nordahl
Howard Chu
matthew sporleder
Quanah Gibson-Mount