Hi,
I have huge LDIF file from openldap 2.3.30, which I try to load in a 2.4.19 mirrormode setup.
I've tried different ways to load it.
1) Load the LDIF on server 1 and wait for server 2 to replicate it. - it takes several days and server-2 never seems to get all the way and catch up with server 1.
2) Load the LDIF on both servers and start slapd. - Afterwards not all entries created on server 1 is replicated to server-2.
So the first thing to rule out would be to ensure that I've loaded the LDIF correctly:
The entryCSN from the 2.3.30 server is in this format: 20071214130312Z#000000#00#000000
The two servers (server1/server2) in the 2.4.19 setup have sid 1, 2 and rid 3,4 (cn=config replication has rid 1,2).
I loaded the LDIF like this on both servers: $ slapadd -q -w -l backup-2.3.30.ldif
Is that enough to make replication start from a known state?
Could someone please exemplify a scenario where the -S option for slapadd is needed?
/Peter
Hi,
I have huge LDIF file from openldap 2.3.30, which I try to load in a 2.4.19 mirrormode setup.
I've tried different ways to load it.
Load the LDIF on server 1 and wait for server 2 to replicate it.
- it takes several days and server-2 never seems to get all the way and catch up with server 1.
Load the LDIF on both servers and start slapd.
- Afterwards not all entries created on server 1 is replicated to server-2.
So the first thing to rule out would be to ensure that I've loaded the LDIF correctly:
The entryCSN from the 2.3.30 server is in this format: 20071214130312Z#000000#00#000000
The two servers (server1/server2) in the 2.4.19 setup have sid 1, 2 and rid 3,4 (cn=config replication has rid 1,2).
I loaded the LDIF like this on both servers: $ slapadd -q -w -l backup-2.3.30.ldif
Is that enough to make replication start from a known state?
Could someone please exemplify a scenario where the -S option for slapadd is needed?
- load server 1 using slapadd with option -S (the SID of server 1) and -w
- slapcat server 1
- slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
p.
masarati@aero.polimi.it wrote:
load server 1 using slapadd with option -S (the SID of server 1) and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup. It already takes over 2 hours to load the database on server 1. Is there really no way around this load-dump-load procedure?
/Peter
masarati@aero.polimi.it wrote:
- load server 1 using slapadd with option -S (the SID of server 1) and
-w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup. It already takes over 2 hours to load the database on server 1. Is there really no way around this load-dump-load procedure?
If the two servers are identical, and you can afford stopping server 1, you can probably copy the database files.
p.
--On Wednesday, November 11, 2009 9:09 PM +0100 Peter Mogensen apm@mutex.dk wrote:
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup. It already takes over 2 hours to load the database on server 1. Is there really no way around this load-dump-load procedure?
Two hours? Are you sure your DB_CONFIG file is tuned, tool-threads is set to cpu#, and -q is being used with slapadd? Must be one heck of a large DB otherwise. :P It takes me two hours to load a 3 million entry DB that ends up being a 12GB db on disk.
--Quanah
--
Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc -------------------- Zimbra :: the leader in open source messaging and collaboration
Quanah Gibson-Mount wrote:
Two hours? Are you sure your DB_CONFIG file is tuned, tool-threads is set to cpu#, and -q is being used with slapadd?
yes.
Must be one heck of a large DB otherwise. :P It takes me two hours to load a 3 million entry DB that ends up being a 12GB db on disk.
In the BerkeleyDB directory:
# du -sh . 22G
/Peter
Peter Mogensen wrote:
masarati@aero.polimi.it wrote:
load server 1 using slapadd with option -S (the SID of server 1) and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup.
This procedure should only be needed if the LDIF doesn't already contain correct CSNs. If you're loading a backup from a 2.4 slapcat you can just slapadd it on all servers at once.
It already takes over 2 hours to load the database on server 1. Is there really no way around this load-dump-load procedure?
Howard Chu wrote:
Peter Mogensen wrote:
masarati@aero.polimi.it wrote:
load server 1 using slapadd with option -S (the SID of server 1) and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup.
This procedure should only be needed if the LDIF doesn't already contain correct CSNs. If you're loading a backup from a 2.4 slapcat you can just slapadd it on all servers at once.
My understanding is that he was loading LDIF from 2.3, which has a different format for CSN. So the first run with -S and -w was intended to initialize CSN info in 2.4 format with the SID of the first master. This would probably require to remove entryCSN values from the original LDIF.
p.
Pierangelo Masarati wrote:
Howard Chu wrote:
Peter Mogensen wrote:
masarati@aero.polimi.it wrote:
- load server 1 using slapadd with option -S (the SID of server 1)
and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup.
This procedure should only be needed if the LDIF doesn't already contain correct CSNs. If you're loading a backup from a 2.4 slapcat you can just slapadd it on all servers at once.
My understanding is that he was loading LDIF from 2.3, which has a different format for CSN. So the first run with -S and -w was intended to initialize CSN info in 2.4 format with the SID of the first master. This would probably require to remove entryCSN values from the original LDIF.
I've done as above. "slapadd -S 1 -q -w" on server-1 (Server-ID 1)
Then slapcat on server-1
I would have expected the entryCSN values in the output to now be with SID 1, but they look like this: entryCSN: 20071214130312.000000Z#000000#000#000000
Then contextCSN is also with SID 0: contextCSN: 20090929120520.000000Z#000000#000#000000
Though that surpised me I impirted the LDIF to server-2 (SID 2) and replication seems to work. However, after the first change from server-1 has been replicated to server-2, there are now 2 contextCSN's on server-2: entryCSN: 20071214130312.000000Z#000000#000#000000 contextCSN: 20090929120520.000000Z#000000#000#000000 contextCSN: 20091112161735.074445Z#000000#001#000000
... the last one with SID 1.
This is not the behaviour I would have expected.
/Peter
Peter Mogensen wrote:
Pierangelo Masarati wrote:
Howard Chu wrote:
Peter Mogensen wrote:
masarati@aero.polimi.it wrote:
- load server 1 using slapadd with option -S (the SID of server 1)
and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup.
This procedure should only be needed if the LDIF doesn't already contain correct CSNs. If you're loading a backup from a 2.4 slapcat you can just slapadd it on all servers at once.
My understanding is that he was loading LDIF from 2.3, which has a different format for CSN. So the first run with -S and -w was intended to initialize CSN info in 2.4 format with the SID of the first master. This would probably require to remove entryCSN values from the original LDIF.
I've done as above. "slapadd -S 1 -q -w" on server-1 (Server-ID 1)
Then slapcat on server-1
I would have expected the entryCSN values in the output to now be with SID 1, but they look like this: entryCSN: 20071214130312.000000Z#000000#000#000000
Then contextCSN is also with SID 0: contextCSN: 20090929120520.000000Z#000000#000#000000
Though that surpised me I impirted the LDIF to server-2 (SID 2) and replication seems to work. However, after the first change from server-1 has been replicated to server-2, there are now 2 contextCSN's on server-2: entryCSN: 20071214130312.000000Z#000000#000#000000 contextCSN: 20090929120520.000000Z#000000#000#000000 contextCSN: 20091112161735.074445Z#000000#001#000000
... the last one with SID 1.
This is not the behaviour I would have expected.
What happened is that slapadd simply converted the existing 2.3 CSNs to 2.4 format while keeping their value. The fact the first contextCSN (generated by slapadd) has SID 000 is expected, since the contextCSN is computed as the largest entryCSN (one for each SID that appears in the database's entryCSN).
At this point, you should:
- take the LDIF slapcat from server-1 - manually modify all SIDs to 001 (e.g. using sed or whatever) - reload the LDIF into server-1
Now you have a properly initialized server-1.
p.
Pierangelo Masarati wrote:
Peter Mogensen wrote:
Pierangelo Masarati wrote:
Howard Chu wrote:
Peter Mogensen wrote:
masarati@aero.polimi.it wrote:
- load server 1 using slapadd with option -S (the SID of server 1)
and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup.
This procedure should only be needed if the LDIF doesn't already contain correct CSNs. If you're loading a backup from a 2.4 slapcat you can just slapadd it on all servers at once.
My understanding is that he was loading LDIF from 2.3, which has a different format for CSN. So the first run with -S and -w was intended to initialize CSN info in 2.4 format with the SID of the first master. This would probably require to remove entryCSN values from the original LDIF.
I've done as above. "slapadd -S 1 -q -w" on server-1 (Server-ID 1)
Then slapcat on server-1
I would have expected the entryCSN values in the output to now be with SID 1, but they look like this: entryCSN: 20071214130312.000000Z#000000#000#000000
Then contextCSN is also with SID 0: contextCSN: 20090929120520.000000Z#000000#000#000000
Though that surpised me I impirted the LDIF to server-2 (SID 2) and replication seems to work. However, after the first change from server-1 has been replicated to server-2, there are now 2 contextCSN's on server-2: entryCSN: 20071214130312.000000Z#000000#000#000000 contextCSN: 20090929120520.000000Z#000000#000#000000 contextCSN: 20091112161735.074445Z#000000#001#000000
... the last one with SID 1.
This is not the behaviour I would have expected.
What happened is that slapadd simply converted the existing 2.3 CSNs to 2.4 format while keeping their value. The fact the first contextCSN (generated by slapadd) has SID 000 is expected, since the contextCSN is computed as the largest entryCSN (one for each SID that appears in the database's entryCSN).
At this point, you should:
- take the LDIF slapcat from server-1
- manually modify all SIDs to 001 (e.g. using sed or whatever)
- reload the LDIF into server-1
Now you have a properly initialized server-1.
Ahh... -S is only for generated CSN's.
But if I'm loading the same data into both servers in a mirromode setup, then I shouldn't really have any use for the old CSN values, should I? So instead of sed/perl chaing the CSN's I could just remove them from the LDIF and let sladadd generate new ones?
It strikes my that there should be an FAQ about this (loading a backup from one server setup into another with different SID/RIDs). Have I missed it?
/Peter
Peter Mogensen wrote:
Pierangelo Masarati wrote:
Peter Mogensen wrote:
Pierangelo Masarati wrote:
Howard Chu wrote:
Peter Mogensen wrote:
masarati@aero.polimi.it wrote: > - load server 1 using slapadd with option -S (the SID of server > 1) and -w > > - slapcat server 1 > > - slapadd server 2 using the slapcat from server 1 > > this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup.
This procedure should only be needed if the LDIF doesn't already contain correct CSNs. If you're loading a backup from a 2.4 slapcat you can just slapadd it on all servers at once.
My understanding is that he was loading LDIF from 2.3, which has a different format for CSN. So the first run with -S and -w was intended to initialize CSN info in 2.4 format with the SID of the first master. This would probably require to remove entryCSN values from the original LDIF.
I've done as above. "slapadd -S 1 -q -w" on server-1 (Server-ID 1)
Then slapcat on server-1
I would have expected the entryCSN values in the output to now be with SID 1, but they look like this: entryCSN: 20071214130312.000000Z#000000#000#000000
Then contextCSN is also with SID 0: contextCSN: 20090929120520.000000Z#000000#000#000000
Though that surpised me I impirted the LDIF to server-2 (SID 2) and replication seems to work. However, after the first change from server-1 has been replicated to server-2, there are now 2 contextCSN's on server-2: entryCSN: 20071214130312.000000Z#000000#000#000000 contextCSN: 20090929120520.000000Z#000000#000#000000 contextCSN: 20091112161735.074445Z#000000#001#000000
... the last one with SID 1.
This is not the behaviour I would have expected.
What happened is that slapadd simply converted the existing 2.3 CSNs to 2.4 format while keeping their value. The fact the first contextCSN (generated by slapadd) has SID 000 is expected, since the contextCSN is computed as the largest entryCSN (one for each SID that appears in the database's entryCSN).
At this point, you should:
- take the LDIF slapcat from server-1
- manually modify all SIDs to 001 (e.g. using sed or whatever)
- reload the LDIF into server-1
Now you have a properly initialized server-1.
Ahh... -S is only for generated CSN's.
Correct. The idea is that if an entry already has a CSN, you'd like to preserve it, at least in the portion that indicates when it was last changed. Having entries whose CSN has a SID of 0 in your setup should not be an issue by itself; my fear is that it may result in some "not mine" issues, that's why I'd suggest to turn single-master entryCSN into MM entryCSN by forcing their SID to that of the first server.
But if I'm loading the same data into both servers in a mirromode setup, then I shouldn't really have any use for the old CSN values, should I? So instead of sed/perl chaing the CSN's I could just remove them from the LDIF and let sladadd generate new ones?
That's another option; you'd lose the real modification date, but this might be a minor issue as soon as you intend to start with a fresh system.
It strikes my that there should be an FAQ about this (loading a backup from one server setup into another with different SID/RIDs).
There should be some discussion in the mailing lists (option -S was added based on something related to this, namely the need to use a specific SID to initialize entryCSN during LDIF import). I'm not aware of specific FAQs on this exact topic.
p.
Pierangelo Masarati wrote:
But if I'm loading the same data into both servers in a mirromode setup, then I shouldn't really have any use for the old CSN values, should I? So instead of sed/perl chaing the CSN's I could just remove them from the LDIF and let sladadd generate new ones?
That's another option; you'd lose the real modification date, but this might be a minor issue as soon as you intend to start with a fresh system.
But I'll only lose it for syncrepl purposes. For application purposes I still have modifyTimestamp. And since I start both servers with the exact same LDIF, they will not need to know in what sequence that data came to be.
/Peter
Howard Chu wrote:
Peter Mogensen wrote:
masarati@aero.polimi.it wrote:
load server 1 using slapadd with option -S (the SID of server 1) and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
Ahh... That's of course right. But that will also more than double the time needed to load a backup on a mirrormode setup.
This procedure should only be needed if the LDIF doesn't already contain correct CSNs. If you're loading a backup from a 2.4 slapcat you can just slapadd it on all servers at once.
But I would guess I could do it like this to save time:
1) slapdadd -S 1 -q -w -l data.ldif on server 1
2) slapcat server 1 > newdata.ldif
3a) Start server 1
3b) slapadd -q -l newdata.ldif on server 2
4) Start server 2
?? I would expect server 2 to catch up quickly with server 1, and of course I would not have mirroring initially while server 2 is loading.
/Peter
On Thu, Nov 12, 2009 at 4:26 AM, masarati@aero.polimi.it wrote:
Could someone please exemplify a scenario where the -S option for slapadd is needed?
load server 1 using slapadd with option -S (the SID of server 1) and -w
slapcat server 1
slapadd server 2 using the slapcat from server 1
this ensures you have consistent entryCSN and contextCSN
So can i say that in mirrormode, there is never a need to provide -S <sid> unless you are initializing the first server (sid 1) from a ldif without valid CSN's etc, eg. by providing the "-S 1 -w" options to slapadd ?
If this is true, then would i also be correct is saying that -S <sid> only sets a parameter to enable the -w option to set the SID correctly in CSN's etc? and that the value provided by -S has no effect if -w is not provided ?
I imagine therefore the same thing would apply when initializing a set of multimaster servers, where the <sid> provided by -S just happened to be the first one you are loading. The others 2-N nodes would be initialized from a slapcat of the first server you loaded (using the contextCSN's etc., slapadd "-S <sid> -w" created), still using slapadd but without needing to provide either the -S <sid> option or the -w option.
Just trying to clarify my understanding before contributing a paragraph of doc to explain this.
Cheers Brett
Brett @Google wrote:
On Thu, Nov 12, 2009 at 4:26 AM, <masarati@aero.polimi.it mailto:masarati@aero.polimi.it> wrote:
> Could someone please exemplify a scenario where the -S option for > slapadd is needed? - load server 1 using slapadd with option -S (the SID of server 1) and -w - slapcat server 1 - slapadd server 2 using the slapcat from server 1 this ensures you have consistent entryCSN and contextCSN
So can i say that in mirrormode, there is never a need to provide -S <sid> unless you are initializing the first server (sid 1) from a ldif without valid CSN's etc, eg. by providing the "-S 1 -w" options to slapadd ?
OK.
If this is true, then would i also be correct is saying that -S <sid> only sets a parameter to enable the -w option to set the SID correctly in CSN's etc? and that the value provided by -S has no effect if -w is not provided ?
No.
I imagine therefore the same thing would apply when initializing a set of multimaster servers, where the <sid> provided by -S just happened to be the first one you are loading. The others 2-N nodes would be initialized from a slapcat of the first server you loaded (using the contextCSN's etc., slapadd "-S <sid> -w" created), still using slapadd but without needing to provide either the -S <sid> option or the -w option.
Yes.
Just trying to clarify my understanding before contributing a paragraph of doc to explain this.
On Sat, Nov 21, 2009 at 7:10 PM, Howard Chu hyc@symas.com wrote:
If this is true, then would i also be correct is saying that -S <sid> only sets a parameter to enable the -w option to set the SID correctly in CSN's etc? and that the value provided by -S has no effect if -w is not provided ?
No.
hmmm.. csnsid in slapadd.c defaults to 0, unless the -S option is given, but let us assume it is given -S 1, so csnsid=1
if the -w option is specified, then update_ctxcsn is >0
it looks like slapadd will create the berval csn (in memory) which takes the value of csnsid, but it will only write related data if (update_ctxcsn)
but for multimaster the csnsid is used (thus must be provided) to find it's own CSN, but unless update_ctxcsn is >0, it wont change any state?
so maybe -S <sid> is required (with or without -w) to initialize multimaster to find the correct CSN (as other master's CSN's could be present? - i dont quite get why though if we are re-loading the database), but for mirrormode -S <sid> is not required (if there is no -w), but only because any other <sid>'s are likely to be shadow contexts in the initial case of a data load ?
Cheers Brett
Brett @Google wrote:
On Sat, Nov 21, 2009 at 7:10 PM, Howard Chu <hyc@symas.com mailto:hyc@symas.com> wrote:
> If this is true, then would i also be correct is saying that -S <sid> > only sets a parameter to enable the -w option to set the SID correctly > in CSN's etc? and that the value provided by -S has no effect if -w is > not provided ? No.
hmmm.. csnsid in slapadd.c defaults to 0, unless the -S option is given, but let us assume it is given -S 1, so csnsid=1
if the -w option is specified, then update_ctxcsn is >0
it looks like slapadd will create the berval csn (in memory) which takes the value of csnsid,
Yes.
but it will only write related data if (update_ctxcsn)
The manpage clearly states that -S sets the SID used in generated entryCSNs.
It also clearly states that the -w option writes the contextCSN based on the greatest entryCSN in the database.
Period, end of story.
but for multimaster the csnsid is used (thus must be provided) to find it's own CSN, but unless update_ctxcsn is >0, it wont change any state?
Huh?
so maybe -S <sid> is required (with or without -w) to initialize multimaster to find the correct CSN (as other master's CSN's could be present? - i dont quite get why though if we are re-loading the database), but for mirrormode -S <sid> is not required (if there is no -w), but only because any other <sid>'s are likely to be shadow contexts in the initial case of a data load ?
Again, huh?
On Sun, Nov 22, 2009 at 5:42 AM, Howard Chu hyc@symas.com wrote:> but it will only write related data if (update_ctxcsn)
The manpage clearly states that -S sets the SID used in generated entryCSNs.
It also clearly states that the -w option writes the contextCSN based on the greatest entryCSN in the database.
Period, end of story.
Indeed.
So the consequence is that the -S <sid> option is used to initialize the entryCSN's, regardless of if the -w option being provided.
If the -w option is provided, then the entryCSN's are searched for the maximal entryCSN, which becomes the contextCSN (either recently generated, or provided as part of the LDIF being loaded)
If the -S is not provided, it defaults to 0.
Thanks for your patience..
Cheers Brett
On Sun, Nov 22, 2009 at 5:42 AM, Howard Chu hyc@symas.com wrote:> but it will only write related data if (update_ctxcsn)
The manpage clearly states that -S sets the SID used in generated entryCSNs.
It also clearly states that the -w option writes the contextCSN based on the greatest entryCSN in the database.
Period, end of story.
Indeed.
So the consequence is that the -S <sid> option is used to initialize the entryCSN's, regardless of if the -w option being provided.
If the -w option is provided, then the entryCSN's are searched for the maximal entryCSN, which becomes the contextCSN (either recently generated, or provided as part of the LDIF being loaded)
If the -S is not provided, it defaults to 0.
Just to clarify (your first statement sounds a bit ambiguous to me): -w and -S are sort of orthogonal.
If the database has "lastmod on", slapadd adds entryCSN (and entryUUID, and createTimestamp, modifyTimestamp and so) unless already present in the LDIF. Already present values are left untouched.
The entryCSN is generated using the SID passed with -S; it defaults to 0.
When -w is used, the largest entryCSN for each independent SID is collected, regardless of being generated or already present in the LDIF, and written in the contextCSN (as soon as slapadd's execution is successful, of course). The contextCSN is multi-valued.
So, for example, an LDIF like this (skipping unnecessary details)
dn: dc=example,dc=com entryCSN: 20091122093849.380000Z#000000#000#000000
dn: ou=People,dc=example,dc=com entryCSN: 20091122093850.380000Z#000000#001#000000
dn: ou=Groups,dc=example,dc=com entryCSN: 20091122093851.380000Z#000000#002#000000
dn: cn=Someone,ou=People,dc=example,dc=com # no entryCSN
run with slapadd -w -S 3 will result in
dn: dc=example,dc=com entryCSN: 20091122093849.380000Z#000000#000#000000 # gathered by slapadd run with -w contextCSN: 20091122093849.380000Z#000000#000#000000 contextCSN: 20091122093850.380000Z#000000#001#000000 contextCSN: 20091122093851.380000Z#000000#002#000000 contextCSN: 20091122093852.380000Z#000000#003#000000
dn: ou=People,dc=example,dc=com entryCSN: 20091122093850.380000Z#000000#001#000000
dn: ou=Groups,dc=example,dc=com entryCSN: 20091122093851.380000Z#000000#002#000000
dn: cn=Someone,ou=People,dc=example,dc=com # added by slapadd, with SID=3 as passed by -S 3 entryCSN: 20091122093852.380000Z#000000#003#000000
Hope this clarifies. Feel free to turn this into a FAQ entry, or better an example for the Admin Guide :)
p.
openldap-technical@openldap.org