--001a11c2ed64c222e504e2c14561 Content-Type: text/plain; charset=UTF-8
Hi,
All writes occur in the parent process only. The child (normally) only reopens the environment and performs a few short reads.
But, it's the actual opening of the env in the forked child that is causing the database growth. I tried to close the env straight after opening it in the child (without performing any reads), and have encountered the same issues.
Hope that makes sense, Dimitrij On 30 Jul 2013 21:19, "Howard Chu" hyc@symas.com wrote:
dimitrij.denissenko@**blacksquaremedia.comdimitrij.denissenko@blacksquaremedia.comwrote:
Full_Name: Dimitrij Denissenko Version: OS: Ubuntu 12.04 URL: Submission from: (NULL) (62.30.100.0)
Hi,
I found an interesting issue with LMDB. I have populated the DB with a bunch of records and it uses ~30M on disk (after sync). Then I added a background process to my app and populated the database again with the same record set. Surprisingly. the resulting size on disk was >70M.
The background process is forked periodically to perform some maintenance tasks, here is my (simplified) code:
/* Close env before forking */ mdb_env_close(env);
if ((childpid = fork()) == 0) { /* Child */ rc = mdb_env_open(env, ".", MDB_NOSYNC, 0644); ... } else { /* Parent */ rc = mdb_env_open(env, ".", MDB_NOSYNC, 0644); ... }
I could narrow it down to the mdb_env_open call in the child. If I add exit(0) before the mdb_env_open line, the DB size remains consistently at ~30M. The data size seems to grow proportionally to the number of forks performed during data load. What could be causing the growth? What can I do to prevent it?
Thanks in advance
PS: I tried it with MDB_FIXMAP and without, same result.
Without seeing more of your code, it's impossible to tell. Are you adding the data on both sides of the fork? In the above code snippet, where are your mdb_put calls occurring? Are both the parent and child processes writing identical data?
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/**project/http://www.openldap.org/project/
--001a11c2ed64c222e504e2c14561 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
<p dir=3D"ltr">Hi,</p> <p dir=3D"ltr">All writes occur in the parent process only. The child (norm= ally) only reopens the environment and performs a few short reads. </p> <p dir=3D"ltr">But, it's the actual opening of the env in the forked ch= ild that is causing the database growth. I tried to close the env straight = after opening it in the child (without performing any reads), and have enco= untered the same issues.</p>
<p dir=3D"ltr">Hope that makes sense,<br> Dimitrij</p> <div class=3D"gmail_quote">On 30 Jul 2013 21:19, "Howard Chu" <= ;<a href=3D"mailto:hyc@symas.com">hyc@symas.com</a>> wrote:<br type=3D"a= ttribution"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo= rder-left:1px #ccc solid;padding-left:1ex"> <a href=3D"mailto:dimitrij.denissenko@blacksquaremedia.com" target=3D"_blan= k">dimitrij.denissenko@<u></u>blacksquaremedia.com</a> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> Full_Name: Dimitrij Denissenko<br> Version:<br> OS: Ubuntu 12.04<br> URL:<br> Submission from: (NULL) (62.30.100.0)<br> <br> <br> Hi,<br> <br> I found an interesting issue with LMDB. I have populated the DB with a bunc= h of<br> records and it uses ~30M on disk (after sync). Then I added a background pr= ocess<br> to my app and populated the database again with the same record set.<br> Surprisingly. the resulting size on disk was >70M.<br> <br> The background process is forked periodically to perform some maintenance t= asks,<br> here is my (simplified) code:<br> <br> /* Close env before forking */<br> mdb_env_close(env);<br> <br> if ((childpid =3D fork()) =3D=3D 0) {<br> =C2=A0 =C2=A0 =C2=A0/* Child */<br> =C2=A0 =C2=A0 =C2=A0rc =3D mdb_env_open(env, ".", MDB_NOSYNC, 064= 4);<br> =C2=A0 =C2=A0 =C2=A0...<br> } else {<br> =C2=A0 =C2=A0 =C2=A0/* Parent */<br> =C2=A0 =C2=A0 =C2=A0rc =3D mdb_env_open(env, ".", MDB_NOSYNC, 064= 4);<br> =C2=A0 =C2=A0 =C2=A0...<br> }<br> <br> I could narrow it down to the mdb_env_open call in the child. If I add exit= (0)<br> before the mdb_env_open line, the DB size remains consistently at ~30M. The= data<br> size seems to grow proportionally to the number of forks performed during d= ata<br> load. What could be causing the growth? What can I do to prevent it?<br> <br> Thanks in advance<br> <br> PS: I tried it with MDB_FIXMAP and without, same result.<br> </blockquote> <br> Without seeing more of your code, it's impossible to tell. Are you addi= ng the data on both sides of the fork? In the above code snippet, where are= your mdb_put calls occurring? Are both the parent and child processes writ= ing identical data?<br>
<br> -- <br> =C2=A0 -- Howard Chu<br> =C2=A0 CTO, Symas Corp. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <a href=3D"http:= //www.symas.com" target=3D"_blank">http://www.symas.com</a><br> =C2=A0 Director, Highland Sun =C2=A0 =C2=A0 <a href=3D"http://highlandsun.c= om/hyc/" target=3D"_blank">http://highlandsun.com/hyc/</a><br> =C2=A0 Chief Architect, OpenLDAP =C2=A0<a href=3D"http://www.openldap.org/p= roject/" target=3D"_blank">http://www.openldap.org/<u></u>project/</a><br> </blockquote></div>
--001a11c2ed64c222e504e2c14561--