openldap-bugs October 2014

openldap-bugs@openldap.org

14 participants
76 discussions

Re: (ITS#7960) [LMDB] wrong pointer used in mdb_cassert()
by jcd＠tribudubois.net 03 Oct '14

03 Oct '14

Le 10/03/2014 10:19 PM, Howard Chu a =E9crit : > jcd(a)tribudubois.net wrote: >> Full_Name: Jean-Christophe Dubois >> Version: 2.4.40 >> OS: Linux >> URL: ftp://ftp.openldap.org/incoming/ >> Submission from: (NULL) (78.235.240.156) >> >> >> In mdb_node_move() csrc is passed to mdb_cassert() at line 7396 when=20 >> it seems it >> should be cdst (as the operation is on cdst). >> >> https://gitorious.org/mdb/mdb/source/56c2c160be19c555e4c42e459c8608ffa= ae7b150:libraries/liblmdb/mdb.c#L7396=20 >> > > Irrelevant. The cursor is only passed to provide an env pointer, and=20 > both cursors point to the same env. Closing this ITS. Right. It is just that it would be nicer/more logical as it is not clear=20 beforehand what the pointer is passed for. An who knows what additional thing could be done in mdb_cassert in the=20 future. JC >> >> Patch available at URL below: >> >> https://github.com/jcdubois/lmdb/commit/41ed03c4584ac46dc233dcf60f93ad= db09962093=20 >> >> >> JC >> >> > >

1 0

Re: (ITS#7960) [LMDB] wrong pointer used in mdb_cassert()
by hyc＠symas.com 03 Oct '14

03 Oct '14

jcd(a)tribudubois.net wrote: > Full_Name: Jean-Christophe Dubois > Version: 2.4.40 > OS: Linux > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (78.235.240.156) > > > In mdb_node_move() csrc is passed to mdb_cassert() at line 7396 when it seems it > should be cdst (as the operation is on cdst). > > https://gitorious.org/mdb/mdb/source/56c2c160be19c555e4c42e459c8608ffaae7b1… Irrelevant. The cursor is only passed to provide an env pointer, and both cursors point to the same env. Closing this ITS. > > Patch available at URL below: > > https://github.com/jcdubois/lmdb/commit/41ed03c4584ac46dc233dcf60f93addb099… > > JC > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#7958) LMDB: LIFO-reclaiming, write-performance improvement & bugfixes
by leo＠yuriev.ru 03 Oct '14

03 Oct '14

2014-10-03 3:13 GMT+04:00 Howard Chu <hyc(a)symas.com>: >> commit fc409d89e0d9dde20f612e34c2a463c8a81ea000 >> Author: Leo Yuriev <leo(a)yuriev.ru> >> Date: 2014-09-20 06:51:04 +0400 >> >> EXTENSION - lmdb: more usefull info from mdb_stat tool. > > > A bit ambiguous. me_tail_txnid is actually the ID of the oldest reader, n= ot > the "last" reader. I'm not convinced of the value of this patch, since yo= u > can already view the readers list. I am agree that "tail" is NOT a best choice. But the main value of this patch is not to show a txn of oldest reader, but to show an info about pages usage. Especially the amount of pages which are "blocked" by oldest (laggard) reader, and how much pages are actually available. 2014-10-04 0:04 GMT+04:00 =D0=9B=D0=B5=D0=BE=D0=BD=D0=B8=D0=B4 =D0=AE=D1=80= =D1=8C=D0=B5=D0=B2 <leo(a)yuriev.ru>: > Fwd: (ITS#7841) high disk utilization > > 2014-10-03 3:13 GMT+04:00 Howard Chu <hyc(a)symas.com>: >>> commit 841059330fd44769e93eb4b937c3ce42654fad6f >>> Author: Leo Yuriev <leo(a)yuriev.ru> >>> Date: 2014-09-20 07:16:15 +0400 >>> >>> BUGFIX - lmdb: lock meta-pages in writemap-mode to avoid unexpect= ed >>> write, >>> before the data pages would be synchronized. >>> >>> Without locking the meta-pages may be writen by OS before other >>> data, >>> in this case database would be inconsistent. >> >> >> Seems unnecessary. Won't happen by default; could happen with MDB_NOSYNC= but >> that risk is already documented. > > We are using the combination: > envflags writemap nosync lifo > checkpoint 0 1 > > If the checkpoint is set in seconds, it gives us the assurance > consistent state database on disk. > However, without this patch meta-pages can be written by the kernel > before the data. > > In fact, for a full guarantee in case of death slapd process, > meta-page should be written explicitly. > But it requires a lot of changes and I do not do that. > >>> commit 0c168d0e63ed78d13df3fc8a42f3667335678639 >>> Author: Leo Yuriev <leo(a)yuriev.ru> >>> Date: 2014-09-20 10:13:28 +0400 >>> >>> FEATURE - lmdb: MDB_LIFORECLAIM & MDB_COALESCE modes. >>> >>> Reclaim FreeDB in LIFO order - this is a main feature. >>> Also aim to coalesce small FreeDFB records. >> >> Will spend more time looking at this closer. > > I would be suggested, but do not insist, review this patch on github. > >>> commit 8ddd63161aeb2689822d1a8d27385d62e4e341ae >>> Author: Leo Yuriev <leo(a)yuriev.ru> >>> Date: 2014-09-19 22:47:19 +0400 >>> >>> BUGFIX - lmdb: properly sync meta-pages in mdb_sync_env(). >>> >>> Meta-pages may be updated during data-syncing in mdb_sync_env(), >>> in this case database would be inconsistent. >>> >>> Check-and-retry if lead txn-id changed during flushing data in >>> mdb_sync_env(). >> >> Probably could simplify this, just obtain the write mutex unconditionall= y, >> then there's no need to loop or retry. But also, this depends on MDB_NOL= OCK >> - if that's set, then do no locking at all. > > I did so for reasons of performance and less a lock retention time. > > Retries will be if there an intensive flow of changes. > In this case it will be a lot of updated pages, the record which will > take some time. > > However, in subsequent iterations (if a transactions had committed > while there was a record), > the modified pages will be much fewer, and the sync will be quick. > > Thus (and it was seen in tests) even when a substantial amount of the > transactions, > usually only two iterations of the cycle, > without locking and flow of changes are not suspended. > >>> commit 147f41a8110f28456bc32123bde86d47183f9c0a >>> Author: Leo Yuriev <leo(a)yuriev.ru> >>> Date: 2014-09-04 16:01:15 +0400 >>> >>> FEATURE - lmdb: implementation of "checkpoint kbytes". >>> >>> Force flush when volume of the changes reached a configurable >>> threshold. >> >> >> Probably OK. Needs some typographical cleanup. Not sure "syncbytes" is a >> good name. > > Agree. > I just took the first choice and try to retaining the style. > Ideas? > >>> commit fb82a0b688f4c31313d0790415feda8aaa18651c >>> Author: Leo Yuriev <leo(a)yuriev.ru> >>> Date: 2014-09-04 15:18:16 +0400 >>> >>> CHANGE - lmdb-backend: checkpoint-interval in seconds instead of >>> minutes. >> >> >> Gratuitous change. We used minutes since the BDB backend uses minutes, a= nd >> the intention was to maintain parallel functionality. What's the >> justification for this change? > > As I had wrote above, we are using the combination: > envflags writemap nosync lifo > checkpoint 0 1 > > If the interval is specified in minutes, then it can not be set less > than one minute. > But it's too big amount of time to allow lost the updates. > > However, setting the synchronization interval of one second, > we reduce the amount of losses in the event of an accident to an > acceptable level, > while the load on the storage system is acceptable even for a large > flow of updates. > > As a result, I have not found a better solution than simply replace > the minutes by the seconds. > >>> commit fc409d89e0d9dde20f612e34c2a463c8a81ea000 >>> Author: Leo Yuriev <leo(a)yuriev.ru> >>> Date: 2014-09-20 06:51:04 +0400 >>> >>> EXTENSION - lmdb: more usefull info from mdb_stat tool. >> >> >> A bit ambiguous. me_tail_txnid is actually the ID of the oldest reader, = not >> the "last" reader. I'm not convinced of the value of this patch, since y= ou >> can already view the readers list. > > I am agree then "tail" is a best choice. > But the main value of this patch is not to show a txn of oldest > reader, but to show an info about pages usage. > Especially the amount of pages which are "blocked" by oldest (laggard) > reader, and how much pages are actually available. > >> -- >> -- Howard Chu >> CTO, Symas Corp. http://www.symas.com >> Director, Highland Sun http://highlandsun.com/hyc/ >> Chief Architect, OpenLDAP http://www.openldap.org/project/ > > Thank you in advance. > BR. > Leonid Yuriev.

1 0

(ITS#7960) [LMDB] wrong pointer used in mdb_cassert()
by jcd＠tribudubois.net 03 Oct '14

03 Oct '14

Full_Name: Jean-Christophe Dubois Version: 2.4.40 OS: Linux URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (78.235.240.156) In mdb_node_move() csrc is passed to mdb_cassert() at line 7396 when it seems it should be cdst (as the operation is on cdst). https://gitorious.org/mdb/mdb/source/56c2c160be19c555e4c42e459c8608ffaae7b1… Patch available at URL below: https://github.com/jcdubois/lmdb/commit/41ed03c4584ac46dc233dcf60f93addb099… JC

1 0

Re: (ITS#7841) high disk utilization
by leo＠yuriev.ru 03 Oct '14

03 Oct '14

As directed by Kurt Zeilenga (Executive Director, Kurt(a)openldap.org) I was re-submitted the new ITS#7958 with updated IPR statement. http://www.openldap.org/its/index.cgi/Incoming?id=7958;selectid=7958 Best regards, Leonid.

1 0

Re: (ITS#7959) [LMDB] check fstat return value.
by hyc＠symas.com 03 Oct '14

03 Oct '14

jcd(a)tribudubois.net wrote: > Full_Name: Jean-Christophe Dubois > Version: 2.4.40 > OS: Linux > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (78.235.240.156) > > > in mdb_env_copyfd0() the function fstat() is called but the return value is not > checked. > > There is no reason not to check if the system call is successful as the result > is used just after. > > Patch available at URL below: > > https://github.com/jcduboi2F2Flmdb/commit/fda581a6dd2e56fac4cab2aa872753f6f… Thanks, fixed in mdb.master. Please read http://www.openldap.org/devel/contributing.html re: submission formats. > > JC > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#7958) LMDB: LIFO-reclaiming, write-performance improvement & bugfixes
by leo＠yuriev.ru 03 Oct '14

03 Oct '14

Fwd: (ITS#7841) high disk utilization 2014-10-03 3:13 GMT+04:00 Howard Chu <hyc(a)symas.com>: >> commit 841059330fd44769e93eb4b937c3ce42654fad6f >> Author: Leo Yuriev <leo(a)yuriev.ru> >> Date: 2014-09-20 07:16:15 +0400 >> >> BUGFIX - lmdb: lock meta-pages in writemap-mode to avoid unexpected >> write, >> before the data pages would be synchronized. >> >> Without locking the meta-pages may be writen by OS before other >> data, >> in this case database would be inconsistent. > > > Seems unnecessary. Won't happen by default; could happen with MDB_NOSYNC but > that risk is already documented. We are using the combination: envflags writemap nosync lifo checkpoint 0 1 If the checkpoint is set in seconds, it gives us the assurance consistent state database on disk. However, without this patch meta-pages can be written by the kernel before the data. In fact, for a full guarantee in case of death slapd process, meta-page should be written explicitly. But it requires a lot of changes and I do not do that. >> commit 0c168d0e63ed78d13df3fc8a42f3667335678639 >> Author: Leo Yuriev <leo(a)yuriev.ru> >> Date: 2014-09-20 10:13:28 +0400 >> >> FEATURE - lmdb: MDB_LIFORECLAIM & MDB_COALESCE modes. >> >> Reclaim FreeDB in LIFO order - this is a main feature. >> Also aim to coalesce small FreeDFB records. > > Will spend more time looking at this closer. I would be suggested, but do not insist, review this patch on github. >> commit 8ddd63161aeb2689822d1a8d27385d62e4e341ae >> Author: Leo Yuriev <leo(a)yuriev.ru> >> Date: 2014-09-19 22:47:19 +0400 >> >> BUGFIX - lmdb: properly sync meta-pages in mdb_sync_env(). >> >> Meta-pages may be updated during data-syncing in mdb_sync_env(), >> in this case database would be inconsistent. >> >> Check-and-retry if lead txn-id changed during flushing data in >> mdb_sync_env(). > > Probably could simplify this, just obtain the write mutex unconditionally, > then there's no need to loop or retry. But also, this depends on MDB_NOLOCK > - if that's set, then do no locking at all. I did so for reasons of performance and less a lock retention time. Retries will be if there an intensive flow of changes. In this case it will be a lot of updated pages, the record which will take some time. However, in subsequent iterations (if a transactions had committed while there was a record), the modified pages will be much fewer, and the sync will be quick. Thus (and it was seen in tests) even when a substantial amount of the transactions, usually only two iterations of the cycle, without locking and flow of changes are not suspended. >> commit 147f41a8110f28456bc32123bde86d47183f9c0a >> Author: Leo Yuriev <leo(a)yuriev.ru> >> Date: 2014-09-04 16:01:15 +0400 >> >> FEATURE - lmdb: implementation of "checkpoint kbytes". >> >> Force flush when volume of the changes reached a configurable >> threshold. > > > Probably OK. Needs some typographical cleanup. Not sure "syncbytes" is a > good name. Agree. I just took the first choice and try to retaining the style. Ideas? >> commit fb82a0b688f4c31313d0790415feda8aaa18651c >> Author: Leo Yuriev <leo(a)yuriev.ru> >> Date: 2014-09-04 15:18:16 +0400 >> >> CHANGE - lmdb-backend: checkpoint-interval in seconds instead of >> minutes. > > > Gratuitous change. We used minutes since the BDB backend uses minutes, and > the intention was to maintain parallel functionality. What's the > justification for this change? As I had wrote above, we are using the combination: envflags writemap nosync lifo checkpoint 0 1 If the interval is specified in minutes, then it can not be set less than one minute. But it's too big amount of time to allow lost the updates. However, setting the synchronization interval of one second, we reduce the amount of losses in the event of an accident to an acceptable level, while the load on the storage system is acceptable even for a large flow of updates. As a result, I have not found a better solution than simply replace the minutes by the seconds. >> commit fc409d89e0d9dde20f612e34c2a463c8a81ea000 >> Author: Leo Yuriev <leo(a)yuriev.ru> >> Date: 2014-09-20 06:51:04 +0400 >> >> EXTENSION - lmdb: more usefull info from mdb_stat tool. > > > A bit ambiguous. me_tail_txnid is actually the ID of the oldest reader, not > the "last" reader. I'm not convinced of the value of this patch, since you > can already view the readers list. I am agree then "tail" is a best choice. But the main value of this patch is not to show a txn of oldest reader, but to show an info about pages usage. Especially the amount of pages which are "blocked" by oldest (laggard) reader, and how much pages are actually available. > -- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/ Thank you in advance. BR. Leonid Yuriev.

1 0

(ITS#7959) [LMDB] check fstat return value.
by jcd＠tribudubois.net 03 Oct '14

03 Oct '14

Full_Name: Jean-Christophe Dubois Version: 2.4.40 OS: Linux URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (78.235.240.156) in mdb_env_copyfd0() the function fstat() is called but the return value is not checked. There is no reason not to check if the system call is successful as the result is used just after. Patch available at URL below: https://github.com/jcduboi2F2Flmdb/commit/fda581a6dd2e56fac4cab2aa872753f6f… JC

1 0

(ITS#7958) LMDB: LIFO-reclaiming, write-performance improvement & bugfixes
by leo＠yuriev.ru 03 Oct '14

03 Oct '14

Full_Name: Leonid Yuriev Version: 2.4.40 OS: RHEL7 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (31.130.36.33) Solution for: ITS#7841 and "OpenLDAP + LMDB Back-End - request 300719-14-EXO" When using LMDB as a backend under the heavy load with add/modify/delete transactions, a huge number of disk writes is generated. In generally this patchset give a bonus of 10-100 times write-performance at the cost of consistency on disk in a one second. 1. Adds a configurable LIFO-policy for reclaiming of FreeDB records. Thus, only a small subset of pages will be updated and re-written on disk repetitive. This allow storage subsystem to effective combine such disk writes. As a result write-performance grow up to 100 times in case of write-back cache or "writemap" mode. 2. Checkpoints with consistency and a second exactness. It is possible and very useful the following settings, for example: envflags writemap nosync lifo checkpoint 0 1 3. Related bugfixes and minor extensions. -- The attached files is derived from OpenLDAP Software. All of the modifications to OpenLDAP Software represented in the following patch(es) were developed by Peter-Service LLC, Moscow, Russia. Peter-Service LLC has not assigned rights and/or interest in this work to any party. I, Leonid Yuriev am authorized by Peter-Service LLC, my employer, to release this work under the following terms. Peter-Service LLC hereby places the following modifications to OpenLDAP Software (and only these modifications) into the public domain. Hence, these modifications may be freely used and/or redistributed for any purpose with or without attribution and/or other notice. commit 841059330fd44769e93eb4b937c3ce42654fad6f Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-20 07:16:15 +0400 BUGFIX - lmdb: lock meta-pages in writemap-mode to avoid unexpected write, before the data pages would be synchronized. Without locking the meta-pages may be writen by OS before other data, in this case database would be inconsistent. commit 6240c3350e8bd86337c7e41722cf6a38881f15e7 Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-12 01:32:13 +0400 BUGFIX - lmdb: reordering of instructions which update the txn in a meta-page. Without "volatile" or memory-barrier compiler may reorder instructions for update the "mm_txnid" field in meta-page in "writemap" mode. From the reader's point of view this cause a short time interval when the transaction is corrupted. commit accef62de7fe5660f870f4c5da319a2a8098b2fb Author: Leo Yuriev <leo(a)yuriev.ru> Date: 14-0-09-21 02:29:50 +0400 BUGFIX - lmdb: 'volatile' to important fields, which may be updated by readers asynchronously. Without 'volatile' compiler may eliminate a mdb_find_oldest() calls. commit bb83e03cf1b8bceee64550229c3becbdd5400680 Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-19 20:18:17 +0400 FEATURE - lmdb-backend: support config for 'lifo' and 'coalesce' envflags. commit 0c168d0e63ed78d13df3fc8a42f3667335678639 Author: Leo Yuriev <leo(a)yuriev.ru> Date: 202014-09-20 10:13:28 +0400 FEATURE - lmdb: MDB_LIFORECLAIM & MDB_COALESCE modes. Reclaim FreeDB in LIFO order - this is a main feature. Also aim to coalesce small FreeDFB records. commit 8ddd63161aeb2689822d1a8d27385d62e4e341ae Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-19 22:47:19 +0400 BUGFIX - lmdb: properly sync meta-pages in mdb_sync_env(). Meta-pages may be updated during data-syncing in mdb_sync_env(), in this case database would be inconsistent. Check-and-retry if lead txn-id changed during flushing data in mdb_sync_env(). commit 908677f989588d06b9f00620576dea3c5c8675d7 Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-04 16:10:05 +0400 FEATURE - lmdb-backend: support for "checkpoint kbytes" config-option. commit 147f41a8110f28456bc32123bde86d47183f9c0a Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-04 16:01:15 +0400 FEATURE - lmdb: implementation of "checkpoint kbytes". A0A Force flush when volume of the changes reached a configurable threshold. commit fb82a0b688f4c31313d0790415feda8aaa18651c Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-04 15:18:16 +0400 CHANGE - lmdb-backend: checkpoint-interval in seconds instead of minutes. commit fc409d89e0d9dde20f612e34c2a463c8a81ea000 Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-20 06:51:04 +0400 EXTENSION - lmdb: more usefull info from mdb_stat tool. commit ccc7da690ffbff440643295b945fdf7886f48c97 Author: Leo Yuriev <leo(a)yuriev.ru> Date: 2014-09-05 00:19:16 +0400 TRIVIA - lmdb: clean testdb-dir while "make test".

1 0

Re: (ITS#7957) [LMDB] critical error after compacting an empty database
by quanah＠zimbra.com 02 Oct '14

02 Oct '14

--On Friday, October 03, 2014 2:45 AM +0000 engin.lee(a)gmail.com wrote: > Full_Name: Engin Lee > Version: LMDB 0.9.14 2014/9/30 > OS: Linux > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (59.124.230.221) Likely <http://www.openldap.org/its/index.cgi/?findid=7956> ? This was just fixed in mdb.master --Quanah -- Quanah Gibson-Mount Server Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration

1 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-bugs October 2014