openldap-bugs

openldap-bugs@openldap.org

1 participants
21065 discussions

Re: (ITS#7842) mdb readers/writer exclusion protocol, as implemented, is racy.
by rsbx＠acm.org 17 May '14

17 May '14

On 05/16/2014 08:09 PM, Howard Chu wrote: > rsbx(a)acm.org wrote: >> Full_Name: Raymond S Brand >> Version: mdb.master >> OS: linux >> URL: ftp://ftp.openldap.org/incoming/ >> Submission from: (NULL) (107.145.137.13) >> >> >> [I would have uploaded this to the ftp.openldap.org but there was no >> space.] >> >> As implemented, the mdb readers/writer exclusion protocol has a race >> condition >> that could result in a writer reclaiming and over-writing pages still >> in use >> by a reader. > > Yes, you're basically correct. But this is unlikely in practice because > the reader thread has no blocking calls in its codepath, while the > writer must acquire locks etc. Generally the only way for two write txns > to complete while a reader thread is stalled is if you explicitly send a > SIGSTOP to the reader thread while it's in its critical section. Or just unlucky process scheduling. > >> >> Pseudo code of the snapshot locking protocols of reader and writer >> transactions. The labeled sections, for the purposes of this analysis, >> can be >> assumed to execute atomically. >> >> READER >> ====== >> >> *R1* t0 = meta[0].txid; t1 = meta[1].txid >> if (t1 > t0) >> t0 = t1 >> >> *R2* readers[me].txid = t0 >> >> *R3* snapshot_root = meta[t0&1].snapshot >> >> *R4* /* lookup data */ >> >> *R5* readers[me].txid = -1 // Release snapshot >> >> WRITER >> ====== >> >> *W1* lock(writer, EXCLUSIVE) >> curr_txid = meta[0].txid >> if (meta[1].txid > curr_txid) >> curr_txid = meta[1].txid >> >> *W2* oldest = curr_txid >> for (i=0; i<reader_slots; i++) >> t = readers[i].txid >> if (t != -1 && t < oldest) >> oldest = t >> >> *W3* reclaim_old_pages(oldest) >> >> /* >> ** Commit new transaction >> */ >> >> *W4* unlock(writer) >> >> --------------------------------------------------------------------------- >> >> >> Adversarial scheduling analysis: >> >> The following timeline demonstrates that a writer can reclaim pages in >> use by a >> reader. >> >> T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 >> R1 R2 R3 R4 >> W1 W2 W3 W4 * W1 W2 W3 >> >> T0 Reader has found the latest txid, Xn. >> >> T1->T4 Reader is not scheduled to run and a write transaction >> commits, Xn+1. >> >> T5 Reader is still not scheduled and another write transaction >> starts. >> >> T6 Writer finds that the oldest referenced transaction is the last >> committed transaction, Xn+1. >> >> T7 Reader records Xn in the reader table. >> >> T8 Reader gets the snapshot root page for transaction Xn. >> >> T9 Writer, believing that Xn+1 is the oldest reference transaction, >> reclaims the pages of transaction Xn. >> >> >> T10 Reader attempts to navigate pages of a transaction, Xn, that >> has been >> reclaimed and "Bad Things Happen". >> >> In fact, bad things will happen if an even number of write >> transactions commit >> between W4 and W1, represented by the '*', in the timeline above. >> >> The fix is to search for the highest txid in R1 and between R2 and R3. >> Optionally, followed by recording the txid, of the snapshot root found >> in R3, >> in the reader's reader table slot to, possibly, increase the number of >> reclaimable transactions. >> >> The lack of compiler and memory barriers in the implementation of the >> locking >> protocol is also of concern. >> >> Beyond the above, the code in mdb_txn_renew0() after the >> "/* Copy the DB info and flags */" >> comment appears to have a number of data races. >> >> --- >> libraries/liblmdb/mdb.c | 24 +++++++++++++++--------- >> 1 file changed, 15 insertions(+), 9 deletions(-) >> >> diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c >> index e0f551e..908417c 100644 >> --- a/libraries/liblmdb/mdb.c >> +++ b/libraries/liblmdb/mdb.c >> @@ -533,11 +533,11 @@ typedef struct MDB_rxbody { >> * started from so we can avoid overwriting any data >> used in that >> * particular version. >> */ >> - txnid_t mrb_txnid; >> + volatile txnid_t mrb_txnid; >> /** The process ID of the process owning this reader txn. */ >> - MDB_PID_T mrb_pid; >> + volatile MDB_PID_T mrb_pid; >> /** The thread ID of the thread owning this txn. */ >> - pthread_t mrb_tid; >> + volatile pthread_t mrb_tid; >> } MDB_rxbody; >> >> /** The actual reader record, with cacheline padding. */ >> @@ -585,12 +585,12 @@ typedef struct MDB_txbody { >> * This is recorded here only for convenience; >> the value >> can always >> * be determined by reading the main database >> meta pages. >> */ >> - txnid_t mtb_txnid; >> + volatile txnid_t mtb_txnid; >> /** The number of slots that have been used in the >> reader >> table. >> * This always records the maximum count, it is not >> decremented >> * when readers release their slots. >> */ >> - unsigned mtb_numreaders; >> + volatile unsigned mtb_numreaders; >> } MDB_txbody; >> >> /** The actual reader table definition. */ >> @@ -854,7 +854,7 @@ typedef struct MDB_meta { >> /** Any persistent environment flags. @ref mdb_env */ >> #define mm_flags mm_dbs[0].md_flags >> pgno_t mm_last_pg; /**< last >> used page in >> file */ >> - txnid_t mm_txnid; /**< txnid that >> committed this page */ >> + volatile txnid_t mm_txnid; /**< txnid that >> committed this >> page */ >> } MDB_meta; >> >> /** Buffer for a stack-allocated meta page. >> @@ -2303,10 +2303,12 @@ mdb_txn_renew0(MDB_txn *txn) >> >> if (txn->mt_flags & MDB_TXN_RDONLY) { >> if (!ti) { >> + /* No readers table; app responsible for >> locking */ >> meta = env->me_metas[ mdb_env_pick_meta(env) ]; >> txn->mt_txnid = meta->mm_txnid; >> txn->mt_u.reader = NULL; >> } else { >> + /* Has readers table */ >> MDB_reader *r = (env->me_flags & MDB_NOTLS) ? >> txn->mt_u.reader : >> pthread_getspecific(env->me_txkey); >> if (r) { >> @@ -2347,8 +2349,12 @@ mdb_txn_renew0(MDB_txn *txn) >> return rc; >> } >> } >> - txn->mt_txnid = r->mr_txnid = ti->mti_txnid; >> txn->mt_u.reader = r; >> + r->mr_txnid = ti->mti_txnid; >> + >> + /* Really need a memory barrier here */ >> + >> + txn->mt_txnid = r->mr_txnid = ti->mti_txnid; >> meta = env->me_metas[txn->mt_txnid & 1]; >> } >> } else { >> @@ -2376,10 +2382,10 @@ mdb_txn_renew0(MDB_txn *txn) >> } >> >> /* Copy the DB info and flags */ >> - memcpy(txn->mt_dbs, meta->mm_dbs, 2 * sizeof(MDB_db)); >> + memcpy(txn->mt_dbs, meta->mm_dbs, 2 * sizeof(MDB_db)); /* >> Racy */ >> >> /* Moved to here to avoid a data race in read TXNs */ >> - txn->mt_next_pgno = meta->mm_last_pg+1; >> + txn->mt_next_pgno = meta->mm_last_pg+1; /* >> Racy */ >> >> for (i=2; i<txn->mt_numdbs; i++) { >> x = env->me_dbflags[i]; >> > >

1 0

Re: (ITS#7842) mdb readers/writer exclusion protocol, as implemented, is racy.
by hyc＠symas.com 16 May '14

16 May '14

rsbx(a)acm.org wrote: > Full_Name: Raymond S Brand > Version: mdb.master > OS: linux > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (107.145.137.13) > > > [I would have uploaded this to the ftp.openldap.org but there was no space.] > > As implemented, the mdb readers/writer exclusion protocol has a race condition > that could result in a writer reclaiming and over-writing pages still in use > by a reader. Yes, you're basically correct. But this is unlikely in practice because the reader thread has no blocking calls in its codepath, while the writer must acquire locks etc. Generally the only way for two write txns to complete while a reader thread is stalled is if you explicitly send a SIGSTOP to the reader thread while it's in its critical section. > > Pseudo code of the snapshot locking protocols of reader and writer > transactions. The labeled sections, for the purposes of this analysis, can be > assumed to execute atomically. > > READER > ====== > > *R1* t0 = meta[0].txid; t1 = meta[1].txid > if (t1 > t0) > t0 = t1 > > *R2* readers[me].txid = t0 > > *R3* snapshot_root = meta[t0&1].snapshot > > *R4* /* lookup data */ > > *R5* readers[me].txid = -1 // Release snapshot > > WRITER > ====== > > *W1* lock(writer, EXCLUSIVE) > curr_txid = meta[0].txid > if (meta[1].txid > curr_txid) > curr_txid = meta[1].txid > > *W2* oldest = curr_txid > for (i=0; i<reader_slots; i++) > t = readers[i].txid > if (t != -1 && t < oldest) > oldest = t > > *W3* reclaim_old_pages(oldest) > > /* > ** Commit new transaction > */ > > *W4* unlock(writer) > > --------------------------------------------------------------------------- > > Adversarial scheduling analysis: > > The following timeline demonstrates that a writer can reclaim pages in use by a > reader. > > T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 > R1 R2 R3 R4 > W1 W2 W3 W4 * W1 W2 W3 > > T0 Reader has found the latest txid, Xn. > > T1->T4 Reader is not scheduled to run and a write transaction commits, Xn+1. > > T5 Reader is still not scheduled and another write transaction starts. > > T6 Writer finds that the oldest referenced transaction is the last > committed transaction, Xn+1. > > T7 Reader records Xn in the reader table. > > T8 Reader gets the snapshot root page for transaction Xn. > > T9 Writer, believing that Xn+1 is the oldest reference transaction, > reclaims the pages of transaction Xn. > > > T10 Reader attempts to navigate pages of a transaction, Xn, that has been > reclaimed and "Bad Things Happen". > > In fact, bad things will happen if an even number of write transactions commit > between W4 and W1, represented by the '*', in the timeline above. > > The fix is to search for the highest txid in R1 and between R2 and R3. > Optionally, followed by recording the txid, of the snapshot root found in R3, > in the reader's reader table slot to, possibly, increase the number of > reclaimable transactions. > > The lack of compiler and memory barriers in the implementation of the locking > protocol is also of concern. > > Beyond the above, the code in mdb_txn_renew0() after the > "/* Copy the DB info and flags */" > comment appears to have a number of data races. > > --- > libraries/liblmdb/mdb.c | 24 +++++++++++++++--------- > 1 file changed, 15 insertions(+), 9 deletions(-) > > diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c > index e0f551e..908417c 100644 > --- a/libraries/liblmdb/mdb.c > +++ b/libraries/liblmdb/mdb.c > @@ -533,11 +533,11 @@ typedef struct MDB_rxbody { > * started from so we can avoid overwriting any data used in that > * particular version. > */ > - txnid_t mrb_txnid; > + volatile txnid_t mrb_txnid; > /** The process ID of the process owning this reader txn. */ > - MDB_PID_T mrb_pid; > + volatile MDB_PID_T mrb_pid; > /** The thread ID of the thread owning this txn. */ > - pthread_t mrb_tid; > + volatile pthread_t mrb_tid; > } MDB_rxbody; > > /** The actual reader record, with cacheline padding. */ > @@ -585,12 +585,12 @@ typedef struct MDB_txbody { > * This is recorded here only for convenience; the value > can always > * be determined by reading the main database meta pages. > */ > - txnid_t mtb_txnid; > + volatile txnid_t mtb_txnid; > /** The number of slots that have been used in the reader > table. > * This always records the maximum count, it is not > decremented > * when readers release their slots. > */ > - unsigned mtb_numreaders; > + volatile unsigned mtb_numreaders; > } MDB_txbody; > > /** The actual reader table definition. */ > @@ -854,7 +854,7 @@ typedef struct MDB_meta { > /** Any persistent environment flags. @ref mdb_env */ > #define mm_flags mm_dbs[0].md_flags > pgno_t mm_last_pg; /**< last used page in > file */ > - txnid_t mm_txnid; /**< txnid that > committed this page */ > + volatile txnid_t mm_txnid; /**< txnid that committed this > page */ > } MDB_meta; > > /** Buffer for a stack-allocated meta page. > @@ -2303,10 +2303,12 @@ mdb_txn_renew0(MDB_txn *txn) > > if (txn->mt_flags & MDB_TXN_RDONLY) { > if (!ti) { > + /* No readers table; app responsible for locking */ > meta = env->me_metas[ mdb_env_pick_meta(env) ]; > txn->mt_txnid = meta->mm_txnid; > txn->mt_u.reader = NULL; > } else { > + /* Has readers table */ > MDB_reader *r = (env->me_flags & MDB_NOTLS) ? > txn->mt_u.reader : > pthread_getspecific(env->me_txkey); > if (r) { > @@ -2347,8 +2349,12 @@ mdb_txn_renew0(MDB_txn *txn) > return rc; > } > } > - txn->mt_txnid = r->mr_txnid = ti->mti_txnid; > txn->mt_u.reader = r; > + r->mr_txnid = ti->mti_txnid; > + > + /* Really need a memory barrier here */ > + > + txn->mt_txnid = r->mr_txnid = ti->mti_txnid; > meta = env->me_metas[txn->mt_txnid & 1]; > } > } else { > @@ -2376,10 +2382,10 @@ mdb_txn_renew0(MDB_txn *txn) > } > > /* Copy the DB info and flags */ > - memcpy(txn->mt_dbs, meta->mm_dbs, 2 * sizeof(MDB_db)); > + memcpy(txn->mt_dbs, meta->mm_dbs, 2 * sizeof(MDB_db)); /* Racy */ > > /* Moved to here to avoid a data race in read TXNs */ > - txn->mt_next_pgno = meta->mm_last_pg+1; > + txn->mt_next_pgno = meta->mm_last_pg+1; /* Racy */ > > for (i=2; i<txn->mt_numdbs; i++) { > x = env->me_dbflags[i]; > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#7844) LMDB Delete Cursor inconsistencies
by armon.dadgar＠gmail.com 16 May '14

16 May '14

--53767cd5_238e1f29_1704c Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline I checked on gitorious only hours previously and fca18d2 was the latest v= ersion. However, I now see=C2=A02764360. I=E2=80=99ve updated to the latest versi= on and can no longer reproduce the issue. Thanks=21 Best Regards, Armon Dadgar =46rom:=C2=A0Howard Chu hyc=40symas.com Reply:=C2=A0Howard Chu hyc=40symas.com Date:=C2=A0May 16, 2014 at 1:54:18 PM To:=C2=A0Armon Dadgar armon.dadgar=40gmail.com, openldap-its=40openldap.o= rg openldap-its=40openldap.org Subject:=C2=A0 Re: (ITS=237844) LMDB Delete Cursor inconsistencies =20 Armon Dadgar wrote: =20 > When I run the attached test case on my machine, I=E2=80=99m hitting th= e failing case. =20 > =20 > Here is the test output: https://gist.github.com/armon/e529d7909fe30112= 6fc6 =20 You seem to be running obsolete code. I checked your gomdb github repo, y= ou're =20 using mdb.c at rev fca18d2. Current mdb.master is 4844a72. =20 > My steps: =20 > =24 clang 7844.c mdb.c midl.c =20 > =24 mkdir testdb =20 > =24 ./a.out =20 > =20 > =24 clang -v =20 > Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) =20 > Target: x86=5F64-apple-darwin13.1.0 =20 > Thread model: posix =20 > =20 > =24 uname -a =20 > Darwin Armons-MacBook-Air.local 13.1.0 Darwin Kernel Version 13.1.0: Th= u Jan =20 > 16 19:40:37 PST 2014; root:xnu-2422.90.20=7E2/RELEASE=5FX86=5F64 x86=5F= 64 =20 > =20 > Best Regards, =20 > Armon Dadgar =20 > =20 > =46rom: Howard Chu hyc=40symas.com <mailto:hyc=40symas.com> =20 > Reply: Howard Chu hyc=40symas.com <mailto:hyc=40symas.com> =20 > Date: May 16, 2014 at 11:06:49 AM =20 > To: armon.dadgar=40gmail.com armon.dadgar=40gmail.com =20 > <mailto:armon.dadgar=40gmail.com>, openldap-its=40openldap.org =20 > openldap-its=40openldap.org <mailto:openldap-its=40openldap.org> =20 > Subject: Re: (ITS=237844) LMDB Delete Cursor inconsistencies =20 > =20 >> armon.dadgar=40gmail.com wrote: =20 >> > --5372ac85=5F8edbdab=5F1271 =20 >> > Content-Type: text/plain; charset=3D=22utf-8=22 =20 >> > Content-Transfer-Encoding: quoted-printable =20 >> > Content-Disposition: inline =20 >> > =20 >> > =3D46or now, we have application code to retry the delete until no f= urther =3D =20 >> > rows are removed. =20 >> > Still, it would be nice to have this resolved (and tested) in master= =3D21 =20 >> =20 >> Unable to reproduce the issue. I've attached my test program based on = your =20 >> description. =20 >> =20 >> -- =20 >> -- Howard Chu =20 >> CTO, Symas Corp. http://www.symas.com =20 >> Director, Highland Sun http://highlandsun.com/hyc/ =20 >> Chief Architect, OpenLDAP http://www.openldap.org/project/ =20 >> ----------------------------------------------------------------------= -------- =20 -- =20 -- Howard Chu =20 CTO, Symas Corp. http://www.symas.com =20 Director, Highland Sun http://highlandsun.com/hyc/ =20 Chief Architect, OpenLDAP http://www.openldap.org/project/ =20 --53767cd5_238e1f29_1704c Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline <html><head><style>body=7Bfont-family:Helvetica,Arial;font-size:13px=7D</= style></head><body style=3D=22word-wrap: break-word; -webkit-nbsp-mode: s= pace; -webkit-line-break: after-white-space;=22><div id=3D=22bloop=5Fcust= omfont=22 style=3D=22font-family:Helvetica,Arial;font-size:13px; color: r= gba(0,0,0,1.0); margin: 0px; line-height: auto;=22>I checked on gitorious= only hours previously and fca18d2 was the latest version.</div><div id=3D= =22bloop=5Fcustomfont=22 style=3D=22font-family:Helvetica,Arial;font-size= :13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;=22>However= , I now see 2764360. I=E2=80=99ve updated to the latest version and = can no longer</div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22font-fam= ily:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; = line-height: auto;=22>reproduce the issue.</div><div id=3D=22bloop=5Fcust= omfont=22 style=3D=22font-family:Helvetica,Arial;font-size:13px; color: r= gba(0,0,0,1.0); margin: 0px; line-height: auto;=22><br></div><div id=3D=22= bloop=5Fcustomfont=22 style=3D=22font-family:Helvetica,Arial;font-size:13= px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;=22>Thanks=21<= /div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22font-family:Helvetica,= Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: a= uto;=22><br></div> <div id=3D=22bloop=5Fsign=5F1400274014470972928=22 cla= ss=3D=22bloop=5Fsign=22><div style=3D=22font-family:helvetica,arial;font-= size:13px=22>Best Regards,</div><div style=3D=22font-family:helvetica,ari= al;font-size:13px=22>Armon Dadgar<br></div></div> <div style=3D=22color:b= lack=22><br>=46rom: <span style=3D=22color:black=22>Howard Chu</span= > <a href=3D=22mailto:hyc=40symas.com=22>hyc=40symas.com</a><br>Reply:&nb= sp;<span style=3D=22color:black=22>Howard Chu</span> <a href=3D=22mailto:= hyc=40symas.com=22>hyc=40symas.com</a><br>Date: <span style=3D=22col= or:black=22>May 16, 2014 at 1:54:18 PM</span><br>To: <span style=3D=22= color:black=22>Armon Dadgar</span> <a href=3D=22mailto:armon.dadgar=40gma= il.com=22>armon.dadgar=40gmail.com</a>, <span style=3D=22color:black=22>o= penldap-its=40openldap.org</span> <a href=3D=22mailto:openldap-its=40open= ldap.org=22>openldap-its=40openldap.org</a><br>Subject: <span style=3D= =22color:black=22> Re: (ITS=237844) LMDB Delete Cursor inconsistencies <b= r></span></div><br> <blockquote type=3D=22cite=22 class=3D=22clean=5Fbq=22= ><span><div><div></div><div>Armon Dadgar wrote: <br>> When I run the attached test case on my machine, I=E2=80=99m hit= ting the failing case. <br>> <br>> Here is the test output: https://gist.github.com/armon/e529d7909= fe301126fc6 <br> <br>You seem to be running obsolete code. I checked your gomdb github rep= o, you're =20 <br>using mdb.c at rev fca18d2. Current mdb.master is 4844a72. <br> <br>> My steps: <br>> =24 clang 7844.c mdb.c midl.c <br>> =24 mkdir testdb <br>> =24 ./a.out <br>> <br>> =24 clang -v <br>> Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) <br>> Target: x86=5F64-apple-darwin13.1.0 <br>> Thread model: posix <br>> <br>> =24 uname -a <br>> Darwin Armons-MacBook-Air.local 13.1.0 Darwin Kernel Version 13.= 1.0: Thu Jan <br>> 16 19:40:37 PST 2014; root:xnu-2422.90.20=7E2/RELEASE=5FX86=5F64= x86=5F64 <br>> <br>> Best Regards, <br>> Armon Dadgar <br>> <br>> =46rom: Howard Chu hyc=40symas.com <mailto:hyc=40symas.com&gt= ; <br>> Reply: Howard Chu hyc=40symas.com <mailto:hyc=40symas.com>= <br>> Date: May 16, 2014 at 11:06:49 AM <br>> To: armon.dadgar=40gmail.com armon.dadgar=40gmail.com <br>> <mailto:armon.dadgar=40gmail.com>, openldap-its=40openldap= .org <br>> openldap-its=40openldap.org <mailto:openldap-its=40openldap.o= rg> <br>> Subject: Re: (ITS=237844) LMDB Delete Cursor inconsistencies <br>> <br>>> armon.dadgar=40gmail.com wrote: <br>>> > --5372ac85=5F8edbdab=5F1271 <br>>> > Content-Type: text/plain; charset=3D=22utf-8=22 <br>>> > Content-Transfer-Encoding: quoted-printable <br>>> > Content-Disposition: inline <br>>> > <br>>> > =3D46or now, we have application code to retry the dele= te until no further =3D <br>>> > rows are removed. <br>>> > Still, it would be nice to have this resolved (and test= ed) in master=3D21 <br>>> <br>>> Unable to reproduce the issue. I've attached my test program= based on your <br>>> description. <br>>> <br>>> -- <br>>> -- Howard Chu <br>>> CTO, Symas Corp. http://www.symas.com <br>>> Director, Highland Sun http://highlandsun.com/hyc/ <br>>> Chief Architect, OpenLDAP http://www.openldap.org/project/ <br>>> ------------------------------------------------------------= ------------------ <br> <br> <br>-- =20 <br> -- Howard Chu <br> CTO, Symas Corp. http://www.symas.com <br> Director, Highland Sun http://highlandsun.com/hyc/ <br> Chief Architect, OpenLDAP http://www.openldap.org/project/ <br></div></div></span></blockquote></body></html> --53767cd5_238e1f29_1704c--

1 0

Re: (ITS#7844) LMDB Delete Cursor inconsistencies
by hyc＠symas.com 16 May '14

16 May '14

Armon Dadgar wrote: > When I run the attached test case on my machine, Im hitting the failing case. > > Here is the test output: https://gist.github.com/armon/e529d7909fe301126fc6 You seem to be running obsolete code. I checked your gomdb github repo, you're using mdb.c at rev fca18d2. Current mdb.master is 4844a72. > My steps: > $ clang 7844.c mdb.c midl.c > $ mkdir testdb > $ ./a.out > > $ clang -v > Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) > Target: x86_64-apple-darwin13.1.0 > Thread model: posix > > $ uname -a > Darwin Armons-MacBook-Air.local 13.1.0 Darwin Kernel Version 13.1.0: Thu Jan > 16 19:40:37 PST 2014; root:xnu-2422.90.20~2/RELEASE_X86_64 x86_64 > > Best Regards, > Armon Dadgar > > From: Howard Chu hyc(a)symas.com <mailto:hyc@symas.com> > Reply: Howard Chu hyc(a)symas.com <mailto:hyc@symas.com> > Date: May 16, 2014 at 11:06:49 AM > To: armon.dadgar(a)gmail.com armon.dadgar(a)gmail.com > <mailto:armon.dadgar@gmail.com>, openldap-its(a)openldap.org > openldap-its(a)openldap.org <mailto:openldap-its@openldap.org> > Subject: Re: (ITS#7844) LMDB Delete Cursor inconsistencies > >> armon.dadgar(a)gmail.com wrote: >> > --5372ac85_8edbdab_1271 >> > Content-Type: text/plain; charset="utf-8" >> > Content-Transfer-Encoding: quoted-printable >> > Content-Disposition: inline >> > >> > =46or now, we have application code to retry the delete until no further = >> > rows are removed. >> > Still, it would be nice to have this resolved (and tested) in master=21 >> >> Unable to reproduce the issue. I've attached my test program based on your >> description. >> >> -- >> -- Howard Chu >> CTO, Symas Corp. http://www.symas.com >> Director, Highland Sun http://highlandsun.com/hyc/ >> Chief Architect, OpenLDAP http://www.openldap.org/project/ >> ------------------------------------------------------------------------------ -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#7844) LMDB Delete Cursor inconsistencies
by armon.dadgar＠gmail.com 16 May '14

16 May '14

--537675cf_2ae8944a_1704c Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline When I run the attached test case on my machine, I=E2=80=99m hitting the = failing case. Here is the test output:=C2=A0https://gist.github.com/armon/e529d7909fe30= 1126fc6 My steps: =24 clang 7844.c mdb.c midl.c =24 mkdir testdb =24 ./a.out =24 clang -v Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn) Target: x86=5F64-apple-darwin13.1.0 Thread model: posix =24 uname -a Darwin Armons-MacBook-Air.local 13.1.0 Darwin Kernel Version 13.1.0: Thu = Jan 16 19:40:37 PST 2014; root:xnu-2422.90.20=7E2/RELEASE=5FX86=5F64 x86=5F= 64 Best Regards, Armon Dadgar =46rom:=C2=A0Howard Chu hyc=40symas.com Reply:=C2=A0Howard Chu hyc=40symas.com Date:=C2=A0May 16, 2014 at 11:06:49 AM To:=C2=A0armon.dadgar=40gmail.com armon.dadgar=40gmail.com, openldap-its=40= openldap.org openldap-its=40openldap.org Subject:=C2=A0 Re: (ITS=237844) LMDB Delete Cursor inconsistencies =20 armon.dadgar=40gmail.com wrote: =20 > --5372ac85=5F8edbdab=5F1271 =20 > Content-Type: text/plain; charset=3D=22utf-8=22 =20 > Content-Transfer-Encoding: quoted-printable =20 > Content-Disposition: inline =20 > =20 > =3D46or now, we have application code to retry the delete until no furt= her =3D =20 > rows are removed. =20 > Still, it would be nice to have this resolved (and tested) in master=3D= 21 =20 Unable to reproduce the issue. I've attached my test program based on you= r =20 description. =20 -- =20 -- Howard Chu =20 CTO, Symas Corp. http://www.symas.com =20 Director, Highland Sun http://highlandsun.com/hyc/ =20 Chief Architect, OpenLDAP http://www.openldap.org/project/ =20 --537675cf_2ae8944a_1704c Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline <html><head><style>body=7Bfont-family:Helvetica,Arial;font-size:13px=7D</= style></head><body style=3D=22word-wrap: break-word; -webkit-nbsp-mode: s= pace; -webkit-line-break: after-white-space;=22><div id=3D=22bloop=5Fcust= omfont=22 style=3D=22font-family:Helvetica,Arial;font-size:13px; color: r= gba(0,0,0,1.0); margin: 0px; line-height: auto;=22>When I run the attache= d test case on my machine, I=E2=80=99m hitting the failing case.</div><di= v id=3D=22bloop=5Fcustomfont=22 style=3D=22font-family:Helvetica,Arial;fo= nt-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;=22>= <br></div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22font-family:Helve= tica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-heig= ht: auto;=22>Here is the test output: https://gist.github.com/armon/= e529d7909fe301126fc6</div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22f= ont-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin= : 0px; line-height: auto;=22><br></div><div id=3D=22bloop=5Fcustomfont=22= style=3D=22font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0= ,1.0); margin: 0px; line-height: auto;=22>My steps:</div><div id=3D=22blo= op=5Fcustomfont=22 style=3D=22font-family:Helvetica,Arial;font-size:13px;= color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;=22>=24 clang 784= 4.c mdb.c midl.c</div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22font-= family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0p= x; line-height: auto;=22>=24 mkdir testdb</div><div id=3D=22bloop=5Fcusto= mfont=22 style=3D=22font-family:Helvetica,Arial;font-size:13px; color: rg= ba(0,0,0,1.0); margin: 0px; line-height: auto;=22>=24 ./a.out</div><div i= d=3D=22bloop=5Fcustomfont=22 style=3D=22font-family:Helvetica,Arial;font-= size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;=22><br= ></div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22font-family:Helvetic= a,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height:= auto;=22><div id=3D=22bloop=5Fcustomfont=22 style=3D=22margin: 0px;=22>=24= clang -v</div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22margin: 0px;= =22>Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)</div><= div id=3D=22bloop=5Fcustomfont=22 style=3D=22margin: 0px;=22>Target: x86=5F= 64-apple-darwin13.1.0</div><div id=3D=22bloop=5Fcustomfont=22 style=3D=22= margin: 0px;=22>Thread model: posix</div><div id=3D=22bloop=5Fcustomfont=22= style=3D=22margin: 0px;=22><br></div><div id=3D=22bloop=5Fcustomfont=22 = style=3D=22margin: 0px;=22><div id=3D=22bloop=5Fcustomfont=22 style=3D=22= margin: 0px;=22>=24 uname -a</div><div id=3D=22bloop=5Fcustomfont=22 styl= e=3D=22margin: 0px;=22>Darwin Armons-MacBook-Air.local 13.1.0 Darwin Kern= el Version 13.1.0: Thu Jan 16 19:40:37 PST 2014; root:xnu-2422.90.20=7E2/= RELEASE=5FX86=5F64 x86=5F64</div></div></div><div id=3D=22bloop=5Fcustomf= ont=22 style=3D=22font-family:Helvetica,Arial;font-size:13px; color: rgba= (0,0,0,1.0); margin: 0px; line-height: auto;=22><br></div> <div id=3D=22b= loop=5Fsign=5F1400272071967389952=22 class=3D=22bloop=5Fsign=22><div styl= e=3D=22font-family:helvetica,arial;font-size:13px=22>Best Regards,</div><= div style=3D=22font-family:helvetica,arial;font-size:13px=22>Armon Dadgar= <br></div></div> <div style=3D=22color:black=22><br>=46rom: <span st= yle=3D=22color:black=22>Howard Chu</span> <a href=3D=22mailto:hyc=40symas= .com=22>hyc=40symas.com</a><br>Reply: <span style=3D=22color:black=22= >Howard Chu</span> <a href=3D=22mailto:hyc=40symas.com=22>hyc=40symas.com= </a><br>Date: <span style=3D=22color:black=22>May 16, 2014 at 11:06:= 49 AM</span><br>To: <span style=3D=22color:black=22>armon.dadgar=40g= mail.com</span> <a href=3D=22mailto:armon.dadgar=40gmail.com=22>armon.dad= gar=40gmail.com</a>, <span style=3D=22color:black=22>openldap-its=40openl= dap.org</span> <a href=3D=22mailto:openldap-its=40openldap.org=22>openlda= p-its=40openldap.org</a><br>Subject: <span style=3D=22color:black=22= > Re: (ITS=237844) LMDB Delete Cursor inconsistencies <br></span></div><b= r> <blockquote type=3D=22cite=22 class=3D=22clean=5Fbq=22><span><div><div= ></div><div>armon.dadgar=40gmail.com wrote: <br>> --5372ac85=5F8edbdab=5F1271 <br>> Content-Type: text/plain; charset=3D=22utf-8=22 <br>> Content-Transfer-Encoding: quoted-printable <br>> Content-Disposition: inline <br>> <br>> =3D46or now, we have application code to retry the delete until = no further =3D <br>> rows are removed. <br>> Still, it would be nice to have this resolved (and tested) in ma= ster=3D21 <br> <br>Unable to reproduce the issue. I've attached my test program based on= your =20 <br>description. <br> <br>-- =20 <br> -- Howard Chu <br> CTO, Symas Corp. http://www.symas.com <br> Director, Highland Sun http://highlandsun.com/hyc/ <br> Chief Architect, OpenLDAP http://www.openldap.org/project/ <br><hr></div></div></span></blockquote></body></html> --537675cf_2ae8944a_1704c--

1 0

Re: (ITS#7844) LMDB Delete Cursor inconsistencies
by hyc＠symas.com 16 May '14

16 May '14

This is a multi-part message in MIME format. --------------010403060500020806080002 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit armon.dadgar(a)gmail.com wrote: > --5372ac85_8edbdab_1271 > Content-Type: text/plain; charset="utf-8" > Content-Transfer-Encoding: quoted-printable > Content-Disposition: inline > > =46or now, we have application code to retry the delete until no further = > rows are removed. > Still, it would be nice to have this resolved (and tested) in master=21 Unable to reproduce the issue. I've attached my test program based on your description. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ --------------010403060500020806080002 Content-Type: text/plain; charset=UTF-8; name="7844.c.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="7844.c.txt" /* 7844.c - memory-mapped database tester/toy */ /* * Copyright 2014 Howard Chu, Symas Corp. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted only as authorized by the OpenLDAP * Public License. * * A copy of this license is available in the file LICENSE in the * top-level directory of the distribution or, alternatively, at * <http://www.OpenLDAP.org/license.html>. */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <uuid/uuid.h> #include "lmdb.h" #define E(expr) CHECK((rc = (expr)) == MDB_SUCCESS, #expr) #define RES(err, expr) ((rc = expr) == (err) || (CHECK(!rc, #expr), 0)) #define CHECK(test, msg) ((test) ? (void)0 : ((void)fprintf(stderr, \ "%s:%d: %s: %s\n", __FILE__, __LINE__, msg, mdb_strerror(rc)), abort())) #define UUID_LEN 36 typedef char kval[UUID_LEN*2+2]; char sval[100]; int main(int argc,char * argv[]) { int i = 0, j = 0, rc; MDB_env *env; MDB_dbi kdbi, ddbi; MDB_val key, ikey, data; MDB_txn *txn; MDB_stat mst; MDB_cursor *cursor, *cur2; int count; unsigned long lastid = 0; kval *keys; char prefix[UUID_LEN+1], *ptr; uuid_t uuid; count = 128; keys = (kval *)malloc(count*sizeof(kval)); E(mdb_env_create(&env)); E(mdb_env_set_mapsize(env, 128*1024*1024)); E(mdb_env_set_maxdbs(env, 4)); E(mdb_env_open(env, "./testdb", MDB_NOSYNC, 0664)); E(mdb_txn_begin(env, NULL, 0, &txn)); E(mdb_open(txn, "kvs", MDB_CREATE|MDB_INTEGERKEY, &ddbi)); E(mdb_open(txn, "kvs_id_idx", MDB_CREATE, &kdbi)); key.mv_size = UUID_LEN*2+1; data.mv_size = sizeof(sval); data.mv_data = sval; ikey.mv_size = sizeof(lastid); ikey.mv_data = &lastid; uuid_generate(uuid); uuid_unparse(uuid, prefix); printf("Adding %d values\n", count); for (i=0;i<count;i++) { uuid_generate(uuid); ptr = (char *)&keys[i]; memcpy(ptr, prefix, UUID_LEN); ptr[UUID_LEN] = '/'; uuid_unparse(uuid, ptr+UUID_LEN+1); ptr[UUID_LEN*2+1] = '\0'; printf("\t%s\n", ptr); key.mv_data = ptr; sprintf(sval, "Some stuff for key %s", ptr); lastid++; if (RES(MDB_KEYEXIST, mdb_put(txn, kdbi, &key, &ikey, MDB_NOOVERWRITE))) { j++; ikey.mv_size = sizeof(lastid); ikey.mv_data = &lastid; } E(mdb_put(txn, ddbi, &ikey, &data, MDB_NOOVERWRITE)); } if (j) printf("%d duplicates skipped\n", j); E(mdb_txn_commit(txn)); E(mdb_env_stat(env, &mst)); uuid_generate(uuid); uuid_unparse(uuid, prefix); printf("Adding %d values\n", count); E(mdb_txn_begin(env, NULL, 0, &txn)); for (i=0;i<count;i++) { uuid_generate(uuid); ptr = (char *)&keys[i]; memcpy(ptr, prefix, UUID_LEN); ptr[UUID_LEN] = '/'; uuid_unparse(uuid, ptr+UUID_LEN+1); ptr[UUID_LEN*2+1] = '\0'; printf("\t%s\n", ptr); key.mv_data = ptr; sprintf(sval, "Some stuff for key %s", ptr); lastid++; if (RES(MDB_KEYEXIST, mdb_put(txn, kdbi, &key, &ikey, MDB_NOOVERWRITE))) { j++; ikey.mv_size = sizeof(lastid); ikey.mv_data = &lastid; } E(mdb_put(txn, ddbi, &ikey, &data, MDB_NOOVERWRITE)); } if (j) printf("%d duplicates skipped\n", j); E(mdb_txn_commit(txn)); E(mdb_txn_begin(env, NULL, 0, &txn)); E(mdb_cursor_open(txn, kdbi, &cursor)); E(mdb_cursor_open(txn, ddbi, &cur2)); key.mv_size = UUID_LEN; key.mv_data = prefix; j = 0; i = count; rc = mdb_cursor_get(cursor, &key, &ikey, MDB_SET_RANGE); while (rc == 0) { #if 0 printf("key: %p %.*s, data: %p %.*s\n", key.mv_data, (int) key.mv_size, (char *) key.mv_data, data.mv_data, (int) data.mv_size, (char *) data.mv_data); #endif E(mdb_cursor_get(cur2, &ikey, &data, MDB_SET)); E(mdb_cursor_del(cur2, 0)); E(mdb_cursor_del(cursor, 0)); j++; i--; if (i == 0) break; rc = mdb_cursor_get(cursor, &key, &ikey, MDB_NEXT); } mdb_cursor_close(cursor); mdb_cursor_close(cur2); E(mdb_txn_commit(txn)); printf("Deleted %d values\n", j); mdb_env_close(env); return 0; } --------------010403060500020806080002--

1 0

(ITS#7856) TLS_REQCERT try is same as TLS_REQCERT hard?
by pguenther＠proofpoint.com 16 May '14

16 May '14

Full_Name: Philip Guenther Version: 2.4.39 OS: OpenBSD URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (76.253.0.176) The ldap.conf(5) manpage says this about TLS_REQCERT TLS_REQCERT <level> Specifies what checks to perform on server certificates in a TLS session, if any. The <level> can be specified as one of the following keywords: ... try The server certificate is requested. If no certificate is provided, the session proceeds normally. If a bad certificate is provided, the session is immediately terminated. demand | hard These keywords are equivalent. The server certificate is requested. If no certificate is provided, or a bad certificate is provided, the session is immediately terminated. This is the default setting. In testing, I can find no difference in behavior between the 'try' and 'hard' keywords. For the ldap* tools, both 'try' and 'hard' seem to place the same requirements on the server. What does "if no certificate is provided" *mean* in terms of server and/or client configuration?

1 0

Re: (ITS#7705) mdb back-end segfaults sporadically with paged searches
by hyc＠symas.com 15 May '14

15 May '14

openldap(a)semyon.org wrote: > On 13-09-23 10:16 PM, Howard Chu wrote: > >> If you backup the DB with slapcat and reload it on another server with >> slapadd, can you still reproduce the fault on the copy? > > Unfortunately, yes. It breaks trying to retrieve the same record as > before, on the same instruction in back_mdb: > > yesterday: > [126393.233615] slapd[19244]: segfault at 7f499f38d000 ip > 00007f4da0e9b6a1 sp 00007f499f1fa540 error 6 in > back_mdb-2.4.so.2.9.2[7f4da0e80000+34000] > > today: > [517860.953779] slapd[24871]: segfault at 7fdeb50a0000 ip > 00007fe2b63ad6a1 sp 00007fdeb4f0d540 error 6 in > back_mdb-2.4.so.2.9.2[7fe2b6392000+34000] > > Semyon A fix is now in git master. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

Re: (ITS#7855) Update to latest autotools to support new architecture ppc64le
by hyc＠symas.com 15 May '14

15 May '14

rajesh(a)linux.vnet.ibm.com wrote: > Full_Name: Rajeshkumar S > Version: 2.4.39 > OS: Ubuntu 14.04 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (122.248.161.59) > > > > I am trying to build the openldap package from the source following the > release tarball from > ftp://ftp.openldap.org/pub/OpenLDAP/openldap-release/openldap-2.4.39.tgz. in a > new architecture ppc64le ( IBM PowerPC Little endian ). As the config.guess and > libtool did not have the required patches to identify this new architecture, I > did autoreconf -f -i in my build system whose latest automake and libtool has > the patches > of ppc64le. > autoreconf fails with automake errors as shown below > > automake: error: no 'Makefile.am' found for any configure output > autoreconf: automake failed with exit status: 1 That's to be expected, since there certainly are no Makefile.am files in our source tree. We don't use automake. This is not a bug. > As discussed in the mailing list, Quanah Gibson-Mount suggested me to request an > update for auto-tools to support new architectures. > > The automake with version >= 1.13.4 has the correct config.guess with the > required ppc64le fixes. > Regarding the libtool, the last release is more than 2 years ago I believe, we > managed to get an alpha source release which has all the latest patches to > support the new architecture ppc64le > > An alpha version of libtool with ppc64le support is available at > ftp://alpha.gnu.org/gnu/libtool/libtool-2.4.2.418.tar.gz. Libtool is so perpetually problematic that we will certainly not integrate an alpha version into our source tree. > Hence I request for an update of the auto tools in your build system to support > these new architectures. > > Thanks and Regards > Rajeshkumar S > Linux Technology Center, IBM > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/

1 0

(ITS#7855) Update to latest autotools to support new architecture ppc64le
by rajesh＠linux.vnet.ibm.com 15 May '14

15 May '14

Full_Name: Rajeshkumar S Version: 2.4.39 OS: Ubuntu 14.04 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (122.248.161.59) I am trying to build the openldap package from the source following the release tarball from ftp://ftp.openldap.org/pub/OpenLDAP/openldap-release/openldap-2.4.39.tgz. in a new architecture ppc64le ( IBM PowerPC Little endian ). As the config.guess and libtool did not have the required patches to identify this new architecture, I did autoreconf -f -i in my build system whose latest automake and libtool has the patches of ppc64le. autoreconf fails with automake errors as shown below automake: error: no 'Makefile.am' found for any configure output autoreconf: automake failed with exit status: 1 As discussed in the mailing list, Quanah Gibson-Mount suggested me to request an update for auto-tools to support new architectures. The automake with version >= 1.13.4 has the correct config.guess with the required ppc64le fixes. Regarding the libtool, the last release is more than 2 years ago I believe, we managed to get an alpha source release which has all the latest patches to support the new architecture ppc64le An alpha version of libtool with ppc64le support is available at ftp://alpha.gnu.org/gnu/libtool/libtool-2.4.2.418.tar.gz. Hence I request for an update of the auto tools in your build system to support these new architectures. Thanks and Regards Rajeshkumar S Linux Technology Center, IBM

1 0

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

openldap-bugs