openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I have some interest in exposing the functionality of mdb_set_compare (and mdb_set_dupsort). But it is proving difficult and I have a question about the function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with callbacks comes with restrictions[2][3]. I believe it impossible to bind these functions way that is flexible, as one would expect. A potential change to LMDB that would make binding drastically easier is having MDB_cmp_func to take a third "context" argument with type void*. Then a binding could safely use an arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to the MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to be added using a different function type, one accepting three arguments?
Thanks for the consideration.
Cheers, - Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go [2] Cgo pointer restrictions -- https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md [3] Cgo documentation -- https://golang.org/cmd/cgo/
Bryan Matsuo wrote:
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I have some interest in exposing the functionality of mdb_set_compare (and mdb_set_dupsort). But it is proving difficult and I have a question about the function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with callbacks comes with restrictions[2][3]. I believe it impossible to bind these functions way that is flexible, as one would expect. A potential change to LMDB that would make binding drastically easier is having MDB_cmp_func to take a third "context" argument with type void*. Then a binding could safely use an arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to the MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to be added using a different function type, one accepting three arguments?
This is a recurrent question.
http://www.openldap.org/its/index.cgi/Incoming?id=8276 http://www.openldap.org/its/index.cgi/Incoming?id=8124 http://www.openldap.org/its/index.cgi/Incoming?id=7980
My response now is the same as before - see ITS#8124 for example.
The compare functions are on the critical path for performance. Jumping thru a bunch of interpreter glue layers to invoke them is a really bad idea.
I might be convinced if someone were to post some benchmarks showing that an implementation of an MDB_cmp_func in their chosen language is actually tolerable.
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go [2] Cgo pointer restrictions -- https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md [3] Cgo documentation -- https://golang.org/cmd/cgo/
Bryan Matsuo wrote:
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I have some interest in exposing the functionality of mdb_set_compare (and mdb_set_dupsort). But it is proving difficult and I have a question about the function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with callbacks comes with restrictions[2][3]. I believe it impossible to bind these functions way that is flexible, as one would expect. A potential change to LMDB that would make binding drastically easier is having MDB_cmp_func to take a third "context" argument with type void*. Then a binding could safely use an arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to the MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to be added using a different function type, one accepting three arguments?
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go [2] Cgo pointer restrictions -- https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md [3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to be passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go func and pointer values for future callbacks from C to Go. This works today but is not permitted by the invariant. It is hard to detect. One safe approach is: Go code that wants to preserve funcs/pointers stores them into a map indexed by an int. Go code calls the C code, passing the int, which the C code may store freely. When the C code wants to call into Go, it passes the int to a Go function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function pointer to C, but the only way for C to use it is to call some *other* Go function? Sorry but there is no other Go function for the mdb_cmp() function to call, the only one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer doesn't achieve anything. If this isn't what you're referring to, then please explain exactly what you hope to achieve with this context pointer.
Actually I’m not commenting on binding Go but I’m voting for a context passed to the compare function.
I fully agree that the compare function is part of the critical path. But as I need to define custom indexes with compound keys the compare functions varies and it would be impractical to predefine for any compound key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx { MDB_val md_name; /**< name of the database */ MDB_cmp_func *md_cmp; /**< function for comparing keys */ void *md_cmpctx; /** user-provided context for md_cmp **/ MDB_cmp_func *md_dcmp; /**< function for comparing data items */ void *md_dcmpctx;/** user-provided context for md_dcmp **/ MDB_rel_func *md_rel; /**< user relocate function */ void *md_relctx; /**< user-provided context for md_rel */ } MDB_dbx;
The following is a draft (not tested yet) of a generic compare function. The context contains a compare specification which is a null terminated list of <type><order> pairs.
// compareSpec <type><order>...<null> int key_comp_generic(const MDB_val *a, const MDB_val *b, char *compareSpec) { int result = 0; char *pa = a->mv_data; char *pb = b->mv_data;
while (1) { switch (*compareSpec++) { case 0: break; case INT32_KEY : { unsigned int va = *(unsigned int *)pa; unsigned int vb = *(unsigned int *)pb;
if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; }
if (result != 0) { break; } else { pa += 4; pb += 4; } } case INT64_KEY : { unsigned long long va = *(unsigned long long *)pa; unsigned long long vb = *(unsigned long long *)pb;
if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; }
if (result != 0) { break; } else { pa += 8; pb += 8; } } case STRING_KEY : { unsigned int la = *(unsigned int *)pa; unsigned int lb = *(unsigned int *)pb;
pa += 4; pb += 4;
if (*compareSpec++ == ASCENDING_ORDER) { result = strncmp(pa, pb, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la < lb) ? -1 : la > lb; } } else { result = strncmp(pb, pa, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la > lb) ? -1 : la < lb; } }
if (result != 0) { break; } else { pa += la; pb += lb; } } } }
return result; }
Regards Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" <openldap-technical-bounces@openldap.org on behalf of hyc@symas.com> wrote:
Bryan Matsuo wrote:
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I have some interest in exposing the functionality of mdb_set_compare (and mdb_set_dupsort). But it is proving difficult and I have a question about the function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with callbacks comes with restrictions[2][3]. I believe it impossible to bind these functions way that is flexible, as one would expect. A potential change to LMDB that would make binding drastically easier is having MDB_cmp_func to take a third "context" argument with type void*. Then a binding could safely use an arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to the MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to be added using a different function type, one accepting three arguments?
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go [2] Cgo pointer restrictions -- https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md [3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to be passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go func and pointer values for future callbacks from C to Go. This works today but is not permitted by the invariant. It is hard to detect. One safe approach is: Go code that wants to preserve funcs/pointers stores them into a map indexed by an int. Go code calls the C code, passing the int, which the C code may store freely. When the C code wants to call into Go, it passes the int to a Go function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function pointer to C, but the only way for C to use it is to call some *other* Go function? Sorry but there is no other Go function for the mdb_cmp() function to call, the only one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer doesn't achieve anything. If this isn't what you're referring to, then please explain exactly what you hope to achieve with this context pointer.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Juerg,
That is is interesting proposal. As an alternative to letting users hook up arbitrary Go function for comparison, I have also thought about the possibility of providing a small set of static C functions usable for comparison. A flexible compound key comparison function like this could fit well into that idea.
Howard,
Sorry I did not find the issues mentioned in previous searches.
I understand the concern about such a hot code path. I'm not sure that Go would see acceptable performance.
But, Go is not an interpreted language (though there is glue). And while I'm not positive about the performance of Go in this area you seem to dismiss comparison functions in any other language. Is it unreasonable to think that comparison functions written in other compiled languages like Rust, Nim, or ML variants would also be impractically slow?
I also believe you have misunderstood the practical problems of passing Go function pointers to C. But to be fair, I think the wording of that quoted paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp() function to
call, the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb...
The callback passed to mdb_reader_list is always the same static function because correctly calling a Go function from C requires an annotated static Go function. The context argument allows dispatch to the correct Go function that was configured at runtime. I believe that is the "other" Go function you mentioned.
The implementation would be similar for mdb_set_compare. The callback would always be the same static function which handles the dynamic dispatch.
Cheers, - Bryan
On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher juerg.bircher@helmedica.com wrote:
Actually I’m not commenting on binding Go but I’m voting for a context passed to the compare function.
I fully agree that the compare function is part of the critical path. But as I need to define custom indexes with compound keys the compare functions varies and it would be impractical to predefine for any compound key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx { MDB_val md_name; /**< name of the database */ MDB_cmp_func *md_cmp; /**< function for comparing keys */ void *md_cmpctx; /** user-provided context for md_cmp **/ MDB_cmp_func *md_dcmp; /**< function for comparing data items */ void *md_dcmpctx;/** user-provided context for md_dcmp **/ MDB_rel_func *md_rel; /**< user relocate function */ void *md_relctx; /**< user-provided context for md_rel */ } MDB_dbx;
The following is a draft (not tested yet) of a generic compare function. The context contains a compare specification which is a null terminated list of <type><order> pairs.
// compareSpec <type><order>...<null> int key_comp_generic(const MDB_val *a, const MDB_val *b, char *compareSpec) { int result = 0; char *pa = a->mv_data; char *pb = b->mv_data;
while (1) { switch (*compareSpec++) { case 0: break; case INT32_KEY : { unsigned int va = *(unsigned int *)pa; unsigned int vb = *(unsigned int *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 4; pb += 4; } } case INT64_KEY : { unsigned long long va = *(unsigned long long *)pa; unsigned long long vb = *(unsigned long long *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 8; pb += 8; } } case STRING_KEY : { unsigned int la = *(unsigned int *)pa; unsigned int lb = *(unsigned int *)pb; pa += 4; pb += 4; if (*compareSpec++ == ASCENDING_ORDER) { result = strncmp(pa, pb, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la < lb) ? -1 : la > lb; } } else { result = strncmp(pb, pa, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la > lb) ? -1 : la < lb; } } if (result != 0) { break; } else { pa += la; pb += lb; } } } } return result;
}
Regards Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" < openldap-technical-bounces@openldap.org on behalf of hyc@symas.com> wrote:
Bryan Matsuo wrote:
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I
have
some interest in exposing the functionality of mdb_set_compare (and mdb_set_dupsort). But it is proving difficult and I have a question
about the
function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C
APIs with
callbacks comes with restrictions[2][3]. I believe it impossible to
bind these
functions way that is flexible, as one would expect. A potential change
to
LMDB that would make binding drastically easier is having MDB_cmp_func
to take
a third "context" argument with type void*. Then a binding could safely
use an
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to
the
MDB_cmp_func signature? Otherwise would it be acceptable for a variant
API to
be added using a different function type, one accepting three arguments?
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go [2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to be passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go func and pointer values for future callbacks from C to Go. This works today but is
not
permitted by the invariant. It is hard to detect. One safe approach is: Go code that wants to preserve funcs/pointers stores them into a map indexed
by
an int. Go code calls the C code, passing the int, which the C code may
store
freely. When the C code wants to call into Go, it passes the int to a Go function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function pointer
to C,
but the only way for C to use it is to call some *other* Go function?
Sorry
but there is no other Go function for the mdb_cmp() function to call, the
only
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer doesn't
achieve
anything. If this isn't what you're referring to, then please explain
exactly
what you hope to achieve with this context pointer.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Bryan Matsuo wrote:
I also believe you have misunderstood the practical problems of passing Go function pointers to C. But to be fair, I think the wording of that quoted paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp() function to call,
the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb...
The callback passed to mdb_reader_list is always the same static function because correctly calling a Go function from C requires an annotated static Go function. The context argument allows dispatch to the correct Go function that was configured at runtime. I believe that is the "other" Go function you mentioned.
The implementation would be similar for mdb_set_compare. The callback would always be the same static function which handles the dynamic dispatch.
And this is the part that I really don't understand - why not let the user pass their own static Go function? Bouncing through a dispatcher like this will be even slower.
From the peanut gallery: Small set of static C functions is probably the
way to go. If I understand correctly, which I probablay don't, the mismatch between green threads and OS threads means there's a lot of expensive stack-switching involved in go->C->go execution.
On Thu, Oct 29, 2015 at 5:28 PM, Bryan Matsuo bryan.matsuo@gmail.com wrote:
Juerg,
That is is interesting proposal. As an alternative to letting users hook up arbitrary Go function for comparison, I have also thought about the possibility of providing a small set of static C functions usable for comparison. A flexible compound key comparison function like this could fit well into that idea.
Howard,
Sorry I did not find the issues mentioned in previous searches.
I understand the concern about such a hot code path. I'm not sure that Go would see acceptable performance.
But, Go is not an interpreted language (though there is glue). And while I'm not positive about the performance of Go in this area you seem to dismiss comparison functions in any other language. Is it unreasonable to think that comparison functions written in other compiled languages like Rust, Nim, or ML variants would also be impractically slow?
I also believe you have misunderstood the practical problems of passing Go function pointers to C. But to be fair, I think the wording of that quoted paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp() function to
call, the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb...
The callback passed to mdb_reader_list is always the same static function because correctly calling a Go function from C requires an annotated static Go function. The context argument allows dispatch to the correct Go function that was configured at runtime. I believe that is the "other" Go function you mentioned.
The implementation would be similar for mdb_set_compare. The callback would always be the same static function which handles the dynamic dispatch.
Cheers,
- Bryan
On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher juerg.bircher@helmedica.com wrote:
Actually I’m not commenting on binding Go but I’m voting for a context passed to the compare function.
I fully agree that the compare function is part of the critical path. But as I need to define custom indexes with compound keys the compare functions varies and it would be impractical to predefine for any compound key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx { MDB_val md_name; /**< name of the database */ MDB_cmp_func *md_cmp; /**< function for comparing keys */ void *md_cmpctx; /** user-provided context for md_cmp **/ MDB_cmp_func *md_dcmp; /**< function for comparing data items */ void *md_dcmpctx;/** user-provided context for md_dcmp **/ MDB_rel_func *md_rel; /**< user relocate function */ void *md_relctx; /**< user-provided context for md_rel */ } MDB_dbx;
The following is a draft (not tested yet) of a generic compare function. The context contains a compare specification which is a null terminated list of <type><order> pairs.
// compareSpec <type><order>...<null> int key_comp_generic(const MDB_val *a, const MDB_val *b, char *compareSpec) { int result = 0; char *pa = a->mv_data; char *pb = b->mv_data;
while (1) { switch (*compareSpec++) { case 0: break; case INT32_KEY : { unsigned int va = *(unsigned int *)pa; unsigned int vb = *(unsigned int *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 4; pb += 4; } } case INT64_KEY : { unsigned long long va = *(unsigned long long *)pa; unsigned long long vb = *(unsigned long long *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 8; pb += 8; } } case STRING_KEY : { unsigned int la = *(unsigned int *)pa; unsigned int lb = *(unsigned int *)pb; pa += 4; pb += 4; if (*compareSpec++ == ASCENDING_ORDER) { result = strncmp(pa, pb, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la < lb) ? -1 : la > lb; } } else { result = strncmp(pb, pa, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la > lb) ? -1 : la < lb; } } if (result != 0) { break; } else { pa += la; pb += lb; } } } } return result;
}
Regards Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" < openldap-technical-bounces@openldap.org on behalf of hyc@symas.com> wrote:
Bryan Matsuo wrote:
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and
I have
some interest in exposing the functionality of mdb_set_compare (and mdb_set_dupsort). But it is proving difficult and I have a question
about the
function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C
APIs with
callbacks comes with restrictions[2][3]. I believe it impossible to
bind these
functions way that is flexible, as one would expect. A potential
change to
LMDB that would make binding drastically easier is having MDB_cmp_func
to take
a third "context" argument with type void*. Then a binding could
safely use an
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to
the
MDB_cmp_func signature? Otherwise would it be acceptable for a variant
API to
be added using a different function type, one accepting three
arguments?
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go [2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to be passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go func and pointer values for future callbacks from C to Go. This works today but
is not
permitted by the invariant. It is hard to detect. One safe approach is:
Go
code that wants to preserve funcs/pointers stores them into a map
indexed by
an int. Go code calls the C code, passing the int, which the C code may
store
freely. When the C code wants to call into Go, it passes the int to a Go function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function pointer
to C,
but the only way for C to use it is to call some *other* Go function?
Sorry
but there is no other Go function for the mdb_cmp() function to call,
the only
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer doesn't
achieve
anything. If this isn't what you're referring to, then please explain
exactly
what you hope to achieve with this context pointer.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
After digging it seems that it will continue to be possible for users to safely pass static Go function references. It is quite a burden on the user. But I will continue to think about it.
Jay, noted. I am open to exploring that direction. Though as was pointed out earlier a library of static functions can be made more useful (if somewhat slower) when a context object can configure their behavior. Before reading that suggestion I was uncertain how much useful functionality could be exposed as a library. I am writing general purpose bindings, so I would prefer a function library be fairly generic.
Howard, do you have thoughts on the proposal from Juerg regarding a compound-key comparison function implemented using a context value?
On Thu, Oct 29, 2015 at 8:49 PM Jay Booth jaybooth@gmail.com wrote:
From the peanut gallery: Small set of static C functions is probably the way to go. If I understand correctly, which I probablay don't, the mismatch between green threads and OS threads means there's a lot of expensive stack-switching involved in go->C->go execution.
On Thu, Oct 29, 2015 at 5:28 PM, Bryan Matsuo bryan.matsuo@gmail.com wrote:
Juerg,
That is is interesting proposal. As an alternative to letting users hook up arbitrary Go function for comparison, I have also thought about the possibility of providing a small set of static C functions usable for comparison. A flexible compound key comparison function like this could fit well into that idea.
Howard,
Sorry I did not find the issues mentioned in previous searches.
I understand the concern about such a hot code path. I'm not sure that Go would see acceptable performance.
But, Go is not an interpreted language (though there is glue). And while I'm not positive about the performance of Go in this area you seem to dismiss comparison functions in any other language. Is it unreasonable to think that comparison functions written in other compiled languages like Rust, Nim, or ML variants would also be impractically slow?
I also believe you have misunderstood the practical problems of passing Go function pointers to C. But to be fair, I think the wording of that quoted paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp() function to
call, the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb...
The callback passed to mdb_reader_list is always the same static function because correctly calling a Go function from C requires an annotated static Go function. The context argument allows dispatch to the correct Go function that was configured at runtime. I believe that is the "other" Go function you mentioned.
The implementation would be similar for mdb_set_compare. The callback would always be the same static function which handles the dynamic dispatch.
Cheers,
- Bryan
On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher juerg.bircher@helmedica.com wrote:
Actually I’m not commenting on binding Go but I’m voting for a context passed to the compare function.
I fully agree that the compare function is part of the critical path. But as I need to define custom indexes with compound keys the compare functions varies and it would be impractical to predefine for any compound key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx { MDB_val md_name; /**< name of the database */ MDB_cmp_func *md_cmp; /**< function for comparing keys */ void *md_cmpctx; /** user-provided context for md_cmp **/ MDB_cmp_func *md_dcmp; /**< function for comparing data items */ void *md_dcmpctx;/** user-provided context for md_dcmp **/ MDB_rel_func *md_rel; /**< user relocate function */ void *md_relctx; /**< user-provided context for md_rel */ } MDB_dbx;
The following is a draft (not tested yet) of a generic compare function. The context contains a compare specification which is a null terminated list of <type><order> pairs.
// compareSpec <type><order>...<null> int key_comp_generic(const MDB_val *a, const MDB_val *b, char *compareSpec) { int result = 0; char *pa = a->mv_data; char *pb = b->mv_data;
while (1) { switch (*compareSpec++) { case 0: break; case INT32_KEY : { unsigned int va = *(unsigned int *)pa; unsigned int vb = *(unsigned int *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 4; pb += 4; } } case INT64_KEY : { unsigned long long va = *(unsigned long long *)pa; unsigned long long vb = *(unsigned long long *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 8; pb += 8; } } case STRING_KEY : { unsigned int la = *(unsigned int *)pa; unsigned int lb = *(unsigned int *)pb; pa += 4; pb += 4; if (*compareSpec++ == ASCENDING_ORDER) { result = strncmp(pa, pb, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la < lb) ? -1 : la > lb; } } else { result = strncmp(pb, pa, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la > lb) ? -1 : la < lb; } } if (result != 0) { break; } else { pa += la; pb += lb; } } } } return result;
}
Regards Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" < openldap-technical-bounces@openldap.org on behalf of hyc@symas.com> wrote:
Bryan Matsuo wrote:
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and
I have
some interest in exposing the functionality of mdb_set_compare (and mdb_set_dupsort). But it is proving difficult and I have a question
about the
function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C
APIs with
callbacks comes with restrictions[2][3]. I believe it impossible to
bind these
functions way that is flexible, as one would expect. A potential
change to
LMDB that would make binding drastically easier is having
MDB_cmp_func to take
a third "context" argument with type void*. Then a binding could
safely use an
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to
the
MDB_cmp_func signature? Otherwise would it be acceptable for a
variant API to
be added using a different function type, one accepting three
arguments?
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go [2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to
be
passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go func and pointer values for future callbacks from C to Go. This works today but
is not
permitted by the invariant. It is hard to detect. One safe approach is:
Go
code that wants to preserve funcs/pointers stores them into a map
indexed by
an int. Go code calls the C code, passing the int, which the C code may
store
freely. When the C code wants to call into Go, it passes the int to a Go function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function pointer
to C,
but the only way for C to use it is to call some *other* Go function?
Sorry
but there is no other Go function for the mdb_cmp() function to call,
the only
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer doesn't
achieve
anything. If this isn't what you're referring to, then please explain
exactly
what you hope to achieve with this context pointer.
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Bryan Matsuo wrote:
After digging it seems that it will continue to be possible for users to safely pass static Go function references. It is quite a burden on the user. But I will continue to think about it.
Jay, noted. I am open to exploring that direction. Though as was pointed out earlier a library of static functions can be made more useful (if somewhat slower) when a context object can configure their behavior. Before reading that suggestion I was uncertain how much useful functionality could be exposed as a library. I am writing general purpose bindings, so I would prefer a function library be fairly generic.
Howard, do you have thoughts on the proposal from Juerg regarding a compound-key comparison function implemented using a context value?
I remain unconvinced. key-comparison is still per-DB; a comparison specifier saves some space but at the expense of time - more compare ops, more branching per key compare. Dedicated functions are still the better way to go. Plus, naive constructs like in the emailed example are easy to get wrong - his example will never terminate because he only breaks from the switch statement, nothing breaks from the while(1) loop. It is attempting to be too clever, when a more straightforward approach will be faster and obviously bug-free.
On Thu, Oct 29, 2015 at 8:49 PM Jay Booth <jaybooth@gmail.com mailto:jaybooth@gmail.com> wrote:
From the peanut gallery: Small set of static C functions is probably the way to go. If I understand correctly, which I probablay don't, the mismatch between green threads and OS threads means there's a lot of expensive stack-switching involved in go->C->go execution. On Thu, Oct 29, 2015 at 5:28 PM, Bryan Matsuo <bryan.matsuo@gmail.com <mailto:bryan.matsuo@gmail.com>> wrote: Juerg, That is is interesting proposal. As an alternative to letting users hook up arbitrary Go function for comparison, I have also thought about the possibility of providing a small set of static C functions usable for comparison. A flexible compound key comparison function like this could fit well into that idea. Howard, Sorry I did not find the issues mentioned in previous searches. I understand the concern about such a hot code path. I'm not sure that Go would see acceptable performance. But, Go is not an interpreted language (though there is glue). And while I'm not positive about the performance of Go in this area you seem to dismiss comparison functions in any other language. Is it unreasonable to think that comparison functions written in other compiled languages like Rust, Nim, or ML variants would also be impractically slow? I also believe you have misunderstood the practical problems of passing Go function pointers to C. But to be fair, I think the wording of that quoted paragraph could be better. >Sorry but there is no other Go function for the mdb_cmp() function to call, the only one it knows about is the function pointer that you pass. It may be of benefit to see how the I've used the context argument in a binding being developed for the mdb_reader_list function. https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c The callback passed to mdb_reader_list is always the same static function because correctly calling a Go function from C requires an annotated static Go function. The context argument allows dispatch to the correct Go function that was configured at runtime. I believe that is the "other" Go function you mentioned. The implementation would be similar for mdb_set_compare. The callback would always be the same static function which handles the dynamic dispatch. Cheers, - Bryan On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher <juerg.bircher@helmedica.com <mailto:juerg.bircher@helmedica.com>> wrote: Actually I’m not commenting on binding Go but I’m voting for a context passed to the compare function. I fully agree that the compare function is part of the critical path. But as I need to define custom indexes with compound keys the compare functions varies and it would be impractical to predefine for any compound key combination a c function. The compare context would be stored on the struct MDB_dbx. typedef struct MDB_dbx { MDB_val md_name; /**< name of the database */ MDB_cmp_func *md_cmp; /**< function for comparing keys */ void *md_cmpctx; /** user-provided context for md_cmp **/ MDB_cmp_func *md_dcmp; /**< function for comparing data items */ void *md_dcmpctx;/** user-provided context for md_dcmp **/ MDB_rel_func *md_rel; /**< user relocate function */ void *md_relctx; /**< user-provided context for md_rel */ } MDB_dbx; The following is a draft (not tested yet) of a generic compare function. The context contains a compare specification which is a null terminated list of <type><order> pairs. // compareSpec <type><order>...<null> int key_comp_generic(const MDB_val *a, const MDB_val *b, char *compareSpec) { int result = 0; char *pa = a->mv_data; char *pb = b->mv_data; while (1) { switch (*compareSpec++) { case 0: break; case INT32_KEY : { unsigned int va = *(unsigned int *)pa; unsigned int vb = *(unsigned int *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 4; pb += 4; } } case INT64_KEY : { unsigned long long va = *(unsigned long long *)pa; unsigned long long vb = *(unsigned long long *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 8; pb += 8; } } case STRING_KEY : { unsigned int la = *(unsigned int *)pa; unsigned int lb = *(unsigned int *)pb; pa += 4; pb += 4; if (*compareSpec++ == ASCENDING_ORDER) { result = strncmp(pa, pb, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la < lb) ? -1 : la > lb; } } else { result = strncmp(pb, pa, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la > lb) ? -1 : la < lb; } } if (result != 0) { break; } else { pa += la; pb += lb; } } } } return result; } Regards Juerg On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" <openldap-technical-bounces@openldap.org <mailto:openldap-technical-bounces@openldap.org> on behalf of hyc@symas.com <mailto:hyc@symas.com>> wrote: >Bryan Matsuo wrote: >> openldap-technical, >> >> I am working on some Go (golang) bindings[1] for the LMDB library and I have >> some interest in exposing the functionality of mdb_set_compare (and >> mdb_set_dupsort). But it is proving difficult and I have a question about the >> function(s). >> >> Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with >> callbacks comes with restrictions[2][3]. I believe it impossible to bind these >> functions way that is flexible, as one would expect. A potential change to >> LMDB that would make binding drastically easier is having MDB_cmp_func to take >> a third "context" argument with type void*. Then a binding could safely use an >> arbitrary Go function for comparisons. >> >> Is it possible for future versions of LMDB to add a third argument to the >> MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to >> be added using a different function type, one accepting three arguments? >> >> Thanks for the consideration. >> >> Cheers, >> - Bryan >> >> [1] Go bindings -- https://github.com/bmatsuo/lmdb-go >> [2] Cgo pointer restrictions -- >> https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md >> [3] Cgo documentation -- https://golang.org/cmd/cgo/ > >I see nothing in these restrictions that requires extra information to be >passed from Go to C or from C to Go. > >There is a vague mention in [2] > >"A particular unsafe area is C code that wants to hold on to Go func and >pointer values for future callbacks from C to Go. This works today but is not >permitted by the invariant. It is hard to detect. One safe approach is: Go >code that wants to preserve funcs/pointers stores them into a map indexed by >an int. Go code calls the C code, passing the int, which the C code may store >freely. When the C code wants to call into Go, it passes the int to a Go >function that looks in the map and makes the call." > >But it's nonsense in this case - you want to pass a Go function pointer to C, >but the only way for C to use it is to call some *other* Go function? Sorry >but there is no other Go function for the mdb_cmp() function to call, the only >one it knows about is the function pointer that you pass. > >If this is what you're referring to, adding a context pointer doesn't achieve >anything. If this isn't what you're referring to, then please explain exactly >what you hope to achieve with this context pointer. > >-- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/
The example is just a draft. It is a suggestion to support compound keys in a generic way with the intention to keep the extra coast as low as possible. I need a generic way as I have to support user-defined compound indices.
Could we not compromise on a #define so those who need the context could compile with SUPPORT_CMP_CONTEXT. And if not needed there is no performance penalty. Therefore the code basis would remain the same.
Thanks for considering…
On 30/10/15 08:43, "Howard Chu" hyc@symas.com wrote:
Bryan Matsuo wrote:
After digging it seems that it will continue to be possible for users to safely pass static Go function references. It is quite a burden on the user. But I will continue to think about it.
Jay, noted. I am open to exploring that direction. Though as was pointed out earlier a library of static functions can be made more useful (if somewhat slower) when a context object can configure their behavior. Before reading that suggestion I was uncertain how much useful functionality could be exposed as a library. I am writing general purpose bindings, so I would prefer a function library be fairly generic.
Howard, do you have thoughts on the proposal from Juerg regarding a compound-key comparison function implemented using a context value?
I remain unconvinced. key-comparison is still per-DB; a comparison specifier saves some space but at the expense of time - more compare ops, more branching per key compare. Dedicated functions are still the better way to go. Plus, naive constructs like in the emailed example are easy to get wrong - his example will never terminate because he only breaks from the switch statement, nothing breaks from the while(1) loop. It is attempting to be too clever, when a more straightforward approach will be faster and obviously bug-free.
On Thu, Oct 29, 2015 at 8:49 PM Jay Booth <jaybooth@gmail.com mailto:jaybooth@gmail.com> wrote:
From the peanut gallery: Small set of static C functions is probably the way to go. If I understand correctly, which I probablay don't, the mismatch between green threads and OS threads means there's a lot of expensive stack-switching involved in go->C->go execution. On Thu, Oct 29, 2015 at 5:28 PM, Bryan Matsuo <bryan.matsuo@gmail.com <mailto:bryan.matsuo@gmail.com>> wrote: Juerg, That is is interesting proposal. As an alternative to letting users hook up arbitrary Go function for comparison, I have also thought about the possibility of providing a small set of static C functions usable for comparison. A flexible compound key comparison function like this could fit well into that idea. Howard, Sorry I did not find the issues mentioned in previous searches. I understand the concern about such a hot code path. I'm not sure that Go would see acceptable performance. But, Go is not an interpreted language (though there is glue). And while I'm not positive about the performance of Go in this area you seem to dismiss comparison functions in any other language. Is it unreasonable to think that comparison functions written in other compiled languages like Rust, Nim, or ML variants would also be impractically slow? I also believe you have misunderstood the practical problems of passing Go function pointers to C. But to be fair, I think the wording of that quoted paragraph could be better. >Sorry but there is no other Go function for the mdb_cmp() function to call, the only one it knows about is the function pointer that you pass. It may be of benefit to see how the I've used the context argument in a binding being developed for the mdb_reader_list function. https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c The callback passed to mdb_reader_list is always the same static function because correctly calling a Go function from C requires an annotated static Go function. The context argument allows dispatch to the correct Go function that was configured at runtime. I believe that is the "other" Go function you mentioned. The implementation would be similar for mdb_set_compare. The callback would always be the same static function which handles the dynamic dispatch. Cheers, - Bryan On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher <juerg.bircher@helmedica.com <mailto:juerg.bircher@helmedica.com>> wrote: Actually I’m not commenting on binding Go but I’m voting for a context passed to the compare function. I fully agree that the compare function is part of the critical path. But as I need to define custom indexes with compound keys the compare functions varies and it would be impractical to predefine for any compound key combination a c function. The compare context would be stored on the struct MDB_dbx. typedef struct MDB_dbx { MDB_val md_name; /**< name of the database */ MDB_cmp_func *md_cmp; /**< function for comparing keys */ void *md_cmpctx; /** user-provided context for md_cmp **/ MDB_cmp_func *md_dcmp; /**< function for comparing data items */ void *md_dcmpctx;/** user-provided context for md_dcmp **/ MDB_rel_func *md_rel; /**< user relocate function */ void *md_relctx; /**< user-provided context for md_rel */ } MDB_dbx; The following is a draft (not tested yet) of a generic compare function. The context contains a compare specification which is a null terminated list of <type><order> pairs. // compareSpec <type><order>...<null> int key_comp_generic(const MDB_val *a, const MDB_val *b, char *compareSpec) { int result = 0; char *pa = a->mv_data; char *pb = b->mv_data; while (1) { switch (*compareSpec++) { case 0: break; case INT32_KEY : { unsigned int va = *(unsigned int *)pa; unsigned int vb = *(unsigned int *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 4; pb += 4; } } case INT64_KEY : { unsigned long long va = *(unsigned long long *)pa; unsigned long long vb = *(unsigned long long *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 8; pb += 8; } } case STRING_KEY : { unsigned int la = *(unsigned int *)pa; unsigned int lb = *(unsigned int *)pb; pa += 4; pb += 4; if (*compareSpec++ == ASCENDING_ORDER) { result = strncmp(pa, pb, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la < lb) ? -1 : la > lb; } } else { result = strncmp(pb, pa, (la < lb) ? la : lb); if (result != 0) { break; } else { result = (la > lb) ? -1 : la < lb; } } if (result != 0) { break; } else { pa += la; pb += lb; } } } } return result; } Regards Juerg On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" <openldap-technical-bounces@openldap.org <mailto:openldap-technical-bounces@openldap.org> on behalf of hyc@symas.com <mailto:hyc@symas.com>> wrote: >Bryan Matsuo wrote: >> openldap-technical, >> >> I am working on some Go (golang) bindings[1] for the LMDB library and I have >> some interest in exposing the functionality of mdb_set_compare (and >> mdb_set_dupsort). But it is proving difficult and I have a question about the >> function(s). >> >> Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with >> callbacks comes with restrictions[2][3]. I believe it impossible to bind these >> functions way that is flexible, as one would expect. A potential change to >> LMDB that would make binding drastically easier is having MDB_cmp_func to take >> a third "context" argument with type void*. Then a binding could safely use an >> arbitrary Go function for comparisons. >> >> Is it possible for future versions of LMDB to add a third argument to the >> MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to >> be added using a different function type, one accepting three arguments? >> >> Thanks for the consideration. >> >> Cheers, >> - Bryan >> >> [1] Go bindings -- https://github.com/bmatsuo/lmdb-go >> [2] Cgo pointer restrictions -- >> https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md >> [3] Cgo documentation -- https://golang.org/cmd/cgo/ > >I see nothing in these restrictions that requires extra information to be >passed from Go to C or from C to Go. > >There is a vague mention in [2] > >"A particular unsafe area is C code that wants to hold on to Go func and >pointer values for future callbacks from C to Go. This works today but is not >permitted by the invariant. It is hard to detect. One safe approach is: Go >code that wants to preserve funcs/pointers stores them into a map indexed by >an int. Go code calls the C code, passing the int, which the C code may store >freely. When the C code wants to call into Go, it passes the int to a Go >function that looks in the map and makes the call." > >But it's nonsense in this case - you want to pass a Go function pointer to C, >but the only way for C to use it is to call some *other* Go function? Sorry >but there is no other Go function for the mdb_cmp() function to call, the only >one it knows about is the function pointer that you pass. > >If this is what you're referring to, adding a context pointer doesn't achieve >anything. If this isn't what you're referring to, then please explain exactly >what you hope to achieve with this context pointer. > >-- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
I think arguments about how error prone a given function is are moot in this case. Tests can be written. And maintaining the number of functions Juerg is talking about is error prone in itself (and would potentially be harder to test). I also believe that asking everyone else to roll there own "simple" compound index functions involving strings will result in its own set of errors.
Simply put, putting the mdb_set_compare and mdb_set_dupsort functions in lmdb.h is letting people to shoot themselves in the foot. Allowing functions to be reused more easily helps reduce that overall.
On Fri, Oct 30, 2015 at 1:29 AM Jürg Bircher juerg.bircher@helmedica.com wrote:
The example is just a draft. It is a suggestion to support compound keys in a generic way with the intention to keep the extra coast as low as possible. I need a generic way as I have to support user-defined compound indices.
Could we not compromise on a #define so those who need the context could compile with SUPPORT_CMP_CONTEXT. And if not needed there is no performance penalty. Therefore the code basis would remain the same.
Thanks for considering…
On 30/10/15 08:43, "Howard Chu" hyc@symas.com wrote:
Bryan Matsuo wrote:
After digging it seems that it will continue to be possible for users to safely pass static Go function references. It is quite a burden on the
user.
But I will continue to think about it.
Jay, noted. I am open to exploring that direction. Though as was
pointed out
earlier a library of static functions can be made more useful (if
somewhat
slower) when a context object can configure their behavior. Before
reading
that suggestion I was uncertain how much useful functionality could be
exposed
as a library. I am writing general purpose bindings, so I would prefer a function library be fairly generic.
Howard, do you have thoughts on the proposal from Juerg regarding a compound-key comparison function implemented using a context value?
I remain unconvinced. key-comparison is still per-DB; a comparison
specifier
saves some space but at the expense of time - more compare ops, more
branching
per key compare. Dedicated functions are still the better way to go. Plus, naive constructs like in the emailed example are easy to get wrong - his example will never terminate because he only breaks from the switch
statement,
nothing breaks from the while(1) loop. It is attempting to be too clever,
when
a more straightforward approach will be faster and obviously bug-free.
On Thu, Oct 29, 2015 at 8:49 PM Jay Booth <jaybooth@gmail.com mailto:jaybooth@gmail.com> wrote:
From the peanut gallery: Small set of static C functions is
probably the
way to go. If I understand correctly, which I probablay don't, the mismatch between green threads and OS threads means there's a lot of expensive stack-switching involved in go->C->go execution. On Thu, Oct 29, 2015 at 5:28 PM, Bryan Matsuo <
bryan.matsuo@gmail.com
<mailto:bryan.matsuo@gmail.com>> wrote: Juerg, That is is interesting proposal. As an alternative to letting
users
hook up arbitrary Go function for comparison, I have also
thought
about the possibility of providing a small set of static C
functions
usable for comparison. A flexible compound key comparison
function
like this could fit well into that idea. Howard, Sorry I did not find the issues mentioned in previous searches. I understand the concern about such a hot code path. I'm not
sure that
Go would see acceptable performance. But, Go is not an interpreted language (though there is glue).
And
while I'm not positive about the performance of Go in this area
you
seem to dismiss comparison functions in any other language. Is
it
unreasonable to think that comparison functions written in other compiled languages like Rust, Nim, or ML variants would also be impractically slow? I also believe you have misunderstood the practical problems of passing Go function pointers to C. But to be fair, I think the
wording
of that quoted paragraph could be better. >Sorry but there is no other Go function for the mdb_cmp()
function to
call, the only one it knows about is the function pointer that
you pass.
It may be of benefit to see how the I've used the context
argument in
a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb...
The callback passed to mdb_reader_list is always the same static function because correctly calling a Go function from C
requires an
annotated static Go function. The context argument allows
dispatch to
the correct Go function that was configured at runtime. I
believe that
is the "other" Go function you mentioned. The implementation would be similar for mdb_set_compare. The
callback
would always be the same static function which handles the
dynamic
dispatch. Cheers, - Bryan On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher <juerg.bircher@helmedica.com <mailto:
juerg.bircher@helmedica.com>> wrote:
Actually I’m not commenting on binding Go but I’m voting
for a
context passed to the compare function. I fully agree that the compare function is part of the
critical
path. But as I need to define custom indexes with compound
keys
the compare functions varies and it would be impractical to predefine for any compound key combination a c function. The compare context would be stored on the struct MDB_dbx. typedef struct MDB_dbx { MDB_val md_name; /**< name
of the
database */ MDB_cmp_func *md_cmp; /**< function for comparing keys */ void *md_cmpctx; /** user-provided
context for
md_cmp **/ MDB_cmp_func *md_dcmp; /**< function for comparing data items */ void *md_dcmpctx;/** user-provided
context for
md_dcmp **/ MDB_rel_func *md_rel; /**< user relocate function */ void *md_relctx; /**< user-provided context for md_rel */ } MDB_dbx; The following is a draft (not tested yet) of a generic
compare
function. The context contains a compare specification
which is a
null terminated list of <type><order> pairs. // compareSpec <type><order>...<null> int key_comp_generic(const MDB_val *a, const MDB_val *b,
char
*compareSpec) { int result = 0; char *pa = a->mv_data; char *pb = b->mv_data; while (1) { switch (*compareSpec++) { case 0: break; case INT32_KEY : { unsigned int va = *(unsigned int *)pa; unsigned int vb = *(unsigned int *)pb; if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 4; pb += 4; } } case INT64_KEY : { unsigned long long va = *(unsigned long
long *)pa;
unsigned long long vb = *(unsigned long
long *)pb;
if (*compareSpec++ == ASCENDING_ORDER) { result = (va < vb) ? -1 : va > vb; } else { result = (va > vb) ? -1 : va < vb; } if (result != 0) { break; } else { pa += 8; pb += 8; } } case STRING_KEY : { unsigned int la = *(unsigned int *)pa; unsigned int lb = *(unsigned int *)pb; pa += 4; pb += 4; if (*compareSpec++ == ASCENDING_ORDER) { result = strncmp(pa, pb, (la < lb) ?
la : lb);
if (result != 0) { break; } else { result = (la < lb) ? -1 : la > lb; } } else { result = strncmp(pb, pa, (la < lb) ?
la : lb);
if (result != 0) { break; } else { result = (la > lb) ? -1 : la < lb; } } if (result != 0) { break; } else { pa += la; pb += lb; } } } } return result; } Regards Juerg On 29/10/15 10:40, "openldap-technical on behalf of Howard
Chu"
<openldap-technical-bounces@openldap.org <mailto:openldap-technical-bounces@openldap.org> on behalf
of
hyc@symas.com <mailto:hyc@symas.com>> wrote: >Bryan Matsuo wrote: >> openldap-technical, >> >> I am working on some Go (golang) bindings[1] for the
LMDB
library and I have >> some interest in exposing the functionality of
mdb_set_compare
(and >> mdb_set_dupsort). But it is proving difficult and I
have a
question about the >> function(s). >> >> Calling mdb_set_compare from the Go runtime is
challenging.
Using C APIs with >> callbacks comes with restrictions[2][3]. I believe it impossible to bind these >> functions way that is flexible, as one would expect. A potential change to >> LMDB that would make binding drastically easier is
having
MDB_cmp_func to take >> a third "context" argument with type void*. Then a
binding
could safely use an >> arbitrary Go function for comparisons. >> >> Is it possible for future versions of LMDB to add a
third
argument to the >> MDB_cmp_func signature? Otherwise would it be
acceptable for a
variant API to >> be added using a different function type, one accepting
three
arguments? >> >> Thanks for the consideration. >> >> Cheers, >> - Bryan >> >> [1] Go bindings -- https://github.com/bmatsuo/lmdb-go >> [2] Cgo pointer restrictions -- >>
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
>> [3] Cgo documentation -- https://golang.org/cmd/cgo/ > >I see nothing in these restrictions that requires extra information to be >passed from Go to C or from C to Go. > >There is a vague mention in [2] > >"A particular unsafe area is C code that wants to hold on
to Go
func and >pointer values for future callbacks from C to Go. This
works
today but is not >permitted by the invariant. It is hard to detect. One safe approach is: Go >code that wants to preserve funcs/pointers stores them
into a
map indexed by >an int. Go code calls the C code, passing the int, which
the C
code may store >freely. When the C code wants to call into Go, it passes
the int
to a Go >function that looks in the map and makes the call." > >But it's nonsense in this case - you want to pass a Go
function
pointer to C, >but the only way for C to use it is to call some *other*
Go
function? Sorry >but there is no other Go function for the mdb_cmp()
function to
call, the only >one it knows about is the function pointer that you pass. > >If this is what you're referring to, adding a context
pointer
doesn't achieve >anything. If this isn't what you're referring to, then
please
explain exactly >what you hope to achieve with this context pointer. > >-- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP
http://www.openldap.org/project/
-- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
openldap-technical@openldap.org