Re: LMDB and text encoding

2 Feb 2015


      Timur Kristóf wrote:
...
Hi Everyone,
I've been talking to Howard about this and he suggested to post it to
this mailing list. There are two things that I recently noticed about
how LMDB works with various encodings and I think it's worth to
discuss.
...

Path names

Functions like mdb_env_open, mdb_env_get_path, mdb_env_copy and the
likes accept a char* for path names. This is fine on most unixes where
char* is an UTF-8 string, but unfortunately, these functions call the
ANSI variants of the Windows API functions, making it impossible to
use Unicode path names with them.
I think we should switch to the widechar APIs instead, but that would
also mean changing the LMDB API to accept a wchar_t* parameter on
Windows instead of char*.
What do you guys think about all this?
I just had a look at how BDB handled this. As you can see they used a 
TO_TSTRING macro to convert incoming pathnames from UTF8 to UTF16.
https://gitorious.org/berkeleydb/berkeleydb/source/347d239a1e44ed4f773ae9274...
https://gitorious.org/berkeleydb/berkeleydb/source/347d239a1e44ed4f773ae9274...
(And a FROM_TSTRING for the reverse, as well.)
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: LMDB and text encoding