Full_Name: Lorenz Bauer Version: OS: Linux URL: Submission from: (NULL) (2a06:98c0:1000:8200:18fd:e772:495e:6163)
This is in some ways a follow up to ITS#8505. In the issue I argued that LMDB did not deal well with fork/exec. Hallvard committed a fix in 04acac634a7b276332e2. I think the fix still has one major weakness: me_fd does not have O_CLOEXEC set.
Now, Hallvard omitted this change on purpose. mdb_env_get_fd() exposes me_fd to the user, so there could conceivably be client code out there which expects the fd to persist across exec(). (Please correct my if I misunderstood your argument.)
I believe that the correct action is to set O_CLOEXEC on me_fd by default, and break such client code.
1. Using mdb_env_get_fd this way is Undefined Behaviour (according to the doc)
Before ITS#8505, the documentation made it sound like ANY use after fork() / exec() is disallowed. At the same time, mdb_env_get_fd made no mention / promises wrt exec() behaviour.
Adopting O_CLOEXEC in my opinion does not violate the API set out so far. We can, in the opposite, clarify what is acceptable use (as Hallvard already has done).
2. mdb_env_get_fd has inconsistent semantics across OSs
When creating file handles on Windows we pass NULL as the lpSecurityAttributes argument to CreateFileW:
If this parameter is NULL, the handle returned by CreateFile cannot be inherited by any child processes the application may create [...]
me_fd already has O_CLOEXEC on Windows.
3. Manually closing me_fd after fork() is not possible in all environments
Hallvard suggests calling fork(), close(mdb_env_get_fd()) and then exec() for users which do not want to leak me_fd to child processes.
We use LMDB from the Go platform, which does not expose fork and exec separately (due to the well-known pitfalls they have). We can't use Hallvard's suggestion therefore.
4. me_fd can still be passed around by dup()ing the handle
If there are legitimate uses of passing the handle around, users can explicitly dup the handle and then pass it on to child processes.
I hope my arguments are convincing enough to make the change before the next release. The benefits are
* Consistent semantics between OS * (In my view) saner defaults * and a solution which works well across environments.
The downside is that we might break unknown client code. I can'c come up with a use case of passing me_fd to children, so I suspect the risk is low.
The patch itself is trivial, and I'm happy to contribute it if desired.
Lorenz