New subject: [Issue 9316] performance issue when writing a high number of large objects

17 Aug 2020


      https://bugs.openldap.org/show_bug.cgi?id=9316
Issue ID: 9316
           Summary: performance issue when writing a high number of large
                    objects
           Product: LMDB
           Version: 0.9.24
          Hardware: x86_64
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: ---
         Component: liblmdb
          Assignee: bugs@openldap.org
          Reporter: JGabler@univa.com
  Target Milestone: ---
Created attachment 755
  --> https://bugs.openldap.org/attachment.cgi?id=755&action=edit
lmdb performance test reproducing the issue
When writing a high number of big objects we see an extreme variation in
performance from very fast to extremely slow.
In the test scenario we write 10 chunks of 10.000 "jobs" (some 10kB) and their
corresponding "job script" (some 40kB), 200.000 objects in total.
Then delete all objects.
We do 10 iterations of this scenario.
When running this scenario as part of Univa Grid Engine with LMDB as database
backend we get the following performance values (rows are the iteration,
columns the chunk of jobs):
Iteration   0  1  2  3  4  5  6  7  8  9
0  21.525   21.250   21.574   21.722   22.693   21.992   22.438   22.650  
21.972   22.017
1  22.262   21.656   22.339   22.914   21.549   24.906   23.862   1531.189   
1695.041    1491.255
2  36.071   21.619   22.074   22.927   23.455   27.239   22.640   22.802  
633.956  1882.008
3  52.163   21.651   21.571   22.686   22.727   22.024   40.980   22.156  
22.429   595.362
4  64.977   21.511   22.519   22.148   22.354   23.292   57.740   20.835  
37.680   250.594
5  54.724   21.074   21.200   23.744   22.109   21.351   62.225   21.447  
91.292   375.260
6  49.065   21.573   22.309   26.084   21.226   21.248   68.580   22.531  
59.338   249.936
7  44.666   21.830   21.009   28.760   21.533   21.611   72.291   23.144  
86.281   118.326
8  35.486   21.720   21.840   24.729   22.045   20.877   76.473   21.193  
120.387  136.836
9  41.159   23.365   21.721   23.024   21.835   20.972   77.409   21.784  
193.885  306.158
So usually writing of 10.000 "jobs"+"job_script" takes some 22 seconds but
after some time performance massively breaks in.
With other database backends we do not see this behaviour, see the following
performance data of the same test done with PostgreSQL backend which is slower
(as expected going over the network) but provides constant throughput:
Iteration   0  1  2  3  4  5  6  7  8  9
0  36.937   37.110   36.952   37.279   37.580   37.364   37.950   37.390  
37.682   37.439
1  37.464   38.110   37.679   38.366   37.576   37.624   37.476   37.412  
37.265   37.727
2  36.394   37.635   37.347   37.603   37.402   37.515   37.802   37.898  
37.355   37.939
3  37.213   37.539   36.771   37.706   37.055   37.780   37.283   37.488  
36.955   37.460
4  36.554   37.557   37.368   37.960   37.070   37.892   37.459   37.857  
37.228   37.833
5  37.047   38.164   37.167   37.885   37.268   37.676   37.355   37.572  
37.347   37.569
6  37.118   37.735   36.857   37.602   36.717   37.716   37.444   37.685  
37.085   38.151
7  36.787   37.647   36.844   37.601   36.934   37.440   37.632   37.291  
37.174   37.926
8  36.884   37.560   37.117   37.239   37.034   37.748   37.289   37.635  
36.822   37.693
9  37.178   37.496   36.849   37.799   37.289   37.644   37.461   37.622  
37.022   37.670
We can reproduce the issue with a small C program (see attachment) which does
essentially the same database operations as our database layer in Univa Grid
Engine but depends only on liblmdb.
It simulates the scenario described above and gives us the following
performance data
showing the extreme performance variation:
Iteration   0  1  2  3  4  5  6  7  8  9
0  0.686    0.625    0.660    0.637    0.631    0.741    0.757    0.658   
0.651    0.614
1  0.705    0.838    0.690    0.772    0.663    3.248    0.605    542.762 
1114.374    898.477
2  13.336   1.299    0.659    0.637    0.626    0.712    11.172   0.663   
29.833   1161.884
3  26.774   0.647    0.607    0.586    0.583    0.639    24.893   0.629   
3.837    423.248
4  32.802   0.629    0.616    0.560    0.550    0.605    31.133   0.625   
6.606    195.150
5  34.819   0.623    0.628    0.582    0.564    0.609    32.275   0.607   
7.599    134.106
6  26.319   0.622    0.582    0.548    0.551    0.590    28.536   0.611   
36.429   160.781
7  21.878   0.814    0.668    0.736    0.614    0.543    24.355   0.626   
36.583   148.337
8  4.129    0.654    5.674    0.596    0.566    0.554    7.158    0.633   
0.599    48.799
9  30.278   0.608    0.608    0.560    0.549    0.587    29.253   0.606   
9.593    128.339
It can be compiled on Linux 64bit with
   gcc -I <path to lmdb>/include -L <path to lmdb>/lib -o test_lmdb_perf
test_lmdb_perf.c -llmdb
To run the given scenario call it with the following parameters:
   ./test_lmdb_perf <path to database directory> 10 10 10000
We built and ran it on
- CentOS Linux release 7.7.1908 (Core)
- Linux biber 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019
x86_64 x86_64 x86_64 GNU/Linux
- it was built with gcc (GCC) 7.2.1 20170829 (Red Hat 7.2.1-1) from
devtoolset-7
-- 
You are receiving this mail because:
You are on the CC list for the issue.

[Issue 9316] New: performance issue when writing a high number of large objects