Hello list
We use ppolicy overlay for enforcing password lifecycle. Recently we faced with following issue and now I am trying to do some countermeasures to minimize risk of issue reoccurring. We use openldap server for user authentication. Here we store objects of real users as well as system users (for daemons and so on). We use redundant setup with two openldap servers running in mirror mode (multi-master). - Few days ago I find out that I wasn't able to log into service which uses this LDAP as authentication backend. - I find out that BOTH openldap servers are down (simply process wasn't listening) - checked LDAP database partition (dedicated partition for storing both DB and LOG BDB transaction files) and it was exhausted on both servers - reason of this exhaustion was a couple of BDB log files created just within last few minutes before daemon went down - based on slapd logs it seems that one system user (used by Nagios) had expired password - for which I forgot to set no password expiration - and it seems that those failed authentication tries caused this transaction logs to exhaust partition, as because for each failed bind, new "pwdFailureTime" value was added into object which is basically normal ldap modify operation causing transaction log to involve. - and as because that system user was used by Nagios for various purposes and LDAP BIND rate was really high, it effectively behave like DoS to kill my ldap servers due partition space exhausting
obviously I have fixed policy for that system user to keep password with unlimited expiration time. but anyway this DoS can be basically reproduced by any real user from outside to effectively kill those ldap servers. Redundancy with multiple servers does not provide any benefit as modifying pwdFailureTime is propagated over all cluster servers with same result to disk space. Also expanding partition will not help - it only extends service availability based on allocated space - and bdb log consuming was really huge - 15 log files (each with 10MB size) was created just within two minutes!!
now the question: did anybody considered this "effect" of using "pwdFailureTime" attribute? If so, what can I do to avoid this behavior to occur? Or how you are facing with this potential kind of issues? On one side it is fine to see some failure attempt history. Also keeping pwdFailureTime limited to some max number of values will not help as the LDAP modify operation have to be done anyway. For me the only useful possibility is to NOT use this attribute pwdFailureTime at all, but how to do it? I haven't found any possibility to disable using this attribute.
openldap-2.4.40/Centos6
many thanks for help
michal
Bruncko Michal wrote:
Hello list
We use ppolicy overlay for enforcing password lifecycle. Recently we faced with following issue and now I am trying to do some countermeasures to minimize risk of issue reoccurring. We use openldap server for user authentication. Here we store objects of real users as well as system users (for daemons and so on). We use redundant setup with two openldap servers running in mirror mode (multi-master).
- Few days ago I find out that I wasn't able to log into service which uses
this LDAP as authentication backend.
- I find out that BOTH openldap servers are down (simply process wasn't
listening)
- checked LDAP database partition (dedicated partition for storing both DB and
LOG BDB transaction files) and it was exhausted on both servers
- reason of this exhaustion was a couple of BDB log files created just within
last few minutes before daemon went down
- based on slapd logs it seems that one system user (used by Nagios) had
expired password - for which I forgot to set no password expiration
- and it seems that those failed authentication tries caused this transaction
logs to exhaust partition, as because for each failed bind, new "pwdFailureTime" value was added into object which is basically normal ldap modify operation causing transaction log to involve.
- and as because that system user was used by Nagios for various purposes and
LDAP BIND rate was really high, it effectively behave like DoS to kill my ldap servers due partition space exhausting
obviously I have fixed policy for that system user to keep password with unlimited expiration time. but anyway this DoS can be basically reproduced by any real user from outside to effectively kill those ldap servers. Redundancy with multiple servers does not provide any benefit as modifying pwdFailureTime is propagated over all cluster servers with same result to disk space. Also expanding partition will not help - it only extends service availability based on allocated space - and bdb log consuming was really huge - 15 log files (each with 10MB size) was created just within two minutes!!
now the question: did anybody considered this "effect" of using "pwdFailureTime" attribute? If so, what can I do to avoid this behavior to occur? Or how you are facing with this potential kind of issues? On one side it is fine to see some failure attempt history. Also keeping pwdFailureTime limited to some max number of values will not help as the LDAP modify operation have to be done anyway. For me the only useful possibility is to NOT use this attribute pwdFailureTime at all, but how to do it? I haven't found any possibility to disable using this attribute.
This is ITS#8327. The fix is released in 2.4.44.
You should upgrade.
You should not be using any BerkeleyDB-based backends, use back-mdb which does not need transaction log files.
openldap-2.4.40/Centos6
many thanks for help
michal
On Feb 21, 2016, at 11:48, Howard Chu hyc@symas.com wrote:
Bruncko Michal wrote:
Hello list
We use ppolicy overlay for enforcing password lifecycle. Recently we faced with following issue and now I am trying to do some countermeasures to minimize risk of issue reoccurring.
[…]
now the question: did anybody considered this "effect" of using "pwdFailureTime" attribute? If so, what can I do to avoid this behavior to occur? Or how you are facing with this potential kind of issues? On one side it is fine to see some failure attempt history. Also keeping pwdFailureTime limited to some max number of values will not help as the LDAP modify operation have to be done anyway. For me the only useful possibility is to NOT use this attribute pwdFailureTime at all, but how to do it? I haven't found any possibility to disable using this attribute.
This is ITS#8327. The fix is released in 2.4.44.
You should upgrade.
You should not be using any BerkeleyDB-based backends, use back-mdb which does not need transaction log files.
If you cannot upgrade for some reason, someone wrote a Perl script that deletes ‘excessive' pwdFailureTime attributes:
http://www.openldap.org/lists/openldap-bugs/201507/msg00012.html
Hello David
thanks for reply. this script can be useful, but it does not improve this situation anyhow. for each failed bind attempt new value for pwdFailureTime attribute will be created anyway which result in same modification operation with utilizing transaction log. that script is helpful to reduce overall number of values of pwdFailureTime attribute which are already in LDAP DB.
this could be helpful as well: configuration variable which defines maximum values for pwdFailureTime. and in case that number of actual values reached max value, do not update that attribute anymore. Yes, this will store NUM oldest failed attempts, but ensure that pwdFailureTime will not be updated forever. but this seems to be request for ppolicy overlay code update rather than any external script.
thanks
michal
On 2016-02-22 3:38, David Magda wrote:
On Feb 21, 2016, at 11:48, Howard Chu hyc@symas.com wrote:
Bruncko Michal wrote:
Hello list
We use ppolicy overlay for enforcing password lifecycle. Recently we faced with following issue and now I am trying to do some countermeasures to minimize risk of issue reoccurring.
[…]
now the question: did anybody considered this "effect" of using "pwdFailureTime" attribute? If so, what can I do to avoid this behavior to occur? Or how you are facing with this potential kind of issues? On one side it is fine to see some failure attempt history. Also keeping pwdFailureTime limited to some max number of values will not help as the LDAP modify operation have to be done anyway. For me the only useful possibility is to NOT use this attribute pwdFailureTime at all, but how to do it? I haven't found any possibility to disable using this attribute.
This is ITS#8327. The fix is released in 2.4.44.
You should upgrade.
You should not be using any BerkeleyDB-based backends, use back-mdb which does not need transaction log files.
If you cannot upgrade for some reason, someone wrote a Perl script that deletes ‘excessive' pwdFailureTime attributes:
http://www.openldap.org/lists/openldap-bugs/201507/msg00012.html
On Feb 22, 2016, at 07:22, Bruncko Michal Michal.Bruncko@zssos.sk wrote: […]
this could be helpful as well: configuration variable which defines maximum values for pwdFailureTime. and in case that number of actual values reached max value, do not update that attribute anymore. Yes, this will store NUM oldest failed attempts, but ensure that pwdFailureTime will not be updated forever. but this seems to be request for ppolicy overlay code update rather than any external script.
It was fixed in 2.4.43 (2015/11/30):
Fixed slapo-ppolicy to allow purging of stale pwdFailureTime attributes (ITS#8185)
http://www.openldap.org/software/release/changes.html
From the bug report:
I've added a pwdMaxRecordedFailure attribute to the policy schema. Overloading pwdMaxFailure would be a mistake.
MaxRecordedFailure will default to MaxFailure if that is set. It defaults to 5 if nothing is set. There's no good reason to allow the timestamps to accumulate without bound.
http://www.openldap.org/its/index.cgi/?findid=8185
You will probably need to compile from source (or build an RPM yourself via the spec file).
--On Monday, February 22, 2016 9:01 PM -0500 David Magda dmagda@ee.ryerson.ca wrote:
http://www.openldap.org/its/index.cgi/?findid=8185
You will probably need to compile from source (or build an RPM yourself via the spec file).
"the spec file"? OpenLDAP does not provide a spec file. However, there are good alternatives like the LTB project and Symas, as I already noted, for those who aren't able to compile from source for whatever reasons.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration A division of Synacor, Inc
On Feb 22, 2016, at 21:24, Quanah Gibson-Mount quanah@zimbra.com wrote:
--On Monday, February 22, 2016 9:01 PM -0500 David Magda dmagda@ee.ryerson.ca wrote:
http://www.openldap.org/its/index.cgi/?findid=8185
You will probably need to compile from source (or build an RPM yourself via the spec file).
"the spec file"? OpenLDAP does not provide a spec file. However, there are good alternatives like the LTB project and Symas, as I already noted, for those who aren't able to compile from source for whatever reasons.
Unless one wants to write a spec file from scratch, I find it easier to take the existing one that was used to build the (usually older) version of a software package included in a distro, point to the newer tarball, and go from there.
You get both the newer code as well as packages that are organized around how the distribution expects them.
--On Monday, February 22, 2016 9:49 PM -0500 David Magda dmagda@ee.ryerson.ca wrote:
Unless one wants to write a spec file from scratch, I find it easier to take the existing one that was used to build the (usually older) version of a software package included in a distro, point to the newer tarball, and go from there.
You get both the newer code as well as packages that are organized around how the distribution expects them.
A good reason not to do that is (a) your package should not interfere with the system packages (I.e., it should not be building into /etc, /usr/bin, etc and (b) RHEL/CentOS link to MozNSS, which is very problematic and should be avoided. It makes much more sense to start with something like the LTB project, and base anything off their spec if not using their pre-compiled packages.
--Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration A division of Synacor, Inc
Hello Howard
On 2016-02-21 17:48, Howard Chu wrote:
Bruncko Michal wrote:
Hello list
We use ppolicy overlay for enforcing password lifecycle. Recently we faced with following issue and now I am trying to do some countermeasures to minimize risk of issue reoccurring. We use openldap server for user authentication. Here we store objects of real users as well as system users (for daemons and so on). We use redundant setup with two openldap servers running in mirror mode (multi-master).
- Few days ago I find out that I wasn't able to log into service which
uses this LDAP as authentication backend.
- I find out that BOTH openldap servers are down (simply process
wasn't listening)
- checked LDAP database partition (dedicated partition for storing
both DB and LOG BDB transaction files) and it was exhausted on both servers
- reason of this exhaustion was a couple of BDB log files created just
within last few minutes before daemon went down
- based on slapd logs it seems that one system user (used by Nagios)
had expired password - for which I forgot to set no password expiration
- and it seems that those failed authentication tries caused this
transaction logs to exhaust partition, as because for each failed bind, new "pwdFailureTime" value was added into object which is basically normal ldap modify operation causing transaction log to involve.
- and as because that system user was used by Nagios for various
purposes and LDAP BIND rate was really high, it effectively behave like DoS to kill my ldap servers due partition space exhausting
obviously I have fixed policy for that system user to keep password with unlimited expiration time. but anyway this DoS can be basically reproduced by any real user from outside to effectively kill those ldap servers. Redundancy with multiple servers does not provide any benefit as modifying pwdFailureTime is propagated over all cluster servers with same result to disk space. Also expanding partition will not help - it only extends service availability based on allocated space - and bdb log consuming was really huge - 15 log files (each with 10MB size) was created just within two minutes!!
now the question: did anybody considered this "effect" of using "pwdFailureTime" attribute? If so, what can I do to avoid this behavior to occur? Or how you are facing with this potential kind of issues? On one side it is fine to see some failure attempt history. Also keeping pwdFailureTime limited to some max number of values will not help as the LDAP modify operation have to be done anyway. For me the only useful possibility is to NOT use this attribute pwdFailureTime at all, but how to do it? I haven't found any possibility to disable using this attribute.
This is ITS#8327. The fix is released in 2.4.44.
You should upgrade.
You should not be using any BerkeleyDB-based backends, use back-mdb which does not need transaction log files.
many thanks for this. this is a bit odd that even in latest centos7 (what we wanted to use for upgrade) there is old version. so the only option would be to build from scratch.
is there any option to stop using pwdFailureTime attribute? if I set global ACL rule like this:
access to attrs=pwdFailureTime by * none
...will it work? or I assume not as overlay ppolicy is not represented by any DN during modification.
thanks
michal
openldap-2.4.40/Centos6
many thanks for help
michal
--On Monday, February 22, 2016 1:13 PM +0100 Bruncko Michal Michal.Bruncko@zssos.sk wrote:
many thanks for this. this is a bit odd that even in latest centos7 (what we wanted to use for upgrade) there is old version. so the only option would be to build from scratch.
No, it is not odd. As has been mentioned hundreds if not thousands of times on this list, in general, distribution packages of OpenLDAP should *not* be used for running production instances of OpenLDAP. The packages provided by RHEL/CentOS have additional issues such as being linked to MozNSS which has a number of problems.
Please see: http://www.openldap.org/faq/data/cache/1456.html
I would note that if building and maintaining OpenLDAP yourself is not something you wish to do, the LTB project (http://ltb-project.org/wiki/download#openldap) and Symas (https://symas.com/products/openldap-directory/) both offer prebuilt packages of OpenLDAP. Both of these sanely link to OpenSSL.
Regards, Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc. -------------------- Zimbra :: the leader in open source messaging and collaboration A division of Synacor, Inc
Thanks Quanah, looks this is the way I should go for our LDAP deployment.
br,
michal
On 2016-02-22 17:53, Quanah Gibson-Mount wrote:
--On Monday, February 22, 2016 1:13 PM +0100 Bruncko Michal Michal.Bruncko@zssos.sk wrote:
many thanks for this. this is a bit odd that even in latest centos7 (what we wanted to use for upgrade) there is old version. so the only option would be to build from scratch.
No, it is not odd. As has been mentioned hundreds if not thousands of times on this list, in general, distribution packages of OpenLDAP should *not* be used for running production instances of OpenLDAP. The packages provided by RHEL/CentOS have additional issues such as being linked to MozNSS which has a number of problems.
Please see: http://www.openldap.org/faq/data/cache/1456.html
I would note that if building and maintaining OpenLDAP yourself is not something you wish to do, the LTB project (http://ltb-project.org/wiki/download#openldap) and Symas (https://symas.com/products/openldap-directory/) both offer prebuilt packages of OpenLDAP. Both of these sanely link to OpenSSL.
Regards, Quanah
--
Quanah Gibson-Mount Platform Architect Zimbra, Inc.
Zimbra :: the leader in open source messaging and collaboration A division of Synacor, Inc
openldap-technical@openldap.org