Обсуждение: On the stability of TAP tests for LDAP
Hi all, (Peter Eisentraut in CC) crake has just complained about a failure with the LDAP test suite: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-07-19%2001%3A33%3A31 # Running: /usr/sbin/slapd -f /home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/slapd.conf -h ldap://localhost:55306 ldaps://localhost:55307 # loading LDAP data # Running: ldapadd -x -y /home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/ldappassword -f authdata.ldif ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1) And FWIW, when running a parallel check-world to make my laptop busy enough, it is rather usual to face failures with this test, which is annoying. Shouldn't we have at least a number of retries with intermediate sleeps for the commands run in 001_auth.pl? As far as I recall, I think that we can run into failures when calling ldapadd and ldappasswd. Thanks, -- Michael
Вложения
On Fri, Jul 19, 2019 at 3:30 PM Michael Paquier <michael@paquier.xyz> wrote: > # Running: /usr/sbin/slapd -f > /home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/slapd.conf > -h ldap://localhost:55306 ldaps://localhost:55307 > # loading LDAP data > # Running: ldapadd -x -y > /home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/ldappassword > -f authdata.ldif > ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1) > > And FWIW, when running a parallel check-world to make my laptop busy > enough, it is rather usual to face failures with this test, which is > annoying. Shouldn't we have at least a number of retries with > intermediate sleeps for the commands run in 001_auth.pl? As far as I > recall, I think that we can run into failures when calling ldapadd and > ldappasswd. Yeah, it seems we need to figure out a way to wait for it to be ready to accept connections. I wondered how other people do this, and found one example that polls for the .pid file: https://github.com/tiredofit/docker-openldap/blob/master/install/etc/cont-init.d/10-openldap#L347 That looks nice and tidy but I'm not sure it can be trusted, given that OpenLDAP's own tests poll a trivial ldapsearch (also based on a cursory glance at slapd/main.c which appears to write the .pid file a bit too early, though I may have misread): https://github.com/openldap/openldap/blob/master/tests/scripts/test039-glue-ldap-concurrency#L59 I guess we should do that too. I don't know how to write Perl but I'll try... -- Thomas Munro https://enterprisedb.com
On Wed, Jul 24, 2019 at 3:52 PM Thomas Munro <thomas.munro@gmail.com> wrote: > I guess we should do that too. I don't know how to write Perl but I'll try... Does this look about right? -- Thomas Munro https://enterprisedb.com
Вложения
On Wed, Jul 24, 2019 at 04:41:05PM +1200, Thomas Munro wrote: > On Wed, Jul 24, 2019 at 3:52 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > I guess we should do that too. I don't know how to write Perl but I'll try... > > Does this look about right? Some comments from here. I have not tested the patch. I would recommend using TestLib::system_log instead of plain system(). The command should be a list of arguments with one element per argument (see call of system_log in PostgresNode.pm for example). The indentation is incorrect, and that I would make the retry longer as I got the feeling that on slow machines we could still have issues. We also usually tend to increase the timeout up to 5 minutes, and the sleep phases make use of Time::HiRes::usleep. -- Michael
Вложения
On Wed, Jul 24, 2019 at 5:26 PM Michael Paquier <michael@paquier.xyz> wrote: > > Does this look about right? > > Some comments from here. I have not tested the patch. > > I would recommend using TestLib::system_log instead of plain system(). > The command should be a list of arguments with one element per > argument (see call of system_log in PostgresNode.pm for example). The > indentation is incorrect, and that I would make the retry longer as I > got the feeling that on slow machines we could still have issues. We > also usually tend to increase the timeout up to 5 minutes, and the > sleep phases make use of Time::HiRes::usleep. Thanks, here's v2. -- Thomas Munro https://enterprisedb.com
Вложения
On Wed, Jul 24, 2019 at 05:47:13PM +1200, Thomas Munro wrote: > Thanks, here's v2. Perhaps this worked on freebsd? Now that I test it, the test gets stuck on my Debian box: # waiting for slapd to accept requests... # Running: ldapsearch -h localhost -p 49534 -s base -b dc=example,dc=net -n 'objectclass=*' SASL/DIGEST-MD5 authentication started Please enter your password: ldap_sasl_interactive_bind_s: Invalid credentials (49) additional info: SASL(-13): user not found: no secret in database pgperltidy complains about the patch indentation using perltidy v20170521 (version mentioned in tools/pgindent/README). -- Michael
Вложения
On Wed, Jul 24, 2019 at 7:50 PM Michael Paquier <michael@paquier.xyz> wrote: > Perhaps this worked on freebsd? Now that I test it, the test gets > stuck on my Debian box: > # waiting for slapd to accept requests... > # Running: ldapsearch -h localhost -p 49534 -s base -b > dc=example,dc=net -n 'objectclass=*' > SASL/DIGEST-MD5 authentication started > Please enter your password: > ldap_sasl_interactive_bind_s: Invalid credentials (49) > additional info: SASL(-13): user not found: no secret in > database Huh, yeah, I don't know why slapd requires credentials on Debian, when the version that ships with FreeBSD is OK with an anonymous connection. Rather than worrying about that, I just adjusted it to supply the credentials. It works on both for me. > pgperltidy complains about the patch indentation using perltidy > v20170521 (version mentioned in tools/pgindent/README). Fixed. -- Thomas Munro https://enterprisedb.com
Вложения
On Wed, Jul 24, 2019 at 09:01:47PM +1200, Thomas Munro wrote: > Huh, yeah, I don't know why slapd requires credentials on Debian, when > the version that ships with FreeBSD is OK with an anonymous > connection. Rather than worrying about that, I just adjusted it to > supply the credentials. It works on both for me. Thanks for the updated patch, this looks good. I have done a series of tests keeping my laptop busy and I haven't seen a failure where I usually see problems 10%~20% of the time. So that seems to help, thanks! -- Michael
Вложения
On Thu, Jul 25, 2019 at 12:51 PM Michael Paquier <michael@paquier.xyz> wrote: > Thanks for the updated patch, this looks good. I have done a series > of tests keeping my laptop busy and I haven't seen a failure where I > usually see problems 10%~20% of the time. So that seems to help, > thanks! Pushed, thanks. -- Thomas Munro https://enterprisedb.com
On Fri, Jul 26, 2019 at 10:18:48AM +1200, Thomas Munro wrote: > Pushed, thanks. Thanks for fixing! I'll update this thread if there are still some problems. -- Michael