Re: Finding cause of test fails on the cfbot site

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Finding cause of test fails on the cfbot site
Дата
Msg-id CA+hUKGLPBXo7FFUgitTOBwMn7vdR4zp0N-VTA9PbeBwR_NPVHA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Finding cause of test fails on the cfbot site  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: Finding cause of test fails on the cfbot site  (Thomas Munro <thomas.munro@gmail.com>)
Re: Finding cause of test fails on the cfbot site  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers
On Thu, Feb 18, 2021 at 9:18 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> On 2/17/21 11:06 AM, Tom Lane wrote:
> > Peter Smith <smithpb2250@gmail.com> writes:
> >> I saw that one of our commitfest entries (32/2914) is recently
> >> reporting a fail on the cfbot site [1]. I thought this was all ok a
> >> few days ago.
> >> ...
> >> Is there any other detailed information available anywhere, e.g.
> >> logs?, which might help us work out what was the cause of the test
> >> failure?
> > AFAIK the cfbot doesn't capture anything beyond the session typescript.
> > However, this doesn't look that hard to reproduce locally ... have you
> > tried, using similar configure options to what that cfbot run did?
> > Once you did reproduce it, there'd be logs under
> > contrib/test_decoding/tmp_check/.
>
> yeah. The cfbot runs check-world which makes it difficult for it to know
> which log files to show when there's an error. That's a major part of
> the reason the buildfarm runs a much finer grained set of steps.

Yeah, it's hard to make it print out just the right logs without
dumping so much stuff that it's hard to see the wood for the trees;
perhaps if the Makefile had an option to dump relevant stuff for the
specific tests that failed, or perhaps the buildfarm is already better
at that and cfbot should just use the buildfarm client directly.  Hmm.
Another idea would be to figure out how to make a tarball of all log
files that you can download for inspection with better tools at home
when things go wrong.  It would rapidly blow through the 1GB limit for
stored "artefacts" on open source/community Cirrus accounts though, so
we'd need to figure out how to manage retention carefully.

For what it's worth, I tried to reproduce this particular on a couple
of systems, many times, with no luck.  It doesn't look like a freak CI
failure (there have been a few random terminations I can't explain
recently, but they look different, I think there was a Google Compute
Engine outage that might explain that), and it failed in exactly the
same way on Linux and FreeBSD.  I tried  locally on FreeBSD, on top of
commit a975ff4980d60f8cbd8d8cbcff70182ea53e787a (which is what the
last cfbot run did), because it conflicts with a recent change so it
doesn't apply on the tip of master right now.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: documentation fix for SET ROLE
Следующее
От: David Rowley
Дата:
Сообщение: Re: Tid scan improvements