RE: Random pg_upgrade test failure on drongo

Поиск
Список
Период
Сортировка
От Hayato Kuroda (Fujitsu)
Тема RE: Random pg_upgrade test failure on drongo
Дата
Msg-id TY3PR01MB9889CD6B11182AEBDA95B798F582A@TY3PR01MB9889.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Random pg_upgrade test failure on drongo  (Alexander Lakhin <exclusion@gmail.com>)
Список pgsql-hackers
Dear Alexander, Andrew,
 
Thanks for your analysis!
 
> I see that behavior on:
> Windows 10 Version 1607 (OS Build 14393.0)
> Windows Server 2016 Version 1607 (OS Build 14393.0)
> Windows Server 2019 Version 1809 (OS Build 17763.1)
>
> But it's not reproduced on:
> Windows 10 Version 1809 (OS Build 17763.1) (triple-checked)
> Windows Server 2019 Version 1809 (OS Build 17763.592)
> Windows 10 Version 22H2 (OS Build 19045.3693)
> Windows 11 Version 21H2 (OS Build 22000.613)
>
> So it looks like the failure occurs depending not on Windows edition, but
> rather on it's build. For Windows Server 2019 the "good" build is
> somewhere between 17763.1 and 17763.592, but for Windows 10 it's between
> 14393.0 and 17763.1.
> (Maybe there was some change related to
> FILE_DISPOSITION_POSIX_SEMANTICS/
> FILE_DISPOSITION_ON_CLOSE implementation; I don't know where to find
> information about that change.)
>
> It's also interesting, what is full version/build of OS on drongo and
> fairywren.
 
Thanks for your interest for the issue. I have been tracking the failure but been not occurred.
Your analysis seems to solve BF failures, by updating OSes.
 
> I think that's because unlink() is performed asynchronously on those old
> Windows versions, but rename() is always synchronous.
 
OK. Actually I could not find descriptions about them, but your experiment showed facts.
 
> I've managed to reproduce that issue (or at least a situation that
> manifested similarly) with a sleep added in miscinit.c:
>          ereport(IsPostmasterEnvironment ? LOG : NOTICE,
>                         (errmsg("database system is shut down")));
> +       pg_usleep(500000L);
>
> With this change, I get the same warning as in [1] when running in
> parallel 10 tests 002_pg_upgrade with a minimal olddump (on iterations
> 33, 46, 8). And with my PoC patch applied, I could see the same warning
> as well (on iteration 6).
>
> I believe that's because rename() can't rename a directory containing an
> open file, just as unlink() can't remove it.
>
> In the light of the above, I think that the issue in question should be
> fixed in accordance with/as a supplement to [2].
 
OK, I understood that we need to fix more around here. For now, we should focus our points.
 
Your patch seems good, but it needs more sight from windows-friendly developers.
How do other think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Следующее
От: John Naylor
Дата:
Сообщение: Re: [PGDOCS] Inconsistent linkends to "monitoring" views.