RE: Random pg_upgrade test failure on drongo

Поиск
Список
Период
Сортировка
От Hayato Kuroda (Fujitsu)
Тема RE: Random pg_upgrade test failure on drongo
Дата
Msg-id TY3PR01MB988963F49BF9528CD8DEED3CF5B9A@TY3PR01MB9889.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Random pg_upgrade test failure on drongo  (Alexander Lakhin <exclusion@gmail.com>)
Ответы Re: Random pg_upgrade test failure on drongo  (Alexander Lakhin <exclusion@gmail.com>)
Список pgsql-hackers
Dear Alexander,

> 
> I can easily reproduce this failure on my workstation by running 5 tests
> 003_logical_slots in parallel inside Windows VM with it's CPU resources
> limited to 50%, like so:
> VBoxManage controlvm "Windows" cpuexecutioncap 50
> 
> set PGCTLTIMEOUT=180
> python3 -c "NUMITERATIONS=20;NUMTESTS=5;import os;tsts='';exec('for i in
> range(1,NUMTESTS+1):
> tsts+=f\"pg_upgrade_{i}/003_logical_slots \"'); exec('for i in
> range(1,NUMITERATIONS+1):print(f\"iteration {i}\");
> assert(os.system(f\"meson test --num-processes {NUMTESTS} {tsts}\") == 0)')"
> ...
> iteration 2
> ninja: Entering directory `C:\src\postgresql\build'
> ninja: no work to do.
> 1/5 postgresql:pg_upgrade_2 / pg_upgrade_2/003_logical_slots
> ERROR            60.30s   exit status 25
> ...
> pg_restore: error: could not execute query: ERROR:  could not create file
> "base/1/2683": File exists
> ...

Great. I do not have such an environment so I could not find. This seemed to
suggest that the failure was occurred because the system was busy.

> I agree with your analysis and would like to propose a PoC fix (see
> attached). With this patch applied, 20 iterations succeeded for me.

Thanks, here are comments. I'm quite not sure for the windows, so I may say
something wrong.

* I'm not sure why the file/directory name was changed before doing a unlink.
  Could you add descriptions?
* IIUC, the important points is the latter part, which waits until the status is
  changed. Based on that, can we remove a double rmtree() from cleanup_output_dirs()?
  They seems to be add for the similar motivation.

```
+    loops = 0;
+    while (lstat(curpath, &st) < 0 && lstat_error_was_status_delete_pending())
+    {
+        if (++loops > 100)        /* time out after 10 sec */
+            return -1;
+        pg_usleep(100000);        /* us */
+    }
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bono Stebler
Дата:
Сообщение: Use index to estimate expression selectivity
Следующее
От: Quan Zongliang
Дата:
Сообщение: Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[]