Re: Testing autovacuum wraparound (including failsafe)
От | Alexander Lakhin |
---|---|
Тема | Re: Testing autovacuum wraparound (including failsafe) |
Дата | |
Msg-id | 5811175c-1a31-4869-032f-7af5e3e4506a@gmail.com обсуждение исходный текст |
Ответ на | Re: Testing autovacuum wraparound (including failsafe) (Masahiko Sawada <sawada.mshk@gmail.com>) |
Список | pgsql-hackers |
Hello, 30.11.2023 10:35, Masahiko Sawada wrote: > >> I've attached new version patches (0002 and 0003 are unchanged except >> for the commit message). I'll push them, barring any objections. >> > Pushed. I've discovered that the test 001_emergency_vacuum.pl can fail due to a race condition. I can't see the server log at [1], but I reproduced the failure locally and with additional logging and log_min_messages = DEBUG3, the log shows: ... 2024-05-22 11:46:28.125 UTC [21256:2853] DEBUG: SlruScanDirectory invoking callback on pg_xact/0690 2024-05-22 11:46:28.125 UTC [21256:2854] DEBUG: transaction ID wrap limit is 2147484396, limited by database with OID 5 2024-05-22 11:46:28.126 UTC [21256:2855] LOG: !!!SendPostmasterSignal| PMSIGNAL_START_AUTOVAC_LAUNCHER 2024-05-22 11:46:28.135 UTC [14871:20077] DEBUG: postmaster received pmsignal signal 2024-05-22 11:46:28.137 UTC [21257:1] DEBUG: autovacuum launcher started 2024-05-22 11:46:28.137 UTC [21257:2] DEBUG: InitPostgres 2024-05-22 11:46:28.138 UTC [21257:3] LOG: !!!AutoVacLauncherMain| !AutoVacuumingActive() && !ShutdownRequestPending; before do_start_worker() 2024-05-22 11:46:28.138 UTC [21257:4] LOG: !!!do_start_worker| return quickly when there are no free workers 2024-05-22 11:46:28.138 UTC [21257:5] DEBUG: shmem_exit(0): 4 before_shmem_exit callbacks to make 2024-05-22 11:46:28.138 UTC [21257:6] DEBUG: shmem_exit(0): 6 on_shmem_exit callbacks to make 2024-05-22 11:46:28.138 UTC [21257:7] DEBUG: proc_exit(0): 1 callbacks to make 2024-05-22 11:46:28.138 UTC [21257:8] DEBUG: exit(0) 2024-05-22 11:46:28.138 UTC [21257:9] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2024-05-22 11:46:28.138 UTC [21257:10] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2024-05-22 11:46:28.138 UTC [21257:11] DEBUG: proc_exit(-1): 0 callbacks to make 2024-05-22 11:46:28.146 UTC [21256:2856] DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 5 2024-05-22 11:46:28.146 UTC [21256:2857] DEBUG: MultiXact member stop limit is now 4294914944 based on MultiXact 1 2024-05-22 11:46:28.146 UTC [21256:2858] DEBUG: shmem_exit(0): 4 before_shmem_exit callbacks to make 2024-05-22 11:46:28.147 UTC [21256:2859] DEBUG: shmem_exit(0): 7 on_shmem_exit callbacks to make 2024-05-22 11:46:28.147 UTC [21256:2860] DEBUG: proc_exit(0): 1 callbacks to make 2024-05-22 11:46:28.147 UTC [21256:2861] DEBUG: exit(0) 2024-05-22 11:46:28.147 UTC [21256:2862] DEBUG: shmem_exit(-1): 0 before_shmem_exit callbacks to make 2024-05-22 11:46:28.147 UTC [21256:2863] DEBUG: shmem_exit(-1): 0 on_shmem_exit callbacks to make 2024-05-22 11:46:28.147 UTC [21256:2864] DEBUG: proc_exit(-1): 0 callbacks to make 2024-05-22 11:46:28.151 UTC [14871:20078] DEBUG: forked new backend, pid=21258 socket=8 2024-05-22 11:46:28.171 UTC [14871:20079] DEBUG: server process (PID 21256) exited with exit code 0 2024-05-22 11:46:28.152 UTC [21258:1] [unknown] LOG: connection received: host=[local] 2024-05-22 11:46:28.171 UTC [21258:2] [unknown] DEBUG: InitPostgres 2024-05-22 11:46:28.172 UTC [21258:3] [unknown] LOG: connection authenticated: user="vagrant" method=trust (/pgtest/postgresql.git/src/test/modules/xid_wraparound/tmp_check/t_001_emergency_vacuum_main_data/pgdata/pg_hba.conf:117) 2024-05-22 11:46:28.172 UTC [21258:4] [unknown] LOG: connection authorized: user=vagrant database=postgres application_name=001_emergency_vacuum.pl 2024-05-22 11:46:28.175 UTC [21258:5] 001_emergency_vacuum.pl LOG: statement: INSERT INTO small(data) SELECT 1 That is, autovacuum worker (21256) sent PMSIGNAL_START_AUTOVAC_LAUNCHER, postmaster started autovacuum launcher, which could not start new autovacuum worker due to the process 21256 not exited yet. The failure can be reproduced easily with the sleep added inside SetTransactionIdLimit(): if (TransactionIdFollowsOrEquals(curXid, xidVacLimit) && IsUnderPostmaster && !InRecovery) SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_LAUNCHER); +pg_usleep(10000L); By the way I also discovered that rather resource-intensive xid_wraparound tests executed twice during the buildfarm "make" run (on dodo and perentie (see [2]) animals), at stage module-xid_wraparound-check and then at stage testmodules-install-check-C. [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dodo&dt=2024-05-19%2006%3A33%3A34 [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=perentie&dt=2024-05-22%2000%3A02%3A19 Best regards, Alexander
В списке pgsql-hackers по дате отправления: