Обсуждение: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals
BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals
От
"Valentine Gogichashvili"
Дата:
The following bug has been logged online: Bug reference: 3942 Logged by: Valentine Gogichashvili Email address: valgog@gmail.com PostgreSQL version: 8.2.3 Operating system: Linux 2.6.8 SMP Description: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals Details: We are running several servers with the 'same' configuration. Recently two of many servers started to experience similar problems. It is quite difficult to reproduce. So I am providing the information that I have managed to get by now. The problem on the second server was to be seen only once. The problem on the first server is happening once a day or so. It does not seem to actually depend on the server load. The problem is normally noticed when the update sessions on one of the biggest table are hanging. The server has a 'statement_timeout' parameter set globally to 20 seconds and cronjob vacuum jobs are being run every 2 hours like: "set statement_timeout to 0; VACUUM VERBOSE ANALYZE;", but never the less the update sessions are not killed for hours. I cannot kill VACUUM session explicitly as well. More info: rumata@dbserver-04:~$ ps -ef | grep VACUUM | grep -v grep postgres 24359 21321 0 13:47 ? 00:00:12 postgres: postgres rumataindex [local] VACUUM rumata@dbserver-04:~$ sudo su - postgres postgres@dbserver-04:~$ psql rumataindex Welcome to psql 8.2.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit rumataindex=# select * from pg_locks where pid = 24359; locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted ---------------+----------+----------+------+-------+---------------+------- --+-------+----------+-------------+-------+--------------------------+----- ---- relation | 17717 | 18917 | | | | | | | 1045306585 | 24359 | ShareUpdateExclusiveLock | t relation | 17717 | 1892961 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94216 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94213 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94214 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94206 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 185865 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 952119 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94212 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94215 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94207 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 185867 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94217 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t transactionid | | | | | 1045306585 | | | | 1045306585 | 24359 | ExclusiveLock | t relation | 17717 | 94208 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94211 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t (16 rows) rumataindex=# \q postgres@dbserver-04:~$ kill 24359 postgres@dbserver-04:~$ psql rumataindex Welcome to psql 8.2.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit rumataindex=# select * from pg_locks where pid = 24359; locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted ---------------+----------+----------+------+-------+---------------+------- --+-------+----------+-------------+-------+--------------------------+----- ---- relation | 17717 | 18917 | | | | | | | 1045306585 | 24359 | ShareUpdateExclusiveLock | t relation | 17717 | 1892961 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94216 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94213 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94214 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94206 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 185865 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 952119 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94212 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94215 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94207 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 185867 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94217 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t transactionid | | | | | 1045306585 | | | | 1045306585 | 24359 | ExclusiveLock | t relation | 17717 | 94208 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t relation | 17717 | 94211 | | | | | | | 1045306585 | 24359 | RowExclusiveLock | t (16 rows) rumataindex=# \q In addition, I want to mention, that "VACUUM VERBOSE ANALYZE;" session owned locks are the oldest ones, when the issue is occurring. The only way to restart the server is to force immediate server shutdown like: postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f waiting for server to shut down............................................................... failed postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f waiting for server to shut down............................................................... failed pg_ctl: server does not shut down postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m i waiting for server to shut down.... done server stopped postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . start server starting There are no errors or warning related to VACUUM to be noticed in the log files. Or any logging related to the sessions, that are still hanging. With best regards, -- Valentine Gogichashvili
Re: BUG #3942: Related to BUG #2568 and BUG #2859: VACUUM process hangs and does not respond to the kill signals
От
"Heikki Linnakangas"
Дата:
Valentine Gogichashvili wrote: > The following bug has been logged online: > > Bug reference: 3942 > Logged by: Valentine Gogichashvili > Email address: valgog@gmail.com > PostgreSQL version: 8.2.3 You should upgrade to the latest 8.2.X minor release, regardless of this issue... > Operating system: Linux 2.6.8 SMP > Description: Related to BUG #2568 and BUG #2859: VACUUM process hangs > and does not respond to the kill signals > Details: > > We are running several servers with the 'same' configuration. > > Recently two of many servers started to experience similar problems. It is > quite difficult to reproduce. So I am providing the information that I have > managed to get by now. The problem on the second server was to be seen only > once. The problem on the first server is happening once a day or so. It does > not seem to actually depend on the server load. > > The problem is normally noticed when the update sessions on one of the > biggest table are hanging. The server has a 'statement_timeout' parameter > set globally to 20 seconds and cronjob vacuum jobs are being run every 2 > hours like: > "set statement_timeout to 0; VACUUM VERBOSE ANALYZE;", but never the less > the update sessions are not killed for hours. Sounds like you have a client connected to the database, but doing nothing. statement_timeout won't do anything to a backend that isn't actively running a query, even if it has a transaction open. You could do "SELECT * FROM pg_stat_activity" and look for any backends in "idle in transaction" state. > I cannot kill VACUUM session explicitly as well. How are you trying to kill it? > > More info: > > rumata@dbserver-04:~$ ps -ef | grep VACUUM | grep -v grep > postgres 24359 21321 0 13:47 ? 00:00:12 postgres: postgres > rumataindex [local] VACUUM > rumata@dbserver-04:~$ sudo su - postgres > postgres@dbserver-04:~$ psql rumataindex > Welcome to psql 8.2.3, the PostgreSQL interactive terminal. > > Type: \copyright for distribution terms > \h for help with SQL commands > \? for help with psql commands > \g or terminate with semicolon to execute query > \q to quit > > rumataindex=# select * from pg_locks where pid = 24359; > locktype | database | relation | page | tuple | transactionid | > classid | objid | objsubid | transaction | pid | mode > | granted > ---------------+----------+----------+------+-------+---------------+------- > --+-------+----------+-------------+-------+--------------------------+----- > ---- > relation | 17717 | 18917 | | | | > | | | 1045306585 | 24359 | ShareUpdateExclusiveLock | t > relation | 17717 | 1892961 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94216 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94213 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94214 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94206 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 185865 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 952119 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94212 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94215 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94207 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 185867 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94217 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > transactionid | | | | | 1045306585 | > | | | 1045306585 | 24359 | ExclusiveLock | t > relation | 17717 | 94208 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94211 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > (16 rows) > > rumataindex=# \q > postgres@dbserver-04:~$ kill 24359 > postgres@dbserver-04:~$ psql rumataindex > Welcome to psql 8.2.3, the PostgreSQL interactive terminal. > > Type: \copyright for distribution terms > \h for help with SQL commands > \? for help with psql commands > \g or terminate with semicolon to execute query > \q to quit > > rumataindex=# select * from pg_locks where pid = 24359; > locktype | database | relation | page | tuple | transactionid | > classid | objid | objsubid | transaction | pid | mode > | granted > ---------------+----------+----------+------+-------+---------------+------- > --+-------+----------+-------------+-------+--------------------------+----- > ---- > relation | 17717 | 18917 | | | | > | | | 1045306585 | 24359 | ShareUpdateExclusiveLock | t > relation | 17717 | 1892961 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94216 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94213 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94214 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94206 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 185865 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 952119 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94212 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94215 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94207 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 185867 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94217 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > transactionid | | | | | 1045306585 | > | | | 1045306585 | 24359 | ExclusiveLock | t > relation | 17717 | 94208 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > relation | 17717 | 94211 | | | | > | | | 1045306585 | 24359 | RowExclusiveLock | t > (16 rows) > > rumataindex=# \q > > In addition, I want to mention, that "VACUUM VERBOSE ANALYZE;" session owned > locks are the oldest ones, when the issue is occurring. > > The only way to restart the server is to force immediate server shutdown > like: > > postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f > waiting for server to shut > down............................................................... failed > postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m f > waiting for server to shut > down............................................................... failed > pg_ctl: server does not shut down > postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . stop -m i > waiting for server to shut down.... done > server stopped > postgres@dbserver-04:~/pgsql/data$ pg_ctl -D . start > server starting > > There are no errors or warning related to VACUUM to be noticed in the log > files. Or any logging related to the sessions, that are still hanging. Is there anything in the log indicating why it refuses to shut down? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com