Re: BUG #6183: FATAL: canceling authentication due to timeout

Поиск
Список
Период
Сортировка
От Thorvald Natvig
Тема Re: BUG #6183: FATAL: canceling authentication due to timeout
Дата
Msg-id 4E5C3B6B.70101@medallia.com
обсуждение исходный текст
Ответ на Re: BUG #6183: FATAL: canceling authentication due to timeout  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On 8/29/11 5:50 PM, Tom Lane wrote:
> "Thorvald Natvig" <thorvald@medallia.com> writes:
>> We get a lot of "FATAL:  canceling authentication due to timeout" in the
>> log, with accompanying closed connections to clients.
> Well, the only known cause of that (other than genuine timeout
> conditions) is in fact fixed in 9.1rc1.  You have not provided any
> information that would permit anyone to look for another cause.
This is a database server with fairly high traffic to multiple
databases. It seems to be related to multiple concurrent connections,
but I haven't had time to isolate a repeatable minimal testcase yet. I
was hoping that whatever was wrong was related to something obvious, or
that someone else had seen similar issues and were able to help with
isolating it.
Since this artifact is influencing the usability of the machine, I've
disabled the issuing of 'vacuumdb' for now (which "fixes" the issue).

>> There does indeed seem to be a correlation between doing vacuum and seeing
>> this error.
> Are you doing VACUUM FULLs on pg_authid (and if so, why)?  If you are,
> is it possible that those are queuing up behind other queries that
> access pg_authid, and for some reason aren't releasing their locks
> promptly?
>
>             regards, tom lane

Databases are created from plain-text backups with createdb and psql,
minimal modifications are done to a few rows, and then
vacuumdb -q -z ${db}

A bit later, this database is renamed, a copy of it is created with
'createdb -T olddb newdb', a lot of deletions (between 0 and 90% of the
rows) are performed and then
vacuumdb -q -f -z ${newdb}

The script doing this is run from several machines working on different
databases, all hosted on the same server. So it's possible there are
multiple full vacuums issued at the same time. However, there are no
users connected to the databases being vacuumed during this time, but
there are hundreds of connections to other databases on the same server;
these are the ones that fail. All of these databases have at one point
been created with -T on a database from the above process. As far as I
know, there are no direct queries to pg_ tables. All operations are
performed over tcp with the same user.

I don't know if this helps with where to look. If it doesn't, I'll try
to make a repeatable testcase on the weekend, when this server isn't
quite so essential.

Regards,
Thorvald

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #6183: FATAL: canceling authentication due to timeout
Следующее
От: Jan Snelders
Дата:
Сообщение: Postgresql ACID bug?