Re: It happened again: Server hung up solid

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: It happened again: Server hung up solid
Дата
Msg-id 25204.957746133@sss.pgh.pa.us
обсуждение исходный текст
Ответ на It happened again: Server hung up solid  (The Hermit Hacker <scrappy@hub.org>)
Ответы Re: It happened again: Server hung up solid  (The Hermit Hacker <scrappy@hub.org>)
Re: It happened again: Server hung up solid  (The Hermit Hacker <scrappy@hub.org>)
Список pgsql-hackers
The Hermit Hacker <scrappy@hub.org> writes:
> Okay, this is with code of ~May 4th ... a 'psql' connection to the
> database hangs solid.

Do you mean you can't make a connection at all?  Is there any indication
that the postmaster is lighting off a backend for you?  Since you show
a couple of zombie backends hanging around, it would seem like a good
bet that the postmaster itself is wedged and not responding to events,
but I'm not sure.

> errout is dated:

> pgsql% !ls
> ls -lt
> total 13324
> -rw-------   1 pgsql  pgsql  4842715 May  7 10:57 errout.5432

> and the last few lines contain:

> ERROR:  parser: parse error at or near "vpti"
> pq_recvbuf: unexpected EOF on client connection
> pq_flush: send() failed: Broken pipe
> pq_recvbuf: recv() failed: Connection reset by peer
> pq_recvbuf: unexpected EOF on client connection
> pq_recvbuf: unexpected EOF on client connection
> pq_flush: send() failed: Broken pipe
> pq_recvbuf: recv() failed: Connection reset by peer

> But, of course, no date/time ...

Given that the file mod time is considerably before the hang (right?)
the messages in it are probably unrelated.  It does seem odd that you
have so many clients disconnecting ungracefully; what client apps are
you running?

> Since this is a production server, I can't just leave it there hung like
> that, but if someone wants to give some instructions on what to do the
> next time this happens, please feel free to do so, and I'll add that to my
> list ... maybe run a gdb command on it, since truss doesn't appear to
> help?

Try killing the postmaster itself in such a way as to produce a coredump
(kill -ABORT ought to do) and get a backtrace from that.  It might also
be worth running the postmaster with connection tracing turned on (I
forget the incantation for that, but it should be in TFM).

> At this time, I consider this to be a show-stopper on the release ... this
> is what happened the last time when the result appeared to be the index
> corruption

If the postmaster is hanging then it's almost certainly unrelated to
index corruption...
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: The Hermit Hacker
Дата:
Сообщение: Re: CREATE DATABASE WITH OWNER '??';
Следующее
От: The Hermit Hacker
Дата:
Сообщение: Re: It happened again: Server hung up solid