Обсуждение: server closed the connection unexpectedly

Поиск
Список
Период
Сортировка

server closed the connection unexpectedly

От
"Mark Greenbank"
Дата:
Hi,

I get this error when accessing the postgresl database  -- any ideas? What should I look at?

I can query all of the other tables in the database, just not the email_queue table. Weird!

Thanks in advance,
Mark

# psql --version
psql (PostgreSQL) 7.3.2
contains support for command-line editing

# psql
mncis2-rel=# select count(*) from members;
count
-------
22676
(1 row)

mncis2-rel=# select count(*) from email_queue;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!# \q

Re: server closed the connection unexpectedly

От
Richard Huxton
Дата:
Mark Greenbank wrote:
> # psql --version
> psql (PostgreSQL) 7.3.2

May I be the first to say "GODS ALIVE MAN! WHAT ARE YOU STILL DOING
RUNNING 7.3.2!". Even if you can't upgrade from 7.3, the latest release
is 7.3.15 - that's 13 sets of bug-fixes you're ignoring. There's a
passing chance one of those bugs might be responsible for your error.

> contains support for command-line editing
>
> # psql
> mncis2-rel=# select count(*) from members;
> count
> -------
> 22676
> (1 row)
>
> mncis2-rel=# select count(*) from email_queue;
> server closed the connection unexpectedly
>        This probably means the server terminated abnormally
>        before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.

What do the logs say?

--
   Richard Huxton
   Archonet Ltd

Re: server closed the connection unexpectedly

От
"Mark Greenbank"
Дата:
Hi Richard,

Thanks for the reply -- yeah, I know it's and old version but management has been reluctant to update a production database.
As for the logs, I looked around an didn't see any. pg_ctl start is not using the -l option for logging and output is redirected to /dev/null :(

Anyway, now I have some room to update, which leads to the following questions:

1) I'm assuming that if I update 7.3.2 to 7.3.15 I can leave the data in place (that is, without doing a dump/restore) -- is this correct?
2) If I up upgrade to 8.x can just copy the data files or do I have to do a dump/restore? The latter would be hard since pg_dump also fails on this same table and pg_dump doesn't seem to have an 'ignore table' option.

Mark

On 10/6/06, Richard Huxton <dev@archonet.com> wrote:
Mark Greenbank wrote:
> # psql --version
> psql (PostgreSQL) 7.3.2

May I be the first to say "GODS ALIVE MAN! WHAT ARE YOU STILL DOING
RUNNING 7.3.2!". Even if you can't upgrade from 7.3, the latest release
is 7.3.15 - that's 13 sets of bug-fixes you're ignoring. There's a
passing chance one of those bugs might be responsible for your error.

> contains support for command-line editing
>
> # psql
> mncis2-rel=# select count(*) from members;
> count
> -------
> 22676
> (1 row)
>
> mncis2-rel=# select count(*) from email_queue;
> server closed the connection unexpectedly
>        This probably means the server terminated abnormally
>        before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.

What do the logs say?

--
   Richard Huxton
   Archonet Ltd

Re: server closed the connection unexpectedly

От
Geoffrey
Дата:
Mark Greenbank wrote:
> Hi,
>
> I get this error when accessing the postgresl database  -- any ideas? What
> should I look at?
>
> I can query all of the other tables in the database, just not the
> email_queue table. Weird!
>
> Thanks in advance,
> Mark
>
> # psql --version
> psql (PostgreSQL) 7.3.2
> contains support for command-line editing

I will tell you this.  We were seeing a similar problem on 7.4.6 and we
upgraded this past Thursday morning to 7.4.13 and thus far we have not
seen another of these errors.  Thursday and Friday are our busiest days
and we saw this error more on these days then any other.

As far as we are concerned, the jury is still out until we have more
time to verify this resolved the issue.

I'd suggest you upgrade to the latest 7.4, although from 7.3.2 that may
not be as simple as it was for us.

--
Until later, Geoffrey

Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
  - Benjamin Franklin

Re: server closed the connection unexpectedly

От
Douglas McNaught
Дата:
"Mark Greenbank" <mark.greenbank@gmail.com> writes:

> 1) I'm assuming that if I update 7.3.2 to 7.3.15 I can leave the data in place
> (that is, without doing a dump/restore) -- is this correct?

Yes--minor releases don't require dump/restore.

> 2) If I up upgrade to 8.x can just copy the data files or do I have to do a
> dump/restore? The latter would be hard since pg_dump also fails on this same
> table and pg_dump doesn't seem to have an 'ignore table' option.

You would use the 8.x pg_dump in this case (against the 7.3 database),
which may or may not help if you have corrupted table data.  If that's
the problem, it's best to fix the corruption before trying to upgrade.

-Doug

Re: server closed the connection unexpectedly

От
"Mark Greenbank"
Дата:
I've enabled logging, having upgraded to 7.3.4 (since that was the simplest yum updated :) and I see the following messages:

FATAL:  unsupported frontend protocol
LOG:  server process (pid 26548) was terminated by signal 11
LOG:  terminating any other active server processes
FATAL:  The database system is in recovery mode
WARNING:  Message from PostgreSQL backend:
        The Postmaster has informed me that some other backend
        died abnormally and possibly corrupted shared memory.
        I have rolled back the current transaction and am
        going to terminate your database system connection and exit.
        Please reconnect to the database system and repeat your query.
LOG:  all server processes terminated; reinitializing shared memory and semaphores
LOG:  database system was interrupted at 2006-10-06 11:34:17 EDT
LOG:  checkpoint record is at D/72C79A94
LOG:  redo record is at D/72C79A94; undo record is at 0/0; shutdown TRUE
LOG:  next transaction id: 74267383; next oid: 89469046
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  ReadRecord: record with zero length at D/72C79AD4
LOG:  redo is not required
FATAL:  unsupported frontend protocol
FATAL:  The database system is starting up
FATAL:  unsupported frontend protocol
FATAL:  The database system is starting up
FATAL:  unsupported frontend protocol
FATAL:  The database system is starting up
FATAL:  unsupported frontend protocol
FATAL:  The database system is starting up
FATAL:  unsupported frontend protocol
FATAL:  The database system is starting up
LOG:  database system is ready

The worrying bit (at least to my uneducated eye) is the line:
LOG:  ReadRecord: record with zero length at D/72C79AD4


Thanks,
Mark

On 10/6/06, Mark Greenbank <mark.greenbank@gmail.com> wrote:
Hi Richard,

Thanks for the reply -- yeah, I know it's and old version but management has been reluctant to update a production database.
As for the logs, I looked around an didn't see any. pg_ctl start is not using the -l option for logging and output is redirected to /dev/null :(

Anyway, now I have some room to update, which leads to the following questions:

1) I'm assuming that if I update 7.3.2 to 7.3.15 I can leave the data in place (that is, without doing a dump/restore) -- is this correct?
2) If I up upgrade to 8.x can just copy the data files or do I have to do a dump/restore? The latter would be hard since pg_dump also fails on this same table and pg_dump doesn't seem to have an 'ignore table' option.

Mark


On 10/6/06, Richard Huxton < dev@archonet.com> wrote:
Mark Greenbank wrote:
> # psql --version
> psql (PostgreSQL) 7.3.2

May I be the first to say "GODS ALIVE MAN! WHAT ARE YOU STILL DOING
RUNNING 7.3.2!". Even if you can't upgrade from 7.3, the latest release
is 7.3.15 - that's 13 sets of bug-fixes you're ignoring. There's a
passing chance one of those bugs might be responsible for your error.

> contains support for command-line editing
>
> # psql
> mncis2-rel=# select count(*) from members;
> count
> -------
> 22676
> (1 row)
>
> mncis2-rel=# select count(*) from email_queue;
> server closed the connection unexpectedly
>        This probably means the server terminated abnormally
>        before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.

What do the logs say?

--
   Richard Huxton
   Archonet Ltd


Re: server closed the connection unexpectedly

От
"Mark Greenbank"
Дата:
Hi Doug,

Are there any pointers as to how to fix corrupted data? Is seems like that might be the problem as I'm seeing a zero-length read in the log file (see my previous email).

Thanks again,
Mark

On 10/6/06, Douglas McNaught <doug@mcnaught.org> wrote:
"Mark Greenbank" <mark.greenbank@gmail.com> writes:

> 1) I'm assuming that if I update 7.3.2 to 7.3.15 I can leave the data in place
> (that is, without doing a dump/restore) -- is this correct?

Yes--minor releases don't require dump/restore.

> 2) If I up upgrade to 8.x can just copy the data files or do I have to do a
> dump/restore? The latter would be hard since pg_dump also fails on this same
> table and pg_dump doesn't seem to have an 'ignore table' option.

You would use the 8.x pg_dump in this case (against the 7.3 database),
which may or may not help if you have corrupted table data.  If that's
the problem, it's best to fix the corruption before trying to upgrade.

-Doug

Re: server closed the connection unexpectedly

От
Tom Lane
Дата:
"Mark Greenbank" <mark.greenbank@gmail.com> writes:
> [ select count(*) from email_queue leads to ]
> LOG:  server process (pid 26548) was terminated by signal 11

If this is repeatable then it probably indicates corrupt data on-disk
(which 7.3.2 is mostly lacking any defenses for).  The trick is to find
and delete the corrupted rows or pages.  You can generally get somewhere
by a process of binary search, ie, find the largest N such that
"select ctid,* from email_queue limit N" doesn't fail.  The last ctid
you can extract this way will be one or two before the (first) bad row.

Given that it's a count(*) that's failing, the problem is probably with
a page or tuple header and not any individual data field, so you might
not have any good recovery option beyond zeroing the whole page
containing the error.  See the mailing list archives for past
discussions of recovering from data corruption for more info.

> The worrying bit (at least to my uneducated eye) is the line:
> LOG:  ReadRecord: record with zero length at D/72C79AD4

That's normal operation, don't worry about it.

            regards, tom lane

Re: server closed the connection unexpectedly

От
Jeff Davis
Дата:
On Fri, 2006-10-06 at 11:20 -0400, Mark Greenbank wrote:
> Hi Richard,
>
> Thanks for the reply -- yeah, I know it's and old version but
> management has been reluctant to update a production database.

If there exists a patchlevel (the third component of the version) higher
than the one you're using, generally it's safer to upgrade the
production system than not. Updating the patchlevel does not require a
dump/restore, just a restart. Just schedule a minute or two of downtime.

> 1) I'm assuming that if I update 7.3.2 to 7.3.15 I can leave the data
> in place (that is, without doing a dump/restore) -- is this correct?

Correct, no dump/reload necessary, just upgrade and restart postgres.

> 2) If I up upgrade to 8.x can just copy the data files or do I have to
> do a dump/restore? The latter would be hard since pg_dump also fails
> on this same table and pg_dump doesn't seem to have an 'ignore table'
> option.

Upgrading the first or second component of the version number (i.e. 7.3
to 7.4 or 7.4 to 8.0) requires a full dump/reload. However, it might be
worth considering since you're using a version that's been obsolete for
years.

Regards,
    Jeff Davis