Обсуждение: server process (PID 27884) was terminated by signal 4 (SIGILL)

Поиск
Список
Период
Сортировка

server process (PID 27884) was terminated by signal 4 (SIGILL)

От
mljv@planwerk6.de
Дата:
Hi,

i had a rather strange crash of my server (log file at the end of my mailling)
and i was googling for Signal 4 and read http://en.wikipedia.org/wiki/SIGILL

i am running on linux 2.6.18-5-686 and postgresql-8.1.9-0etch2.

Most (all?) other processes on this machine got signal 4 at this time also
(postfix, munin), etc

As i understood my reading about signal 4, there must be some kind of hardware
failure as the postgresql log says, too. I rebooted the server and since then
everything works fine. But i am going to drop this server and replace it with
a new one of course. I just want to ask if there is something else besideds
hardware failure which could force a signall 4 (ILL)?

I am asking as i did a lot of CREATE/DROP Tables before to move some data back
and forth between to schemas. Is it possible that massive creating and
dropping of tables generated this error. If so, i have to rethink my sql
script. Otherwise i just would replace the hardware.

here is my log:

LOG:  server process (PID 27884) was terminated by signal 4
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat
your command.
WARNING:  terminating connection because of crash of another server process
[...] 26 times
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2007-12-31 13:43:49 CET
LOG:  checkpoint record is at 4/D401C5FC
LOG:  redo record is at 4/D401C5FC; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 22378390; next OID: 373396
LOG:  next MultiXactId: 2; next MultiXactOffset: 3
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 4/D401C640
LOG:  record with zero length at 4/D401E88C
LOG:  redo done at 4/D401E864
LOG:  database system is ready
LOG:  transaction ID wrap limit is 2147484146, limited by database "postgres"
LOG:  database system was interrupted at 2007-12-31 13:43:59 CET
LOG:  checkpoint record is at 4/D401E88C
LOG:  redo record is at 4/D401E88C; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 22378414; next OID: 373396
LOG:  next MultiXactId: 2; next MultiXactOffset: 3
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  incomplete startup packet
LOG:  record with zero length at 4/D401E8D0
LOG:  redo is not required
LOG:  database system is ready
LOG:  transaction ID wrap limit is 2147484146, limited by database "postgres"

kind regards,
janning


Re: server process (PID 27884) was terminated by signal 4 (SIGILL)

От
Steve Atkins
Дата:
On Jan 4, 2008, at 12:44 AM, mljv@planwerk6.de wrote:

> Hi,
>
> i had a rather strange crash of my server (log file at the end of my
> mailling)
> and i was googling for Signal 4 and read http://en.wikipedia.org/wiki/SIGILL
>
> i am running on linux 2.6.18-5-686 and postgresql-8.1.9-0etch2.
>
> Most (all?) other processes on this machine got signal 4 at this
> time also
> (postfix, munin), etc
>
> As i understood my reading about signal 4, there must be some kind
> of hardware
> failure as the postgresql log says, too. I rebooted the server and
> since then
> everything works fine. But i am going to drop this server and
> replace it with
> a new one of course. I just want to ask if there is something else
> besideds
> hardware failure which could force a signall 4 (ILL)?

Software bugs can on rare occasions (by overwriting return stack data
and heading off into the weeds, say), but there's no way they'd do
that in
multiple processes simultaneously

It's conceivable it was just a transient problem ("a cosmic ray" -
they really do
happen occasionally) but it's much more likely you have bad hardware.
Probably either RAM, disk or disk controller.

Cheers,
   Steve


Re: server process (PID 27884) was terminated by signal 4 (SIGILL)

От
mljv@planwerk6.de
Дата:
Am Freitag, 4. Januar 2008 10:03 schrieb Steve Atkins:
> On Jan 4, 2008, at 12:44 AM, mljv@planwerk6.de wrote:
> > I just want to ask if there is something else
> > besideds
> > hardware failure which could force a signall 4 (ILL)?
>
> Software bugs can on rare occasions (by overwriting return stack data
> and heading off into the weeds, say), but there's no way they'd do
> that in
> multiple processes simultaneously
>
> It's conceivable it was just a transient problem ("a cosmic ray" -
> they really do
> happen occasionally) but it's much more likely you have bad hardware.
> Probably either RAM, disk or disk controller.

thanks a lot for clarifying this so quickly.

kind regards,
Janning