Обсуждение: server process (PID 27884) was terminated by signal 4 (SIGILL)
Hi, i had a rather strange crash of my server (log file at the end of my mailling) and i was googling for Signal 4 and read http://en.wikipedia.org/wiki/SIGILL i am running on linux 2.6.18-5-686 and postgresql-8.1.9-0etch2. Most (all?) other processes on this machine got signal 4 at this time also (postfix, munin), etc As i understood my reading about signal 4, there must be some kind of hardware failure as the postgresql log says, too. I rebooted the server and since then everything works fine. But i am going to drop this server and replace it with a new one of course. I just want to ask if there is something else besideds hardware failure which could force a signall 4 (ILL)? I am asking as i did a lot of CREATE/DROP Tables before to move some data back and forth between to schemas. Is it possible that massive creating and dropping of tables generated this error. If so, i have to rethink my sql script. Otherwise i just would replace the hardware. here is my log: LOG: server process (PID 27884) was terminated by signal 4 LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process [...] 26 times LOG: all server processes terminated; reinitializing LOG: database system was interrupted at 2007-12-31 13:43:49 CET LOG: checkpoint record is at 4/D401C5FC LOG: redo record is at 4/D401C5FC; undo record is at 0/0; shutdown FALSE LOG: next transaction ID: 22378390; next OID: 373396 LOG: next MultiXactId: 2; next MultiXactOffset: 3 LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 4/D401C640 LOG: record with zero length at 4/D401E88C LOG: redo done at 4/D401E864 LOG: database system is ready LOG: transaction ID wrap limit is 2147484146, limited by database "postgres" LOG: database system was interrupted at 2007-12-31 13:43:59 CET LOG: checkpoint record is at 4/D401E88C LOG: redo record is at 4/D401E88C; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 22378414; next OID: 373396 LOG: next MultiXactId: 2; next MultiXactOffset: 3 LOG: database system was not properly shut down; automatic recovery in progress LOG: incomplete startup packet LOG: record with zero length at 4/D401E8D0 LOG: redo is not required LOG: database system is ready LOG: transaction ID wrap limit is 2147484146, limited by database "postgres" kind regards, janning
On Jan 4, 2008, at 12:44 AM, mljv@planwerk6.de wrote: > Hi, > > i had a rather strange crash of my server (log file at the end of my > mailling) > and i was googling for Signal 4 and read http://en.wikipedia.org/wiki/SIGILL > > i am running on linux 2.6.18-5-686 and postgresql-8.1.9-0etch2. > > Most (all?) other processes on this machine got signal 4 at this > time also > (postfix, munin), etc > > As i understood my reading about signal 4, there must be some kind > of hardware > failure as the postgresql log says, too. I rebooted the server and > since then > everything works fine. But i am going to drop this server and > replace it with > a new one of course. I just want to ask if there is something else > besideds > hardware failure which could force a signall 4 (ILL)? Software bugs can on rare occasions (by overwriting return stack data and heading off into the weeds, say), but there's no way they'd do that in multiple processes simultaneously It's conceivable it was just a transient problem ("a cosmic ray" - they really do happen occasionally) but it's much more likely you have bad hardware. Probably either RAM, disk or disk controller. Cheers, Steve
Am Freitag, 4. Januar 2008 10:03 schrieb Steve Atkins: > On Jan 4, 2008, at 12:44 AM, mljv@planwerk6.de wrote: > > I just want to ask if there is something else > > besideds > > hardware failure which could force a signall 4 (ILL)? > > Software bugs can on rare occasions (by overwriting return stack data > and heading off into the weeds, say), but there's no way they'd do > that in > multiple processes simultaneously > > It's conceivable it was just a transient problem ("a cosmic ray" - > they really do > happen occasionally) but it's much more likely you have bad hardware. > Probably either RAM, disk or disk controller. thanks a lot for clarifying this so quickly. kind regards, Janning