Обсуждение: always forced restart after status 139?
Hi all, We are using Postgres 7.1 on RedHat Linux 7.1. When calling a C function in a shared library (*.so), if you get a segmentation fault (status 139), the log indicates that the database will shut down and then restart in a few seconds. My question is, does this always have to happen? Is postgres capable of just logging the seg fault, but not affecting all the users on the database by restarting? Thanks, Jason jwilliams@wc-group.com __________________________________________________ Expand your wireless world with Arkdom PLUS http://www.arkdom.com/
On Mon, 18 Mar 2002, Jason Williams wrote: > We are using Postgres 7.1 on RedHat Linux 7.1. > > When calling a C function in a shared library (*.so), if you get a > segmentation fault (status 139), the log indicates that the database will > shut down and then restart in a few seconds. > > My question is, does this always have to happen? Is postgres capable of > just logging the seg fault, but not affecting all the users on the database > by restarting? Because (the nature of) a SIGSEGV, you can't trust any data remaining in memory - what if the crash was caused by corrupt data in memory? This is why PostgreSQL completely shuts down, and re-starts back up. Allowing any part of PostgreSQL to continue (especially since there's data in SHM that's important) would be a bad idea, since you have no idea who caused the SIGSEGV. -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ------------------------------------------------------------------------------- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
Thanks Dominic. Point taken and understood. The problem is this: we are planning to make this database for a commercial website that will handle financial transactions. What will happen if the database receives a seg fault and another user of the database is in the middle of submitting a "critical" update? I'm assuming it will rollback gracefully? Does anyone know what the exact behavior is in this situation? Thanks, Jason -----Original Message----- From: Dominic J. Eidson [mailto:sauron@the-infinite.org] Sent: Monday, March 18, 2002 1:28 PM To: Jason Williams Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] always forced restart after status 139? On Mon, 18 Mar 2002, Jason Williams wrote: > We are using Postgres 7.1 on RedHat Linux 7.1. > > When calling a C function in a shared library (*.so), if you get a > segmentation fault (status 139), the log indicates that the database will > shut down and then restart in a few seconds. > > My question is, does this always have to happen? Is postgres capable of > just logging the seg fault, but not affecting all the users on the database > by restarting? Because (the nature of) a SIGSEGV, you can't trust any data remaining in memory - what if the crash was caused by corrupt data in memory? This is why PostgreSQL completely shuts down, and re-starts back up. Allowing any part of PostgreSQL to continue (especially since there's data in SHM that's important) would be a bad idea, since you have no idea who caused the SIGSEGV. -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ---------------------------------------------------------------------------- --- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
On Mon, 18 Mar 2002, Jason Williams wrote: > Point taken and understood. You might wanna fix those extensions so they don't crash, btw :) > The problem is this: we are planning to make this database for a commercial > website that will handle financial transactions. What will happen if the > database receives a seg fault and another user of the database is in the > middle of submitting a "critical" update? I'm assuming it will rollback > gracefully? It should roll back to it's pre-transaction state. -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ------------------------------------------------------------------------------- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
"Jason Williams" <jwilliams@wc-group.com> writes: > The problem is this: we are planning to make this database for a commercial > website that will handle financial transactions. What will happen if the > database receives a seg fault and another user of the database is in the > middle of submitting a "critical" update? I'm assuming it will rollback > gracefully? The seg fault as such is not a problem. What concerns me a tad is that your buggy C extension may scribble on shared-memory disk buffers at some point before it causes an outright crash. If corrupted data manages to get written to disk before the backend crash and ensuing restart, there's no guarantee we can clean it up. The odds of this are probably not high (assuming you use conservatively-sized shared buffers, rather than a large fraction of your address space as some here have been known to suggest) ... but they're not zero. I concur with Dominic: fix your extension *before* you put it in production, not after. If you don't have confidence in your ability to get the bugs out then maybe you shouldn't be writing C functions. The interpreted PLs are a great deal safer. regards, tom lane
Did not mean to give the impression I was going to put a buggy extension into production. Relax guys. I've fixed the bug that was causing the initial seg fault. Just trying to cover all the bases here and understand what we need to do in our front end "if" we've overlooked something in one of our extensions. Thanks for the help and clarifications. Jason -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Monday, March 18, 2002 2:47 PM To: Jason Williams Cc: Dominic J. Eidson; pgsql-general@postgresql.org Subject: Re: [GENERAL] always forced restart after status 139? "Jason Williams" <jwilliams@wc-group.com> writes: > The problem is this: we are planning to make this database for a commercial > website that will handle financial transactions. What will happen if the > database receives a seg fault and another user of the database is in the > middle of submitting a "critical" update? I'm assuming it will rollback > gracefully? The seg fault as such is not a problem. What concerns me a tad is that your buggy C extension may scribble on shared-memory disk buffers at some point before it causes an outright crash. If corrupted data manages to get written to disk before the backend crash and ensuing restart, there's no guarantee we can clean it up. The odds of this are probably not high (assuming you use conservatively-sized shared buffers, rather than a large fraction of your address space as some here have been known to suggest) ... but they're not zero. I concur with Dominic: fix your extension *before* you put it in production, not after. If you don't have confidence in your ability to get the bugs out then maybe you shouldn't be writing C functions. The interpreted PLs are a great deal safer. regards, tom lane
Jason Williams wrote: > Thanks Dominic. > > Point taken and understood. > > The problem is this: we are planning to make this database for a commercial > website that will handle financial transactions. What will happen if the > database receives a seg fault and another user of the database is in the > middle of submitting a "critical" update? I'm assuming it will rollback > gracefully? First of all, you don't allow development work on the same system your production runs on. Doing so implies that the data is not critical to you. > Does anyone know what the exact behavior is in this situation? Nobody can tell for sure. In almost all cases, yes, the rollback would be gracefully. But the fault could've corrupted the stack of the failing backend, causing it to execute arbitrary code. How does someone predict what arbitrary code will do? Jan > > Thanks, > > Jason > > -----Original Message----- > From: Dominic J. Eidson [mailto:sauron@the-infinite.org] > Sent: Monday, March 18, 2002 1:28 PM > To: Jason Williams > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] always forced restart after status 139? > > > On Mon, 18 Mar 2002, Jason Williams wrote: > > > We are using Postgres 7.1 on RedHat Linux 7.1. > > > > When calling a C function in a shared library (*.so), if you get a > > segmentation fault (status 139), the log indicates that the database will > > shut down and then restart in a few seconds. > > > > My question is, does this always have to happen? Is postgres capable of > > just logging the seg fault, but not affecting all the users on the > database > > by restarting? > > Because (the nature of) a SIGSEGV, you can't trust any data remaining in > memory - what if the crash was caused by corrupt data in memory? > > This is why PostgreSQL completely shuts down, and re-starts back up. > > Allowing any part of PostgreSQL to continue (especially since there's data > in SHM that's important) would be a bad idea, since you have no idea who > caused the SIGSEGV. > > > -- > Dominic J. Eidson > "Baruk Khazad! Khazad ai-menu!" - > Gimli > ---------------------------------------------------------------------------- > --- > http://www.the-infinite.org/ > http://www.the-infinite.org/~dominic/ > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com