Обсуждение: always forced restart after status 139?

Поиск
Список
Период
Сортировка

always forced restart after status 139?

От
"Jason Williams"
Дата:
Hi all,

We are using Postgres 7.1 on RedHat Linux 7.1.

When calling a C function in a shared library (*.so), if you get a
segmentation fault (status 139), the log indicates that the database will
shut down and then restart in a few seconds.

My question is, does this always have to happen?  Is postgres capable of
just logging the seg fault, but not affecting all the users on the database
by restarting?

Thanks,

Jason
jwilliams@wc-group.com
__________________________________________________
 Expand your wireless world with Arkdom PLUS
 http://www.arkdom.com/




Re: always forced restart after status 139?

От
"Dominic J. Eidson"
Дата:
On Mon, 18 Mar 2002, Jason Williams wrote:

> We are using Postgres 7.1 on RedHat Linux 7.1.
>
> When calling a C function in a shared library (*.so), if you get a
> segmentation fault (status 139), the log indicates that the database will
> shut down and then restart in a few seconds.
>
> My question is, does this always have to happen?  Is postgres capable of
> just logging the seg fault, but not affecting all the users on the database
> by restarting?

Because (the nature of) a SIGSEGV, you can't trust any data remaining in
memory - what if the crash was caused by corrupt data in memory?

This is why PostgreSQL completely shuts down, and re-starts back up.

Allowing any part of PostgreSQL to continue (especially since there's data
in SHM that's important) would be a bad idea, since you have no idea who
caused the SIGSEGV.


--
Dominic J. Eidson
                                        "Baruk Khazad! Khazad ai-menu!" - Gimli
-------------------------------------------------------------------------------
http://www.the-infinite.org/              http://www.the-infinite.org/~dominic/



Re: always forced restart after status 139?

От
"Jason Williams"
Дата:
Thanks Dominic.

Point taken and understood.

The problem is this:  we are planning to make this database for a commercial
website that will handle financial transactions.  What will happen if the
database receives a seg fault and another user of the database is in the
middle of submitting a "critical" update?  I'm assuming it will rollback
gracefully?

Does anyone know what the exact behavior is in this situation?

Thanks,

Jason

-----Original Message-----
From: Dominic J. Eidson [mailto:sauron@the-infinite.org]
Sent: Monday, March 18, 2002 1:28 PM
To: Jason Williams
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] always forced restart after status 139?


On Mon, 18 Mar 2002, Jason Williams wrote:

> We are using Postgres 7.1 on RedHat Linux 7.1.
>
> When calling a C function in a shared library (*.so), if you get a
> segmentation fault (status 139), the log indicates that the database will
> shut down and then restart in a few seconds.
>
> My question is, does this always have to happen?  Is postgres capable of
> just logging the seg fault, but not affecting all the users on the
database
> by restarting?

Because (the nature of) a SIGSEGV, you can't trust any data remaining in
memory - what if the crash was caused by corrupt data in memory?

This is why PostgreSQL completely shuts down, and re-starts back up.

Allowing any part of PostgreSQL to continue (especially since there's data
in SHM that's important) would be a bad idea, since you have no idea who
caused the SIGSEGV.


--
Dominic J. Eidson
                                        "Baruk Khazad! Khazad ai-menu!" -
Gimli
----------------------------------------------------------------------------
---
http://www.the-infinite.org/
http://www.the-infinite.org/~dominic/




Re: always forced restart after status 139?

От
"Dominic J. Eidson"
Дата:
On Mon, 18 Mar 2002, Jason Williams wrote:

> Point taken and understood.

You might wanna fix those extensions so they don't crash, btw :)

> The problem is this:  we are planning to make this database for a commercial
> website that will handle financial transactions.  What will happen if the
> database receives a seg fault and another user of the database is in the
> middle of submitting a "critical" update?  I'm assuming it will rollback
> gracefully?

It should roll back to it's pre-transaction state.


--
Dominic J. Eidson
                                        "Baruk Khazad! Khazad ai-menu!" - Gimli
-------------------------------------------------------------------------------
http://www.the-infinite.org/              http://www.the-infinite.org/~dominic/


Re: always forced restart after status 139?

От
Tom Lane
Дата:
"Jason Williams" <jwilliams@wc-group.com> writes:
> The problem is this:  we are planning to make this database for a commercial
> website that will handle financial transactions.  What will happen if the
> database receives a seg fault and another user of the database is in the
> middle of submitting a "critical" update?  I'm assuming it will rollback
> gracefully?

The seg fault as such is not a problem.  What concerns me a tad is that
your buggy C extension may scribble on shared-memory disk buffers at
some point before it causes an outright crash.  If corrupted data
manages to get written to disk before the backend crash and ensuing
restart, there's no guarantee we can clean it up.  The odds of this are
probably not high (assuming you use conservatively-sized shared buffers,
rather than a large fraction of your address space as some here have
been known to suggest) ... but they're not zero.

I concur with Dominic: fix your extension *before* you put it in
production, not after.  If you don't have confidence in your ability
to get the bugs out then maybe you shouldn't be writing C functions.
The interpreted PLs are a great deal safer.

            regards, tom lane

Re: always forced restart after status 139?

От
"Jason Williams"
Дата:
Did not mean to give the impression I was going to put a buggy extension
into production.  Relax guys.  I've fixed the bug that was causing the
initial seg fault.  Just trying to cover all the bases here and understand
what we need to do in our front end "if" we've overlooked something in one
of our extensions.  Thanks for the help and clarifications.

Jason

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 18, 2002 2:47 PM
To: Jason Williams
Cc: Dominic J. Eidson; pgsql-general@postgresql.org
Subject: Re: [GENERAL] always forced restart after status 139?


"Jason Williams" <jwilliams@wc-group.com> writes:
> The problem is this:  we are planning to make this database for a
commercial
> website that will handle financial transactions.  What will happen if the
> database receives a seg fault and another user of the database is in the
> middle of submitting a "critical" update?  I'm assuming it will rollback
> gracefully?

The seg fault as such is not a problem.  What concerns me a tad is that
your buggy C extension may scribble on shared-memory disk buffers at
some point before it causes an outright crash.  If corrupted data
manages to get written to disk before the backend crash and ensuing
restart, there's no guarantee we can clean it up.  The odds of this are
probably not high (assuming you use conservatively-sized shared buffers,
rather than a large fraction of your address space as some here have
been known to suggest) ... but they're not zero.

I concur with Dominic: fix your extension *before* you put it in
production, not after.  If you don't have confidence in your ability
to get the bugs out then maybe you shouldn't be writing C functions.
The interpreted PLs are a great deal safer.

            regards, tom lane


Re: always forced restart after status 139?

От
Jan Wieck
Дата:
Jason Williams wrote:
> Thanks Dominic.
>
> Point taken and understood.
>
> The problem is this:  we are planning to make this database for a commercial
> website that will handle financial transactions.  What will happen if the
> database receives a seg fault and another user of the database is in the
> middle of submitting a "critical" update?  I'm assuming it will rollback
> gracefully?

    First  of  all,  you don't allow development work on the same
    system your production runs on. Doing  so  implies  that  the
    data is not critical to you.

> Does anyone know what the exact behavior is in this situation?

    Nobody  can  tell  for  sure.  In  almost all cases, yes, the
    rollback  would  be  gracefully.  But  the   fault   could've
    corrupted  the  stack  of  the failing backend, causing it to
    execute  arbitrary  code.   How  does  someone  predict  what
    arbitrary code will do?

Jan

>
> Thanks,
>
> Jason
>
> -----Original Message-----
> From: Dominic J. Eidson [mailto:sauron@the-infinite.org]
> Sent: Monday, March 18, 2002 1:28 PM
> To: Jason Williams
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] always forced restart after status 139?
>
>
> On Mon, 18 Mar 2002, Jason Williams wrote:
>
> > We are using Postgres 7.1 on RedHat Linux 7.1.
> >
> > When calling a C function in a shared library (*.so), if you get a
> > segmentation fault (status 139), the log indicates that the database will
> > shut down and then restart in a few seconds.
> >
> > My question is, does this always have to happen?  Is postgres capable of
> > just logging the seg fault, but not affecting all the users on the
> database
> > by restarting?
>
> Because (the nature of) a SIGSEGV, you can't trust any data remaining in
> memory - what if the crash was caused by corrupt data in memory?
>
> This is why PostgreSQL completely shuts down, and re-starts back up.
>
> Allowing any part of PostgreSQL to continue (especially since there's data
> in SHM that's important) would be a bad idea, since you have no idea who
> caused the SIGSEGV.
>
>
> --
> Dominic J. Eidson
>                                         "Baruk Khazad! Khazad ai-menu!" -
> Gimli
> ----------------------------------------------------------------------------
> ---
> http://www.the-infinite.org/
> http://www.the-infinite.org/~dominic/
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com