Обсуждение: BUG #12918: Segfault in BackendIdGetTransactionIds

Поиск
Список
Период
Сортировка

BUG #12918: Segfault in BackendIdGetTransactionIds

От
root@simply.name
Дата:
The following bug has been logged on the website:

Bug reference:      12918
Logged by:          Vladimir
Email address:      root@simply.name
PostgreSQL version: 9.4.1
Operating system:   RHEL 6.6
Description:

Hello.

After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
yum.postgresql.org) we have started getting segfaults of different backends.
Backtraces of all coredumps look similar:
(gdb) bt
#0  0x000000000066bf9b in BackendIdGetTransactionIds (backendID=<value
optimized out>, xid=0x7f2a1b714798, xmin=0x7f2a1b71479c) at sinvaladt.c:426
#1  0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2871
#2  0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c:2342
#3  0x00000000006f9d5a in pg_stat_get_db_numbackends (fcinfo=<value
optimized out>) at pgstatfuncs.c:1080
#4  0x000000000059c345 in ExecMakeFunctionResultNoSets (fcache=0x1f4c270,
econtext=0x1f4bbe0, isNull=0x1f5e588 "", isDone=<value optimized out>) at
execQual.c:2023
#5  0x00000000005981a3 in ExecTargetList (projInfo=<value optimized out>,
isDone=0x0) at execQual.c:5304
#6  ExecProject (projInfo=<value optimized out>, isDone=0x0) at
execQual.c:5519
#7  0x00000000005a458d in advance_aggregates (aggstate=0x1f4bdc0,
pergroup=0x1f5e380) at nodeAgg.c:556
#8  0x00000000005a4da5 in agg_retrieve_direct (node=<value optimized out>)
at nodeAgg.c:1223
#9  ExecAgg (node=<value optimized out>) at nodeAgg.c:1115
#10 0x0000000000597638 in ExecProcNode (node=0x1f4bdc0) at
execProcnode.c:476
#11 0x0000000000596252 in ExecutePlan (queryDesc=0x1eae6d0, direction=<value
optimized out>, count=0) at execMain.c:1486
#12 standard_ExecutorRun (queryDesc=0x1eae6d0, direction=<value optimized
out>, count=0) at execMain.c:319
#13 0x0000000000686797 in PortalRunSelect (portal=0x1ea5660, forward=<value
optimized out>, count=0, dest=<value optimized out>) at pquery.c:946
#14 0x00000000006879c1 in PortalRun (portal=0x1ea5660,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x1f5a528,
altdest=0x1f5a528, completionTag=0x7fff277b3b80 "") at pquery.c:790
#15 0x000000000068404e in exec_simple_query (query_string=0x1e989d0 "SELECT
sum(numbackends) FROM pg_stat_database;") at postgres.c:1072
#16 0x00000000006856c8 in PostgresMain (argc=<value optimized out>,
argv=<value optimized out>, dbname=0x1e7f398 "postgres", username=<value
optimized out>) at postgres.c:4074
#17 0x0000000000632d7d in BackendRun (argc=<value optimized out>,
argv=<value optimized out>) at postmaster.c:4155
#18 BackendStartup (argc=<value optimized out>, argv=<value optimized out>)
at postmaster.c:3829
#19 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at
postmaster.c:1597
#20 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>)
at postmaster.c:1244
#21 0x00000000005cadb8 in main (argc=3, argv=0x1e7e5e0) at main.c:228
(gdb)

Unfortunatelly, I can't give a clear sequence of steps to reproduce the
problem, segfaults are happening in quiet random time and under random
workloads :( So I'm trying to reproduce it on testing stand where PostgreSQL
is built with --enable-debug flag to give you more information (but still no
luck for last two weeks).

The common conditions are:
  1. it happens only on master hosts (never on any of the streaming
replicas),
  2. it happens on simple queries to pg_catalog or system views as shown in
the backtrace above,
  3. it happens only with direct connecting to PostgreSQL
(production-queries go through pgbouncer and no coredumps contain production
queries). And till now it happened only with python-psycopg2 (we have tried
versions 2.5.3-1.rhel6 with postgresql93-libs, 2.5.4-1.rhel6 and 2.6-1.rhel6
with postgresql94-libs). I've asked about it on psycopg-list [0] but it
doesn't seem to be the client problem.

[0]

http://www.postgresql.org/message-id/flat/CA+mi_8a246TK6YBLzf_7c5sc+XuiMaGafG0mhrFbp6Nq+SQt3w@mail.gmail.com#CA+mi_8a246TK6YBLzf_7c5sc+XuiMaGafG0mhrFbp6Nq+SQt3w@mail.gmail.com

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Tom Lane
Дата:
root@simply.name writes:
> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> yum.postgresql.org) we have started getting segfaults of different backends.
> Backtraces of all coredumps look similar:
> (gdb) bt
> #0  0x000000000066bf9b in BackendIdGetTransactionIds (backendID=<value
> optimized out>, xid=0x7f2a1b714798, xmin=0x7f2a1b71479c) at sinvaladt.c:426
> #1  0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2871
> #2  0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c:2342

Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
It supposes that there are no inactive entries in the sinval array
within the range 0 .. lastBackend.  But there can be, in which case
dereferencing stateP->proc crashes.  The reason it's hard to reproduce
is the relatively narrow window between where pgstat_read_current_status
saw the backend as active and where we're inspecting its sinval entry.

            regards, tom lane

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Vladimir Borodin
Дата:
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:33, Tom Lane =
<tgl@sss.pgh.pa.us> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):
>=20
> root@simply.name writes:
>> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
>> yum.postgresql.org) we have started getting segfaults of different =
backends.
>> Backtraces of all coredumps look similar:
>> (gdb) bt
>> #0  0x000000000066bf9b in BackendIdGetTransactionIds =
(backendID=3D<value
>> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at =
sinvaladt.c:426
>> #1  0x00000000006287f4 in pgstat_read_current_status () at =
pgstat.c:2871
>> #2  0x0000000000628879 in pgstat_fetch_stat_numbackends () at =
pgstat.c:2342
>=20
> Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
> It supposes that there are no inactive entries in the sinval array
> within the range 0 .. lastBackend.  But there can be, in which case
> dereferencing stateP->proc crashes.  The reason it's hard to reproduce
> is the relatively narrow window between where =
pgstat_read_current_status
> saw the backend as active and where we're inspecting its sinval entry.

I=E2=80=99ve also tried to revert dd1a3bcc where this function appeared =
but couldn=E2=80=99t do it :( If you would be able to make a build =
without this commit (if it is easier than fix it in right way), I could =
install it on several production hosts to test it.

>=20
>             regards, tom lane


--
May the force be with you=E2=80=A6
https://simply.name

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Stephen Frost
Дата:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> root@simply.name writes:
> > After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> > yum.postgresql.org) we have started getting segfaults of different back=
ends.
> > Backtraces of all coredumps look similar:
> > (gdb) bt
> > #0  0x000000000066bf9b in BackendIdGetTransactionIds (backendID=3D<value
> > optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at sinvala=
dt.c:426
> > #1  0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2871
> > #2  0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c:=
2342
>=20
> Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
> It supposes that there are no inactive entries in the sinval array
> within the range 0 .. lastBackend.  But there can be, in which case
> dereferencing stateP->proc crashes.  The reason it's hard to reproduce
> is the relatively narrow window between where pgstat_read_current_status
> saw the backend as active and where we're inspecting its sinval entry.

As an immediate short-term workaround, from what I can tell,=20
disabling calls to pg_stat_activity, and pg_stat_database (views), and
pg_stat_get_activity, pg_stat_get_backend_idset, and
pg_stat_get_db_numbackends (functions) should prevent triggering this
bug.

These are likely being run by a monitoring system (eg: check_postgres
=66rom Nagios).

    Thanks!

        Stephen

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Vladimir Borodin
Дата:
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:44, Stephen =
Frost <sfrost@snowman.net> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0=
):
>=20
> * Tom Lane (tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>) wrote:
>> root@simply.name writes:
>>> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
>>> yum.postgresql.org) we have started getting segfaults of different =
backends.
>>> Backtraces of all coredumps look similar:
>>> (gdb) bt
>>> #0  0x000000000066bf9b in BackendIdGetTransactionIds =
(backendID=3D<value
>>> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at =
sinvaladt.c:426
>>> #1  0x00000000006287f4 in pgstat_read_current_status () at =
pgstat.c:2871
>>> #2  0x0000000000628879 in pgstat_fetch_stat_numbackends () at =
pgstat.c:2342
>>=20
>> Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
>> It supposes that there are no inactive entries in the sinval array
>> within the range 0 .. lastBackend.  But there can be, in which case
>> dereferencing stateP->proc crashes.  The reason it's hard to =
reproduce
>> is the relatively narrow window between where =
pgstat_read_current_status
>> saw the backend as active and where we're inspecting its sinval =
entry.
>=20
> As an immediate short-term workaround, from what I can tell,=20
> disabling calls to pg_stat_activity, and pg_stat_database (views), and
> pg_stat_get_activity, pg_stat_get_backend_idset, and
> pg_stat_get_db_numbackends (functions) should prevent triggering this
> bug.

I suppose, pg_stat_replication should not be asked too. We have already =
done that on most critical databases but it is hard to be blind :(

>=20
> These are likely being run by a monitoring system (eg: check_postgres
> from Nagios).
>=20
>     Thanks!
>=20
>         Stephen


--
May the force be with you=E2=80=A6
https://simply.name

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Stephen Frost
Дата:
* Vladimir Borodin (root@simply.name) wrote:
> > 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:44, Stephen F=
rost <sfrost@snowman.net> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0=
):
> > * Tom Lane (tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>) wrote:
> >> root@simply.name writes:
> >>> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> >>> yum.postgresql.org) we have started getting segfaults of different ba=
ckends.
> >>> Backtraces of all coredumps look similar:
> >>> (gdb) bt
> >>> #0  0x000000000066bf9b in BackendIdGetTransactionIds (backendID=3D<va=
lue
> >>> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at sinva=
ladt.c:426
> >>> #1  0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2=
871
> >>> #2  0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.=
c:2342
> >>=20
> >> Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
> >> It supposes that there are no inactive entries in the sinval array
> >> within the range 0 .. lastBackend.  But there can be, in which case
> >> dereferencing stateP->proc crashes.  The reason it's hard to reproduce
> >> is the relatively narrow window between where pgstat_read_current_stat=
us
> >> saw the backend as active and where we're inspecting its sinval entry.
> >=20
> > As an immediate short-term workaround, from what I can tell,=20
> > disabling calls to pg_stat_activity, and pg_stat_database (views), and
> > pg_stat_get_activity, pg_stat_get_backend_idset, and
> > pg_stat_get_db_numbackends (functions) should prevent triggering this
> > bug.
>=20
> I suppose, pg_stat_replication should not be asked too. We have already d=
one that on most critical databases but it is hard to be blind :(

Ah, yes, not sure where I dropped that; it was in my initial list but
didn't make it into the final email.

I would expect a fix to be included in the next point release, hopefully
released in the next couple of months.

    Thanks!

        Stephen

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Stephen Frost
Дата:
* Vladimir Borodin (root@simply.name) wrote:
>=20
> > 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:33, Tom Lane =
<tgl@sss.pgh.pa.us> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):
> >=20
> > root@simply.name writes:
> >> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> >> yum.postgresql.org) we have started getting segfaults of different bac=
kends.
> >> Backtraces of all coredumps look similar:
> >> (gdb) bt
> >> #0  0x000000000066bf9b in BackendIdGetTransactionIds (backendID=3D<val=
ue
> >> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at sinval=
adt.c:426
> >> #1  0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:28=
71
> >> #2  0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c=
:2342
> >=20
> > Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
> > It supposes that there are no inactive entries in the sinval array
> > within the range 0 .. lastBackend.  But there can be, in which case
> > dereferencing stateP->proc crashes.  The reason it's hard to reproduce
> > is the relatively narrow window between where pgstat_read_current_status
> > saw the backend as active and where we're inspecting its sinval entry.
>=20
> I=E2=80=99ve also tried to revert dd1a3bcc where this function appeared b=
ut couldn=E2=80=99t do it :( If you would be able to make a build without t=
his commit (if it is easier than fix it in right way), I could install it o=
n several production hosts to test it.

Hopefully a fix will be forthcoming shortly.  Reverting it won't work
though, no, as it included a catalog bump.

    Thanks,

        Stephen

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Tom Lane
Дата:
Vladimir Borodin <root@simply.name> writes:
> I���ve also tried to revert dd1a3bcc where this function appeared but couldn���t do it :( If you would be able to
makea build without this commit (if it is easier than fix it in right way), I could install it on several production
hoststo test it. 

Try this.

            regards, tom lane

diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 81b85c0..a2fde89 100644
*** a/src/backend/storage/ipc/sinvaladt.c
--- b/src/backend/storage/ipc/sinvaladt.c
*************** BackendIdGetProc(int backendID)
*** 403,411 ****
  void
  BackendIdGetTransactionIds(int backendID, TransactionId *xid, TransactionId *xmin)
  {
-     ProcState  *stateP;
      SISeg       *segP = shmInvalBuffer;
-     PGXACT       *xact;

      *xid = InvalidTransactionId;
      *xmin = InvalidTransactionId;
--- 403,409 ----
*************** BackendIdGetTransactionIds(int backendID
*** 415,425 ****

      if (backendID > 0 && backendID <= segP->lastBackend)
      {
!         stateP = &segP->procState[backendID - 1];
!         xact = &ProcGlobal->allPgXact[stateP->proc->pgprocno];

!         *xid = xact->xid;
!         *xmin = xact->xmin;
      }

      LWLockRelease(SInvalWriteLock);
--- 413,428 ----

      if (backendID > 0 && backendID <= segP->lastBackend)
      {
!         ProcState  *stateP = &segP->procState[backendID - 1];
!         PGPROC       *proc = stateP->proc;

!         if (proc != NULL)
!         {
!             PGXACT       *xact = &ProcGlobal->allPgXact[proc->pgprocno];
!
!             *xid = xact->xid;
!             *xmin = xact->xmin;
!         }
      }

      LWLockRelease(SInvalWriteLock);

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Vladimir Borodin
Дата:
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 20:00, Tom Lane =
<tgl@sss.pgh.pa.us> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):
>=20
> Vladimir Borodin <root@simply.name> writes:
>> I=E2=80=99ve also tried to revert dd1a3bcc where this function =
appeared but couldn=E2=80=99t do it :( If you would be able to make a =
build without this commit (if it is easier than fix it in right way), I =
could install it on several production hosts to test it.
>=20
> Try this.

38 minutes from a bug report to the patch with a fix! You are fantastic. =
Thanks.

It compiles, passes 'make check' and 'make check-world=E2=80=99 (I =
think, you have checked it but just in case...). I=E2=80=99ve built a =
package and installed it on one host. If everything would be ok, =
tomorrow I will install it on several hosts and slowly farther. The =
problem reproduces on our number of hosts approximately once a week. If =
the problem disappears I will let you know in a couple of weeks.

Thanks again.

>=20
>             regards, tom lane
>=20
> diff --git a/src/backend/storage/ipc/sinvaladt.c =
b/src/backend/storage/ipc/sinvaladt.c
> index 81b85c0..a2fde89 100644
> *** a/src/backend/storage/ipc/sinvaladt.c
> --- b/src/backend/storage/ipc/sinvaladt.c
> *************** BackendIdGetProc(int backendID)
> *** 403,411 ****
>  void
>  BackendIdGetTransactionIds(int backendID, TransactionId *xid, =
TransactionId *xmin)
>  {
> -     ProcState  *stateP;
>      SISeg       *segP =3D shmInvalBuffer;
> -     PGXACT       *xact;
>=20
>      *xid =3D InvalidTransactionId;
>      *xmin =3D InvalidTransactionId;
> --- 403,409 ----
> *************** BackendIdGetTransactionIds(int backendID
> *** 415,425 ****
>=20
>      if (backendID > 0 && backendID <=3D segP->lastBackend)
>      {
> !         stateP =3D &segP->procState[backendID - 1];
> !         xact =3D &ProcGlobal->allPgXact[stateP->proc->pgprocno];
>=20
> !         *xid =3D xact->xid;
> !         *xmin =3D xact->xmin;
>      }
>=20
>      LWLockRelease(SInvalWriteLock);
> --- 413,428 ----
>=20
>      if (backendID > 0 && backendID <=3D segP->lastBackend)
>      {
> !         ProcState  *stateP =3D &segP->procState[backendID - 1];
> !         PGPROC       *proc =3D stateP->proc;
>=20
> !         if (proc !=3D NULL)
> !         {
> !             PGXACT       *xact =3D =
&ProcGlobal->allPgXact[proc->pgprocno];
> !=20
> !             *xid =3D xact->xid;
> !             *xmin =3D xact->xmin;
> !         }
>      }
>=20
>      LWLockRelease(SInvalWriteLock);


--
May the force be with you=E2=80=A6
https://simply.name

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
David Gould
Дата:
On Mon, 30 Mar 2015 13:00:01 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Vladimir Borodin <root@simply.name> writes:
> > I=E2=80=99ve also tried to revert dd1a3bcc where this function appeared=
 but couldn=E2=80=99t do it :( If you would be able to make a build without=
 this commit (if it is easier than fix it in right way), I could install it=
 on several production hosts to test it.
>=20
> Try this.

Nice to see a patch, in advance of need ;-) Thanks!

We have had a couple segfaults recently but once we enabled core files it
stopped happening. Until just now. I can build with the
patch, but if a 9.4.2 is immanent it would be nice to know before
scheduling an extra round of downtimes.

This is apparently from a python trigger calling get_app_name(). I
can provide the rest of the stack if it would be useful.

Program terminated with signal 11, Segmentation fault.
#0  0x000000000066148b in BackendIdGetTransactionIds (backendID=3D<value op=
timized out>, xid=3D0x7f5d56ae1598, xmin=3D0x7f5d56ae159c)
    at sinvaladt.c:426
426     sinvaladt.c: No such file or directory.
        in sinvaladt.c
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.149.el6_6.=
5.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x000000000066148b in BackendIdGetTransactionIds (backendID=3D<value op=
timized out>, xid=3D0x7f5d56ae1598, xmin=3D0x7f5d56ae159c)
    at sinvaladt.c:426
#1  0x000000000061f064 in pgstat_read_current_status () at pgstat.c:2871
#2  0x000000000061f0e9 in pgstat_fetch_stat_numbackends () at pgstat.c:2342
#3  0x00000000006ef373 in pg_stat_get_activity (fcinfo=3D0x7fffd2e78f50) at=
 pgstatfuncs.c:591
#4  0x00000000005977ec in ExecMakeTableFunctionResult (funcexpr=3D0x17fdae0=
, econtext=3D0x17fd770, argContext=3D<value optimized out>,=20
    expectedDesc=3D0x17ffd70, randomAccess=3D0 '\000') at execQual.c:2193
#5  0x00000000005a91f2 in FunctionNext (node=3D0x17fd660) at nodeFunctionsc=
an.c:95
#6  0x00000000005982ce in ExecScanFetch (node=3D0x17fd660, accessMtd=3D0x5a=
8f40 <FunctionNext>, recheckMtd=3D0x5a8870 <FunctionRecheck>)
    at execScan.c:82
#7  ExecScan (node=3D0x17fd660, accessMtd=3D0x5a8f40 <FunctionNext>, rechec=
kMtd=3D0x5a8870 <FunctionRecheck>) at execScan.c:167
#8  0x00000000005913c8 in ExecProcNode (node=3D0x17fd660) at execProcnode.c=
:426
#9  0x000000000058ff32 in ExecutePlan (queryDesc=3D0x17f81f0, direction=3D<=
value optimized out>, count=3D1) at execMain.c:1486
#10 standard_ExecutorRun (queryDesc=3D0x17f81f0, direction=3D<value optimiz=
ed out>, count=3D1) at execMain.c:319
#11 0x00007f69a7d3867b in explain_ExecutorRun (queryDesc=3D0x17f81f0, direc=
tion=3DForwardScanDirection, count=3D1) at auto_explain.c:243
#12 0x00007f69a7b33965 in pgss_ExecutorRun (queryDesc=3D0x17f81f0, directio=
n=3DForwardScanDirection, count=3D1)
    at pg_stat_statements.c:873
#13 0x000000000059bd6c in postquel_getnext (fcinfo=3D<value optimized out>)=
 at functions.c:853
#14 fmgr_sql (fcinfo=3D<value optimized out>) at functions.c:1148
#15 0x0000000000595f85 in ExecMakeFunctionResultNoSets (fcache=3D0x17ed920,=
 econtext=3D0x17ed730, isNull=3D0x17ee2a8 " ",=20
    isDone=3D<value optimized out>) at execQual.c:2023
#16 0x0000000000591e53 in ExecTargetList (projInfo=3D<value optimized out>,=
 isDone=3D0x7fffd2e798fc) at execQual.c:5304
#17 ExecProject (projInfo=3D<value optimized out>, isDone=3D0x7fffd2e798fc)=
 at execQual.c:5519
#18 0x00000000005a98fb in ExecResult (node=3D0x17ed620) at nodeResult.c:155
#19 0x0000000000591478 in ExecProcNode (node=3D0x17ed620) at execProcnode.c=
:373
#20 0x000000000058ff32 in ExecutePlan (queryDesc=3D0x166c610, direction=3D<=
value optimized out>, count=3D0) at execMain.c:1486
#21 standard_ExecutorRun (queryDesc=3D0x166c610, direction=3D<value optimiz=
ed out>, count=3D0) at execMain.c:319
#22 0x00007f69a7d3867b in explain_ExecutorRun (queryDesc=3D0x166c610, direc=
tion=3DForwardScanDirection, count=3D0) at auto_explain.c:243
#23 0x00007f69a7b33965 in pgss_ExecutorRun (queryDesc=3D0x166c610, directio=
n=3DForwardScanDirection, count=3D0)
    at pg_stat_statements.c:873
#24 0x00000000005b39d0 in _SPI_pquery (plan=3D0x7fffd2e79d10, paramLI=3D0x0=
, snapshot=3D<value optimized out>, crosscheck_snapshot=3D0x0,=20
    read_only=3D0 '\000', fire_triggers=3D1 '\001', tcount=3D0) at spi.c:23=
72
#25 _SPI_execute_plan (plan=3D0x7fffd2e79d10, paramLI=3D0x0, snapshot=3D<va=
lue optimized out>, crosscheck_snapshot=3D0x0,=20
    read_only=3D0 '\000', fire_triggers=3D1 '\001', tcount=3D0) at spi.c:21=
60
#26 0x00000000005b4076 in SPI_execute (src=3D0x15f6054 "SELECT get_app_name=
() AS a", read_only=3D0 '\000', tcount=3D0) at spi.c:386
#27 0x00007f5d5672f702 in PLy_spi_execute_query (query=3D0x15f6054 "SELECT =
get_app_name() AS a", limit=3D0) at plpy_spi.c:357

-dg

--=20
David Gould              510 282 0869         daveg@sonic.net
If simplicity worked, the world would be overrun with insects.

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Tom Lane
Дата:
David Gould <daveg@sonic.net> writes:
> We have had a couple segfaults recently but once we enabled core files it
> stopped happening. Until just now. I can build with the
> patch, but if a 9.4.2 is immanent it would be nice to know before
> scheduling an extra round of downtimes.

No plans for an imminent 9.4.2.  There's been some discussion about a set
of releases in May; the only way something happens sooner than that is
if we find a staggeringly-bad bug.

            regards, tom lane

Re: BUG #12918: Segfault in BackendIdGetTransactionIds

От
Vladimir Borodin
Дата:
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 20:54, Vladimir =
Borodin <root@simply.name> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0=
):
>=20
>>=20
>> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 20:00, Tom =
Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> =D0=BD=D0=B0=D0=BF=D0=B8=
=D1=81=D0=B0=D0=BB(=D0=B0):
>>=20
>> Vladimir Borodin <root@simply.name <mailto:root@simply.name>> writes:
>>> I=E2=80=99ve also tried to revert dd1a3bcc where this function =
appeared but couldn=E2=80=99t do it :( If you would be able to make a =
build without this commit (if it is easier than fix it in right way), I =
could install it on several production hosts to test it.
>>=20
>> Try this.
>=20
> 38 minutes from a bug report to the patch with a fix! You are =
fantastic. Thanks.
>=20
> It compiles, passes 'make check' and 'make check-world=E2=80=99 (I =
think, you have checked it but just in case...). I=E2=80=99ve built a =
package and installed it on one host. If everything would be ok, =
tomorrow I will install it on several hosts and slowly farther. The =
problem reproduces on our number of hosts approximately once a week. If =
the problem disappears I will let you know in a couple of weeks.

No segfaults for more than a week since I=E2=80=99ve upgraded all hosts. =
Seems, that the patch is good. Thank you very much.

>=20
> Thanks again.
>=20
>>=20
>>             regards, tom lane
>>=20
>> diff --git a/src/backend/storage/ipc/sinvaladt.c =
b/src/backend/storage/ipc/sinvaladt.c
>> index 81b85c0..a2fde89 100644
>> *** a/src/backend/storage/ipc/sinvaladt.c
>> --- b/src/backend/storage/ipc/sinvaladt.c
>> *************** BackendIdGetProc(int backendID)
>> *** 403,411 ****
>>  void
>>  BackendIdGetTransactionIds(int backendID, TransactionId *xid, =
TransactionId *xmin)
>>  {
>> -     ProcState  *stateP;
>>      SISeg       *segP =3D shmInvalBuffer;
>> -     PGXACT       *xact;
>>=20
>>      *xid =3D InvalidTransactionId;
>>      *xmin =3D InvalidTransactionId;
>> --- 403,409 ----
>> *************** BackendIdGetTransactionIds(int backendID
>> *** 415,425 ****
>>=20
>>      if (backendID > 0 && backendID <=3D segP->lastBackend)
>>      {
>> !         stateP =3D &segP->procState[backendID - 1];
>> !         xact =3D &ProcGlobal->allPgXact[stateP->proc->pgprocno];
>>=20
>> !         *xid =3D xact->xid;
>> !         *xmin =3D xact->xmin;
>>      }
>>=20
>>      LWLockRelease(SInvalWriteLock);
>> --- 413,428 ----
>>=20
>>      if (backendID > 0 && backendID <=3D segP->lastBackend)
>>      {
>> !         ProcState  *stateP =3D &segP->procState[backendID - 1];
>> !         PGPROC       *proc =3D stateP->proc;
>>=20
>> !         if (proc !=3D NULL)
>> !         {
>> !             PGXACT       *xact =3D =
&ProcGlobal->allPgXact[proc->pgprocno];
>> !=20
>> !             *xid =3D xact->xid;
>> !             *xmin =3D xact->xmin;
>> !         }
>>      }
>>=20
>>      LWLockRelease(SInvalWriteLock);
>=20
>=20
> --
> May the force be with you=E2=80=A6
> https://simply.name <https://simply.name/>

--
May the force be with you=E2=80=A6
https://simply.name