Обсуждение: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

Поиск
Список
Период
Сортировка

Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
"Lawrence Oluyede"
Дата:
While developing a Python program I encountered a situation that makes
on of the PostgreSQL 8.3b4's processes crash badly with a segfault.
Let me explain.

After enabling "debug5" as log level I was able to locate and
reproduce the exact sequence of SQL queries made against my test
database
and crash postgres again using only "psql". See for yourself:


foodb=# \o out.txt
foodb=# SELECT * FROM foobaz.instrument WHERE code = 'TEST0118';
LOG:  duration: 3.683 ms  statement: SELECT * FROM foobaz.instrument
WHERE code = 'TEST0118';
foodb=# SELECT xpath('//sp:description/text()', content,
ARRAY[ARRAY['sp', 'http://www.foobaz.com/']])
foodb-#        FROM foobaz.instrument WHERE code = 'TEST0018';
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

The pgsql log file contains these error logs (the pid 753 is a
postgres child process):

"""""""""
[loluyede:foodb:2007-12-11 14:38:20.992 CET]LOG:  00000: duration:
0.090 ms  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ
COMMITTED
[loluyede:foodb:2007-12-11 14:38:20.992 CET]LOCATION:
exec_simple_query, postgres.c:1040
[::2007-12-11 14:38:20.993 CET]LOG:  00000: server process (PID 753)
was terminated by signal 11: Segmentation fault
[::2007-12-11 14:38:20.993 CET]LOCATION:  LogChildExit, postmaster.c:2510
[::2007-12-11 14:38:20.993 CET]LOG:  00000: terminating any other
active server processes
[::2007-12-11 14:38:20.993 CET]LOCATION:  HandleChildCrash, postmaster.c:2355
[::2007-12-11 14:38:20.997 CET]LOG:  00000: all server processes
terminated; reinitializing
[::2007-12-11 14:38:20.997 CET]LOCATION:  PostmasterStateMachine,
postmaster.c:2663
[::2007-12-11 14:38:21.000 CET]LOG:  00000: database system was
interrupted; last known up at 2007-12-11 14:38:02 CET
[::2007-12-11 14:38:21.000 CET]LOCATION:  StartupXLOG, xlog.c:4789
[::2007-12-11 14:38:21.000 CET]DEBUG:  00000: checkpoint record is at 0/2AB634A0
""""""""

The out.txt file contains the output of the "SELECT *" statement

   code   |    kind    |         last_modified         |
created            | modified_by | created_by |
       content

----------+------------+-------------------------------+------------------------------+-------------+------------+----------------------------------------------------------------------
 TEST0118 | bazfoo | 2007-12-12 08:17:01.846919+01 | 2007-12-12
08:17:00.32337+01 | anonymous   | anonymous  | <xml ... >
(1 row)

The "instrument" table has the following schema:

"""
CREATE TABLE foobaz.instrument
(
-- Inherited:   code character varying(32) NOT NULL,
-- Inherited:   kind character varying(32) NOT NULL DEFAULT
'bazfoo'::character varying,
  last_modified timestamp with time zone NOT NULL,
  created timestamp with time zone NOT NULL DEFAULT
('now'::text)::timestamp(3) with time zone,
  modified_by character varying(64),
  created_by character varying(64) NOT NULL DEFAULT
'anonymous'::character varying,
  content xml NOT NULL,
  CONSTRAINT instrument_pkey PRIMARY KEY (code),
  CONSTRAINT instrument_kind_check CHECK (kind::text = 'bazfoo'::text)
)
INHERITS (foobaz.instrument_base)
WITH (OIDS=FALSE);
"""

Its "parent" table is instrument_base and has the following schema:

"""
CREATE TABLE foobaz.instrument_base
(
  code character varying(32) NOT NULL,
  kind character varying(32) NOT NULL,
  CONSTRAINT instrument_base_pkey PRIMARY KEY (code)
)
WITH (OIDS=FALSE);
"""

PostgreSQL is again 8.3beta 4 on Ubuntu Linux 7.10.
It's compiled with the following flags:
./configure --with-python --with-openssl --with-pam --with-libxml
--with-libxslt --enable-thread-safety --enable-debug

libxml is 2.6.30, libxslt is 1.1.21

I think that is all.

Is it a bug or am I doing something wrong?

--
Lawrence, oluyede.org - neropercaso.it
"It is difficult to get a man to understand
something when his salary depends on not
understanding it" - Upton Sinclair

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
"Pavel Stehule"
Дата:
Hello

this is bug. Please send backtrace from core file.

Regards
Pavel Stehule

On 12/12/2007, Lawrence Oluyede <l.oluyede@gmail.com> wrote:
> While developing a Python program I encountered a situation that makes
> on of the PostgreSQL 8.3b4's processes crash badly with a segfault.
> Let me explain.
>
> After enabling "debug5" as log level I was able to locate and
> reproduce the exact sequence of SQL queries made against my test
> database
> and crash postgres again using only "psql". See for yourself:
>
>
> foodb=# \o out.txt
> foodb=# SELECT * FROM foobaz.instrument WHERE code = 'TEST0118';
> LOG:  duration: 3.683 ms  statement: SELECT * FROM foobaz.instrument
> WHERE code = 'TEST0118';
> foodb=# SELECT xpath('//sp:description/text()', content,
> ARRAY[ARRAY['sp', 'http://www.foobaz.com/']])
> foodb-#        FROM foobaz.instrument WHERE code = 'TEST0018';
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> The pgsql log file contains these error logs (the pid 753 is a
> postgres child process):
>
> """""""""
> [loluyede:foodb:2007-12-11 14:38:20.992 CET]LOG:  00000: duration:
> 0.090 ms  statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ
> COMMITTED
> [loluyede:foodb:2007-12-11 14:38:20.992 CET]LOCATION:
> exec_simple_query, postgres.c:1040
> [::2007-12-11 14:38:20.993 CET]LOG:  00000: server process (PID 753)
> was terminated by signal 11: Segmentation fault
> [::2007-12-11 14:38:20.993 CET]LOCATION:  LogChildExit, postmaster.c:2510
> [::2007-12-11 14:38:20.993 CET]LOG:  00000: terminating any other
> active server processes
> [::2007-12-11 14:38:20.993 CET]LOCATION:  HandleChildCrash, postmaster.c:2355
> [::2007-12-11 14:38:20.997 CET]LOG:  00000: all server processes
> terminated; reinitializing
> [::2007-12-11 14:38:20.997 CET]LOCATION:  PostmasterStateMachine,
> postmaster.c:2663
> [::2007-12-11 14:38:21.000 CET]LOG:  00000: database system was
> interrupted; last known up at 2007-12-11 14:38:02 CET
> [::2007-12-11 14:38:21.000 CET]LOCATION:  StartupXLOG, xlog.c:4789
> [::2007-12-11 14:38:21.000 CET]DEBUG:  00000: checkpoint record is at 0/2AB634A0
> """"""""
>
> The out.txt file contains the output of the "SELECT *" statement
>
>    code   |    kind    |         last_modified         |
> created            | modified_by | created_by |
>        content
>
----------+------------+-------------------------------+------------------------------+-------------+------------+----------------------------------------------------------------------
>  TEST0118 | bazfoo | 2007-12-12 08:17:01.846919+01 | 2007-12-12
> 08:17:00.32337+01 | anonymous   | anonymous  | <xml ... >
> (1 row)
>
> The "instrument" table has the following schema:
>
> """
> CREATE TABLE foobaz.instrument
> (
> -- Inherited:   code character varying(32) NOT NULL,
> -- Inherited:   kind character varying(32) NOT NULL DEFAULT
> 'bazfoo'::character varying,
>   last_modified timestamp with time zone NOT NULL,
>   created timestamp with time zone NOT NULL DEFAULT
> ('now'::text)::timestamp(3) with time zone,
>   modified_by character varying(64),
>   created_by character varying(64) NOT NULL DEFAULT
> 'anonymous'::character varying,
>   content xml NOT NULL,
>   CONSTRAINT instrument_pkey PRIMARY KEY (code),
>   CONSTRAINT instrument_kind_check CHECK (kind::text = 'bazfoo'::text)
> )
> INHERITS (foobaz.instrument_base)
> WITH (OIDS=FALSE);
> """
>
> Its "parent" table is instrument_base and has the following schema:
>
> """
> CREATE TABLE foobaz.instrument_base
> (
>   code character varying(32) NOT NULL,
>   kind character varying(32) NOT NULL,
>   CONSTRAINT instrument_base_pkey PRIMARY KEY (code)
> )
> WITH (OIDS=FALSE);
> """
>
> PostgreSQL is again 8.3beta 4 on Ubuntu Linux 7.10.
> It's compiled with the following flags:
> ./configure --with-python --with-openssl --with-pam --with-libxml
> --with-libxslt --enable-thread-safety --enable-debug
>
> libxml is 2.6.30, libxslt is 1.1.21
>
> I think that is all.
>
> Is it a bug or am I doing something wrong?
>
> --
> Lawrence, oluyede.org - neropercaso.it
> "It is difficult to get a man to understand
> something when his salary depends on not
> understanding it" - Upton Sinclair
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
"Lawrence Oluyede"
Дата:
Here it is:

postgres$ gdb /usr/local/pgsql/bin/postgres core.1600
(gdb) bt
#0  0x082c101c in pfree (pointer=0x8472f00) at mcxt.c:591
#1  0xb7e46513 in xmlCleanupCharEncodingHandlers () from /usr/lib/libxml2.so.2
#2  0xb7e4f091 in xmlCleanupParser () from /usr/lib/libxml2.so.2
#3  0x082940e4 in xpath (fcinfo=0xbfee59c4) at xml.c:3441
#4  0x0817107f in ExecMakeFunctionResult (fcache=0x84723d0,
econtext=0x8472348, isNull=0x8472fa8 "", isDone=0x8472fb8) at
execQual.c:1351
#5  0x0816f522 in ExecProject (projInfo=0x8472e98, isDone=0xbfee5c78)
at execQual.c:4601
#6  0x08175987 in ExecScan (node=0x84722c0, accessMtd=0x817e310
<IndexNext>) at execScan.c:143
#7  0x0817e2e7 in ExecIndexScan (node=0x84722c0) at nodeIndexscan.c:147
#8  0x0816ee7a in ExecProcNode (node=0x84722c0) at execProcnode.c:338
#9  0x0816e113 in ExecutorRun (queryDesc=0x845e908,
direction=ForwardScanDirection, count=0) at execMain.c:1233
#10 0x08204ea0 in PortalRunSelect (portal=0x84577a0, forward=<value
optimized out>, count=0, dest=0x845d1c8) at pquery.c:943
#11 0x08205eaa in PortalRun (portal=0x84577a0, count=2147483647,
isTopLevel=1 '\001', dest=0x845d1c8, altdest=0x845d1c8,
completionTag=0xbfee5ee8 "") at pquery.c:797
#12 0x082013c3 in exec_simple_query (
    query_string=0x8453790 "SELECT xpath('//sp:description/text()',
content, ARRAY[ARRAY['sp',
'http://www.statpro.net/xml/structure/1.0']])\n       FROM instrument
WHERE code = 'TEST0018';") at postgres.c:963
#13 0x08202ebc in PostgresMain (argc=4, argv=<value optimized out>,
username=0x83d2330 "postgres") at postgres.c:3531
#14 0x081d76cf in ServerLoop () at postmaster.c:3180
#15 0x081d83a5 in PostmasterMain (argc=3, argv=0x83ccb08) at postmaster.c:1028
#16 0x081909a0 in main (argc=3, argv=0x82c0c00) at main.c:188
(gdb) l
586             header = (StandardChunkHeader *)
587                     ((char *) pointer - STANDARDCHUNKHEADERSIZE);
588
589             AssertArg(MemoryContextIsValid(header->context));
590
591             (*header->context->methods->free_p) (header->context, pointer);
592     }
593
594     /*
595      * repalloc

I also uploaded the entire core file on the net, if you want to inspect it:
http://www.oluyede.org/temp/core.1600 (it's ~ 30M)

thank you



--
Lawrence, oluyede.org - neropercaso.it
"It is difficult to get a man to understand
something when his salary depends on not
understanding it" - Upton Sinclair

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
"Pavel Stehule"
Дата:
Hello

I cannot repeat this bug on my test date. Can you send some small set
that reproduce the bug?

Regards
Pavel

On 12/12/2007, Lawrence Oluyede <l.oluyede@gmail.com> wrote:
> Here it is:
>
> postgres$ gdb /usr/local/pgsql/bin/postgres core.1600
> (gdb) bt
> #0  0x082c101c in pfree (pointer=0x8472f00) at mcxt.c:591
> #1  0xb7e46513 in xmlCleanupCharEncodingHandlers () from /usr/lib/libxml2.so.2
> #2  0xb7e4f091 in xmlCleanupParser () from /usr/lib/libxml2.so.2
> #3  0x082940e4 in xpath (fcinfo=0xbfee59c4) at xml.c:3441
> #4  0x0817107f in ExecMakeFunctionResult (fcache=0x84723d0,
> econtext=0x8472348, isNull=0x8472fa8 "", isDone=0x8472fb8) at
> execQual.c:1351
> #5  0x0816f522 in ExecProject (projInfo=0x8472e98, isDone=0xbfee5c78)
> at execQual.c:4601
> #6  0x08175987 in ExecScan (node=0x84722c0, accessMtd=0x817e310
> <IndexNext>) at execScan.c:143
> #7  0x0817e2e7 in ExecIndexScan (node=0x84722c0) at nodeIndexscan.c:147
> #8  0x0816ee7a in ExecProcNode (node=0x84722c0) at execProcnode.c:338
> #9  0x0816e113 in ExecutorRun (queryDesc=0x845e908,
> direction=ForwardScanDirection, count=0) at execMain.c:1233
> #10 0x08204ea0 in PortalRunSelect (portal=0x84577a0, forward=<value
> optimized out>, count=0, dest=0x845d1c8) at pquery.c:943
> #11 0x08205eaa in PortalRun (portal=0x84577a0, count=2147483647,
> isTopLevel=1 '\001', dest=0x845d1c8, altdest=0x845d1c8,
> completionTag=0xbfee5ee8 "") at pquery.c:797
> #12 0x082013c3 in exec_simple_query (
>     query_string=0x8453790 "SELECT xpath('//sp:description/text()',
> content, ARRAY[ARRAY['sp',
> 'http://www.statpro.net/xml/structure/1.0']])\n       FROM instrument
> WHERE code = 'TEST0018';") at postgres.c:963
> #13 0x08202ebc in PostgresMain (argc=4, argv=<value optimized out>,
> username=0x83d2330 "postgres") at postgres.c:3531
> #14 0x081d76cf in ServerLoop () at postmaster.c:3180
> #15 0x081d83a5 in PostmasterMain (argc=3, argv=0x83ccb08) at postmaster.c:1028
> #16 0x081909a0 in main (argc=3, argv=0x82c0c00) at main.c:188
> (gdb) l
> 586             header = (StandardChunkHeader *)
> 587                     ((char *) pointer - STANDARDCHUNKHEADERSIZE);
> 588
> 589             AssertArg(MemoryContextIsValid(header->context));
> 590
> 591             (*header->context->methods->free_p) (header->context, pointer);
> 592     }
> 593
> 594     /*
> 595      * repalloc
>
> I also uploaded the entire core file on the net, if you want to inspect it:
> http://www.oluyede.org/temp/core.1600 (it's ~ 30M)
>
> thank you
>
>
>
> --
> Lawrence, oluyede.org - neropercaso.it
> "It is difficult to get a man to understand
> something when his salary depends on not
> understanding it" - Upton Sinclair
>

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
"Pavel Stehule"
Дата:
sorry

s/date/data/g

Pavel

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
"Lawrence Oluyede"
Дата:
On Dec 12, 2007 1:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I cannot repeat this bug on my test date. Can you send some small set
> that reproduce the bug?

I dumped the entire table here: http://www.oluyede.org/temp/instrument.txt

Let me know if you need anything else


--
Lawrence, oluyede.org - neropercaso.it
"It is difficult to get a man to understand
something when his salary depends on not
understanding it" - Upton Sinclair

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
Tom Lane
Дата:
"Lawrence Oluyede" <l.oluyede@gmail.com> writes:
> On Dec 12, 2007 1:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> I cannot repeat this bug on my test date. Can you send some small set
>> that reproduce the bug?

> I dumped the entire table here: http://www.oluyede.org/temp/instrument.txt

I didn't have any luck reproducing the crash either.

Please try to put together a SQL script that reproduces the crash for
you, starting from an empty database.  It's not exactly clear to me
whether I'm doing the same things you did.

            regards, tom lane

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
Tom Lane
Дата:
"Lawrence Oluyede" <l.oluyede@gmail.com> writes:
> Here it is:
> postgres$ gdb /usr/local/pgsql/bin/postgres core.1600
> (gdb) bt
> #0  0x082c101c in pfree (pointer=0x8472f00) at mcxt.c:591
> #1  0xb7e46513 in xmlCleanupCharEncodingHandlers () from /usr/lib/libxml2.so.2
> #2  0xb7e4f091 in xmlCleanupParser () from /usr/lib/libxml2.so.2
> #3  0x082940e4 in xpath (fcinfo=0xbfee59c4) at xml.c:3441

I think we finally found the problem.  Please see if things are more
stable with 8.3RC1 plus this patch:

http://archives.postgresql.org/pgsql-committers/2008-01/msg00190.php

            regards, tom lane

Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)

От
"Lawrence Oluyede"
Дата:
> I think we finally found the problem.  Please see if things are more
> stable with 8.3RC1 plus this patch:
>
> http://archives.postgresql.org/pgsql-committers/2008-01/msg00190.php

Sorry for being so late. I'm currently using PostgreSQL RC2 at work
and I notice no crashes.
Thanks!


--
Lawrence, stacktrace.it - oluyede.org - neropercaso.it
"It is difficult to get a man to understand
something when his salary depends on not
understanding it" - Upton Sinclair