Обсуждение: Many 8.3 server crashes in xmlCleanupCharEncodingHandlers()

Поиск
Список
Период
Сортировка

Many 8.3 server crashes in xmlCleanupCharEncodingHandlers()

От
"Matt Magoffin"
Дата:
Hello, I'm using 8.3b4 and keep experiencing server crash when I execute
various queries using XML functions. The crash backtraces look like this:

---
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x3f847ae1
0x004b140c in xmlCleanupCharEncodingHandlers ()
(gdb) bt
#0  0x004b140c in xmlCleanupCharEncodingHandlers ()
#1  0x004d01fc in xmlCleanupParser ()
#2  0x0026d514 in xpath (fcinfo=<value temporarily unavailable, due to
optimizations>) at xml.c:3441
#3  0x0010b908 in ExecMakeFunctionResult (fcache=0xa08474,
econtext=0x7fe9ec, isNull=0xbfffdb08 "", isDone=0xbfffd8c8) at
execQual.c:1351
#4  0x0010a120 in ExecEvalArrayCoerceExpr (astate=0xa08428,
econtext=0x7fe9ec, isNull=0xbfffdb08 "", isDone=0xbfffd8c8) at
execQual.c:3633
#5  0x0010b4c0 in ExecEvalFuncArgs (fcinfo=0xbfffd968, argList=<value
temporarily unavailable, due to optimizations>, econtext=0x7fe9ec) at
execQual.c:1111
#6  0x0010b5fc in ExecMakeFunctionResult (fcache=0xa0801c,
econtext=0x7fe9ec, isNull=0xbfffdbf8 "", isDone=0x0) at execQual.c:1169
#7  0x0010c120 in ExecQual (qual=<value temporarily unavailable, due to
optimizations>, econtext=0x7fe9ec, resultForNull=<value temporarily
unavailable, due to optimizations>) at execQual.c:4501
#8  0x0010fcb8 in ExecScan (node=0x7fe960, accessMtd=0x119a20 <IndexNext>)
at execScan.c:131
#9  0x00107e80 in ExecProcNode (node=0x7fe960) at execProcnode.c:338
#10 0x0011c818 in ExecNestLoop (node=0x107bb8c) at nodeNestloop.c:170
#11 0x00107ee0 in ExecProcNode (node=0x107bb8c) at execProcnode.c:367
#12 0x0011e158 in ExecSort (node=0x107bb00) at nodeSort.c:102
#13 0x00107f20 in ExecProcNode (node=0x107bb00) at execProcnode.c:386
#14 0x00105f28 in ExecutorRun (queryDesc=<value temporarily unavailable,
due to optimizations>, direction=ForwardScanDirection, count=0) at
execMain.c:1233
#15 0x001c6834 in PortalRunSelect (portal=0x104be1c, forward=<value
temporarily unavailable, due to optimizations>, count=0, dest=0x103a228)
at pquery.c:943
#16 0x001c7c4c in PortalRun (portal=0x104be1c, count=2147483647,
isTopLevel=0 '\0', dest=0x103a228, altdest=0x103a228,
completionTag=0xbfffe3c4 "") at pquery.c:769
#17 0x001c59b4 in PostgresMain (argc=<value temporarily unavailable, due
to optimizations>, argv=<value temporarily unavailable, due to
optimizations>, username=<value temporarily unavailable, due to
optimizations>) at postgres.c:1844
#18 0x0018a468 in ServerLoop () at postmaster.c:3180
#19 0x0018b974 in PostmasterMain (argc=4, argv=0x800550) at postmaster.c:1028
#20 0x0012e2fc in main (argc=4, argv=0x800550) at main.c:188

---
#0  0x95c59ca8 in __kill ()
#1  0x95cfe7b8 in abort ()
#2  0x00281338 in ExceptionalCondition (conditionName=<value temporarily
unavailable, due to optimizations>, errorType=<value temporarily
unavailable, due to optimizations>, fileName=<value temporarily
unavailable, due to optimizations>, lineNumber=<value temporarily
unavailable, due to optimizations>) at assert.c:57
#3  0x002a1460 in pfree (pointer=0x207) at mcxt.c:581
#4  0x004b1428 in xmlCleanupCharEncodingHandlers ()
#5  0x004d01fc in xmlCleanupParser ()
#6  0x0026d514 in xpath (fcinfo=<value temporarily unavailable, due to
optimizations>) at xml.c:3441
#7  0x0010b908 in ExecMakeFunctionResult (fcache=0xa08474,
econtext=0x7fe9ec, isNull=0xbfffdb08 "", isDone=0xbfffd8c8) at
execQual.c:1351
#8  0x0010a120 in ExecEvalArrayCoerceExpr (astate=0xa08428,
econtext=0x7fe9ec, isNull=0xbfffdb08 "", isDone=0xbfffd8c8) at
execQual.c:3633
#9  0x0010b4c0 in ExecEvalFuncArgs (fcinfo=0xbfffd968, argList=<value
temporarily unavailable, due to optimizations>, econtext=0x7fe9ec) at
execQual.c:1111
#10 0x0010b5fc in ExecMakeFunctionResult (fcache=0xa0801c,
econtext=0x7fe9ec, isNull=0xbfffdbf8 "", isDone=0x0) at execQual.c:1169
#11 0x0010c120 in ExecQual (qual=<value temporarily unavailable, due to
optimizations>, econtext=0x7fe9ec, resultForNull=<value temporarily
unavailable, due to optimizations>) at execQual.c:4501
#12 0x0010fcb8 in ExecScan (node=0x7fe960, accessMtd=0x119a20 <IndexNext>)
at execScan.c:131
#13 0x00107e80 in ExecProcNode (node=0x7fe960) at execProcnode.c:338
#14 0x0011c818 in ExecNestLoop (node=0x107c58c) at nodeNestloop.c:170
#15 0x00107ee0 in ExecProcNode (node=0x107c58c) at execProcnode.c:367
#16 0x0011e158 in ExecSort (node=0x107c500) at nodeSort.c:102
#17 0x00107f20 in ExecProcNode (node=0x107c500) at execProcnode.c:386
#18 0x00105f28 in ExecutorRun (queryDesc=<value temporarily unavailable,
due to optimizations>, direction=ForwardScanDirection, count=0) at
execMain.c:1233
#19 0x001c6834 in PortalRunSelect (portal=0x104781c, forward=<value
temporarily unavailable, due to optimizations>, count=0, dest=0x102d828)
at pquery.c:943
#20 0x001c7c4c in PortalRun (portal=0x104781c, count=2147483647,
isTopLevel=0 '\0', dest=0x102d828, altdest=0x102d828,
completionTag=0xbfffe3c4 "") at pquery.c:769
#21 0x001c59b4 in PostgresMain (argc=<value temporarily unavailable, due
to optimizations>, argv=<value temporarily unavailable, due to
optimizations>, username=<value temporarily unavailable, due to
optimizations>) at postgres.c:1844
#22 0x0018a468 in ServerLoop () at postmaster.c:3180
#23 0x0018b974 in PostmasterMain (argc=4, argv=0x800550) at postmaster.c:1028
#24 0x0012e2fc in main (argc=4, argv=0x800550) at main.c:188

The queries are executing via JDBC. When I then manually run the same
query in psql, filling in the statement parameters that were passed to the
query in Java, I seem to be able to run the query without error. An
example query plan is this:

 Sort  (cost=1815.26..1815.26 rows=1 width=58)
   Sort Key: m.range
   ->  Nested Loop  (cost=1371.75..1815.25 rows=1 width=58)
         ->  Unique  (cost=1371.75..1372.15 rows=53 width=40)
               ->  Sort  (cost=1371.75..1371.88 rows=53 width=40)
                     Sort Key: s.vin, m.lost_sale, m.lead_pos
                     ->  Hash Join  (cost=768.80..1370.23 rows=53 width=40)
                           Hash Cond: (m.sale_id = s.id)
                           ->  Seq Scan on mb_lead m  (cost=0.00..561.25
rows=10573 width=26)
                                 Filter: (NOT lost_sale)
                           ->  Hash  (cost=760.44..760.44 rows=669 width=30)
                                 ->  Index Scan using
mb_sale_sale_date_idx on mb_sale s
(cost=0.00..760.44 rows=669 width=30)
                                       Index Cond: ((sale_date >=
'2007-09-01'::date) AND (sale_date
< '2007-10-01'::date))
                                       Filter: (sale_type = 'd'::bpchar)
         ->  Index Scan using lead_pkey on lead l  (cost=0.00..8.34 rows=1
width=16)
               Index Cond: (l.id = m.lead_id)
               Filter:
((xpath('/als:auto-lead-service/als:meta[@key="com.autoleadservice.TypeFlag"]/text()'::text,
l.xml,
'{{als,http://autoleadservice.com/xml/als}}'::text[]))::text[]
&& '{foo,bar}'::text[])

Any ideas?

-- m@

Re: Many 8.3 server crashes in xmlCleanupCharEncodingHandlers()

От
Tom Lane
Дата:
"Matt Magoffin" <postgresql.org@msqr.us> writes:
> Hello, I'm using 8.3b4 and keep experiencing server crash when I execute
> various queries using XML functions. The crash backtraces look like this:

This was reported before,
http://archives.postgresql.org/pgsql-general/2007-12/msg00716.php
but neither he nor you have provided anything approximating a
reproducible test case.  The interactions with libxml are messy
enough that I'm not even going to think about fixing this without
a test case to trace through.

            regards, tom lane

Re: Many 8.3 server crashes in xmlCleanupCharEncodingHandlers()

От
"Matt Magoffin"
Дата:
> This was reported before,
> http://archives.postgresql.org/pgsql-general/2007-12/msg00716.php
> but neither he nor you have provided anything approximating a
> reproducible test case.  The interactions with libxml are messy
> enough that I'm not even going to think about fixing this without
> a test case to trace through.

I'll try to get a test case to you. I've had trouble getting it to
consistently crash, except via JDBC. Do you still happen to have that
database dump I provided to you previously? I can try to get the crash to
occur on that table.

I also noticed these in my log file, don't know if this is helpful:

TRAP: FailedAssertion("!(pointer == (void *) (((long) ((pointer)) + ((4) -
1)) & ~((long) ((4) - 1))))", File: "mcxt.c", Line: 581)
LOG:  server process (PID 714) was terminated by signal 6: Abort trap

TRAP: BadArgument("!(((header->context) != ((void *)0) &&
(((((Node*)((header->context)))->type) == T_AllocSetContext))))", File:
"mcxt.c", Line: 589)
LOG:  server process (PID 633) was terminated by signal 6: Abort trap

-- m@



Re: Many 8.3 server crashes in xmlCleanupCharEncodingHandlers()

От
Tom Lane
Дата:
"Matt Magoffin" <postgresql.org@msqr.us> writes:
> Do you still happen to have that
> database dump I provided to you previously?

Not sure --- when are you thinking of, and what was the context?
I don't usually keep sample data unless the issue still seems open.

> I also noticed these in my log file, don't know if this is helpful:
> TRAP: FailedAssertion("!(pointer == (void *) (((long) ((pointer)) + ((4) -
> 1)) & ~((long) ((4) - 1))))", File: "mcxt.c", Line: 581)
> LOG:  server process (PID 714) was terminated by signal 6: Abort trap
> TRAP: BadArgument("!(((header->context) != ((void *)0) &&
> (((((Node*)((header->context)))->type) == T_AllocSetContext))))", File:
> "mcxt.c", Line: 589)
> LOG:  server process (PID 633) was terminated by signal 6: Abort trap

These are consistent with the idea that we've got a memory-allocation
problem, ie, libxml is trying to access data that was already freed.
But exactly where and how is not any more clear than before.

FWIW, I think it's unlikely that a single query will reproduce this,
because the problem looks to be an expectation that leftover data is
still valid when it ain't.  What you need to be looking for is a series
of two or more queries that crash PG.  Possibly it'll be easier to
reproduce with that in mind ...

            regards, tom lane

Re: Many 8.3 server crashes in xmlCleanupCharEncodingHandlers()

От
"Matt Magoffin"
Дата:
> Not sure --- when are you thinking of, and what was the context?
> I don't usually keep sample data unless the issue still seems open.

I was referring to a dump I provided a link to you called
"pg83-leads-sanitized.db" which was around 20 Dec, with email subject "Re:
[GENERAL] 8.3b2 XPath-based function index server crash".

>> I also noticed these in my log file, don't know if this is helpful:
>> TRAP: FailedAssertion("!(pointer == (void *) (((long) ((pointer)) + ((4)
>> -
>> 1)) & ~((long) ((4) - 1))))", File: "mcxt.c", Line: 581)
>> LOG:  server process (PID 714) was terminated by signal 6: Abort trap
>> TRAP: BadArgument("!(((header->context) != ((void *)0) &&
>> (((((Node*)((header->context)))->type) == T_AllocSetContext))))", File:
>> "mcxt.c", Line: 589)
>> LOG:  server process (PID 633) was terminated by signal 6: Abort trap
>
> These are consistent with the idea that we've got a memory-allocation
> problem, ie, libxml is trying to access data that was already freed.
> But exactly where and how is not any more clear than before.
>
> FWIW, I think it's unlikely that a single query will reproduce this,
> because the problem looks to be an expectation that leftover data is
> still valid when it ain't.  What you need to be looking for is a series
> of two or more queries that crash PG.  Possibly it'll be easier to
> reproduce with that in mind ...

Thanks for the tips. I am trying to get some sort of reproducible series
of queries, but so far no luck. I'll let you know if I find anything.

-- m@



Re: Many 8.3 server crashes in xmlCleanupCharEncodingHandlers()

От
Tom Lane
Дата:
"Matt Magoffin" <postgresql.org@msqr.us> writes:
> Hello, I'm using 8.3b4 and keep experiencing server crash when I execute
> various queries using XML functions.

Please see if it's better with 8.3RC1 plus this patch:

http://archives.postgresql.org/pgsql-committers/2008-01/msg00190.php

            regards, tom lane