Обсуждение: BUG #3860: xpath crashes backend when is querying xmlagg result
The following bug has been logged online: Bug reference: 3860 Logged by: Sokolov Yura Email address: funny.falcon@gmail.com PostgreSQL version: 8.3beta4 Operating system: Debian Linux 4.0r2 both 32bit and amd64 Description: xpath crashes backend when is querying xmlagg result Details: xpath() crashes backend when is querying particular xmlagg results. Behavior is unstable, but it is reproduced with command line: psql < bg***.sql and initially observed with pgadmin3 1.8 -------- bg1.sql--------------- -- crashed, when queries xmlelement with containment given by xmlagg -- and type of aggregated elements is different select xpath('any_non_empty_expression', xmlelement(name a, xmlagg(el) )) as el from ( values ( xmlelement(name b, 0) ), -- integer ( xmlelement(name c, 0.1) ) -- numeric, string crashe backend -- on 32bit Debian float4 and float8 does not -- on 64 bit Debian float4 does but float8 not -- order is irrelevant ) as t(el); -----end bg1.sql -------------- -------- bg2.sql--------------- -- crashed, when queries xml given by xmlagg -- and types of elements text() is different select xpath('any_non_empty_expression', xmlagg(el) ) as el from ( values ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1) ) ) as t(el); -----end bg2.sql -------------- -------- bg3.sql--------------- -- this crashes 64bit machine select xpath('/a/c', xmlagg(el)) as el from ( values ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1::float4) ) ) as t(el); -----end bg3.sql -------------- -------- bg-not31.sql ---------- -- but this works on 64bit machine select xpath('/b', xmlagg(el)) as el from ( values ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1::float8) ) ) as t(el); -----end bg-not31.sql ---------- -------- bg-not32.sql ---------- -- and, hardly believed, combined file works too select xpath('/b', xmlagg(el)) as el from ( values ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1::float8) ) ) as t(el); select xpath('/c', xmlagg(el)) as el from ( values ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1::float4) ) ) as t(el); -----end bg-not32.sql ---------- -------- bg-not.sql ----------- -- Empty xpath expression catched - no backend crash select xpath('',xmlelement(name a, xmlagg(el) )) as el from ( values ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1) ) ) as t(el); -----end bg-not.sql ----------- -------- bg-not1.sql ----------- -- when types are same no backend crash -- with numeric works on 32bit machine, but crashes on 64bit select xpath('/a/b',xmlelement(name a, xmlagg(el) )) as el from ( values ( xmlelement(name b, 0.1) ), ( xmlelement(name c, 0.1) ) ) as t(el); -----end bg-not1.sql ----------- -------- bg-not2.sql ----------- -- when elements given explicitly then no backend crash select xpath('/a/c', xmlelement(name a, xmlelement(name b, 0), xmlelement(name c, 0.1) )); -----end bg-not2.sql ----------- -------- bg-not3.sql ----------- -- xmlelement itself not crashed select xmlelement(name a, xmlagg(el) ) as el from ( values ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1) ) ) as t(el); -----end bg-not3.sql ----------- -------- bg-not4.sql ----------- -- when elements given with xmlconcat2 then no backend crash select xpath('/a/c', xmlelement(name a, xmlconcat2( xmlelement(name b, 0), xmlelement(name c, 0.1)) )); -----end bg-not4.sql -----------
Sokolov Yura escribió: > -------- bg1.sql--------------- > -- crashed, when queries xmlelement with containment given by xmlagg > -- and type of aggregated elements is different > select xpath('any_non_empty_expression', xmlelement(name a, xmlagg(el) )) as > el > from ( values > ( xmlelement(name b, 0) ), -- integer > ( xmlelement(name c, 0.1) ) -- numeric, string crashe backend > -- on 32bit Debian float4 and float8 does > not > -- on 64 bit Debian float4 does but float8 > not > -- order is irrelevant > ) as t(el); > -----end bg1.sql -------------- I can reproduce this crash. The backtrace looks like this: (gdb) bt #0 0x000000000076a06e in pfree (pointer=0xc4b1c8) at /pgsql/source/00head/src/backend/utils/mmgr/mcxt.c:589 #1 0x000000000072c773 in xml_pfree (ptr=0xc4b1c8) at /pgsql/source/00head/src/backend/utils/adt/xml.c:1342 #2 0x00002b7c3b3ecb12 in xmlCleanupCharEncodingHandlers () from /usr/lib/libxml2.so.2 #3 0x00002b7c3b3f4d75 in xmlCleanupParser () from /usr/lib/libxml2.so.2 #4 0x0000000000730778 in xpath (fcinfo=0x7fff6f909d70) at /pgsql/source/00head/src/backend/utils/adt/xml.c:3441 #5 0x000000000058dc89 in ExecMakeFunctionResult (fcache=0xc40780, econtext=0xc40688, isNull=0xc416f8 "", isDone=0xc417b0) at /pgsql/source/00head/src/backend/executor/execQual.c:1351 #6 0x000000000058e587 in ExecEvalFunc (fcache=0xc40780, econtext=0xc40688, isNull=0xc416f8 "", isDone=0xc417b0) at /pgsql/source/00head/src/backend/executor/execQual.c:1753 #7 0x0000000000594ed6 in ExecTargetList (targetlist=0xc40f08, econtext=0xc40688, values=0xc416d8, isnull=0xc416f8 "", itemIsDone=0xc417b0, isDone=0x0) at /pgsql/source/00head/src/backend/executor/execQual.c:4601 #8 0x00000000005953a9 in ExecProject (projInfo=0xc41718, isDone=0x0) at /pgsql/source/00head/src/backend/executor/execQual.c:4802 #9 0x000000000059b5e4 in agg_retrieve_direct (aggstate=0xc40378) at /pgsql/source/00head/src/backend/executor/nodeAgg.c:989 #10 0x000000000059b2e0 in ExecAgg (node=0xc40378) at /pgsql/source/00head/src/backend/executor/nodeAgg.c:816 #11 0x000000000058b34b in ExecProcNode (node=0xc40378) at /pgsql/source/00head/src/backend/executor/execProcnode.c:394 #12 0x0000000000588a60 in ExecutePlan (estate=0xc3fe48, planstate=0xc40378, operation=CMD_SELECT, numberTuples=0, direction=ForwardScanDirection, dest=0xc30fc0) at /pgsql/source/00head/src/backend/executor/execMain.c:1233 #13 0x000000000058721a in ExecutorRun (queryDesc=0xc08ed8, direction=ForwardScanDirection, count=0) at /pgsql/source/00head/src/backend/executor/execMain.c:267 #14 0x000000000066692b in PortalRunSelect (portal=0xc37228, forward=1 '\001', count=0, dest=0xc30fc0) at /pgsql/source/00head/src/backend/tcop/pquery.c:943 #15 0x000000000066657b in PortalRun (portal=0xc37228, count=9223372036854775807, isTopLevel=1 '\001', dest=0xc30fc0, altdest=0xc30fc0, completionTag=0x7fff6f90a6e0 "") at /pgsql/source/00head/src/backend/tcop/pquery.c:769 #16 0x0000000000660a1f in exec_simple_query ( query_string=0xbf3868 "select xpath('any_non_empty_expression', xmlelement(name a, xmlagg(el) )) as el\nfrom ( v alues ( xmlelement(name b, 0) ), ( xmlelement(name c, 0.1) ) ) as t(el);") at /pgsql/source/00head/src/backend/tcop/postgres.c:963 #17 0x0000000000664917 in PostgresMain (argc=4, argv=0xb4ca28, username=0xb4c9d8 "alvherre") at /pgsql/source/00head/src/backend/tcop/postgres.c:3535 #18 0x000000000062609c in BackendRun (port=0xb72400) at /pgsql/source/00head/src/backend/postmaster/postmaster.c:3180 #19 0x00000000006255de in BackendStartup (port=0xb72400) at /pgsql/source/00head/src/backend/postmaster/postmaster.c:2803 -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > I can reproduce this crash. The backtrace looks like this: Ah, good. This looks like it matches a previous report, but we didn't have a reproducible test case: http://archives.postgresql.org/pgsql-general/2007-12/msg01086.php regards, tom lane
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > I can reproduce this crash. The backtrace looks like this: Hmm, what I'm seeing is that libxml is apparently trying to pfree something that it didn't allocate via palloc ... -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Hmm, what I'm seeing is that libxml is apparently trying to pfree > something that it didn't allocate via palloc ... Not totally surprising --- when we call xmlMemSetup() we are telling libxml to start using xml_palloc etc rather than its default of malloc etc. We had some bugs earlier involving not calling xmlMemSetup soon enough during some seqauence of operations; this may be another. Another likely theory is that libxml is trying to pfree something that had been palloc'd in a previous cycle of function calls, and then went away in a memory context reset. We try to shut the library down completely at exit from xml.c functions, so that it doesn't think it has any open allocated memory, buut there may be something we've missed doing. See the note at lines 27ff of xml.c. If nothing else comes to mind, try instrumenting xml_palloc and friends to print a trace of what they're doing, and match up the calls ... regards, tom lane
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Hmm, what I'm seeing is that libxml is apparently trying to pfree > > something that it didn't allocate via palloc ... > > Not totally surprising --- when we call xmlMemSetup() we are telling > libxml to start using xml_palloc etc rather than its default of malloc > etc. We had some bugs earlier involving not calling xmlMemSetup soon > enough during some seqauence of operations; this may be another. The crash is on free of the character encoding stuff in libxml. So I traced its initialization and xmlMemSetup, and my conclusion that this is not the problem -- xmlMemSetup is called earlier. > Another likely theory is that libxml is trying to pfree something that > had been palloc'd in a previous cycle of function calls, and then went > away in a memory context reset. We try to shut the library down > completely at exit from xml.c functions, so that it doesn't think it has > any open allocated memory, buut there may be something we've missed > doing. See the note at lines 27ff of xml.c. [ ... a lot of playing with gdb later, including clearing up confusion between tracepoints and watchpoints ... ] What's happening is that libxml encoding handler table is being allocated in an ExprContext which apparently is too short-lived. I'm not seeing very well how to handle this -- one idea would be to manually call xmlInitCharEncodingHandlers (which is public but supposed to be called internally by libxml) with a longer-lived context, but I wonder whether there's some other initialization that will come bite us. Another idea would be to initialize and then destroy the libxml state separately for each expression, which perhaps doesn't really work at all. Perhaps a better idea is to create a separate LibxmlContext memcxt, child of QueryContext, and have xml_palloc etc always use that. That way it won't be reset between calls. It probably also means we could wire xml reset to transaction abort, making it a bit simpler. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > What's happening is that libxml encoding handler table is being > allocated in an ExprContext which apparently is too short-lived. > I'm not seeing very well how to handle this -- one idea would be to > manually call xmlInitCharEncodingHandlers (which is public but supposed > to be called internally by libxml) with a longer-lived context, but I > wonder whether there's some other initialization that will come bite us. Ugh. This seems like about the worst-case scenario: we don't know when libxml chooses to allocate or free this data structure, and apparently we have no way to force it to be freed. > Another idea would be to initialize and then destroy the libxml state > separately for each expression, which perhaps doesn't really work at > all. That's what we're trying to do now, I thought. > Perhaps a better idea is to create a separate LibxmlContext memcxt, > child of QueryContext, and have xml_palloc etc always use that. That > way it won't be reset between calls. It probably also means we could > wire xml reset to transaction abort, making it a bit simpler. Might as well go back to letting it use malloc :-(. I actually don't see a problem with letting it use malloc for stuff that it is going to manage for itself. I guess the problem comes with transient return values of libxml functions; we'd want to explicitly move those into palloc-based storage and then free() them. This looks like a rather large rewrite though. Peter, have you any better ideas? regards, tom lane
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > What's happening is that libxml encoding handler table is being > > allocated in an ExprContext which apparently is too short-lived. > > > I'm not seeing very well how to handle this -- one idea would be to > > manually call xmlInitCharEncodingHandlers (which is public but supposed > > to be called internally by libxml) with a longer-lived context, but I > > wonder whether there's some other initialization that will come bite us. > > Ugh. This seems like about the worst-case scenario: we don't know when > libxml chooses to allocate or free this data structure, and apparently > we have no way to force it to be freed. Yeah, we can free it with xmlCleanupParser. Too blunt an instrument, perhaps? > > Another idea would be to initialize and then destroy the libxml state > > separately for each expression, which perhaps doesn't really work at > > all. > > That's what we're trying to do now, I thought. Yeah, apparently it doesn't work very well. > I actually don't see a problem with letting it use malloc for stuff that > it is going to manage for itself. I guess the problem comes with > transient return values of libxml functions; we'd want to explicitly > move those into palloc-based storage and then free() them. I guess the problem is that the objects can contain pointers and we have no way of following those to copy them. Perhaps we can make it use malloc, let it create the object, switch to palloc, create a clone, revert to malloc, free the original. What problem you have with it having its own memory context? I don't think it has any particular disadvantage. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > What problem you have with it having its own memory context? I don't > think it has any particular disadvantage. I don't object to that per se; I'm just saying it doesn't do much to fix the problem. The difficulty we've got here ultimately comes down to separating long-lived objects from not-long-lived ones. It seems libxml doesn't have adequate controls to let us do that. If we run libxml in a context we don't ever reset, then we leak memory if an error occurs while we're fooling around with a function result. If we do reset the context, then we've got exactly this problem again of possibly deleting objects that libxml thinks are still there. We might be able to compromise by only resetting the context after an error, but this is still only possible if we have a way to make libxml let go of *all* pointers to alloc'd objects. I don't understand your comment that xmlCleanupParser solves it --- we call that already, and it doesn't seem to be preventing the problem. regards, tom lane
Tom Lane escribió: > We might be able to compromise by only resetting the context after > an error, but this is still only possible if we have a way to make > libxml let go of *all* pointers to alloc'd objects. I don't understand > your comment that xmlCleanupParser solves it --- we call that already, > and it doesn't seem to be preventing the problem. We call it *after* the context has been reset -- or at least that's what gdb is showing me. Hardware watchpoint 3: handlers[0] Old value = (xmlCharEncodingHandlerPtr) 0xc43580 New value = (xmlCharEncodingHandlerPtr) 0x7f7f7f7f7f7f7f7f 0x00002b579ad84283 in memset () from /lib/libc.so.6 (gdb) bt #0 0x00002b579ad84283 in memset () from /lib/libc.so.6 #1 0x0000000000768b69 in AllocSetReset (context=0xc1d970) at /pgsql/source/00head/src/backend/utils/mmgr/aset.c:456 #2 0x0000000000769f42 in MemoryContextReset (context=0xc1d970) at /pgsql/source/00head/src/backend/utils/mmgr/mcxt.c:130 #3 0x000000000059791c in ReScanExprContext (econtext=0xc42050) at /pgsql/source/00head/src/backend/executor/execUtils.c:447 #4 0x00000000005a6ed8 in ValuesNext (node=0xc41f38) at /pgsql/source/00head/src/backend/executor/nodeValuesscan.c:106 #5 0x00000000005957ec in ExecScan (node=0xc41f38, accessMtd=0x5a6d98 <ValuesNext>) at /pgsql/source/00head/src/backend/executor/execScan.c:68 #6 0x00000000005a706c in ExecValuesScan (node=0xc41f38) at /pgsql/source/00head/src/backend/executor/nodeValuesscan.c:173 #7 0x000000000058b6bd in ExecProcNode (node=0xc41f38) at /pgsql/source/00head/src/backend/executor/execProcnode.c:360 #8 0x000000000059b86b in agg_retrieve_direct (aggstate=0xc41448) at /pgsql/source/00head/src/backend/executor/nodeAgg.c:921 #9 0x000000000059b6d0 in ExecAgg (node=0xc41448) at /pgsql/source/00head/src/backend/executor/nodeAgg.c:816 #10 0x000000000058b73b in ExecProcNode (node=0xc41448) at /pgsql/source/00head/src/backend/executor/execProcnode.c:394 #11 0x0000000000588e50 in ExecutePlan (estate=0xc40f18, planstate=0xc41448, operation=CMD_SELECT, numberTuples=0, direction=ForwardScanDirection, dest=0xc31748) at /pgsql/source/00head/src/backend/executor/execMain.c:1233 #12 0x000000000058760a in ExecutorRun (queryDesc=0xc09f78, direction=ForwardScanDirection, count=0) at /pgsql/source/00head/src/backend/executor/execMain.c:267 #13 0x000000000066715b in PortalRunSelect (portal=0xc382f8, forward=1 '\001', count=0, dest=0xc31748) at /pgsql/source/00head/src/backend/tcop/pquery.c:943 #14 0x0000000000666dab in PortalRun (portal=0xc382f8, count=9223372036854775807, isTopLevel=1 '\001', dest=0xc31748, altdest=0xc31748, completionTag=0x7fff10f94cf0 "") at /pgsql/source/00head/src/backend/tcop/pquery.c:769 #15 0x000000000066125f in exec_simple_query ( query_string=0xbf4938 "select xpath('any_non_empty_expression', xmlagg(el) ) as el\nfrom ( values\n ( xmlelement(nameb, 0) ),\n ( xmlelement(name c, 0.1) )\n) as t(el);") at /pgsql/source/00head/src/backend/tcop/postgres.c:963 [...] -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera escribió: > Tom Lane escribió: > > > We might be able to compromise by only resetting the context after > > an error, but this is still only possible if we have a way to make > > libxml let go of *all* pointers to alloc'd objects. I don't understand > > your comment that xmlCleanupParser solves it --- we call that already, > > and it doesn't seem to be preventing the problem. > > We call it *after* the context has been reset -- or at least that's what > gdb is showing me. > > Hardware watchpoint 3: handlers[0] I forgot to mention that I had a breakpoint on xmlCleanupParser that didn't fire. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Tom Lane escribió: > We might be able to compromise by only resetting the context after > an error, but this is still only possible if we have a way to make > libxml let go of *all* pointers to alloc'd objects. I don't understand > your comment that xmlCleanupParser solves it --- we call that already, > and it doesn't seem to be preventing the problem. With the attached patch, it doesn't crash, but I see the added WARNING four times in the log, which is proof that the cleanup thing is not called as the code seems to think. I wonder -- is this thing supposed to be reentrant? I think that's the whole problem with it. (I think what I'm doing in xml_init in the non-first case is bogus anyway -- but I post the patch to show my point.) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Вложения
I wonder whether the real issue here isn't that we have some functions that invoke libxml without ultimately doing xmlCleanupParser() --- xml_in being the first obvious candidate. Maybe that is the mechanism through which libxml ends up with dangling pointers. regards, tom lane
Tom Lane escribió: > I wonder whether the real issue here isn't that we have some functions > that invoke libxml without ultimately doing xmlCleanupParser() --- > xml_in being the first obvious candidate. Maybe that is the mechanism > through which libxml ends up with dangling pointers. Hmm. I see that xml_in uses xml_parse, which does call xmlCleanupParser. However, I see that xml_in calls xmlFreeDoc _after_ xmlCleanupParser. Maybe that's a bug in itself. However, I doubt that explains the issue at hand, because as far as the segmentation fault, nobody has called xmlCleanupParser yet. Perhaps look for another candidate. Another thing I'm wondering is why xml_parse is wasting time returning the parsed document, if all callers just call xmlFreeDoc() on it immediately. Hmm ... parse_xml_decl is seen calling xml_init and libxml functions, but not xmlCleanupParser. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera escribió: > Tom Lane escribió: > > > We might be able to compromise by only resetting the context after > > an error, but this is still only possible if we have a way to make > > libxml let go of *all* pointers to alloc'd objects. I don't understand > > your comment that xmlCleanupParser solves it --- we call that already, > > and it doesn't seem to be preventing the problem. > > With the attached patch, it doesn't crash, but I see the added WARNING > four times in the log, which is proof that the cleanup thing is not > called as the code seems to think. Update: it does crash for another of the test cases by Sokolov, if executed more than once. Reason: it calls xml_init after xmlCleanupParser has been called, so the memory context is created again, only to be reset later. The culprit seems to be xml_out_internal. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Perhaps a better idea is to create a separate LibxmlContext memcxt, > > child of QueryContext, and have xml_palloc etc always use that. That > > way it won't be reset between calls. It probably also means we could > > wire xml reset to transaction abort, making it a bit simpler. > > Might as well go back to letting it use malloc :-(. > > I actually don't see a problem with letting it use malloc for stuff that > it is going to manage for itself. I guess the problem comes with > transient return values of libxml functions; we'd want to explicitly > move those into palloc-based storage and then free() them. > > This looks like a rather large rewrite though. Peter, have you any > better ideas? OK, after a lot of research I think the best way to deal with this is what I suggest above. With the attached patch, I cannot cause the system to crash with any of the given examples. Furthermore this may help clean up the PG_TRY blocks that are currently sprinkled through the xml.c code. Handling of subtransactions is no good at this point, but I think that could easily be improved. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > + void > + AtEOXact_xml(void) > + { > + if (LibxmlContext == NULL) > + return; > + > + MemoryContextDelete(LibxmlContext); > + LibxmlContext = NULL; > + > + xmlCleanupParser(); > + } [ blink... ] Shouldn't that be the other way around? It looks to me like xmlCleanupParser will be pfree'ing already-deleted data. Did you test this with CLOBBER_FREED_MEMORY active? Also, I don't see how this patch fixes what I believe to be the fundamental problem, which is that we have code paths that invoke libxml without having previously initialized the alloc support. I think the sequence if (LibxmlContext == NULL) { create LibxmlContext; xmlMemSetup(...); } ought to be executed before *any* use of libxml stuff (which suggests factoring it out as a subroutine...) We also need to do some testing to see if letting the thing go until transaction end is OK, or whether we will see unacceptable memory leaks over long series of XML calls. (In particular I suspect that we'd better zap the context at subtransaction abort; but even without any error, are we sure that normal operations don't leak memory?) One thing I was wondering about earlier today is whether libxml isn't expecting NULL-return-on-failure from the malloc-substitute routine. If we take control away from it unexpectedly, I wouldn't be a bit surprised if its data structures are left corrupt. This might lead to failures during cleanup. I do like the idea of getting rid of the PG_TRY blocks and the associated cleanup attempts; with the approach you're converging on here, we need only assume that xmlCleanupParser() will get rid of all staticly-allocated pointers and not crash, whereas right now we are assuming a great deal of libxml stuff still works after an ereport (which makes throwing ereport from xml_palloc even more risky). regards, tom lane
Tom Lane escribió: > One thing I was wondering about earlier today is whether libxml isn't > expecting NULL-return-on-failure from the malloc-substitute routine. > If we take control away from it unexpectedly, I wouldn't be a bit > surprised if its data structures are left corrupt. This might lead to > failures during cleanup. Hmm, this is a very good point. I quick look at the source shows that they are not very consistent on its own checking for memory allocation errors. For example, see a bug I just reported: http://bugzilla.gnome.org/show_bug.cgi?id=508662 The problem is that many routines look like this: xmlXPathNewNodeSet(xmlNodePtr val) { xmlXPathObjectPtr ret; ret = (xmlXPathObjectPtr) xmlMalloc(sizeof(xmlXPathObject)); if (ret == NULL) { xmlXPathErrMemory(NULL, "creating nodeset\n"); return(NULL); } and others would call this code and then happily use the return value without checking for null. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane escribi�: >> One thing I was wondering about earlier today is whether libxml isn't >> expecting NULL-return-on-failure from the malloc-substitute routine. >> If we take control away from it unexpectedly, I wouldn't be a bit >> surprised if its data structures are left corrupt. This might lead to >> failures during cleanup. > Hmm, this is a very good point. I quick look at the source shows that > they are not very consistent on its own checking for memory allocation > errors. For example, see a bug I just reported: > http://bugzilla.gnome.org/show_bug.cgi?id=508662 Ugh. So we're pretty much damned if we do and damned if we don't. Given what you showed, it is certain that we are at risk if we return NULL, whereas it is merely hypothetical that we are at risk if we longjmp. So let's stick to the palloc infrastructure for now. regards, tom lane
"Sokolov Yura" <funny.falcon@gmail.com> writes: > xpath() crashes backend when is querying particular xmlagg results. Please see whether this patch fixes it: http://archives.postgresql.org/pgsql-committers/2008-01/msg00190.php The specific examples you give seem to be fixed, but that's not very much of a test case for such a low-level issue. regards, tom lane
Tom Lane wrote: > "Sokolov Yura" <funny.falcon@gmail.com> writes: > >> xpath() crashes backend when is querying particular xmlagg results. >> > > Please see whether this patch fixes it: > > http://archives.postgresql.org/pgsql-committers/2008-01/msg00190.php > > The specific examples you give seem to be fixed, but that's not very > much of a test case for such a low-level issue. > > regards, tom lane > > I have no other tests. I was just playing with the cool new feature.