Re: [HACKERS] Postgres dies in the rules regression test (64-bit problem)

Поиск
Список
Период
Сортировка
От Pedro J. Lobo
Тема Re: [HACKERS] Postgres dies in the rules regression test (64-bit problem)
Дата
Msg-id Pine.OSF.4.05.9906111335120.5407-100000@haddock.euitt.upm.es
обсуждение исходный текст
Ответ на Re: [HACKERS] Postgres dies in the rules regression test (64-bit problem)  ("Pedro J. Lobo" <pjlobo@euitt.upm.es>)
Ответы Re: [HACKERS] Postgres dies in the rules regression test (64-bit problem)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Fri, 11 Jun 1999, Pedro J. Lobo wrote:

>On Thu, 10 Jun 1999, Tom Lane wrote:
>
>>"Pedro J. Lobo" <pjlobo@euitt.upm.es> writes:
>>>
>>> If someone points me to the right place to look, I can play a bit more
>>> with gdb and try to find the cause.
>>
>>The Expr node was presumably made by make_op() in
>>backend/parser/parse_oper.c, although the tree might have been copied at
>>least once using the functions in backend/nodes/copyfuncs.c.  Good luck!
>
>Ok, I'll let you know if/when I find something (or, more probably, when I
>have more questions ;-)

Well, I have found something. As you said, the node is made by make_op()
(which, btw, is in parse_node.c ;-) and later copied using copyObject().
But those functions are correct and work as expected.

I have located the point where the pointer is overwritten. Here is the
stack trace:

(gdb) where
#0  ResolveNew (info=0x140169ac0, targetlist=0x140169f70, nodePtr=0x14016aac8,    sublevels_up=0) at
rewriteManip.c:719
#1  0x12013ea38 in ResolveNew (info=0x140169ac0, targetlist=0x140169f70,    nodePtr=0x14016ab38, sublevels_up=0) at
rewriteManip.c:670
#2  0x12013ea58 in ResolveNew (info=0x140169ac0, targetlist=0x140169f70,    nodePtr=0x140169b80, sublevels_up=0) at
rewriteManip.c:730
#3  0x12013ead0 in FixNew (info=0x140169ac0, parsetree=0x1401692d0)   at rewriteManip.c:753
#4  0x12013cc5c in fireRules (parsetree=0x1401692d0, rt_index=1,    event=CMD_UPDATE, instead_flag=0x11fffa6c0 "\001",
locks=0x14016a8c0,   qual_products=0x11fffa6b8) at rewriteHandler.c:2612
 
#5  0x12013ce90 in RewriteQuery (parsetree=0x1401692d0,    instead_flag=0x11fffa6c0 "\001", qual_products=0x11fffa6b8)
at rewriteHandler.c:2697
 
#6  0x12013cf38 in deepRewriteQuery (parsetree=0x1401692d0)   at rewriteHandler.c:2742
#7  0x12013d008 in QueryRewriteOne (parsetree=0x1401692d0)   at rewriteHandler.c:2791
#8  0x12013d128 in BasicQueryRewrite (parsetree=0x1401692d0)   at rewriteHandler.c:2862
#9  0x12013d230 in QueryRewrite (parsetree=0x1401692d0)   at rewriteHandler.c:2916
#10 0x1200b3ce4 in ExplainQuery (query=0x1401692d0, verbose=0 '\000',    dest=Remote) at explain.c:68
#11 0x12015dce0 in ProcessUtility (parsetree=0x140169110, dest=Remote)   at utility.c:651
#12 0x12015a138 in pg_exec_query_dest (   query_string=0x11fffa918 "explain update rtest_v1 set a = rtest_t3.a +   20
whereb = rtest_t3.b;", dest=Remote, aclOverride=0 '\000') at   postgres.c:727
 
#13 0x120159f80 in pg_exec_query (   query_string=0x11fffa918 "explain update rtest_v1 set a = rtest_t3.a +   20 where
b= rtest_t3.b;") at postgres.c:656
 
#14 0x12015baa0 in PostgresMain (argc=10, argv=0x11fffee90, real_argc=9,    real_argv=0x11ffffc28) at postgres.c:1658
#15 0x12012d02c in DoBackend (port=0x1400d9a00) at postmaster.c:1628
#16 0x12012c778 in BackendStartup (port=0x1400d9a00) at postmaster.c:1373
#17 0x12012b5d8 in ServerLoop () at postmaster.c:823
#18 0x12012ae00 in PostmasterMain (argc=9, argv=0x11ffffc28)   at postmaster.c:616
#19 0x1200e0b30 in main (argc=9, argv=0x11ffffc28) at main.c:93
(gdb)

ResolveNew() is a recursive function in backend/rewrite/rewriteManip.c.
The problem appears one time that it is called with nodePtr pointing to a
Var node. This is the node:

(gdb) p *((Var *) *nodePtr)
$4 = {type = T_Var, varno = 4, varattno = 1, vartype = 23, vartypmod = -1,  varlevelsup = 0, varnoold = 4, varoattno =
1}

In the beginning of the function, there is a 'switch(nodeTag(node))'
('node' is assigned the value of '*nodePtr'), that jumps to the
corresponding 'case T_Var:' at line 695. There you see a call to
FindMatchingNew() (a static function of the same c file), that returns a
pointer to a node that, from what I have seen in that function, is
expected to be of type Expr, and indeed it is. This pointer is stored in
a variable named 'n'. Right after that, you can see this code:

if (n == NULL)
{   ... some code that isn't executed because n != NULL
}
else
{       *nodePtr = copyObject(n);       ((Var *) *nodePtr)->varlevelsup = this_varlevelsup;
}

Well, this *can't* be correct. 'n' points to a node of type Expr, and of
course copyObject returns a node of that type. But that node is treated in
the following line as if it were of type Var! The value of
'this_varlevelsup' is 0, and the offset of the field 'varlevelsup' of
node Var is the same as that of the field 'oper' of node Expr, and the 
pointer is overwritten.

I'd bet that the copy of the Expr node is not supposed to be stored in
*nodePtr (which points to a node of type Var), but I don't know what would
be the right action. I've found the bug, now I'll let the wise ones fix it
;-)

This problem has been hidden until now (for me, at least) because of a bug
in the 'money' data type that broke the rules test before it reached that
point. That bug sowed up only when you had locale enabled and your
currency didn't use cents (like spanish pesetas). On 32-bit systems it
is likely that the pointer isn't overwritten, so the bug didn't showed up,
either.

Cheers,
Pedro.

-- 
-------------------------------------------------------------------
Pedro José Lobo Perea                   Tel:    +34 91 336 78 19
Centro de Cálculo                       Fax:    +34 91 331 92 29
E.U.I.T. Telecomunicación               e-mail: pjlobo@euitt.upm.es
Universidad Politécnica de Madrid
Ctra. de Valencia, Km. 7                E-28031 Madrid - España / Spain



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Remigiusz Sokolowski
Дата:
Сообщение: Re: [HACKERS] postgres processes
Следующее
От: Dmitry Samersoff
Дата:
Сообщение: UnixWare