Обсуждение: seg regression failures
contrib/seg fails regression tests on my msvc build (reason Ididn't notice earlier is probably that you couldn't run contribcheck on the vcbuild). regression.diff attached. Anybody see a smoking gun? I don't really know what seg is supposed to be doing, so I don't know where to start. (had to send this to -patches, because -hackers denied the attachment..) //Magnus
Вложения
> Anybody see a smoking gun? I don't really know what seg is supposed to be > doing, so I don't know where to start. As crazy idea only: try to modify seg_overlap to bool seg_overlap(SEG * a, SEG * b) { return ( ((a->upper >= b->upper) && (a->lower <= b->upper)) || ((b->upper >= a->upper) && (b->lower <= a->upper)) ) ? true : false; } -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Fri, Mar 23, 2007 at 01:15:10PM +0300, Teodor Sigaev wrote: > >Anybody see a smoking gun? I don't really know what seg is supposed to be > >doing, so I don't know where to start. > > As crazy idea only: try to modify seg_overlap to > bool > seg_overlap(SEG * a, SEG * b) > { > return ( > ((a->upper >= b->upper) && (a->lower <= b->upper)) > || > ((b->upper >= a->upper) && (b->lower <= a->upper)) > ) ? true : false; > } Nope, no difference. //Magnus
Magnus Hagander wrote: > contrib/seg fails regression tests on my msvc build (reason Ididn't notice > earlier is probably that you couldn't run contribcheck on the vcbuild). > regression.diff attached. > > Anybody see a smoking gun? I don't really know what seg is supposed to be > doing, so I don't know where to start. > > (had to send this to -patches, because -hackers denied the attachment..) > > [snip] I think I would have put the diff up somewhere on a web site, although i guess this way we have a record of it. Anyway, I think you probably need to load up the old debugger and put a break point in the overlap function ... surely the error can't be in float4in or else we'd have seen other regression problems. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Anyway, I think you probably need to load up the old debugger and put a > break point in the overlap function ... surely the error can't be in > float4in or else we'd have seen other regression problems. One of the failing test cases is for seg_over_left, which is so trivial that it's pretty hard to believe even a Microsoft compiler could get it wrong ;-). My bet is something wrong in segparse.y or possibly segscan.l, leading to garbage seg values being produced. We've seen segparse.y trigger compiler bugs before --- look at its CVS history. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Anyway, I think you probably need to load up the old debugger and put a >> break point in the overlap function ... surely the error can't be in >> float4in or else we'd have seen other regression problems. > > One of the failing test cases is for seg_over_left, which is so trivial > that it's pretty hard to believe even a Microsoft compiler could get it > wrong ;-). My bet is something wrong in segparse.y or possibly > segscan.l, leading to garbage seg values being produced. We've seen > segparse.y trigger compiler bugs before --- look at its CVS history. Loading up the debugger (I'm so happy not to have gdb-win32 now :-P), I get some interesting stuff. My test case is "SELECT '1'::seg && '2'::seg" which should return false, but returns true. I also tried seg_overlap('1'::seg,'2'::seg) to make the path easier, but same problem - still returns 't' when it should return'f'. The SEG parameters going into seg_overlap() look perfectly correct, and seg_overlap() actually returns 0. But this is somehow later turned into 't'. Any pointers for where to look for how that happens? For example, calling seg_same('1'::seg,'2'::seg) works fine, so it's not something in general with bool functions, or with the input functions. In fact, those two functions looks very very similar to me. //Magnus
Magnus Hagander <magnus@hagander.net> writes: > The SEG parameters going into seg_overlap() look perfectly correct, and > seg_overlap() actually returns 0. But this is somehow later turned into > 't'. Any pointers for where to look for how that happens? I'll betcha that MSVC is generating code that only sets the low-order byte of the return register (EAX likely) where GCC tends to set the whole register. So when the returned value is taken as a Datum, it might contain some garbage. Seems like we need to either reconsider the definition of DatumGetBool, or decree that old-style functions returning bool are broken. I'm a bit surprised this hasn't come up before, actually, since it seems like it could happen on a lot of architectures. Fixing DatumGetBool is probably the right thing to do. -#define DatumGetBool(X) ((bool) (((Datum) (X)) != 0)) +#define DatumGetBool(X) ((bool) (((bool) (X)) != 0)) regards, tom lane
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> The SEG parameters going into seg_overlap() look perfectly correct, and >> seg_overlap() actually returns 0. But this is somehow later turned into >> 't'. Any pointers for where to look for how that happens? > > I'll betcha that MSVC is generating code that only sets the low-order > byte of the return register (EAX likely) where GCC tends to set the > whole register. So when the returned value is taken as a Datum, it > might contain some garbage. That looks very consistent with the data in my debugger (returned value is not zero for the whole datum, but casted to bool it's fine). I'm just going to clean up a compile error I introduced and try your suggestion below, and I'll let you know how it works out. //Magnus
Magnus Hagander wrote: > Tom Lane wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>> The SEG parameters going into seg_overlap() look perfectly correct, and >>> seg_overlap() actually returns 0. But this is somehow later turned into >>> 't'. Any pointers for where to look for how that happens? >> I'll betcha that MSVC is generating code that only sets the low-order >> byte of the return register (EAX likely) where GCC tends to set the >> whole register. So when the returned value is taken as a Datum, it >> might contain some garbage. > > That looks very consistent with the data in my debugger (returned value > is not zero for the whole datum, but casted to bool it's fine). > > I'm just going to clean up a compile error I introduced and try your > suggestion below, and I'll let you know how it works out. Yup, that was indeed the problem. I assume you'll do the honors. //Magnus
Magnus Hagander <magnus@hagander.net> writes: > Yup, that was indeed the problem. I assume you'll do the honors. Yeah, the comments need fixed too. I'll do it later today. regards, tom lane
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> Yup, that was indeed the problem. I assume you'll do the honors. > > Yeah, the comments need fixed too. I'll do it later today. Thanks. //Magnus