Обсуждение: Oddity with NOT IN

Поиск
Список
Период
Сортировка

Oddity with NOT IN

От
Jim Nasby
Дата:
I've got a customer that discovered something odd...

SELECT f1 FROM v1 WHERE f2 not in (SELECT bad FROM v2 WHERE f3 = 1);

does not error, even though bad doesn't exist, but

SELECT bad FROM v2 WHERE f3 = 1;
gives

ERROR:  column "bad" does not exist

Is that expected?

This is on 9.4.8, and both v1 and v2 are views. The only "odd" thing 
that I see is that v1 is a UNION ALL and v2 is a UNION. I don't think 
there's any tables in common between the two views.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Oddity with NOT IN

От
Marko Tiikkaja
Дата:
On 2016-08-04 11:23 PM, Jim Nasby wrote:
> I've got a customer that discovered something odd...
>
> SELECT f1 FROM v1 WHERE f2 not in (SELECT bad FROM v2 WHERE f3 = 1);
>
> does not error, even though bad doesn't exist, but

I'm guessing there's a v1.bad?

This is a common mistake, and also why I recommend always table 
qualifying column references when there's more than one table in scope.


.m



Re: Oddity with NOT IN

От
Jim Nasby
Дата:
On 8/4/16 4:53 PM, Marko Tiikkaja wrote:
> On 2016-08-04 11:23 PM, Jim Nasby wrote:
>> I've got a customer that discovered something odd...
>>
>> SELECT f1 FROM v1 WHERE f2 not in (SELECT bad FROM v2 WHERE f3 = 1);
>>
>> does not error, even though bad doesn't exist, but
>
> I'm guessing there's a v1.bad?
>
> This is a common mistake, and also why I recommend always table
> qualifying column references when there's more than one table in scope.

Well now I feel dumb...

It would be very useful if we had some way to warn users about stuff 
like this. Emitting a NOTICE comes to mind.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Oddity with NOT IN

От
Pavel Stehule
Дата:


2016-08-06 18:53 GMT+02:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 8/4/16 4:53 PM, Marko Tiikkaja wrote:
On 2016-08-04 11:23 PM, Jim Nasby wrote:
I've got a customer that discovered something odd...

SELECT f1 FROM v1 WHERE f2 not in (SELECT bad FROM v2 WHERE f3 = 1);

does not error, even though bad doesn't exist, but

I'm guessing there's a v1.bad?

This is a common mistake, and also why I recommend always table
qualifying column references when there's more than one table in scope.

Well now I feel dumb...

It would be very useful if we had some way to warn users about stuff like this. Emitting a NOTICE comes to mind.

This can be valid query

Regards

Pavel
 
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Oddity with NOT IN

От
Andrew Gierth
Дата:
>>>>> "Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:
>> Well now I feel dumb...>> >> It would be very useful if we had some way to warn users about stuff>> like this.
Emittinga NOTICE comes to mind.
 
Pavel> This can be valid query

It can be, but it essentially never is. The cases where you genuinely
want a correlated IN query are rare, but even then there would be
something in the targetlist that referenced the inner query.

The easy to catch case, I think, is when the targetlist of the IN or NOT
IN subquery contains vars of the outer query level but no vars of the
inner one and no volatile functions. This can be checked for with a
handful of lines in the parser or a couple of dozen lines in a plugin
module (though one would have to invent an error code, none of the
existing WARNING sqlstates would do).

Maybe David Fetter's suggested module for catching missing WHERE clauses
could be expanded into a more general SQL-'Lint' module?

-- 
Andrew (irc:RhodiumToad)



Re: Oddity with NOT IN

От
Jim Nasby
Дата:
On 8/6/16 12:03 PM, Pavel Stehule wrote:
>     It would be very useful if we had some way to warn users about stuff
>     like this. Emitting a NOTICE comes to mind.
>
>
> This can be valid query

Right, but in my experience it's an extremely uncommon pattern and much 
more likely to be a mistake (that ends up being very time consuming to 
debug). That's why I think something like a NOTICE or even a WARNING 
would be useful. The only thing I don't like about that idea is if you 
ever did actually want this behavior you'd have to do something to 
squash the ereport. Though, that's a problem we already have in some 
places, so perhaps not worth worrying about.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Oddity with NOT IN

От
Pavel Stehule
Дата:


2016-08-06 20:01 GMT+02:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 8/6/16 12:03 PM, Pavel Stehule wrote:
    It would be very useful if we had some way to warn users about stuff
    like this. Emitting a NOTICE comes to mind.


This can be valid query

Right, but in my experience it's an extremely uncommon pattern and much more likely to be a mistake (that ends up being very time consuming to debug). That's why I think something like a NOTICE or even a WARNING would be useful. The only thing I don't like about that idea is if you ever did actually want this behavior you'd have to do something to squash the ereport. Though, that's a problem we already have in some places, so perhaps not worth worrying about.

I worked for company where they generated sets of SQL queries as result of transformation from multidimensional query language. Some similar queries are possible there.

I don't thing so using NOTICE or WARNING for any valid query is good idea.

I like the idea of some special extension than can block or raises warning for some strange plans like this or with Cartesian product ...

Regards

Pavel
 

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

Re: Oddity with NOT IN

От
Andrew Gierth
Дата:
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> The easy to catch case, I think, is when the targetlist of theAndrew> IN or NOT IN subquery contains vars of
theouter query levelAndrew> but no vars of the inner one and no volatile functions. ThisAndrew> can be checked for with
ahandful of lines in the parser or aAndrew> couple of dozen lines in a plugin module (though one would haveAndrew> to
inventan error code, none of the existing WARNING sqlstatesAndrew> would do).
 

Actually thinking about this, if you did it in a module, you'd probably
want to make it an ERROR not a WARNING, because you'd want to actually
stop queries like

delete from t1 where x in (select x from table_with_no_column_x);

rather than let them run.

-- 
Andrew (irc:RhodiumToad)



Re: Oddity with NOT IN

От
Jim Nasby
Дата:
On 8/6/16 12:57 PM, Andrew Gierth wrote:
> The easy to catch case, I think, is when the targetlist of the IN or NOT
> IN subquery contains vars of the outer query level but no vars of the
> inner one and no volatile functions. This can be checked for with a
> handful of lines in the parser or a couple of dozen lines in a plugin
> module (though one would have to invent an error code, none of the
> existing WARNING sqlstates would do).

I would still like to warn on any outer vars show up in the target list 
(other than as function params), because it's still very likely to be a 
bug. But I agree that what you describe is even more certain to be one.

> Maybe David Fetter's suggested module for catching missing WHERE clauses
> could be expanded into a more general SQL-'Lint' module?

Possibly, though I hadn't really considered treating this condition as 
an error.

Also, there's some other common gotchas that we could better warn users 
about, some of which involve DDL. One example is accidentally defining 
duplicate indexes.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461



Re: Oddity with NOT IN

От
Corey Huinker
Дата:
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Aug 6, 2016 at 2:13 PM, Jim Nasby <span
dir="ltr"><<ahref="mailto:Jim.Nasby@bluetreble.com" target="_blank">Jim.Nasby@bluetreble.com</a>></span>
wrote:<br/><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"><spanclass="">On 8/6/16 12:57 PM, Andrew Gierth wrote:<br /><blockquote
class="gmail_quote"style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> The easy
tocatch case, I think, is when the targetlist of the IN or NOT<br /> IN subquery contains vars of the outer query level
butno vars of the<br /> inner one and no volatile functions. This can be checked for with a<br /> handful of lines in
theparser or a couple of dozen lines in a plugin<br /> module (though one would have to invent an error code, none of
the<br/> existing WARNING sqlstates would do).<br /></blockquote><br /></span> I would still like to warn on any outer
varsshow up in the target list (other than as function params), because it's still very likely to be a bug. But I agree
thatwhat you describe is even more certain to be one.<span class=""><br /><br /><blockquote class="gmail_quote"
style="margin:0px0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Maybe David Fetter's suggested
modulefor catching missing WHERE clauses<br /> could be expanded into a more general SQL-'Lint' module?<br
/></blockquote><br/></span> Possibly, though I hadn't really considered treating this condition as an error.<br /><br
/>Also, there's some other common gotchas that we could better warn users about, some of which involve DDL. One example
isaccidentally defining duplicate indexes.<span class="im"><br /> -- <br /> Jim Nasby, Data Architect, Blue Treble
Consulting,Austin TX<br /> Experts in Analytics, Data Architecture and PostgreSQL<br /> Data in Trouble? Get it in
Treble!<a href="http://BlueTreble.com" rel="noreferrer" target="_blank">http://BlueTreble.com</a><br /> 855-TREBLE2 <a
href="tel:%28855-873-2532"target="_blank" value="+18558732532">(855-873-2532</a>)   mobile: <a href="tel:512-569-9461"
target="_blank"value="+15125699461">512-569-9461</a><br /><br /><br /></span><div class=""><div class="h5"> -- <br />
Sentvia pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org"
target="_blank">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your subscription:<br /><a
href="http://www.postgresql.org/mailpref/pgsql-hackers"rel="noreferrer"
target="_blank">http://www.postgresql.org/mail<wbr/>pref/pgsql-hackers</a><br /></div></div></blockquote></div><br
/></div><divclass="gmail_extra">If we are contemplating a setting wherein we issue debug/notice/warning messages for
potentiallyerroneous SQL, I would suggest a simple test would be any reference to a column without the a corresponding
table/alias.<br /><br />This is fine:<br />    SELECT a.x, b.y FROM table_that_has_x a JOIN table_that_has_y b ON <a
href="http://a.id">a.id</a>= b.foreign_id<br />This gives the notice/warning:<br />    SELECT x, b.y FROM
table_that_has_xa JOIN table_that_has_y b ON <a href="http://a.id">a.id</a> = b.foreign_id<br /><br />We'd have to
suppressthe warning in cases where no tables are mentioned (no table to alias, i.e. "SELECT 'a_constant' as
config_var"),and I could see a reason for suppressing it where only one table is mentioned, though I often urge table
aliasingand full references because it makes it easier when you modify the query to add another table.<br /><br />Some
settingname suggestions:</div><div class="gmail_extra"><br /></div><blockquote style="margin:0 0 0
40px;border:none;padding:0px"><divclass="gmail_extra">notify_vague_column_reference = (on,off)</div><div
class="gmail_extra">pedantic_column_identifiers= (off,debug,notice,warn,error)</div></blockquote><div
class="gmail_extra"><br/></div></div>