Обсуждение: aliases break my query

Поиск
Список
Период
Сортировка

aliases break my query

От
Joseph Shraibman
Дата:
These two queries are exactly alike. The first one uses aliases except
for the order by. The second uses aliases also for the order by. The
third uses whole names.  The third has the behavior I want.

Someone please tell me what I am doing wrong. I don't want to have to
use whole names for my query.

The data for the tables are at the end.


playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by tablea.a;
a|b|c|?column?
-+-+-+--------
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       1
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       1
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       1
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       1
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       1
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       1
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       1
3|4|5|       0
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       1
4|5|4|       0
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       1
1|2| |       0
2|3|4|       0
3|4|5|       0
4|5|4|       0
(80 rows)

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
ta.a) from tablea ta, tableb tb order by ta.a;
a|b|c|?column?
-+-+-+--------
1|2| |       0
1|2| |       0
1|2| |       0
1|2| |       0
1|2| |       0
2|3|4|       1
2|3|4|       1
2|3|4|       0
2|3|4|       0
2|3|4|       0
3|4|5|       0
3|4|5|       0
3|4|5|       1
3|4|5|       0
3|4|5|       0
4|5|4|       0
4|5|4|       0
4|5|4|       0
4|5|4|       1
4|5|4|       0
(20 rows)

playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
where tableb.yy = tablea.a) order by tablea.a;
a|b|c|?column?
-+-+-+--------
1|2| |       0
2|3|4|       2
3|4|5|       1
4|5|4|       1
(4 rows)

playpen=> 
playpen=> select * from tablea;
a|b|c
-+-+-
1|2| 
2|3|4
3|4|5
4|5|4
(4 rows)

playpen=> select * from tableb;
yy|zz
--+--2| 42| 53| 94|145|15
(5 rows)


Re: aliases break my query

От
Tom Lane
Дата:
Joseph Shraibman <jks@selectacast.net> writes:
> These two queries are exactly alike. The first one uses aliases except
> for the order by. The second uses aliases also for the order by. The
> third uses whole names.  The third has the behavior I want.

I think you are confusing yourself by leaving out FROM clauses.
In particular, with no FROM for the inner SELECT it's not real clear
what should happen there.  I can tell you what *is* happening, but
who's to say if it's right or wrong?

> playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> ta.a) from tablea ta, tableb tb order by tablea.a;
[ produces 80 rows ]

> playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> ta.a) from tablea ta, tableb tb order by ta.a;
[ produces 20 rows ]

The difference between these two is that by explicitly specifying
"tablea" in the order-by clause, you've created a three-way join,
as if you had written "from tablea ta, tableb tb, tablea tablea".
Once you write an alias in a from-clause entry, you must refer to
that from-clause entry by its alias, not by its true table name.

Meanwhile, what of the inner select?  It has no FROM clause *and*
no valid table names.  The only way to interpret the names in it
is as references to the outer select.  So, on any given iteration
of the outer select, the inner select collapses to constants.
It looks like "SELECT count(constant1) WHERE constant2 = constant3"
and so you get either 0 or 1 depending on whether tb.yy and ta.a
from the outer scan are different or equal.

> playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
> where tableb.yy = tablea.a) order by tablea.a;
[ produces 4 rows ]

Here the outer select is not a join at all --- it mentions only tablea,
so you are going to get one output for each tablea row.  The inner
select looks like "select count (zz) FROM tableb WHERE yy = <constant>",
so you get an actual scan of tableb for each iteration of the outer
scan.

It's not very clear from these examples what you actually wanted to have
happen, but I suggest that you will have better luck if you specify
explicit FROM lists in both the inner and outer selects, and be careful
that each variable you use clearly refers to exactly one of the
FROM-list entries.
        regards, tom lane


Re: aliases break my query

От
Peter Eisentraut
Дата:
> > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> > ta.a) from tablea ta, tableb tb order by tablea.a;
> [ produces 80 rows ]

> > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> > ta.a) from tablea ta, tableb tb order by ta.a;
> [ produces 20 rows ]

> > playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
> > where tableb.yy = tablea.a) order by tablea.a;
> [ produces 4 rows ]

Once again, I think that we *really* need to discuss whether implicit
range table entries in SELECT are a good idea. We invariably get a
question like this every week and invariably the answer is "if you give a
table an alias you *must* refer to it by that alias". (I'm sure Tom has
this reply automated by now.) I claim the only thing that buys is
confusion for very little convenience at the other end.

Stop the madness! :)

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: aliases break my query

От
Tom Lane
Дата:
Peter Eisentraut <e99re41@DoCS.UU.SE> writes:
> Once again, I think that we *really* need to discuss whether implicit
> range table entries in SELECT are a good idea. We invariably get a
> question like this every week and invariably the answer is "if you give a
> table an alias you *must* refer to it by that alias". (I'm sure Tom has
> this reply automated by now.)

No, this one was actually a pretty original way of shooting oneself in
the foot ;-).  I thought the interesting point was the confusion between
whether variables in the inner select were supposed to be local to the
inner select or references to the outer select.  I'm not sure getting
rid of implicit rangetable entries would've helped prevent that.

> I claim the only thing that buys is
> confusion for very little convenience at the other end.
>
> Stop the madness! :)

I doubt that it's worth breaking a lot of existing applications for.

At one time Bruce had made some patches to emit informative notice
messages about implicit FROM entries, but that got turned off again
for reasons that I forget...
        regards, tom lane


Re: aliases break my query

От
Joseph Shraibman
Дата:
Peter Eisentraut wrote:
> 
> > > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> > > ta.a) from tablea ta, tableb tb order by tablea.a;
> > [ produces 80 rows ]
> 
> > > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> > > ta.a) from tablea ta, tableb tb order by ta.a;
> > [ produces 20 rows ]
> 
> > > playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
> > > where tableb.yy = tablea.a) order by tablea.a;
> > [ produces 4 rows ]
> 
> Once again, I think that we *really* need to discuss whether implicit
> range table entries in SELECT are a good idea.

What is an "implicit range table entry"?
We invariably get a
> question like this every week and invariably the answer is "if you give a
> table an alias you *must* refer to it by that alias".

Hey, I *did* do that in the second query, and that still produced extra
results. I tried putting the aliases in the inner select too but that
didn't help. In fact the inner select always is 4 in that case. Unless I
only alias tableb in the inner query, and let it get the definition of
tablea from the outer query.

(I'm sure Tom has
> this reply automated by now.) I claim the only thing that buys is
> confusion for very little convenience at the other end.
> 
> Stop the madness! :)
> 
> --
> Peter Eisentraut                  Sernanders väg 10:115
> peter_e@gmx.net                   75262 Uppsala
> http://yi.org/peter-e/            Sweden


Re: aliases break my query

От
Joseph Shraibman
Дата:
Tom Lane wrote:
> 
> Joseph Shraibman <jks@selectacast.net> writes:
> > These two queries are exactly alike. The first one uses aliases except
> > for the order by. The second uses aliases also for the order by. The
> > third uses whole names.  The third has the behavior I want.
> 
> I think you are confusing yourself by leaving out FROM clauses.
> In particular, with no FROM for the inner SELECT it's not real clear
> what should happen there.  I can tell you what *is* happening, but
> who's to say if it's right or wrong?
> 
Well I assumed that the aliases would be inerited from the outer query.

> > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> > ta.a) from tablea ta, tableb tb order by tablea.a;
> [ produces 80 rows ]
> 
> > playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) where tb.yy =
> > ta.a) from tablea ta, tableb tb order by ta.a;
> [ produces 20 rows ]
> 
> The difference between these two is that by explicitly specifying
> "tablea" in the order-by clause, you've created a three-way join,
> as if you had written "from tablea ta, tableb tb, tablea tablea".
> Once you write an alias in a from-clause entry, you must refer to
> that from-clause entry by its alias, not by its true table name.

I guess I made the mistake of assuming that SQL is logical. I don't know
what I was thinking. ;)

> 
> Meanwhile, what of the inner select?  It has no FROM clause *and*
> no valid table names.  The only way to interpret the names in it
> is as references to the outer select.  So, on any given iteration
> of the outer select, the inner select collapses to constants.
> It looks like "SELECT count(constant1) WHERE constant2 = constant3"
> and so you get either 0 or 1 depending on whether tb.yy and ta.a
> from the outer scan are different or equal.

OK that sorta makes sense to be. What I want is the behavior I got with
the third query (below). I want the values in table a, and then a count
of how many entries in tableb have the yy field of tableb that matches
that entry in tablea's a field.

playpen=> select ta.a,ta.b,ta.c, (select count (tb.zz) from tableb tb
where tb.yy = ta.a) from tablea ta, tableb tb group by ta.a, ta.b, ta.c
order by ta.a;
a|b|c|?column?
-+-+-+--------
1|2| |       0
2|3|4|       2
3|4|5|       1
4|5|4|       1
(4 rows)

... which is what I want. Thanks.

> 
> > playpen=> select tablea.a,tablea.b,tablea.c, (select count (tableb.zz)
> > where tableb.yy = tablea.a) order by tablea.a;
> [ produces 4 rows ]
> 
> Here the outer select is not a join at all --- it mentions only tablea,
> so you are going to get one output for each tablea row.  The inner
> select looks like "select count (zz) FROM tableb WHERE yy = <constant>",
> so you get an actual scan of tableb for each iteration of the outer
> scan.
> 
> It's not very clear from these examples what you actually wanted to have
> happen, but I suggest that you will have better luck if you specify
> explicit FROM lists in both the inner and outer selects, and be careful
> that each variable you use clearly refers to exactly one of the
> FROM-list entries.
> 
>                         regards, tom lane


Re: [HACKERS] Re: aliases break my query

От
Thomas Lockhart
Дата:
> At one time Bruce had made some patches to emit informative notice
> messages about implicit FROM entries, but that got turned off again
> for reasons that I forget...

It was triggered with common cases from the "outer join" syntax. It took
a while to track down since it was introduced while I was working on the
syntax feature :(

If it *really* needs to be put back in, then we should do so with a flag
so we can disable the warning at compile time, run time, and/or in the
outer join parser area. But imho sprinkling the parser with warnings for
allowed syntax is heading the wrong direction. If it is legal, allow it.
If it is illegal, disallow it. If it is confusing for some, but works
fine for others, it shouldn't become "sort of legal" with a warning.
                  - Thomas