Re: [HACKERS] please help on query

Поиск
Список
Период
Сортировка
От Manfred Koizar
Тема Re: [HACKERS] please help on query
Дата
Msg-id gu9riu8qp28ifp40qsva7b9puif47ihi5b@4ax.com
обсуждение исходный текст
Список pgsql-sql
[moving to pgsql-sql]
On Thu, 11 Jul 2002 17:22:14 +0200, "Luis Alberto Amigo Navarro"
<lamigo@atc.unican.es> wrote:
>I can't improve performance on this query:
>
>SELECT
> supplier.name,
> supplier.address
>FROM
> supplier,
> nation
>WHERE
> supplier.suppkey IN(
>  SELECT
>   partsupp.suppkey
>  FROM
>   partsupp
>  WHERE
>   partsupp.partkey IN(
>    SELECT
>     part.partkey
>    FROM
>     part
>    WHERE
>     part.name like 'forest%'
>     )
>   AND partsupp.availqty>(
>    SELECT
>     0.5*(sum(lineitem.quantity)::FLOAT)
>    FROM
>     lineitem
>    WHERE
>     lineitem.partkey=partsupp.partkey
>     AND lineitem.suppkey=partsupp.partkey                                   ^^^^^^^
suppkey???
 
>     AND lineitem.shipdate>=('1994-01-01')::DATE
>     AND lineitem.shipdate<(('1994-01-01')::DATE+('1 year')::INTERVAL)::DATE
>     )
>  )
> AND supplier.nationkey=nation.nationkey
> AND nation.name='CANADA'
>ORDER BY
> supplier.name;

Luis,
rules of thumb: "Avoid subselects; use joins!" and "If you have to use
subselects, avoid IN, use EXISTS!"

Let's try.  If partkey is unique in part, then
|   FROM partsupp
|   WHERE partsupp.partkey IN (SELECT part.partkey

can be replaced by   FROM partsupp ps, part p   WHERE ps.partkey = p.partkey

or   partsupp ps INNER JOIN part p     ON (ps.partkey = p.partkey AND p.name LIKE '...')

When we ignore "part" for now, your subselect boils down to

|  SELECT partsupp.suppkey
|  FROM partsupp
|  WHERE partsupp.availqty > (
|    SELECT 0.5*(sum(lineitem.quantity)::FLOAT)
|    FROM lineitem
|    WHERE lineitem.partkey=partsupp.partkey
|      AND lineitem.suppkey=partsupp.suppkey
|      AND lineitem.shipdate BETWEEN ... AND ...
|     )

which can be rewritten to (untested)
 SELECT ps.suppkey FROM partsupp ps, lineitem li WHERE li.partkey=ps.partkey   AND li.suppkey=ps.suppkey   AND
lineitem.shipdateBETWEEN ... AND ... GROUP BY ps.partkey, ps.suppkey HAVING min(ps.availqty) >
0.5*(sum(lineitem.quantity)::FLOAT)       ^^^        As all ps.availqty are equal in one group, you can as well
 
use max() or avg().

Now we have left only one IN:
| WHERE supplier.suppkey IN (
|  SELECT partsupp.suppkey FROM partsupp WHERE <condition> )

Being to lazy to find out, if this can be rewritten to a join, let`s
apply rule 2 here:
 WHERE EXISTS (   SELECT ... FROM partsupp ps   WHERE supplier.suppkey = ps.suppkey     AND <condition> )

HTH, but use with a grain of salt ...

>Sort  (cost=2777810917708.17..2777810917708.17 rows=200 width=81)            ^^^^^^^^^^^^^^^^
BTW, how many years are these? :-)

ServusManfred


В списке pgsql-sql по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: how to inherits the references...
Следующее
От: Jan Wieck
Дата:
Сообщение: Re: list of tables ? -update to question ...