Обсуждение: "AND", "OR" and Materialize :((((

Поиск
Список
Период
Сортировка

"AND", "OR" and Materialize :((((

От
Meszaros Attila
Дата:
Hi all,

We've got the following 3 tables and 2 simple queries. The only difference
lies in the join condition: the first uses OR, the second uses AND.

I expected some difference in the performace according to the
difference in the evaluation of the logical form, but not
3 magnitudes !!!

So the question is:

WHY SHALL IT MATERIALIZE A CONSTANT RESULT IN A LOOP OF 120000 TIMES ??
[it would be enough to materialize only once, or even never, because
the size of the materialized table is not larger than 1Mb... ]

ps.: there are indeces on all referenced fields.


atti=# explain select count(*) from _108 left join (_111 cross join _110) on (_108.objectid=_111._108objectid OR
_108.objectid=_110._108objectid);
 
NOTICE:  QUERY PLAN:

Aggregate  (cost=5017202.06..5017202.06 rows=1 width=24) ->  Nested Loop  (cost=0.00..5016900.05 rows=120806 width=24)
    ->  Seq Scan on _108  (cost=0.00..44.70 rows=1670 width=8)       ->  Materialize  (cost=2097.79..2097.79 rows=60421
width=16)            ->  Nested Loop  (cost=0.00..2097.79 rows=60421 width=16)                   ->  Seq Scan on _110
(cost=0.00..1.37rows=37 width=8)                   ->  Seq Scan on _111  (cost=0.00..40.33 rows=1633 width=8)
 

EXPLAIN
atti=# explain select count(*) from _108 left join (_111 cross join _110) on _108.objectid=_111._108objectid AND
_108.objectid=_110._108objectid;
 
NOTICE:  QUERY PLAN:

Aggregate  (cost=7965.68..7965.68 rows=1 width=24) ->  Merge Join  (cost=7030.14..7961.51 rows=1670 width=24)       ->
Sort (cost=134.09..134.09 rows=1670 width=8)             ->  Seq Scan on _108  (cost=0.00..44.70 rows=1670 width=8)
 ->  Sort  (cost=6896.05..6896.05 rows=60421 width=16)             ->  Nested Loop  (cost=0.00..2097.79 rows=60421
width=16)                  ->  Seq Scan on _110  (cost=0.00..1.37 rows=37 width=8)                   ->  Seq Scan on
_111 (cost=0.00..40.33 rows=1633 width=8)
 

Attila




Re: "AND", "OR" and Materialize :((((

От
Tom Lane
Дата:
Meszaros Attila <tilla@draconis.elte.hu> writes:
> WHY SHALL IT MATERIALIZE A CONSTANT RESULT IN A LOOP OF 120000 TIMES ??
> [it would be enough to materialize only once,

Which in fact is exactly what the materialize node is for.  The reported
costs are pretty bogus, but AFAICT the plan is the right thing.
        regards, tom lane


Re: "AND", "OR" and Materialize :((((

От
Meszaros Attila
Дата:
Hi,

> > WHY SHALL IT MATERIALIZE A CONSTANT RESULT IN A LOOP OF 120000 TIMES ??
> > [it would be enough to materialize only once,
> 
> Which in fact is exactly what the materialize node is for.  The reported
> costs are pretty bogus, but AFAICT the plan is the right thing.
Thanx for the answer.I've thought the same (eg. materialize should reduce the amount    of work to be done, but I
haven'tfelt this in the result)
 
Unfortunatelly the cost prediction in line 2 may be close tothe real cost. According to some measures:    time for the
querywith 'AND':      2 sec    time for the query with 'OR':    421 sec
 
So the question is what to do? Can I speed up the second one?
[vacuum analyze and indices are done, postgres version is 7.1.2]

1:Aggregate  (cost=5017202.06..5017202.06 rows=1 width=24)
2:  ->  Nested Loop  (cost=0.00..5016900.05 rows=120806 width=24)
3:        ->  Seq Scan on _108  (cost=0.00..44.70 rows=1670 width=8)
4:        ->  Materialize  (cost=2097.79..2097.79 rows=60421 width=16)
5:              ->  Nested Loop  (cost=0.00..2097.79 rows=60421 width=16)
6:                    ->  Seq Scan on _110  (cost=0.00..1.37 rows=37 width=8)
7:                    ->  Seq Scan on _111  (cost=0.00..40.33 rows=1633 width=8)

Attila