Re: [HACKERS] nested loops in joins, ambiguous rewrite rules

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] nested loops in joins, ambiguous rewrite rules
Дата
Msg-id 7950.917809074@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] nested loops in joins, ambiguous rewrite rules  (Bruce Momjian <maillist@candle.pha.pa.us>)
Ответы Re: [HACKERS] nested loops in joins, ambiguous rewrite rules  (Bruce Momjian <maillist@candle.pha.pa.us>)
Список pgsql-hackers
>> Well isn't that something!  Thanks so much for your help!
>> I set the GEQO variable to 4 and now the 11.5 minute query
>> executes in 6 seconds with this query plan:

Maybe this was already obvious to everyone else, but I just now got it
through my head that Charlie was not complaining about the time to
*execute* his seven-way join, but about the time to *plan* it.

(In other words, EXPLAIN was taking about as long as actually doing
the SELECT, right?)

The problem here is explained in the Postgres "Developer's Guide",
in the chapter titled "Genetic Query Optimization in Database Systems".
The regular planner/optimizer does a nearly exhaustive search through
all possible ways to execute the query --- and the number of possible
ways grows exponentially in the number of joins.

So you need a heuristic shortcut when there are lots of tables.  It may
not find the best possible execution plan, but it should find a pretty
good one in not too much time.  That's what the GEQO code does.

>> Are there any recommendations about what value *ought* to be
>> set for GEQO?  It seems to me like 'ON=8' is pretty high

Bruce Momjian <maillist@candle.pha.pa.us> writes:
> We have toyed with lowering it to 6.  I think someone said that was too
> low, but obviously, some queries need a value of 6 or lower.  I don't
> understand when that is the case, though.  Any comments from anyone?

The docs say that the number being used is just the number of base
tables mentioned in the query.  I wonder whether that is the right
number to look at.  In particular, the number of cases that the
exhaustive searcher would have to look at is not only dependent on the
number of underlying tables, but also on how many indexes there are to
consider using.

Perhaps if we develop a slightly more complex measure to compare against
the GEQO threshold, we can do better in deciding when to kick in GEQO.
A brain-dead suggestion is number of tables + number of indexes ...
does anyone know whether that is a good measure of the number of cases
considered in the optimizer?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] Patches
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] nested loops in joins, ambiguous rewrite rules