Re: Hash join and picking which result set to build the hash table with.

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Hash join and picking which result set to build the hash table with.
Дата
Msg-id 14387.1716411233@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Hash join and picking which result set to build the hash table with.  ("Dirschel, Steve" <steve.dirschel@thomsonreuters.com>)
Список pgsql-general
"Dirschel, Steve" <steve.dirschel@thomsonreuters.com> writes:
> The query and execution plan are shown below.  My question is
> related to the result set the optimizer is choosing to build the
> hash table from.  My understanding is for a hash join you want to
> build the hash table out of the smaller result set.

That's *a* consideration, but not the only one.  We also consider
whether the hash key has a flat distribution; if it is too skewed,
we might find specific hash chains getting too long.

> When running some tests I forgot to create the PK on table
> docloc_test.  When the PK was not on the table the optimizer decided
> to create the hash table off the 1000 rows from collection.  But as
> soon as I put the PK on that table it then decides to use
> docloc_test to build the hash table.

I think that the presence of a unique index overrides the statistics
(or the lack of any) so that the planner knows that the column is
unique and thus safe to use as a hash key.  Now, it should have
known that anyway, unless maybe this is a freshly-built table that
auto-analyze hasn't gotten to yet?

            regards, tom lane



В списке pgsql-general по дате отправления:

Предыдущее
От: "Dirschel, Steve"
Дата:
Сообщение: Hash join and picking which result set to build the hash table with.
Следующее
От: sud
Дата:
Сообщение: Long running query causing XID limit breach