Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
Дата
Msg-id 8747.1302716131@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?  (Scott Carey <scott@richrelevance.com>)
Ответы Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?  (Scott Carey <scott@richrelevance.com>)
Список pgsql-performance
Scott Carey <scott@richrelevance.com> writes:
>> On Oct 27, 2010, at 12:56 PM, Tom Lane wrote:
>> Because a poorly distributed inner relation leads to long hash chains.
>> In the very worst case, all the keys are on the same hash chain and it
>> degenerates to a nested-loop join.

> A pathological skew case (all relations with the same key), should be
> _cheaper_ to probe.

I think you're missing the point, which is that all the hash work is
just pure overhead in such a case (and it is most definitely not
zero-cost overhead).  You might as well just do a nestloop join.
Hashing is only beneficial to the extent that it allows a smaller subset
of the inner relation to be compared to each outer-relation tuple.
So I think biasing against skew-distributed inner relations is entirely
appropriate.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Scott Carey
Дата:
Сообщение: Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?
Следующее
От: Scott Carey
Дата:
Сообщение: Re: HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?