Costing bug in hash join logic for semi joins

Поиск
Список
Период
Сортировка
От RK
Тема Costing bug in hash join logic for semi joins
Дата
Msg-id CA+SNy03bhq0fodsfOkeWDCreNjJVjsdHwUsb7AG=jpe0PtZc_g@mail.gmail.com
обсуждение исходный текст
Ответы Re: Costing bug in hash join logic for semi joins  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
There is a costing bug in hash join logic seems to have been introduced by the patch related to inner_unique enhancements(commit: 9c7f5229ad68d7e0e4dd149e3f80257893e404d4). Specifically, "hashjointuples" which tracks the number of matches for hash clauses is computed wrong for inner unique scenario. This leads to lot of semi-joins  incorrectly turn to inner joins with unique on inner side. Function "final_cost_hashjoin" has special handling to cost semi/anti joins to account for early stop after the first match. This is enhanced by the above said commit to be used for inner_unique scenario also. However, "hashjointuples" computation is not fixed to handle this new scenario which is incorrectly stepping into the anti join logic and assuming unmatched rows. Fix is to handle inner_unique case when computing "hashjointuples".  Here is the outline of the code that shows the bug.

void
final_cost_hashjoin(PlannerInfo *root, HashPath *path,
                              JoinCostWorkspace *workspace,
                            JoinPathExtraData *extra)
{
         .....
        /* CPU costs */
        if (path->jpath.jointype == JOIN_SEMI ||
            path->jpath.jointype == JOIN_ANTI ||
            extra->inner_unique)

       {
                 ......
                                           
                 /* Get # of tuples that will pass the basic join */
               if (path->jpath.jointype == JOIN_SEMI)
                  hashjointuples = outer_matched_rows;
              else
                 hashjointuples = outer_path_rows - outer_matched_rows; 

        }
       else
       {
         .....
       }
}

Thanks, RK (Salesforce)

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Usage of epoch in txid_current
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Usage of epoch in txid_current