Обсуждение: what is the purpose to use 3 function to compare cost add_path/set_cheapest/get_cheapest_fractional_path

Поиск
Список
Период
Сортировка
Hello Hackers:
   I'm reading the code of optimizer and get confused about the 3 functions. add_path/set_cheapest/get_cheapest_fractional_path


add_(partial_)path:
For every relations,  optimizer will build path for it and add then call add_path to the rel->pathlist.  during this stage,  it compare the current rel->pathlist with the new one,  then it may discard it or add it into the pathlist.   finally we may have multi path now. 

set_cheapest
after we finished the add_path for each relation,  we will call set_cheapest,  the cheapest is the one with the lowest total cost plus the parameterized path.  

get_cheapest_fractional_path
after we build the paths for all the relation,  it calls get_cheapest_fractional_path to get the best path and then build the plan. 

so my question is why do we need to have the 3 cost compare function? 

and i have another real case,  and hope you can provide some hints:

select * from t1 where slow_func(t1.a,  3) == true;  ==>  parallel bitmap index scan. 

select * from t1, t2 where t2.pk = 1 and t1.b = t2.b and slow_function(t1.a,  t2.a) == true; ==> I get nest loop without parallel scan.  what I hope is  after I get the t2.a (only 1 value),  I want to go with parallel index scan as well. 

In this case,  which part should I focus on to figure out the issue? 


Thanks
On Fri, Dec 6, 2019 at 11:26 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Hello Hackers:
>    I'm reading the code of optimizer and get confused about the 3 functions.
add_path/set_cheapest/get_cheapest_fractional_path
>
>
> add_(partial_)path:
> For every relations,  optimizer will build path for it and add then call add_path to the rel->pathlist.  during this
stage, it compare the current rel->pathlist with the new one,  then it may discard it or add it into the pathlist.
finallywe may have multi path now. 
>
> set_cheapest
> after we finished the add_path for each relation,  we will call set_cheapest,  the cheapest is the one with the
lowesttotal cost plus the parameterized path. 
>
> get_cheapest_fractional_path
> after we build the paths for all the relation,  it calls get_cheapest_fractional_path to get the best path and then
buildthe plan. 
>
> so my question is why do we need to have the 3 cost compare function?
>
> and i have another real case,  and hope you can provide some hints:
>
> select * from t1 where slow_func(t1.a,  3) == true;  ==>  parallel bitmap index scan.
>
> select * from t1, t2 where t2.pk = 1 and t1.b = t2.b and slow_function(t1.a,  t2.a) == true; ==> I get nest loop
withoutparallel scan.  what I hope is  after I get the t2.a (only 1 value),  I want to go with parallel index scan as
well.
>

It might be that the scan size of index is too small that we didn't
consider to pick parallel index scan.  You might want to tweak
min_parallel_index_scan_size and see how it behaves.  Just start by
setting it to 0 and see if it leads to what you want and then you can
increase its value and then see the behaviour.  I think additionally,
you might want to debug build_index_paths and see the cost it
generates for parallel path (the relevant code is near comment "If
appropriate, consider parallel index scan. ..") as compared to the
non-parallel path.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com