Re: cost_sort() improvements
От | Teodor Sigaev |
---|---|
Тема | Re: cost_sort() improvements |
Дата | |
Msg-id | ce8eff53-52f2-e7e6-0059-8527c3f2892d@sigaev.ru обсуждение исходный текст |
Ответ на | Re: cost_sort() improvements (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
> OK, so Fi is pretty much whatever CREATE FUNCTION ... COST says, right? exactly > Hmm, makes sense. But doesn't that mean it's mostly a fixed per-tuple > cost, not directly related to the comparison? For example, why should it > be multiplied by C0? That is, if I create a very expensive comparator > (say, with cost 100), why should it increase the cost for transferring > the tuple to CPU cache, unpacking it, etc.? > > I'd say those costs are rather independent of the function cost, and > remain rather fixed, no matter what the function cost is. > > Perhaps you haven't noticed that, because the default funcCost is 1? May be, but see my email https://www.postgresql.org/message-id/ee14392b-d753-10ce-f5ed-7b2f7e277512%40sigaev.ru about additional term proportional to N > The number of new magic constants introduced by this patch is somewhat > annoying. 2.0, 1.5, 0.125, ... :-( 2.0 is removed in last patch, 1.5 leaved and could be removed when I understand you letter with group size estimation :) 0.125 should be checked, and I suppose we couldn't remove it at all because it "average over whole word" constant. > >> - Final cost is cpu_operator_cost * N * sum(per column costs described >> above). >> Note, for single column with width <= sizeof(datum) and F1 = 1 this >> formula >> gives exactly the same result as current one. >> - for Top-N sort empiric is close to old one: use 2.0 multiplier as >> constant >> under log2, and use log2(Min(NGi, output_tuples)) for second and >> following >> columns. >> > > I think compute_cpu_sort_cost is somewhat confused whether > per_tuple_cost is directly a cost, or a coefficient that will be > multiplied with cpu_operator_cost to get the actual cost. > > At the beginning it does this: > > per_tuple_cost = comparison_cost; > > so it inherits the value passed to cost_sort(), which is supposed to be > cost. But then it does the work, which includes things like this: > > per_tuple_cost += 2.0 * funcCost * LOG2(tuples); > > where funcCost is pretty much pg_proc.procost. AFAIK that's meant to be > a value in units of cpu_operator_cost. And at the end it does this > > per_tuple_cost *= cpu_operator_cost; > > I.e. it gets multiplied with another cost. That doesn't seem right. Huh, you are right, will fix in v8. > Also, why do we need this? > > if (sortop != InvalidOid) > { > Oid funcOid = get_opcode(sortop); > > funcCost = get_func_cost(funcOid); > } Safety first :). Will remove. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
В списке pgsql-hackers по дате отправления: