Обсуждение: cost_agg() with AGG_HASHED does not account for startup costs

Поиск
Список
Период
Сортировка

cost_agg() with AGG_HASHED does not account for startup costs

От
David Rowley
Дата:
During working on allowing the planner to perform GROUP BY before joining I've noticed that cost_agg() completely ignores input_startup_cost when aggstrategy == AGG_HASHED.

I can see at least 3 call to cost_agg() which pass aggstrategy as AGG_HASHED that are also passing a possible non-zero startup cost.

The attached changes cost_agg() to include the startup cost, but does nothing for the changed plans in the regression tests.

Is this really intended?

Regards

David Rowley

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Вложения

Re: cost_agg() with AGG_HASHED does not account for startup costs

От
Tom Lane
Дата:
David Rowley <david.rowley@2ndquadrant.com> writes:
> During working on allowing the planner to perform GROUP BY before joining
> I've noticed that cost_agg() completely ignores input_startup_cost
> when aggstrategy == AGG_HASHED.

Isn't your proposed patch double-counting the input startup cost?
input_total_cost already includes that charge.  The calculation
reflects the fact that we have to read all of the input before we
can deliver any aggregated results, so the time to get the first
input row isn't really interesting.

If this were wrong, the PLAIN costing path would also be wrong, but
I don't think that either one is.
        regards, tom lane



Re: cost_agg() with AGG_HASHED does not account for startup costs

От
David Rowley
Дата:
On 5 August 2015 at 01:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <david.rowley@2ndquadrant.com> writes:
> During working on allowing the planner to perform GROUP BY before joining
> I've noticed that cost_agg() completely ignores input_startup_cost
> when aggstrategy == AGG_HASHED.

Isn't your proposed patch double-counting the input startup cost?
input_total_cost already includes that charge.  The calculation
reflects the fact that we have to read all of the input before we
can deliver any aggregated results, so the time to get the first
input row isn't really interesting.

If this were wrong, the PLAIN costing path would also be wrong, but
I don't think that either one is.


Sorry, false alarm, you're right. 

This was a bug in my code where I was adding disable_cost to the startup_cost, but not to the total_cost.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services