Обсуждение: estimated_count() implementation

Поиск
Список
Период
Сортировка

estimated_count() implementation

От
Volkan YAZICI
Дата:
Hi,

I'm trying to implement estimated_count() function that's mentioned in
the TODO list. First of all, I wanted to learn if this TODO item is
still valid? I looked at the related -hackers discussions, does anybody
want to say more sth related with the implementation?

Also I've some questions. I'd be appreciated if somebody would answer
any of the below questions to help me find my way.

1. I'm planning to use same method as ExplainOneQuery() does in  backend/commands/explain.c. (Using Plan->plan_rows
thatwill be  returned from planner(query, isCursor, cursorOptions, params)  function.) Is this the way to go, or should
Ilook for another  method to aggregate estimated row count.
 

2. I've been also considering getting called from a nodeAgg. In such a  case, it shouldn't be a problem for me to use
sameway as above to  retrieve Query, ParamListInfo and TupOutputState. Right?
 

3. I was looking at int8inc() and backend/executor/nodeAgg.c and  couldn't find anything special to count() aggregate.
AmI looking  at the right place? For instance, for my case, I won't need any  transition function call. How should I
modifynodeAgg.c to skip  transfn calls for estimated_count()?
 

4. Related with the problem, any question I missed.


Regards.


Re: estimated_count() implementation

От
Michael Fuhr
Дата:
On Sat, Oct 21, 2006 at 11:44:19PM +0300, Volkan YAZICI wrote:
> I'm trying to implement estimated_count() function that's mentioned in
> the TODO list. First of all, I wanted to learn if this TODO item is
> still valid? I looked at the related -hackers discussions, does anybody
> want to say more sth related with the implementation?

I hadn't noticed the TODO item but about a year ago I posted a
cursor_plan_rows() function and asked for comments.  The only reply
was from Tom, who said, "Given how far off it frequently is, I can't
believe that any of the people who ask for the feature would find
this a satisfactory answer :-("

http://archives.postgresql.org/pgsql-hackers/2005-11/msg00579.php
http://archives.postgresql.org/pgsql-hackers/2005-11/msg00580.php

-- 
Michael Fuhr


Re: estimated_count() implementation

От
Volkan YAZICI
Дата:
On Oct 21 05:09, Michael Fuhr wrote:
> I hadn't noticed the TODO item but about a year ago I posted a
> cursor_plan_rows() function and asked for comments.

Ah! I didn't see this.

> The only reply was from Tom, who said, "Given how far off it
> frequently is, I can't believe that any of the people who ask for the
> feature would find this a satisfactory answer :-("

AFAIU, cursor_plan_rows() has some serious limitations like requiring to
be executed for a portal. I was planning to make estimated_count() work
for nodeAgg and custom calls too - as count() does.

But OTOH, Tom's complaints look like still applicable for my
estimated_count() too. Does this TODO need a little bit more
clarification or we can count is a redundant one?


Regards.


Re: estimated_count() implementation

От
"Simon Riggs"
Дата:
On Sun, 2006-10-22 at 12:07 +0300, Volkan YAZICI wrote:
> On Oct 21 05:09, Michael Fuhr wrote:
> > I hadn't noticed the TODO item but about a year ago I posted a
> > cursor_plan_rows() function and asked for comments.
> 
> Ah! I didn't see this.
> 
> > The only reply was from Tom, who said, "Given how far off it
> > frequently is, I can't believe that any of the people who ask for the
> > feature would find this a satisfactory answer :-("
> 
> AFAIU, cursor_plan_rows() has some serious limitations like requiring to
> be executed for a portal. I was planning to make estimated_count() work
> for nodeAgg and custom calls too - as count() does.
> 
> But OTOH, Tom's complaints look like still applicable for my
> estimated_count() too. Does this TODO need a little bit more
> clarification or we can count is a redundant one?

http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php

Is the source of the TODO item, though please read the upthread messages
as to how we got there...

I think it would be a useful function...

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




Re: estimated_count() implementation

От
Tom Lane
Дата:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> On Sun, 2006-10-22 at 12:07 +0300, Volkan YAZICI wrote:
>> But OTOH, Tom's complaints look like still applicable for my
>> estimated_count() too. Does this TODO need a little bit more
>> clarification or we can count is a redundant one?

> http://archives.postgresql.org/pgsql-hackers/2005-11/msg00943.php

> Is the source of the TODO item, though please read the upthread messages
> as to how we got there...

I think there is a use-case for something like
select estimated_count('select * from ... where ...');

i.e. there are applications where a possibly-bad estimate is enough.
What I'm concerned about is the level of complaints from newbies who'll
expect it to be dead accurate all the time ...

BTW, you can build estimated_count() today in a few lines of plpgsql:

create or replace function estimated_count(text) returns float8 as $$
declare r text;
begin for r in execute 'explain ' || $1 loop   return substring(r from 'rows=([0-9]+) '); end loop;
end$$ language plpgsql strict;

I don't see that it really justifies any more work than that.
        regards, tom lane