Re: Estimating geometric distributions

Поиск
Список
Период
Сортировка
От Stephen Denne
Тема Re: Estimating geometric distributions
Дата
Msg-id F0238EBA67824444BC1CB4700960CB4804DD9399@dmpeints002.isotach.com
обсуждение исходный текст
Ответ на Estimating geometric distributions  ("Stephen Denne" <Stephen.Denne@datamail.co.nz>)
Список pgsql-hackers
I wrote:
> > I have a field whose distribution of frequencies of values is
> > roughly geometric, rather than flat.


> My problem is frequent
> > over-estimation of rows when restricting by this field with
> > values not known at plan time.


> Is there any facility already in PostgreSQL to help me here?
>
> Hopefully an index type that I don't know about yet?
> (Geometric distributions are similar to those found in word
> count distributions).
>
> If not... is there any merit in this idea:
>
> During the analyze process, the geometric mean of sampled
> rows was calculated, and if determined to be significantly
> different from the arithmetic mean, stored in a new stats
> column. When estimating the number of rows that will be
> returned by queries of the form shown above, if there is a
> geometric mean stored, use it instead of the arithmetic mean.

I came up with another (much easier) means of adjusting the planners estimation of how many rows will be returned:

Increase the number of distinct values in the statistics.
For example:
update pg_statistic set stadistinct=2691 where starelid=29323 and staattnum=2;

I can then pick a number of distinct values such that the effective arithmetic mean is equal to what I calculated the
geometricmean to be. 

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any
attachmentsis confidential and may be subject to legal privilege.  If it is not intended for you please advise by reply
immediately,destroy it and do not copy, disclose or use it in any way. 

__________________________________________________________________ This email has been scanned by the DMZGlobal
BusinessQuality              Electronic Messaging Suite. 
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Autovacuum vs statement_timeout
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Autovacuum vs statement_timeout