Re: Expected accuracy of planner statistics

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Expected accuracy of planner statistics
Дата	29 сентября 2006 г. 03:51:28
Msg-id	9347.1159501861@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Expected accuracy of planner statistics (Casey Duncan <casey@pandora.com>)
Ответы	Re: Expected accuracy of planner statistics (Casey Duncan <casey@pandora.com>) Re: Expected accuracy of planner statistics ("John D. Burger" <john@mitre.org>)
Список	pgsql-general

Дерево обсуждения

Casey Duncan <casey@pandora.com> writes:
> I was also trying to figure out how big the sample really is. Does a
> stats target of 1000 mean 1000 rows sampled?

No.  From memory, the sample size is 300 times the stats target (eg,
3000 rows sampled for the default target of 10).  This is based on some
math that says that's enough for a high probability of getting good
histogram estimates.  Unfortunately that math promises nothing about
n_distinct.

The information we've seen says that the only statistically reliable way
to arrive at an accurate n_distinct estimate is to examine most of the
table :-(.  Which seems infeasible for extremely large tables, which is
exactly where the problem is worst.  Marginal increases in the sample
size seem unlikely to help much ... as indeed your experiment shows.

We could also diddle the estimator equation to inflate the estimate.
I'm not sure whether such a cure would be worse than the disease, but
certainly the current code was not given to us on stone tablets.
IIRC I picked an equation out of the literature partially on the basis
of it being simple and fairly cheap to compute...

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее

От: Tom Lane
Дата: 29 сентября 2006 г., 03:14:47
Сообщение: Re: Row versions and indexes

Следующее

От: snacktime
Дата: 29 сентября 2006 г., 05:59:04
Сообщение: using schema's for data separation

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Expected accuracy of planner statistics

Предыдущее

Следующее