Re: More stable query plans via more predictable column statistics

Поиск

Список

Период

Сортировка

От	Shulgin, Oleksandr
Тема	Re: More stable query plans via more predictable column statistics
Дата	8 марта 2016 г. 21:25:41
Msg-id	CACACo5RP3aO-vQxB+10-iGJiEGgPeHPyugDQbLcRdBOaxzmEZg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: More stable query plans via more predictable column statistics (Joel Jacobson <joel@trustly.com>)
Ответы	Re: More stable query plans via more predictable column statistics (Joel Jacobson <joel@trustly.com>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, Mar 8, 2016 at 3:36 PM, Joel Jacobson <joel@trustly.com> wrote:

Hi Alex,

Thanks for excellent research.

Joel,

Thank you for spending your time to run these :-)

I've ran your queries against Trustly's production database and I can
confirm your findings, the results are similar:

WITH ...
SELECT count(1),
min(hist_ratio)::real,
avg(hist_ratio)::real,
max(hist_ratio)::real,
stddev(hist_ratio)::real
FROM stats2
WHERE histogram_bounds IS NOT NULL;

-[ RECORD 1 ]----
count | 2814
min | 0.193548
avg | 0.927357
max | 1
stddev | 0.164134

WHERE distinct_hist < num_hist
-[ RECORD 1 ]----
count | 624
min | 0.193548
avg | 0.672407
max | 0.990099
stddev | 0.194901

WITH ..
SELECT schemaname ||'.'|| tablename ||'.'|| attname || (CASE inherited
WHEN TRUE THEN ' (inherited)' ELSE '' END) AS columnname,
n_distinct, null_frac,
num_mcv, most_common_vals, most_common_freqs,
mcv_frac, (mcv_frac / (1 - null_frac))::real AS nonnull_mcv_frac,
distinct_hist, num_hist, hist_ratio,
histogram_bounds
FROM stats2
ORDER BY hist_ratio
LIMIT 1;

-[ RECORD 1 ]-----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
columnname | public.x.y
n_distinct | 103
null_frac | 0
num_mcv | 10
most_common_vals | {0,1,2,3,4,5,6,7,8,9}
most_common_freqs |
{0.4765,0.141733,0.1073,0.0830667,0.0559667,0.0373333,0.0251,0.0188,0.0141,0.0113667}
mcv_frac | 0.971267
nonnull_mcv_frac | 0.971267
distinct_hist | 18
num_hist | 93
hist_ratio | 0.193548387096774
histogram_bounds |
{10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,12,12,12,12,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,13,13,14,14,14,14,14,15,15,15,15,16,16,16,16,21,23,5074,5437,5830,6049,6496,7046,7784,14629,21285}

I don't want to be asking for too much here, but is there a chance you could try the effects of the proposed patch on an offline copy of your database?

Do you envision or maybe have experienced problems with query plans referring to the columns that are near the top of the above hist_ratio report? In other words: what are the practical implications for you with the values being duplicated rather badly throughout the histogram like in the example you shown?

Thank you!

Alex

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 08 марта 2016 г., 21:21:11
Сообщение: Re: Freeze avoidance of very large table.

Следующее

От: Robert Haas
Дата: 08 марта 2016 г., 21:38:28
Сообщение: Re: Freeze avoidance of very large table.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: More stable query plans via more predictable column statistics

Предыдущее

Следующее