Re: Should heapam_estimate_rel_size consider fillfactor?

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: Should heapam_estimate_rel_size consider fillfactor?
Дата	14 июня 2023 г. 22:53:53
Msg-id	20230614195353.kke26ol7vyrykdad@awork3.anarazel.de обсуждение исходный текст
Ответ на	Should heapam_estimate_rel_size consider fillfactor? (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2023-06-11 14:41:27 +0200, Tomas Vondra wrote:
> While testing some stuff, I noticed heapam_estimate_rel_size (or rather
> table_block_relation_estimate_size it calls) ignores fillfactor, so that
> for a table without statistics it ends up with reltuple estimate much
> higher than reality. For example, with fillfactor=10 the estimate is
> about 10x higher.
> 
> I ran into this while doing some tests with hash indexes, where I use
> fillfactor to make the table look bigger (as if the tuples were wider),
> and I ran into this:
> 
>   drop table hash_test;
>   create table hash_test (a int, b text) with (fillfactor=10);
>   insert into hash_test select 1 + ((i - 1) / 10000), md5(i::text)
>          from generate_series(1, 1000000) s(i);
>   -- analyze hash_test;
>   create index hash_test_idx on hash_test using hash (a);
> 
>   select pg_size_pretty(pg_relation_size('hash_test_idx'));
> 
> If you run it like this (without the analyze), the index will be 339MB.
> With the analyze, it's 47MB.
> 
> This only happens if there are no relpages/reltuples statistics yet, in
> which case table_block_relation_estimate_size estimates density from
> tuple width etc.
> 
> So it seems the easiest "fix" is to do ANALYZE before creating the index
> (and not after it, as I had in my scripts). But I wonder how many people
> fail to realize this - it sure took me a while to realize the indexes
> are too large and even longer what is causing it. I wouldn't be very
> surprised if many people had bloated hash indexes after bulk loads.
> 
> So maybe we should make table_block_relation_estimate_size smarter to
> also consider the fillfactor in the "no statistics" branch, per the
> attached patch.

Seems like a good idea - I can't think of a reason why we shouldn't do so.

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Andres Freund
Дата: 14 июня 2023 г., 22:51:39
Сообщение: Re: Let's make PostgreSQL multi-threaded

Следующее

От: Robert Haas
Дата: 14 июня 2023 г., 22:56:44
Сообщение: Re: Let's make PostgreSQL multi-threaded

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Should heapam_estimate_rel_size consider fillfactor?

Предыдущее

Следующее