Re: n_distinct off by a factor of 1000

Поиск
Список
Период
Сортировка
От Pavel Luzanov
Тема Re: n_distinct off by a factor of 1000
Дата
Msg-id b750644a-9bad-d4fe-0b85-7ebaa6053f00@postgrespro.ru
обсуждение исходный текст
Ответ на n_distinct off by a factor of 1000  (Klaudie Willis <Klaudie.Willis@protonmail.com>)
Ответы Re: n_distinct off by a factor of 1000
Список pgsql-general

Hello,

I got my first hint of why this problem occurs when I looked at the statistics.  For the column in question, "instrument_ref" the statistics claimed it to be:

The default_statistics_target=500, and analyze has been run.
select * from pg_stats where attname like 'instr%_ref'; -- Result: 40.000
select count(distinct instrumentid_ref) from bigtable -- Result: 33 385 922 (!!)

That is an astonishing difference of almost a 1000X. 

I have tried to increase the statistics target to 5000, and it helps, but it reduces the error to 100X.  Still crazy high.

As far as I know, increasing default_statistics_target will not help. [1]

I have considered these fixes:
- hardcode the statistics to a particular ratio of the total number of rows

You can hardcode the percentage of distinct values:
ALTER TABLE bigtable ALTER COLUMN instrument_ref SET ( n_distinct=-0.06 ); /* -1 * (33385922 / 500000000) */


[1] https://www.postgresql.org/message-id/4136ffa0812111823u645b6ec9wdca60b3da4b00499%40mail.gmail.com
-----
Pavel Luzanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

В списке pgsql-general по дате отправления:

Предыдущее
От: Matthias Apitz
Дата:
Сообщение: Re: error messages (autovaccum canceled and syntax errors) whileloading a DUMP
Следующее
От: "Wolff, Ken L"
Дата:
Сообщение: RE: EXTERNAL: Re: Netapp SnapCenter