Re: [GENERAL] Bad planning data resulting in OOM killing of postgres

Поиск

Список

Период

Сортировка

От	David Hinkle
Тема	Re: [GENERAL] Bad planning data resulting in OOM killing of postgres
Дата	17 февраля 2017 г. 02:22:29
Msg-id	CACw4T0pNRus3gaxUu4wmsb+m4WER9YerqtmvPfmSRQ4WnF5TiQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [GENERAL] Bad planning data resulting in OOM killing of postgres (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [GENERAL] Bad planning data resulting in OOM killing of postgres (David Hinkle <hinkle@cipafilter.com>)
Список	pgsql-general

Дерево обсуждения

Yep, 420ish million records out of 540 million records have a titleid
of 1.   There are about 880,000 other unique values, but most of the
records are 1.   Of course, n_distinct is only 292.   I'm surprised
it's not eliminating the duplicates while it builds that hash table.

This is what I'm doing for a work around right now.  Getting
n_distinct right seems to be preventing the system from breaking.
It's going to be executed once a week during the weekly maintenance.
It's setting the n_distinct of each column to the number of rows in
the associated table.

    CREATE OR REPLACE FUNCTION patch_ndistinct(_table varchar, _column
varchar, _string_table varchar)
      RETURNS real AS
    $$
         DECLARE _cnt REAL;
    BEGIN
         SELECT reltuples INTO _cnt from pg_class where relname = _string_table;
       EXECUTE 'ALTER TABLE ' || _table || ' ALTER COLUMN ' || _column
|| ' SET (n_distinct=' || _cnt || ')';
       RETURN _cnt;
    END
    $$ LANGUAGE plpgsql;
    select patch_ndistinct('log_raw', 'titleid', 'titles');
    select patch_ndistinct('log_raw', 'urlid', 'urls');
    select patch_ndistinct('log_raw', 'hostid', 'hosts');
    ANALYZE log_raw;

On Thu, Feb 16, 2017 at 10:54 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Hinkle <hinkle@cipafilter.com> writes:
>> Tom, there are three columns in this table that exhibit the problem,
>> here is the statistics data after an analyze, and the real data to
>> compare it to.
>
>>  attname | n_distinct |  most_common_freqs
>
>>  titleid |        292 | {0.767167}
>
> Ouch.  That's saying there's some single value of titleid that accounts
> for more than three-quarters of the entries ... does that square with
> reality?  That'd certainly explain why a hash join goes nuts.
>
>                         regards, tom lane



--
David Hinkle

Senior Software Developer

Phone:  800.243.3729x3000

Email:  hinkle@cipafilter.com

Hours:  Mon-Fri   8:00AM-5:00PM (CT)

В списке pgsql-general по дате отправления:

Предыдущее

От: Adrian Klaver
Дата: 17 февраля 2017 г., 02:10:08
Сообщение: Re: [GENERAL] disk writes within a transaction

Следующее

От: Tom Lane
Дата: 17 февраля 2017 г., 04:39:31
Сообщение: Re: [GENERAL] Autovacuum stuck for hours, blocking queries

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [GENERAL] Bad planning data resulting in OOM killing of postgres

Предыдущее

Следующее