Re: [GENERAL] Bad planning data resulting in OOM killing of postgres

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: [GENERAL] Bad planning data resulting in OOM killing of postgres
Дата
Msg-id CAFj8pRCQ+kiqyUU9Zh8UnAoSt9n0dLY0OgrszyBk9sj3ByiKVw@mail.gmail.com
обсуждение исходный текст
Ответ на [GENERAL] Bad planning data resulting in OOM killing of postgres  (David Hinkle <hinkle@cipafilter.com>)
Список pgsql-general
Hi

2017-02-13 18:40 GMT+01:00 David Hinkle <hinkle@cipafilter.com>:
I'm having trouble with purges related to a large table.   The delete
query consumes ram until postgres crashes due to OOM.   I have a very
large table called log_raw.  There are half a dozen related tables,
such as 'urls' and 'titles'.   log_raw.urlid = urls.urlid and urls
contains the text of the various urls, for example.

Each time I try to purge these side tables the unit OOM's.

psql:postgres@cipafilter = explain DELETE FROM titles WHERE NOT EXISTS
( SELECT 1 FROM log_raw WHERE log_raw.titleid = titles.titleid );
                                       QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────
 Delete on titles  (cost=22166473.44..24850954.67 rows=870382 width=12)
   ->  Hash Anti Join  (cost=22166473.44..24850954.67 rows=870382 width=12)
         Hash Cond: (titles.titleid = log_raw.titleid)
         ->  Seq Scan on titles  (cost=0.00..17871.64 rows=870664 width=10)
         ->  Hash  (cost=12744792.64..12744792.64 rows=542011264 width=10)
               ->  Seq Scan on log_raw  (cost=0.00..12744792.64
rows=542011264 width=10)
(6 rows)


what is your work_mem setting?

Regards

Pavel
 
psql:postgres@cipafilter = select count(*) from (select titleid from
log_raw group by titleid) as a;
 count
────────
 872210
(1 row)

cipafilter=# select n_distinct from pg_stats where tablename =
'log_raw' and attname = 'titleid';
 n_distinct
------------
        282
(1 row)

The planning data is wildly low for each of these fields, and I wonder
if because of that error the planner thinks it can keep all these id's
in ram while it works. Analyze doesn't fix it.   Increasing the
statistics target improves the data in n_distinct but not
considerably, as increasing it 3 or 4 fold leads to it still being
wildly off.  ALTER TABLE set n_distinct doesn't seem to be used by the
planner as it doesn't change any of the plans I've generated or seem
to be taken into account in the row estimates. I'm out of ideas.
Anybody have any ideas?


--
David Hinkle

Senior Software Developer

Phone:  800.243.3729x3000

Email:  hinkle@cipafilter.com

Hours:  Mon-Fri   8:00AM-5:00PM (CT)


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

В списке pgsql-general по дате отправления:

Предыдущее
От: David Hinkle
Дата:
Сообщение: [GENERAL] Bad planning data resulting in OOM killing of postgres
Следующее
От: John R Pierce
Дата:
Сообщение: Re: [GENERAL] Auto-Rollback option