Re: [GENERAL] Bad planning data resulting in OOM killing of postgres

Поиск

Список

Период

Сортировка

От	Jeff Janes
Тема	Re: [GENERAL] Bad planning data resulting in OOM killing of postgres
Дата	22 февраля 2017 г. 02:46:12
Msg-id	CAMkU=1zdL_04dNCZMgxRYs9w4-tJHumj+o+_3-uGVf584f7wWw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [GENERAL] Bad planning data resulting in OOM killing of postgres (David Hinkle <hinkle@cipafilter.com>)
Список	pgsql-general

Дерево обсуждения

On Thu, Feb 16, 2017 at 3:27 PM, David Hinkle <hinkle@cipafilter.com> wrote:

I guess this doesn't work, latest test run crashed. It still uses the
bad plan for the hostid column even after n_distinct is updated.

cipafilter=# select attname, n_distinct from pg_stats where tablename
cipafilter-# = 'log_raw' and (attname = 'urlid' or attname = 'titleid'
or attname =
cipafilter(# 'hostid');
attname | n_distinct
---------+-------------
urlid | 1.51625e+08
hostid | 304710
titleid | 886499
(3 rows)

cipafilter=# explain DELETE FROM hosts WHERE NOT EXISTS ( SELECT
log_raw.hostid FROM log_raw WHERE log_raw.hostid = hosts.hostid );
QUERY PLAN
-----------------------------------------------------------------------------------------
Delete on hosts (cost=22249475.67..74844813.47 rows=1 width=12)
-> Hash Anti Join (cost=22249475.67..74844813.47 rows=1 width=12)
Hash Cond: (hosts.hostid = log_raw.hostid)
-> Seq Scan on hosts (cost=0.00..5017.10 rows=304710 width=10)
-> Hash (cost=12799395.52..12799395.52 rows=543645052 width=10)
-> Seq Scan on log_raw (cost=0.00..12799395.52
rows=543645052 width=10)
(6 rows)

I guess I will also try throwing in 'set enable_hashjoin = false;' and
see if that gets these purges to go.

Another option would be to force the de-dup to happen, with:

explain with t as (select distinct hostid from log_raw) delete from hosts where not exists (select 1 from t where t.hostid=hosts.hostid)

That way you can use the hash join without running out of memory, in case the hash join is actually faster than the merge join. Also, it just seems cleaner than fiddling with enable_* parameters and then having to remember to reset them when done.

Cheers,

Jeff

В списке pgsql-general по дате отправления:

Предыдущее

От: Merlin Moncure
Дата: 22 февраля 2017 г., 02:14:36
Сообщение: Re: [GENERAL] postgresql how to duplicate rows in result.

Следующее

От: Tom Lane
Дата: 22 февраля 2017 г., 02:58:00
Сообщение: Re: [GENERAL] postgresql how to duplicate rows in result.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [GENERAL] Bad planning data resulting in OOM killing of postgres

Предыдущее

Следующее