Hash Aggregate plan picked for very large table == out of memory

Поиск
Список
Период
Сортировка
От Mason Hale
Тема Hash Aggregate plan picked for very large table == out of memory
Дата
Msg-id 8bca3aa10706141315k38d89d74occ317907f68ed54d@mail.gmail.com
обсуждение исходный текст
Ответы Re: Hash Aggregate plan picked for very large table == out of memory  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Hash Aggregate plan picked for very large table == out of memory  (Gregory Stark <stark@enterprisedb.com>)
Список pgsql-general
With Postgresql 8.1.9 -- I have a simple group by query:

SELECT target_page_id, min(created_at)
FROM page_page_link
GROUP BY 1;

The page_page_link table has ~130 million rows.

After analyzing the table, the planner picks a hash aggregate plan, which results in an out of memory error.

crystal=> analyze page_page_link;
ANALYZE
crystal=> explain
crystal-> SELECT target_page_id as page_id, min(created_at) as created_at
crystal-> FROM page_page_link
crystal-> GROUP By 1
crystal-> ;
                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
 HashAggregate  (cost=3663517.88..3670393.09 rows=550017 width=12)
   ->  Seq Scan on page_page_link  (cost=0.00..2993649.92 rows=133973592 width=12)
(2 rows)

The default_statistics_target was originally 200.
I upped it to 1000 and still get the same results.

crystal=> show default_statistics_target;
 default_statistics_target
---------------------------
 1000
(1 row)

crystal=> set enable_hashagg = off;
SET
crystal=> explain
crystal-> SELECT target_page_id as page_id, min(created_at) as created_at
crystal-> FROM page_page_link
crystal-> GROUP BY 1
crystal-> ;
                                       QUERY PLAN                                       
-----------------------------------------------------------------------------------------
 GroupAggregate  (cost=27240841.37..28252518.53 rows=550017 width=12)
   ->  Sort  (cost=27240841.37..27575775.35 rows=133973592 width=12)
         Sort Key: target_page_id
         ->  Seq Scan on page_page_link  (cost= 0.00..2993649.92 rows=133973592 width=12)
(4 rows)

crystal=>

I am working around this by setting enable_hashagg = off  -- but it just seems like a case where the planner is not picking the strategy?

Is there another setting I can change to help make better decisions?

thanks in advance,

Mason

В списке pgsql-general по дате отправления:

Предыдущее
От: "Shoaib Mir"
Дата:
Сообщение: Re: Function with COPY command?
Следующее
От: Francisco Reyes
Дата:
Сообщение: pg_restore out of memory