Re: Odd Row Estimates in Query Plan (rows=75)

Поиск

Список

Период

Сортировка

От	Don Seiler
Тема	Re: Odd Row Estimates in Query Plan (rows=75)
Дата	16 августа 2018 г. 02:03:14
Msg-id	CAHJZqBBqL0xPtz0CEG4r1J8T4X1LML=eTZaHPVw92uKcmGzHdg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Odd Row Estimates in Query Plan (rows=75) (Adrian Klaver <adrian.klaver@aklaver.com>)
Ответы	Re: Odd Row Estimates in Query Plan (rows=75) (Adrian Klaver <adrian.klaver@aklaver.com>)
Список	pgsql-general

Дерево обсуждения

Here's the query, obfuscated manually by me:

SELECT

'Foo' as system_function,

stores.name as store,

lt.owner,

lt.minute_of_day,

lt.records

FROM

foo.stores

LEFT OUTER JOIN

(SELECT

lts.store_pkey,

lts.owner,

date_trunc('minute', lts.date_gifted) as minute_of_day,

count(*) as records

FROM foo.gifts lts

WHERE

lts.date_added > '2017-07-14 11:13:05'

AND lts.date_added < '2017-08-13 14:14:21'

AND lts.date_gifted >= '2017-08-13 11:13:05'

AND lts.date_gifted < '2017-08-13 14:14:21'

GROUP BY 1,2,3

ORDER BY 1

) lt ON lt.store_pkey = stores.pkey

WHERE lt.records IS NOT NULL;

The foo.gifts table is pretty much the core table of our database. It's big and very active. There is an index on date_added but not yet on date_gifted.

I'm working to re-write the query while the dev sees if we even need this query anymore.

On Wed, Aug 15, 2018 at 2:39 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 08/15/2018 12:31 PM, Don Seiler wrote:
PostgreSQL 9.6.6 on CentOS.

We have a report query that has gone from maybe a few seconds to run to a few minutes to run since mid-July. Looking at the output of EXPLAIN ANALYZE, the row count estimates are way off, even though this table was just analyzed a day or so ago. What's more bizarre to me is that the row count esimate is *always* 75 for every node of the plan, where the actual rows is in the hundreds or thousands. This table is one of the busiest tables in our production database (many inserts and updates). It is autovacuumed and autoanalyzed a few times per week, although I'm looking to change it to a nightly manual schedule to avoid daytime autovacuums.

Hash Join (cost=1869142.34..1869146.15 rows=75 width=88) (actual time=179877.869..179878.011 rows=759 loops=1)
   Hash Cond: (stores.pkey = lt.store_pkey)
   Buffers: shared hit=1654593 read=331897 dirtied=249
   -> Seq Scan on stores (cost=0.00..2.77 rows=77 width=22) (actual time=0.007..0.023 rows=78 loops=1)
   Buffers: shared hit=2
   -> Hash (cost=1869141.40..1869141.40 rows=75 width=50) (actual time=179877.847..179877.847 rows=759 loops=1)
   Buckets: 1024 Batches: 1 Memory Usage: 73kB
   Buffers: shared hit=1654591 read=331897 dirtied=249
   -> Subquery Scan on lt (cost=1869138.59..1869141.40 rows=75 width=50) (actual time=179875.976..179877.697 rows=759 loops=1)
   Buffers: shared hit=1654591 read=331897 dirtied=249
   -> GroupAggregate (cost=1869138.59..1869140.65 rows=75 width=50) (actual time=179875.976..179877.606 rows=759 loops=1)
   Group Key: lts.store_pkey, lts.owner, (date_trunc('minute'::text, lts.date_gifted))
   Filter: (count(*) IS NOT NULL)
   Buffers: shared hit=1654591 read=331897 dirtied=249
   -> Sort (cost=1869138.59..1869138.78 rows=75 width=42) (actual time=179875.961..179876.470 rows=6731 loops=1)
   Sort Key: lts.store_pkey, lts.entry_source_owner, (date_trunc('minute'::text, lts.date_gifted))
   Sort Method: quicksort Memory: 757kB
   Buffers: shared hit=1654591 read=331897 dirtied=249
   -> Index Scan using gifts_date_added on gifts lts (cost=0.56..1869136.25 rows=75 width=42) (actual time=190.657..179870.165 rows=6731 loops=1)
   Index Cond: ((date_added > '2018-07-14 11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13 14:14:21'::timestamp without time zone))
   Filter: ((date_gifted >= '2018-08-13 11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13 14:14:21'::timestamp without time zone))
   Rows Removed by Filter: 938197
   Buffers: shared hit=1654591 read=331897 dirtied=249
  Planning time: 0.426 ms
  Execution time: 179893.894 ms

I don't have a version of this query from prior to this summer, but getting explain plan for older data from older sandboxes show a similar plan.

I don't have an answer, just a question:

Can you provide the actual query and the table schema?

Sidenote: I am suggesting that an index be added on the date_gifted field as that is far more selective and avoids throwing rows away. However I'm very interested in why every node dealing with the gifts table thinks rows=75 when the actual is much, much higher. And 75 seems like too round of a number to be random?

--
Don Seiler
www.seiler.us <http://www.seiler.us>

--
Adrian Klaver
adrian.klaver@aklaver.com

Don Seiler
www.seiler.us

В списке pgsql-general по дате отправления:

Предыдущее

От: Adrian Klaver
Дата: 16 августа 2018 г., 01:39:45
Сообщение: Re: Odd Row Estimates in Query Plan (rows=75)

Следующее

От: "Phil Endecott"
Дата: 16 августа 2018 г., 02:25:19
Сообщение: During promotion, new master tries to archive same segment twice

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Odd Row Estimates in Query Plan (rows=75)

Предыдущее

Следующее