Re: [PERFORM] Execution plan analysis

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: [PERFORM] Execution plan analysis
Дата	25 августа 2017 г. 21:33:54
Msg-id	9fdf19f5-a49f-2d85-1e9e-a3cdb335eb80@2ndquadrant.com обсуждение исходный текст
Ответ на	Re: [PERFORM] Execution plan analysis (Neto pr <netopr9@gmail.com>)
Список	pgsql-performance

Дерево обсуждения

Hi,

So looking at the plans, essentially the only part that is different is
the scan node at the very bottom - in one case it's a sequential scan,
in the other case (the slow one) it's the bitmap index scan.

Essentially it's this:

    ->  Seq Scan on lineitem
        (cost=0.00..2624738.17 ...)
        (actual time=0.839..74391.087 ...)

vs. this:

    ->  Bitmap Heap Scan on lineitem
        (cost=336295.10..1970056.39 ...)
        (actual time=419620.817..509685.421 ...)
        ->  Bitmap Index Scan on idx_l_shipmodelineitem000
            (cost=0.00..336227.49 ...)
            (actual time=419437.172..419437.172 ...)

All the nodes are the same and perform about the same in both cases, so
you can ignore them. This difference it the the root cause you need to
investigate.

The question is why is the sequential scan so much faster than bitmap
index scan? Ideally, the bitmap heap scan should scan the index (in a
mostly sequential way), build a bitmap, and then read just the matching
part of the table (sequentially, by skipping some of the pages).

Now, there are a few reasons why this might not work that well.

Perhaps the table fits into RAM, but table + index does not. That would
make the sequential scan much faster than the index path. Not sure if
this is the case, as you haven't mentioned which TPC-H scale are you
testing, but you only have 4GB of RAM which if fairly low.

Another bit is prefetching - with sequential scans, the OS is able to
prefetch the next bit of data automatically (read-ahead). With bitmap
index scans that's not the case, producing a lot of individual
synchronous I/O requests. See if increasing effective_cache_size (from
default 1 to 16 or 32) helps.

Try generating the plans with EXPLAIN (ANALYZE, BUFFERS), that should
tell us more about how many blocks are found in shared buffers, etc.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-performance по дате отправления:

Предыдущее

От: Tom Lane
Дата: 25 августа 2017 г., 21:07:21
Сообщение: Re: [PERFORM] 10x faster sort performance on Skylake CPU vs Ivy Bridge

Следующее

От: Aniko Belim
Дата: 25 августа 2017 г., 21:36:29
Сообщение: [PERFORM] Partitioned table - scans through every partitions

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PERFORM] Execution plan analysis

Предыдущее

Следующее