Re: [PERFORM] Execution plan analysis

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [PERFORM] Execution plan analysis
Дата
Msg-id 9fdf19f5-a49f-2d85-1e9e-a3cdb335eb80@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [PERFORM] Execution plan analysis  (Neto pr <netopr9@gmail.com>)
Список pgsql-performance
Hi,

So looking at the plans, essentially the only part that is different is
the scan node at the very bottom - in one case it's a sequential scan,
in the other case (the slow one) it's the bitmap index scan.

Essentially it's this:

    ->  Seq Scan on lineitem
        (cost=0.00..2624738.17 ...)
        (actual time=0.839..74391.087 ...)

vs. this:

    ->  Bitmap Heap Scan on lineitem
        (cost=336295.10..1970056.39 ...)
        (actual time=419620.817..509685.421 ...)
        ->  Bitmap Index Scan on idx_l_shipmodelineitem000
            (cost=0.00..336227.49 ...)
            (actual time=419437.172..419437.172 ...)

All the nodes are the same and perform about the same in both cases, so
you can ignore them. This difference it the the root cause you need to
investigate.

The question is why is the sequential scan so much faster than bitmap
index scan? Ideally, the bitmap heap scan should scan the index (in a
mostly sequential way), build a bitmap, and then read just the matching
part of the table (sequentially, by skipping some of the pages).

Now, there are a few reasons why this might not work that well.

Perhaps the table fits into RAM, but table + index does not. That would
make the sequential scan much faster than the index path. Not sure if
this is the case, as you haven't mentioned which TPC-H scale are you
testing, but you only have 4GB of RAM which if fairly low.

Another bit is prefetching - with sequential scans, the OS is able to
prefetch the next bit of data automatically (read-ahead). With bitmap
index scans that's not the case, producing a lot of individual
synchronous I/O requests. See if increasing effective_cache_size (from
default 1 to 16 or 32) helps.

Try generating the plans with EXPLAIN (ANALYZE, BUFFERS), that should
tell us more about how many blocks are found in shared buffers, etc.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [PERFORM] 10x faster sort performance on Skylake CPU vs Ivy Bridge
Следующее
От: Aniko Belim
Дата:
Сообщение: [PERFORM] Partitioned table - scans through every partitions