Обсуждение: bitmap heap scan exact/lossy blocks and row removal

Поиск
Список
Период
Сортировка

bitmap heap scan exact/lossy blocks and row removal

От
Augusto Callejas
Дата:
I created a GIN index on the following relation and ran an EXPLAIN query on a query, and noticed that despite all heap blocks being exact, that the outermost bitmap heap scan removed 62 rows after recheck. My understanding (mainly from https://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-lossyexact-pages-for-bitmap-heap-scan) is that if there are only exact pages, then there are only tuples in the bitmap, so I wouldn't expect to see rows being removed by the recheck. I maxed out the work memory just in case the bitmap was hitting a memory threshold where it would have to switch to lossy mode. Why would rows be removed with only exact pages?

For reference, I'm running PostgreSQL 11.9.

===
SHOW work_mem;
   work_mem   
--------------
 2147483647kB

CREATE INDEX trgm_idx ON outcomes_development_shard_2.outcomes USING gin ((description || ' ' || title || ' ' || label) gin_trgm_ops);
EXPLAIN ANALYSE SELECT COUNT(*) FROM outcomes_development_shard_2.outcomes WHERE (description || ' ' || title || ' ' || label) %> 'multiplicatio';
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=126.70..126.71 rows=1 width=8) (actual time=9.197..9.198 rows=1 loops=1)
   ->  Bitmap Heap Scan on outcomes  (cost=116.02..126.70 rows=3 width=0) (actual time=0.469..9.186 rows=49 loops=1)
         Recheck Cond: ((((((description)::text || ' '::text) || (title)::text) || ' '::text) || (label)::text) %> 'multiplicatio'::text)
         Rows Removed by Index Recheck: 62
         Heap Blocks: exact=59
         ->  Bitmap Index Scan on trgm_idx  (cost=0.00..116.02 rows=3 width=0) (actual time=0.303..0.303 rows=111 loops=1)
               Index Cond: ((((((description)::text || ' '::text) || (title)::text) || ' '::text) || (label)::text) %> 'multiplicatio'::text)
 Planning Time: 0.111 ms
 Execution Time: 9.232 ms
(9 rows)
===

Thanks,
Augusto

--
--
Augusto Callejas | Senior Software Engineer

Re: bitmap heap scan exact/lossy blocks and row removal

От
Tom Lane
Дата:
Augusto Callejas <acallejas@instructure.com> writes:
> I created a GIN index on the following relation and ran an EXPLAIN query on
> a query, and noticed that despite all heap blocks being exact, that the
> outermost bitmap heap scan removed 62 rows after recheck. My understanding
> (mainly from
> https://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-lossyexact-pages-for-bitmap-heap-scan)
> is that if there are only exact pages, then there are only tuples in the
> bitmap, so I wouldn't expect to see rows being removed by the recheck.

"exact" in that context just means that the bitmap mechanism didn't
forget any details about which row TIDs were returned by the index.
But the index itself could be lossy, ie it can return some rows that only
*might* match the query.  The recheck mechanism exists mainly for that
purpose --- backstopping a lossy bitmap was something we tacked on later.

An example here is that a trigram index will only tell you which rows
contain the same trigrams that the query string does.  Whether they are
in the right order has to be verified by a recheck of the original query
operator applied to the real data.

            regards, tom lane