Обсуждение: bitmap index scan problem?

Поиск
Список
Период
Сортировка

bitmap index scan problem?

От
stig erikson
Дата:
hi
we have a table with some 30M records.
running PG8.1.4. on linux.

when we run with enable_bitmapscan true, PG begins by doing a bitmap index scan plus a BitmapAnd.

when we run with enable_bitmapscan false, PG finds a better index directly and chooses a much better plan.


below is some data, the query and the plans.
as you can see, when using the Index Scan PG finds 62 rows.
but when using Bitmap Index Scan it finds 2563790 + 506 rows which should never be better!?

the question is simply why the planner is not smart enough to skip the bitmap scan if normal operation is faster.




stat=# \d stats
                      Table "public.stats"
   Column   |            Type             |     Modifiers
-----------+-----------------------------+--------------------
  id        | bigint                      | not null default 0
  timestamp | timestamp without time zone |
  aid       | integer                     |
  i         | integer                     |
  ct        | integer                     |
  total     | bigint                      |
  bid       | integer                     | not null default 0
Indexes:
     "id_idx" btree (id)
     "bid_index" btree (bid)
     "ct_index" btree (ct)



--GOOD PLAN FIRST--
stat=# set enable_bitmapscan to false;
SET
Time: 0.645 ms
stat=# explain analyze select aid, ct, sum(total) from stats where ct='90' and bid=17675 GROUP BY aid, ct;
                                                           QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=28713.95..28714.63 rows=54 width=16) (actual time=1.072..1.080 rows=3 loops=1)
    ->  Index Scan using bid_index on stats  (cost=0.00..28709.92 rows=538 width=16) (actual time=0.100..0.804 rows=62
loops=1)
          Index Cond: (bid = 17675)
          Filter: (ct = 90)
  Total runtime: 1.163 ms
(5 rows)

Time: 2.692 ms




--NOW THE BAD PLAN--
stat=# set enable_bitmapscan to true;
SET
Time: 2.775 ms
stat=# explain analyze select aid, ct, sum(total) from stats where ct='90' and bid=17675 GROUP BY aid, ct;
                                                                      QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate  (cost=15848.76..15849.44 rows=54 width=16) (actual time=13210.811..13210.818 rows=3 loops=1)
    ->  Bitmap Heap Scan on stats  (cost=13754.80..15844.73 rows=538 width=16) (actual time=13206.714..13210.525
rows=62loops=1) 
          Recheck Cond: ((bid = 17675) AND (ct = 90))
          ->  BitmapAnd  (cost=13754.80..13754.80 rows=538 width=0) (actual time=13206.659..13206.659 rows=0 loops=1)
                ->  Bitmap Index Scan on bid_index  (cost=0.00..44.51 rows=7576 width=0) (actual time=0.137..0.137
rows=506loops=1) 
                      Index Cond: (bid = 17675)
                ->  Bitmap Index Scan on ct_index  (cost=0.00..13710.04 rows=2409440 width=0) (actual
time=13206.216..13206.216rows=2563790 loops=1) 
                      Index Cond: (ct = 90)
  Total runtime: 13210.918 ms
(9 rows)

Time: 13212.121 ms




/stig

Re: bitmap index scan problem?

От
Tom Lane
Дата:
stig erikson <stigerikson_nospam_@yahoo.se> writes:
> the question is simply why the planner is not smart enough to skip the bitmap scan if normal operation is faster.

Probably because it hasn't got good statistics about the distribution of
"bid":

>                 ->  Bitmap Index Scan on bid_index  (cost=0.00..44.51 rows=7576 width=0) (actual time=0.137..0.137
rows=506loops=1) 
>                       Index Cond: (bid = 17675)

When the rowcount estimate is off by more than a factor of 10, the costs
are going to be wrong too.  Try increasing the statistics target for this
table.

            regards, tom lane

Re: bitmap index scan problem?

От
stig erikson
Дата:
Tom Lane wrote:
> stig erikson <stigerikson_nospam_@yahoo.se> writes:
>> the question is simply why the planner is not smart enough to skip the bitmap scan if normal operation is faster.
>
> Probably because it hasn't got good statistics about the distribution of
> "bid":
>
>>                 ->  Bitmap Index Scan on bid_index  (cost=0.00..44.51 rows=7576 width=0) (actual time=0.137..0.137
rows=506loops=1) 
>>                       Index Cond: (bid = 17675)
>
> When the rowcount estimate is off by more than a factor of 10, the costs
> are going to be wrong too.  Try increasing the statistics target for this
> table.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

Hi.
thank you for your answer.
the last example was not 100% according to what my intention was, even though it shows the problem.
here is a better example.
i have not yet changed the default statistics target, how do i find out what i should change it to?

again i am quite confused that PG does not use the bid_index when enable_bitmapscan is true.

some other information:
- there are around 30 000 000 rows in total.

- there are usually 28-35 different values for ct at any given time, the number of times each value occurs varies from
   less then 10 to over 10 000 000.

- there are usually 23 000 different values for bid at any given time. the number of times each value occurs varies
from
   100 to 20 000.




stat=# show default_statistics_target;
  default_statistics_target
---------------------------
  10
(1 row)

stat=# VACUUM FULL ANALYZE stats ;
VACUUM

stat=# set enable_bitmapscan to 1;
SET
stat=# explain analyze select aid, ct, sum(total) from stats where (ct='90' OR ct='212') and bid=17675 GROUP BY aid,
ct;
                                                                         QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=18149.28..18149.94 rows=53 width=16) (actual time=14458.638..14458.644 rows=3 loops=1)
    ->  Bitmap Heap Scan on stats  (cost=15786.50..18144.74 rows=605 width=16) (actual time=14033.693..14458.260
rows=62loops=1) 
          Recheck Cond: ((bid = 17675) AND ((ct = 90) OR (ct = 212)))
          ->  BitmapAnd  (cost=15786.50..15786.50 rows=608 width=0) (actual time=14026.953..14026.953 rows=0 loops=1)
                ->  Bitmap Index Scan on bid_index  (cost=0.00..44.10 rows=7456 width=0) (actual time=79.293..79.293
rows=506loops=1) 
                      Index Cond: (bid = 17675)
                ->  BitmapOr  (cost=15742.16..15742.16 rows=2766331 width=0) (actual time=13947.348..13947.348 rows=0
loops=1)
                      ->  Bitmap Index Scan on ct_index  (cost=0.00..14675.92 rows=2579119 width=0) (actual
time=13526.774..13526.774rows=2563790 loops=1) 
                            Index Cond: (ct = 90)
                      ->  Bitmap Index Scan on ct_index  (cost=0.00..1066.24 rows=187212 width=0) (actual
time=420.564..420.564rows=374354 loops=1) 
                            Index Cond: (ct = 212)
  Total runtime: 14458.747 ms
(12 rows)



stat=# set enable_bitmapscan to 0;
SET
stat=# explain analyze select aid, ct, sum(total) from stats where (ct='90' OR ct='212') and bid=17675 GROUP BY aid,
ct;
                                                           QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=28152.82..28153.48 rows=53 width=16) (actual time=7.759..7.768 rows=3 loops=1)
    ->  Index Scan using bid_index on stats  (cost=0.00..28148.28 rows=605 width=16) (actual time=0.100..7.483 rows=62
loops=1)
          Index Cond: (bid = 17675)
          Filter: ((ct = 90) OR (ct = 212))
  Total runtime: 7.858 ms
(5 rows)




/stig