Inoptimal query plan for max() and multicolumn index

Поиск
Список
Период
Сортировка
От Vladimir Kulev
Тема Inoptimal query plan for max() and multicolumn index
Дата
Msg-id BANLkTimsNMhnAHk6Wz7gARfdkTmz2QKJ7g@mail.gmail.com
обсуждение исходный текст
Ответы Re: Inoptimal query plan for max() and multicolumn index  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: Inoptimal query plan for max() and multicolumn index  (Gaetano Mendola <mendola@gmail.com>)
Re: Inoptimal query plan for max() and multicolumn index  (Gaetano Mendola <mendola@gmail.com>)
Список pgsql-performance
Hi all!
Please, just look at these query explanations and try to explain why
planner does so (PostgreSQL 8.4).
There is an index on table sms (number, timestamp).

And three fast & simple queries:
=# explain analyze select max(timestamp) from sms where number='5502712';

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=3.79..3.80 rows=1 width=0) (actual time=0.269..0.270
rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..3.79 rows=1 width=8) (actual
time=0.259..0.260 rows=1 loops=1)
           ->  Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.253..0.253
rows=1 loops=1)
                 Index Cond: ((number)::text = '5502712'::text)
                 Filter: ("timestamp" IS NOT NULL)
 Total runtime: 0.342 ms

=# explain analyze select max(timestamp) from sms where number='5802693';

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=3.79..3.80 rows=1 width=0) (actual time=0.425..0.426
rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..3.79 rows=1 width=8) (actual
time=0.413..0.414 rows=1 loops=1)
           ->  Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.409..0.409
rows=1 loops=1)
                 Index Cond: ((number)::text = '5802693'::text)
                 Filter: ("timestamp" IS NOT NULL)
 Total runtime: 0.513 ms

=# explain analyze select max(timestamp) from sms where number='5802693';

-------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=3.79..3.80 rows=1 width=0) (actual time=0.425..0.426
rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.00..3.79 rows=1 width=8) (actual
time=0.413..0.414 rows=1 loops=1)
           ->  Index Scan Backward using sms_number_timestamp on sms
(cost=0.00..5981.98 rows=1579 width=8) (actual time=0.409..0.409
rows=1 loops=1)
                 Index Cond: ((number)::text = '5802693'::text)
                 Filter: ("timestamp" IS NOT NULL)
 Total runtime: 0.513 ms



But this does not work:
# explain analyze select max(timestamp) from sms where number in
('5502712','5802693','5801981');

------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=15912.30..15912.31 rows=1 width=8) (actual
time=587.952..587.954 rows=1 loops=1)
   ->  Bitmap Heap Scan on sms  (cost=1413.02..15758.71 rows=61432
width=8) (actual time=34.266..491.853 rows=59078 loops=1)
         Recheck Cond: ((number)::text = ANY
('{5502712,5802693,5801981}'::text[]))
         ->  Bitmap Index Scan on sms_number_timestamp
(cost=0.00..1397.67 rows=61432 width=0) (actual time=30.778..30.778
rows=59078 loops=1)
               Index Cond: ((number)::text = ANY
('{5502712,5802693,5801981}'::text[]))
 Total runtime: 588.199 ms

And this too:
# explain analyze select max(timestamp) from sms where
number='5502712' or number='5802693' or number='5801981';

------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=16205.75..16205.76 rows=1 width=8) (actual
time=851.204..851.205 rows=1 loops=1)
   ->  Bitmap Heap Scan on sms  (cost=1473.31..16052.17 rows=61432
width=8) (actual time=68.233..745.004 rows=59090 loops=1)
         Recheck Cond: (((number)::text = '5502712'::text) OR
((number)::text = '5802693'::text) OR ((number)::text =
'5801981'::text))
         ->  BitmapOr  (cost=1473.31..1473.31 rows=61592 width=0)
(actual time=64.992..64.992 rows=0 loops=1)
               ->  Bitmap Index Scan on sms_number_timestamp
(cost=0.00..40.27 rows=1579 width=0) (actual time=0.588..0.588 rows=59
loops=1)
                     Index Cond: ((number)::text = '5502712'::text)
               ->  Bitmap Index Scan on sms_number_timestamp
(cost=0.00..40.27 rows=1579 width=0) (actual time=0.266..0.266 rows=59
loops=1)
                     Index Cond: ((number)::text = '5802693'::text)
               ->  Bitmap Index Scan on sms_number_timestamp
(cost=0.00..1346.69 rows=58434 width=0) (actual time=64.129..64.129
rows=58972 loops=1)
                     Index Cond: ((number)::text = '5801981'::text)
 Total runtime: 853.176 ms


According to planner cost estimations - it has enough data to
understand that it is better to aggregate maximum from three
subqueries. I suppose it's not a bug but not implemented feature -
maybe there is already something about it on roadmap?


--
Vladimir Kulev
Mobile: +7 (921) 555-44-22

Jabber: me@lightoze.net

Skype: lightoze

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jose Ildefonso Camargo Tolosa
Дата:
Сообщение: Re: Large rows number, and large objects
Следующее
От: Julius Tuskenis
Дата:
Сообщение: Re: generating a large XML document