text search query planning

Поиск
Список
Период
Сортировка
От Henk van Lingen
Тема text search query planning
Дата
Msg-id 20091119110447.GA17878@uu.nl
обсуждение исходный текст
Список pgsql-general
Hi,

I've a problem understanding of slow query. I use 8.4 for storing
syslog messages, on which i want to do text searching. This table:

syslog=# \d fw_msgs
   Table "public.fw_msgs"
  Column  | Type | Modifiers
----------+------+-----------
 datetime | text |
 facility | text |
 level    | text |
 host     | text |
 program  | text |
 pid      | text |
 message  | text |
Indexes:
    "fw_msgs_datetime_idx" btree (datetime)
    "fw_msgs_facility_idx" btree (facility)
    "fw_msgs_host_idx" btree (host)
    "fw_msgs_idx" gin (to_tsvector('english'::regconfig, message))
    "fw_msgs_program_idx" btree (program)

contains millions of records.

This query is fast:

syslog=# explain select datetime,facility,level,host,program,pid,message from fw_msgs where to_tsvector('english',
message)@@ to_tsquery ('131.211.1.124')  limit 3000; 
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Limit  (cost=17658.01..25225.08 rows=3000 width=176)
   ->  Bitmap Heap Scan on fw_msgs  (cost=17658.01..181537.99 rows=64971 width=176)
         Recheck Cond: (to_tsvector('english'::regconfig, message) @@ to_tsquery('131.211.1.124'::text))
         ->  Bitmap Index Scan on fw_msgs_idx  (cost=0.00..17641.77 rows=64971 width=0)
               Index Cond: (to_tsvector('english'::regconfig, message) @@ to_tsquery('131.211.1.124'::text))
(5 rows)

However, when I want to order by datetime, it takes for ever. It seems the db
insist on first sorting on datetime instead of using the gin index:

syslog=# explain select datetime,facility,level,host,program,pid,message from fw_msgs where to_tsvector('english',
message)@@ to_tsquery ('131.211.1.124')  order by datetime limit 3000; 
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..44985.68 rows=3000 width=176)
   ->  Index Scan using fw_msgs_datetime_idx on fw_msgs  (cost=0.00..974644.66 rows=64997 width=176)
         Filter: (to_tsvector('english'::regconfig, message) @@ to_tsquery('131.211.1.124'::text))
(3 rows)

doing something like

select datetime,facility,level,host,program,pid,message from fw_msgs where message in (select message from fw_msgs
whereto_tsvector('english', message) @@ to_tsquery ('131.211.1.124') ) order by datetime desc limit 3000; 


doesn't help.

Any ideas?

Regards,
--
Henk van Lingen,                                              (o-      -+
Netwerk & Telefonie, ICT Service Center                       /\        |
Universiteit Utrecht, Jenalaan 18a, kamer 0.12                v_/_
http://henk.vanlingen.net/             http://www.tuxtown.net/netiquette/

В списке pgsql-general по дате отправления:

Предыдущее
От: Richard Huxton
Дата:
Сообщение: Re: Can anyone help setting up pgbouncer?
Следующее
От: Pedro Doria Meunier
Дата:
Сообщение: obtaining ARRAY position for a given match