How to get around LIKE inefficiencies?

Поиск
Список
Период
Сортировка
От The Hermit Hacker
Тема How to get around LIKE inefficiencies?
Дата
Msg-id Pine.BSF.4.21.0011052045440.494-100000@thelab.hub.org
обсуждение исходный текст
Ответы Re: How to get around LIKE inefficiencies?  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: How to get around LIKE inefficiencies?  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Список pgsql-hackers
I'm tryin to figure out how to speed up udmsearch when run under
postgresql, and am being hit by atrocious performance when using a LIKE
query ... the query looks like:

SELECT ndict.url_id,ndict.intag  FROM ndict,url WHERE ndict.word_id=1971739852   AND url.rec_id=ndict.url_id    AND
(url.urlLIKE 'http://www.postgresql.org/%');
 

Take off the AND ( LIKE ) part of the query, finishes almost as soon as
you hit return.  Put it back in, and you can go for coffee before it
finishes ...
If I do 'SELECT url_id FROM ndict WHERE word_id=1971739852', there
are 153 records returned ... is there some way, that I'm not thinking, of
re-writing the above so that it 'resolves' the equality before the LIKE in
order to reduce the number of tuples that it has to do the LIKE on?  Is
there some way of writing the above so that it doesn't take forever to
execute?
I'm running this on a Dual-PIII 450 Server, 512Meg of RAM, zero
swap space being used ... the database has its indices on one hard drive,
the tables themselves are on a second one ... its PgSQL 7.0.2 (Tom,
anything in v7.0.3 that might improve this?) and startup is as:

#!/bin/tcsh
setenv PORT 5432
setenv POSTMASTER /pgsql/bin/postmaster
unlimit
${POSTMASTER} -B 384 -N 192 \             -o "-F -S 32768" \             -i -p ${PORT} -D/pgsql/data >&
/pgsql/logs/postmaster.${PORT}.$$ &
So its not like I'm not throwing alot of resources at this ...
Is there anything that we can do to improve this?  I was trying to
think of some way to use a subselect to narrow the search results, or
something ...
Oh, the above explains down to:

NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..1195.14 rows=1 width=10) ->  Index Scan using url_url on url  (cost=0.00..2.73 rows=1 width=4)
-> Index Scan using n_word on ndict  (cost=0.00..1187.99 rows=353 width=6)
 

EXPLAIN
ndict: 663018 tuples  url:  29276 tuples

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: Unicode conversion (Re: [COMMITTERS] pgsql (configure.in))
Следующее
От: Tom Lane
Дата:
Сообщение: Re: How to get around LIKE inefficiencies?