Hi,
I tried to implement fulltext search using linguistic approach,
for example, using ispell like udmsearch does. We also save position
information of each lexem in document to calculate relevancy
(it's C-function using SPI-interface). We're still testing different
strategies but found several problems with optimizer, just look at plan -
very strange numbers and no indices used) (I did run vacuume analyze)
explain
select txt.tid
from txt, txt_lexem1 tl1_0, txt_lexem11 tl11_0
where
tl1_0.lid =17700
OR
tl11_0.lid =172751
;
NOTICE: QUERY PLAN:
Nested Loop (cost=0.00..16275952180.00 rows=512819420786 width=12) -> Nested Loop (cost=0.00..891369556.42
rows=512819421width=8) -> Seq Scan on txt_lexem11 tl11_0 (cost=0.00..2596.92 rows=132292 width=4) -> Seq
Scanon txt_lexem1 tl1_0 (cost=0.00..3815.95 rows=194795 width=4) -> Seq Scan on txt (cost=0.00..20.00 rows=1000
width=4)
EXPLAIN
fulltext=# \d txt Table "txt"Attribute | Type | Modifier
-----------+---------+----------tid | integer | not null
Index: txt_pkey
tables txt_lexemX look like:
fulltext=# \d txt_lexem1 Table "txt_lexem1"Attribute | Type | Modifier
-----------+-----------+----------tid | integer | not nulllid | integer | not nulldid | integer
|not nullcount | integer | not nullpos | integer[] | not null
Index: txt_lexem1_key
We have rewrite using EXISTS and plan looks better !
select txt.tid
from txt
where
EXISTS ( select tid from txt_lexem1 tl1_0 where tl1_0.lid=17700 and tl1_0.did=0
and txt.tid=tl1_0.tid )
OR
EXISTS ( select tid from txt_lexem11 tl11_0 where tl11_0.lid=172751 and
tl11_0.did=0 and txt.tid=tl11_0.tid )
;
NOTICE: QUERY PLAN:
Seq Scan on txt (cost=0.00..7416.48 rows=1000 width=4) SubPlan -> Index Scan using txt_lexem1_key on txt_lexem1
tl1_0 (cost=0.00..3.95 rows=1 width=4) -> Index Scan using txt_lexem11_key on txt_lexem11 tl11_0 (cost=0.00..3.45
rows=1width=4)
EXPLAIN
I've tested on plain 7.0.2 and CVS version.
I remind there was old problem with OR. Does optimizer still has
such problem ?
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83