Re: phrase search

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: phrase search
Дата
Msg-id 488629FB.2030501@sigaev.ru
обсуждение исходный текст
Ответ на Re: phrase search  (Oleg Bartunov <oleg@sai.msu.su>)
Список pgsql-hackers
>> 1. What is the meaning of such a query operator?
>>
>> foo #5 bar -> true if the document has word "foo" followed by "bar" at
>> 5th position.
>>
>> foo #<5 bar -> true if document has word "foo" followed by "bar" with in
>> 5 positions
>>
>> foo #>5 bar -> true if document has word "foo" followed by "bar" after 5
>> positions

Sounds good, but, may be it's an overkill.

>> etc .....
>>
>> 2. How to implement such query operators?
>>
>> Should we modify QueryItem to include additional distance information or
>> is there any other way to accomplish it?
>>
>> Is the following list sufficient to accomplish this?
>> a. Modify to_tsquery
>> b. Modify TS_execute in tsvector_op.c to check new operator
Exactly

>>
>> Is there anything needed in rewrite subsystem?
Yes, of course - rewrite system should support that operation.

>>
>> 3. Are these valid uses of the operators and if yes what would they
>> mean?
>>
>> foo #5 (bar & cup)
It must support!  Because of lexize might return subtsquery. For example, 
russian ispell can return several lexemes:  "adfg" can become  a 'adf | adfs | 
ad', norwegian and german languages are more complicated: "abc" -> " (ab & c) | 
(a & bc) | abc"


>> 4. If the operator only applies to two query items can we create an
>> index such that (foo, bar)-> documents[min distance, max distance]
>> How difficult it is to implement an index like this?
No, index should execute query 'foo & bar' and mark recheck flag to true to 
execute 'foo #<5 bar' on original tsvector from table.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Markus Wanner
Дата:
Сообщение: Re: Plans for 8.4
Следующее
От: Shane Ambler
Дата:
Сообщение: Re: Do we really want to migrate plproxy and citext into PG core distribution?