Обсуждение: Problem with tsearch and stopwords

Поиск
Список
Период
Сортировка

Problem with tsearch and stopwords

От
Joerg Erdmenger
Дата:
Hi there,

I have an issue with tsearch. I'm using tsearch as a search mechanism on a
website making various queries created from the words that a user has put in.
So I query for all the words put in combined with a logical 'and' and also a
query with all the words combined with a logical 'or'. In the 'and' case
stopwords are correctly ignored whereas in the 'or' case they are not
apparently, e.g.

select 'tax&and&work'::mquery_txt;
   mquery_txt
----------------
 'tax' & 'work'
(1 row)

select 'tax|and|work'::mquery_txt;
NOTICE:  Query contains only stopword(s) or doesn't contain lexem(s), ignored
 mquery_txt
------------

(1 row)

Am I missing something? Is the only workaround to filter the stopwords myself
before I pass the query to tsearch?

Thanks

Joerg


Re: Problem with tsearch and stopwords

От
Oleg Bartunov
Дата:
Well, seems we treat stop words too formal. Stop word in definition is
a word containing in all documents, so searching for stop word is always
'true', so combination of logical 'or' and stop words will always be
'true'. That means, no useful searching could be done. The same behaviour
is in tsearch2. We'll fix the problem by ignoring stop word in query.
I'd recommend to use tsearch2 instead of tsearch
(http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/) which is deprecated
and will be obsoleted in future version of pgsql.

    Oleg
On Thu, 28 Aug 2003, Joerg Erdmenger wrote:

> Hi there,
>
> I have an issue with tsearch. I'm using tsearch as a search mechanism on a
> website making various queries created from the words that a user has put in.
> So I query for all the words put in combined with a logical 'and' and also a
> query with all the words combined with a logical 'or'. In the 'and' case
> stopwords are correctly ignored whereas in the 'or' case they are not
> apparently, e.g.
>
> select 'tax&and&work'::mquery_txt;
>    mquery_txt
> ----------------
>  'tax' & 'work'
> (1 row)
>
> select 'tax|and|work'::mquery_txt;
> NOTICE:  Query contains only stopword(s) or doesn't contain lexem(s), ignored
>  mquery_txt
> ------------
>
> (1 row)
>
> Am I missing something? Is the only workaround to filter the stopwords myself
> before I pass the query to tsearch?
>
> Thanks
>
> Joerg
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83