Re: fts, compond words?

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: fts, compond words?
Дата
Msg-id 43980BE7.9000601@sigaev.ru
обсуждение исходный текст
Ответ на Re: fts, compond words?  (Mike Rylander <mrylander@gmail.com>)
Ответы Re: fts, compond words?  (Mike Rylander <mrylander@gmail.com>)
Список pgsql-general
> hrm... that is a problem.  Though, I think that's a case of how the
> compiled expression is built from user input.  Unless I'm mistaken
>
>   a + ( foo1 | foo2 )
>
> is exactly equal to
>
>   (a + foo1) | (a + foo2)
>
>
> Ahhh... but then there is the more complex example of
>
>   a + foonish + bar
>
> becoming
>
>   a + (foo1 | foo2) + bar
>
> .... but I guess that could be
>
> (a + foo1 + bar) | (a + foo2 + bar)

That a simple case, what about languages as norwegian or german? They has
compound words and ispell dictionary can split them to lexemes. But, usialy
there is more than one variant of separation:

forbruksvaremerkelov
    forbruk    vare merke lov
    forbruk    vare merkelov
    forbruk varemerke lov
    forbruk varemerkelov
    forbruksvare merke lov
    forbruksvare merkelov
(notice: I don't know translation, just an example. When we working on compound
word support we found word which has 24 variant of separation!!)

So, query 'a + forbruksvaremerkelov' will be awful:

a + ( (forbruk & vare & merke & lov) | (forbruk & vare & merkelov) | ... )

Of course, that is examle just from mind, but solution of phrase search should
work reasonably with such corner cases.



--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

В списке pgsql-general по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Help on collation and accent sensitivity
Следующее
От: Teodor Sigaev
Дата:
Сообщение: Re: TSearch2 / Get all unique lexems