On Sat, 18 Aug 2007, Mike Rylander wrote:
> On 8/18/07, tomas@tuxteam.de <tomas@tuxteam.de> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On Fri, Aug 17, 2007 at 04:06:15PM -0700, Josh Berkus wrote:
>>> Bruce,
>>>
>>>> Oh, so you want the config inside each tsvector value. Interesting
>>>> idea.
>>>
>>> Yeah, hasn't anyone suggested this before? It seems like the obvious
>>> solution. A TSvector constructed with en_US is NOT the same as a vector
>>> constructed with fr_FR and it's silly to pretend that they are comparable.
>>
>> Except that (as I understand Oleg) it even seems to make sense sometimes
>> to compare a tsvectors constructed with different configs -- so it might
>> be important not to prevent this use case eihter. Oleg?
>
> Configs are not simply about languages, they are also about stopword
> lists and stemmers and parsers, and there's no reason to think that
> one would be using only one configuration to create a single tsvector.
>
> Different fields from within one document may require different
> treatment. Take for instance title, with stopwords included, and
> body, with them removed. Those two initial tsvectors can then be
> concatenated together with different weights to provide a very rich,
> and simple (relatively speaking) search infrastructure.
I can't say better, Mike !
Regards, Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83