Re: text search: restricting the number of parsed words in headline generation

Поиск

Список

Период

Сортировка

От	Sushant Sinha
Тема	Re: text search: restricting the number of parsed words in headline generation
Дата	24 августа 2011 г. 01:28:57
Msg-id	1314149321.1819.11.camel@dragflick обсуждение исходный текст
Ответ на	Re: text search: restricting the number of parsed words in headline generation (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: text search: restricting the number of parsed words in headline generation (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

> > Here is a simple patch that limits the number of words during the
> > tokenization phase and puts an upper-bound on the headline generation.
> 
> Doesn't this force the headline to be taken from the first N words of
> the document, independent of where the match was?  That seems rather
> unworkable, or at least unhelpful.
> 
>             regards, tom lane

In headline generation function, we don't have any index or knowledge of
where the match is. We discover the matches by first tokenizing and then
comparing the matches with the query tokens. So it is hard to do
anything better than first N words.


One option could be that we start looking for "good match" while
tokenizing and then stop if we have found good match. Currently the
algorithms that decide a good match operates independently of the
tokenization and there are two of them. So integrating them would not be
easy.

The patch is very helpful if you believe in the common case assumption
that most of the time a good match is at the top of the document.
Typically a search application generates headline for the top matches of
a query i.e., those in which the query terms appears frequently. So
there should be atleast one or two good text excerpt matches at the top
of the document.



-Sushant.

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 24 августа 2011 г., 01:17:26
Сообщение: REGRESS_OPTS default

Следующее

От: Tom Lane
Дата: 24 августа 2011 г., 01:55:49
Сообщение: Re: Question: CREATE EXTENSION and create schema permission?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: text search: restricting the number of parsed words in headline generation

Предыдущее

Следующее