Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly

Поиск
Список
Период
Сортировка
От Pavel Borisov
Тема Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly
Дата
Msg-id CALT9ZEGMS_U-dLQLROg5op9va7kjA8qQMRKbUROSWsv2sYec5w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-bugs
Hi, Bruce and Tom!

On Sun, 29 Oct 2023 at 00:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > Is this documented somewhere?
>
> The docs [1] only say that ts_headline "returns an excerpt from the
> document in which terms from the query are highlighted".  This
> behavior does not violate that admittedly-weak contract.
>
> IIRC, ts_headline does attempt to find a text fragment or fragments
> that fully satisfy the query (e.g., include an exact phrase match)
> but it will then highlight all the matching words in the fragment,
> not only the location of the phrase match.  I do not agree with the
> OP's opinion that that's wrong.  The highlight-em-all approach has its
> own value, and in any case it may not be possible to find a full match
> that satisfies the function's other constraints such as MaxWords.
> Refusing to highlight anything in that event would be unhelpful.
>
>                         regards, tom lane

I think that the ts_headline main functionality is to make Postgres
more friendly to search-engine-like approach, which I feel is too
niche usage scenario for supporting it as a part of core code. If
remember right, bug reports coming from the users supposing it has
more strict semantics than it has in reality are regular. And I also
remember myself being puzzled by unusual output in the past.

If we fiddle with other parameters of ts_headline we can easily have
other kinds of output that seem counterintuitive e.g.:
SELECT ts_headline('English',


                             'This Commercial Bank does not have any
Equity in Europe but European Commercial Bank does',


('''equiti'' <-> ''bank''')::tsquery,  'MaxWords=30, MinWords=2');
   ts_headline
-----------------
 This Commercial
(1 row)

What do you think about clearly deprecating this feature in docs,
still leaving it working as it is?

Kind regards,
Pavel Borisov,
Supabase.



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #15172: Postgresql ts_headline with <-> operator does not highlight text properly
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: BUG #18170: Unexpected error: no relation entry for relid 3