Hi, Bruce and Tom!
On Sun, 29 Oct 2023 at 00:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Bruce Momjian <bruce@momjian.us> writes:
> > Is this documented somewhere?
>
> The docs [1] only say that ts_headline "returns an excerpt from the
> document in which terms from the query are highlighted". This
> behavior does not violate that admittedly-weak contract.
>
> IIRC, ts_headline does attempt to find a text fragment or fragments
> that fully satisfy the query (e.g., include an exact phrase match)
> but it will then highlight all the matching words in the fragment,
> not only the location of the phrase match. I do not agree with the
> OP's opinion that that's wrong. The highlight-em-all approach has its
> own value, and in any case it may not be possible to find a full match
> that satisfies the function's other constraints such as MaxWords.
> Refusing to highlight anything in that event would be unhelpful.
>
> regards, tom lane
I think that the ts_headline main functionality is to make Postgres
more friendly to search-engine-like approach, which I feel is too
niche usage scenario for supporting it as a part of core code. If
remember right, bug reports coming from the users supposing it has
more strict semantics than it has in reality are regular. And I also
remember myself being puzzled by unusual output in the past.
If we fiddle with other parameters of ts_headline we can easily have
other kinds of output that seem counterintuitive e.g.:
SELECT ts_headline('English',
'This Commercial Bank does not have any
Equity in Europe but European Commercial Bank does',
('''equiti'' <-> ''bank''')::tsquery, 'MaxWords=30, MinWords=2');
ts_headline
-----------------
This Commercial
(1 row)
What do you think about clearly deprecating this feature in docs,
still leaving it working as it is?
Kind regards,
Pavel Borisov,
Supabase.