Re: Website mailing list search enhancement idea - one result per thread
От | Masahiro Ikeda |
---|---|
Тема | Re: Website mailing list search enhancement idea - one result per thread |
Дата | |
Msg-id | 6f68bf53ede1c315ae7f52931653eec1@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: Website mailing list search enhancement idea - one result per thread (Masahiro Ikeda <ikedamsh@oss.nttdata.com>) |
Список | pgsql-www |
On 2021-02-06 18:05, Masahiro Ikeda wrote: > "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes: >> When searching the mailing lists in advanced search mode we have list >> filtering, timeframe filtering (pretty coarse though) and then sorting >> mode. For moderately long running threads a search term is going to >> return >> many messages from the same thread. It would desirable to ask that >> only >> threads be shown and maybe how many individual messages in that thread >> matched. > > +1. I always think so. > > On 2020-05-01 01:55, Tom Lane wrote: >> Magnus Hagander <magnus@hagander.net> writes: >>> On Thu, Apr 30, 2020 at 2:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> "David G. Johnston" <david.g.johnston@gmail.com> writes: >>>>> It would desirable to ask that only >>>>> threads be shown and maybe how many individual messages in that >>>>> thread >>>>> matched. >> >>>> +1, an option to do that would be handy. No idea how hard it is ... >> >>> Not having actually looked at it, probably not too hard. The hardest >>> part >>> might be bikeshedding about the color (aka user interface). >> >>> Should this replace the current interface, or do we want to make >>> something >>> like this a choice? >> >> No no, not replace. It should be an option, "aggregate into threads" >> or something like that. > > Why don't you implement the following feature? > > (1) add an option on the advanced search page. > > I agree with the idea to add the option to aggregate into threads > on the advanced search page. > > I attached the image("add_option_to_aggregate_into_threads.png"). > What do you think? > > > (2) change the message's URLs on the result pages if the option is > checked. > > I think it's better to change the message's URLs to flatted ones on > result pages > if the messages are aggregated into threads because users want to > access the > discussion pages not per messages but per threads directly. > > ``` > --- a/templates/search/listsearch.html > +++ b/templates/search/listsearch.html > @@ -53,7 +59,12 @@ > <h2>Results {{firsthit}}-{{lasthit}} of {%if hitcount == > 1000%}more than 1000{%else%}{{hitcount}}{%endif%}.</h2> > {%if pagelinks %}Result pages: > {{pagelinks|safe}}<br/><br/>{%endif%} > {%for hit in hits %} > - {{forloop.counter0|add:firsthit}}. <a > href="https://www.postgresql.org/message-id/{{hit.messageid}}">{{hit.subject}}</a> > [{{hit.rank|floatformat:2}}]<br/> > + {{forloop.counter0|add:firsthit}}. > + {%if aggregate%} > + <a > href="https://www.postgresql.org/message-id/flat/{{hit.messageid}}">{{hit.subject}}</a> > [{{hit.rank|floatformat:2}}]<br/> > + {%else%} > + <a > href="https://www.postgresql.org/message-id/{{hit.messageid}}">{{hit.subject}}</a> > [{{hit.rank|floatformat:2}}]<br/> > + {%endif%} > From {{hit.author}} on {{hit.date}}.<br/> > {{hit.abstract|safe}}<br/> > <a > href="https://www.postgresql.org/message-id/{{hit.messageid}}">https://www.postgresql.org/message-id/{{hit.messageid}}</a><br/> > ``` > > > If someone already works for this, it's ok to ignore the following. > Although I'm not familiar with Django, I made the patches for > pgarchives and pgweb. > > - pgweb: > "0001-wip-add-an-search-option-to-aggregate-into-threads.patch" > > This patch changes the user interface the above (1) and (2). > > > - pgarchives: > "0001-wip-add-an-search-parameter-to-aggregate-into-thread.patch" > > This patch changes the search query to the Postgresql. The example > query is the following. > To aggregate into threads, it uses the window function to extract the > highest-ranked > messages in the same thread group. > > > ``` > SELECT messageid, date, subject, _from, ts_rank_cd(fti, > plainto_tsquery('public.pg', 'wal writer')), > ts_headline(bodytxt, plainto_tsquery('public.pg', 'wal > writer'),'StartSel="[[[[[[",StopSel="]]]]]]"') > FROM > (SELECT messageid, date, subject, _from, fti, bodytxt, > -- add a sub query > RANK() OVER (PARTITION BY threadid ORDER BY ts_rank_cd(fti, > plainto_tsquery('wal writer')) DESC, id) AS rank -- this line is > main change > FROM messages m > WHERE > fti @@ plainto_tsquery('public.pg', 'wal writer') > AND EXISTS (SELECT 1 FROM list_threads lt WHERE > lt.threadid=m.threadid AND lt.listid=ANY(ARRAY[2])) > AND m.date > '2020-02-07T08:44:56.738979'::timestamp > ) m1 > WHERE m1.rank = 1 > ORDER BY ts_rank_cd(fti, plainto_tsquery('wal writer')) DESC > LIMIT 1000; > ``` > > If you have any good ideas, please let me know. I fixed hard-coded value in "0001-wip-add-an-search-parameter- to-aggregate-into-thread.patch" for pgarchives. Regards, -- Masahiro Ikeda NTT DATA CORPORATION
Вложения
В списке pgsql-www по дате отправления: