Обсуждение: ts_count
One of our PostgreSQL Experts Inc customers wanted a function to count all the occurrences of terms in a tsquery in a tsvector. This has been written as a loadable module function, and initial testing shows it is working well. With the client's permission we are releasing the code - it's available at <https://github.com/pgexperts/ts_count>. The actual new code involved here is tiny, some of the code is C&P'd from tsrank.c and much of the rest is boilerplate. A snippet from the regression test: select ts_count(to_tsvector('managing managers manage peons managerially'), to_tsquery('managers| peon')); ts_count ---------- 4 We'd like to add something like this for 9.2, so I'd like to get the API agreed and then I'll prepare a patch and submitit for the next CF. Comments? cheers andrew
Well, there are several functions available around tsearch2. so I suggest somebody to collect all of them and create one extension - ts_addon. For example, these are what I remember: 1. tsvector2array 2. noccurences(tsvector, tsquery) - like your ts_count 3. nmatches(tsvector, tsquery) - # of matched lexems in query Of course, we need to think about better names for functions, since ts_count is a bit ambiguous. Oleg On Sat, 4 Jun 2011, Andrew Dunstan wrote: > > One of our PostgreSQL Experts Inc customers wanted a function to count all > the occurrences of terms in a tsquery in a tsvector. This has been written as > a loadable module function, and initial testing shows it is working well. > With the client's permission we are releasing the code - it's available at > <https://github.com/pgexperts/ts_count>. The actual new code involved here is > tiny, some of the code is C&P'd from tsrank.c and much of the rest is > boilerplate. > > A snippet from the regression test: > > > select ts_count(to_tsvector('managing managers manage peons > managerially'), > to_tsquery('managers | peon')); > ts_count > ---------- > 4 > > We'd like to add something like this for 9.2, so I'd like to get the API > agreed and then I'll prepare a patch and submit it for the next CF. > > Comments? cheers andrew > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On 06/04/2011 04:51 PM, Oleg Bartunov wrote: > Well, there are several functions available around tsearch2. so I suggest > somebody to collect all of them and create one extension - ts_addon. > For example, these are what I remember: > 1. tsvector2array > 2. noccurences(tsvector, tsquery) - like your ts_count > 3. nmatches(tsvector, tsquery) - # of matched lexems in query > Of course, we need to think about better names for functions, since > ts_count is a bit ambiguous. > > Getting agreed names was one reason for posting. I don't know why these need to be an extension. I think they are of sufficiently general interest (and sufficiently lightweight) that we could just build them in. cheers andrew
Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011: > A snippet from the regression test: > > > select ts_count(to_tsvector('managing managers manage peons managerially'), > to_tsquery('managers | peon')); > ts_count > ---------- > 4 Err, shouldn't this return 5? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On 06/04/2011 08:59 PM, Alvaro Herrera wrote: > Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011: > >> A snippet from the regression test: >> >> >> select ts_count(to_tsvector('managing managers manage peons managerially'), >> to_tsquery('managers | peon')); >> ts_count >> ---------- >> 4 > Err, shouldn't this return 5? No. 'managerially' doesn't get the same stemming. cheers andrew
On 06/04/2011 04:51 PM, Oleg Bartunov wrote: > Well, there are several functions available around tsearch2. so I suggest > somebody to collect all of them and create one extension - ts_addon. > For example, these are what I remember: > 1. tsvector2array > 2. noccurences(tsvector, tsquery) - like your ts_count > 3. nmatches(tsvector, tsquery) - # of matched lexems in query > Of course, we need to think about better names for functions, since > ts_count is a bit ambiguous. > > > Oleg, are you doing this? I'd rather this stuff didn't get dropped on the floor. cheers andrew