Re: Full text search on partial URLs

Поиск
Список
Период
Сортировка
От bricklen
Тема Re: Full text search on partial URLs
Дата
Msg-id CAGrpgQ_QoB4fYMPSZEb_N77ai6DtPknLmbLqMtc04=o_j0-qMA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Full text search on partial URLs  (Zev Benjamin <zev-pgsql@strangersgate.com>)
Ответы Re: Full text search on partial URLs  (Zev Benjamin <zev-pgsql@strangersgate.com>)
Список pgsql-general

On Wed, Nov 6, 2013 at 10:53 AM, Zev Benjamin <zev-pgsql@strangersgate.com> wrote:
On 11/06/2013 01:47 PM, bricklen wrote:

On Wed, Nov 6, 2013 at 10:45 AM, Zev Benjamin
<zev-pgsql@strangersgate.com <mailto:zev-pgsql@strangersgate.com>> wrote:

    Hi,

    I have Postgres full text search set up for my application and it's
    been working great!  However, my users would like their searches to
    turn up parts of URLs.  For example, they would like a search for
    "foobar" to turn up a document that contains the string
    "http://example.com/foobar/__blah <http://example.com/foobar/blah>"
    (and similarly for queries like "example" and "blah).  With the
    default dictionaries for host, url, and url_path, the search query
    would have to contain the complete host or url path.

    What is the best way to accomplish this?  Should I be looking at
    building a custom dictionary that breaks down hosts and urls or is
    there something simpler I can do?


Have you looked into trigrams?
http://www.postgresql.org/docs/current/static/pgtrgm.html

I've looked at it in the context of adding fuzzy search.  But my understanding is that doing a fuzzy search here would only work if the query were a significant fraction of, say, the url path.  For example, I would expect a fuzzy search of "foobar" on "/foobar/x" to return a high similarity, but a fuzzy search of "foobar" on "/foobar/some/very/long/path/x" to have a low similarity.

Or are you suggesting using trigrams in a different way?

Yeah, I was thinking more along the lines of allowing wildcard searching, not similarity.

Eg.
CREATE INDEX yourtable_yourcol_gist_fbi ON yourtable using GIST ( yourcol gist_trgm_ops );
select * from yourtable where yourcol ~~ '%foobar%';

В списке pgsql-general по дате отправления:

Предыдущее
От: Zev Benjamin
Дата:
Сообщение: Re: Full text search on partial URLs
Следующее
От: Zev Benjamin
Дата:
Сообщение: Re: Full text search on partial URLs