Re: [pg_trgm] Making similarity(?, ?) < ? use an index

Поиск

Список

Период

Сортировка

От	Artur Zakirov
Тема	Re: [pg_trgm] Making similarity(?, ?) < ? use an index
Дата	3 июня 2016 г. 15:00:11
Msg-id	c1b61e4b-0a1c-41ac-43a2-ead19f641606@postgrespro.ru обсуждение исходный текст
Ответ на	Re: [pg_trgm] Making similarity(?, ?) < ? use an index (Greg Navis <contact@gregnavis.com>)
Список	pgsql-general

Дерево обсуждения

Oh, I understand. It is because you want different limits for
restaurants and cinemas?

I see only one solution. It is custom extension, which will create
operator class similar to gin_trgm_ops and will depends on pg_trgm. In
gin_trgm_consistent() you can use your own limit variable.

As I know functions do not use indexes.

Of course I may be wrong. And somebody knows a better solution.

On 03.06.2016 14:24, Greg Navis wrote:
> Artur, thanks for your reply. That's right, `%` does use the index. The
> goal of using `similarity(lhs, rhs) >= show_limit()` was to replace
> `show_limit()` with a custom, per-query limit. I noticed that the latter
> approach does _not_ use the index, hence my question:
>
> grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE city % 'warsw';
>                                                                   QUERY
> PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on restaurants  (cost=24.28..1319.36 rows=515
> width=10) (actual time=96.081..96.456 rows=400 loops=1)
>    Recheck Cond: ((city)::text % 'warsw'::text)
>    Heap Blocks: exact=359
>    ->  Bitmap Index Scan on restaurants_city_gist_trgm_idx
>  (cost=0.00..24.15 rows=515 width=0) (actual time=96.030..96.030
> rows=400 loops=1)
>          Index Cond: ((city)::text % 'warsw'::text)
>  Planning time: 0.211 ms
>  Execution time: 96.528 ms
> (7 rows)
>
> grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE similarity(city,
> 'warsw') >= show_limit();
>                                                      QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------
>  Seq Scan on restaurants  (cost=0.00..11692.81 rows=171825 width=10)
> (actual time=14.520..692.520 rows=400 loops=1)
>    Filter: (similarity((city)::text, 'warsw'::text) >= show_limit())
>    Rows Removed by Filter: 515075
>  Planning time: 0.109 ms
>  Execution time: 692.560 ms
> (5 rows)
>
> If this functionality isn't supported then it might be a good idea for a
> contribution.
>
> Best regards
>
> On Fri, Jun 3, 2016 at 12:51 PM, Artur Zakirov <a.zakirov@postgrespro.ru
> <mailto:a.zakirov@postgrespro.ru>> wrote:
>
>     Hello.
>
>     As I know 'lhs % rhs' is equivalent to 'similarity(lhs, rhs) >=
>     show_limit()'.
>
>     And so your query should looks like this:
>
>     SELECT * FROM restaurants WHERE city % 'warsw';
>
>     And it should use index.
>
>
>     On 03.06.2016 13:35, Greg Navis wrote:
>
>         Hey!
>
>         I'm playing with pg_trgm. It seems that `lhs % rhs` is _almost_
>         equivalent to `similarity(lhs, rhs) < show_limit()`. The
>         difference that
>         I noticed is that `%` uses a GIN index while `similarity` does not.
>
>         ```
>         grn=# \d restaurants
>                  Table "public.restaurants"
>          Column |          Type          | Modifiers
>         --------+------------------------+-----------
>          city   | character varying(255) | not null
>         Indexes:
>             "restaurants_city_trgm_idx" gin (city gin_trgm_ops)
>
>         grn=# SELECT COUNT(*) FROM restaurants;
>          count
>         --------
>          515475
>         (1 row)
>
>         Time: 45.964 ms
>         grn=# EXPLAIN ANALYZE SELECT * FROM restaurants WHERE
>         similarity(city,
>         'warsw') > show_limit();
>                                                              QUERY PLAN
>
>
--------------------------------------------------------------------------------------------------------------------
>          Seq Scan on restaurants  (cost=0.00..11692.81 rows=171825 width=10)
>         (actual time=16.436..665.062 rows=360 loops=1)
>            Filter: (similarity((city)::text, 'warsw'::text) > show_limit())
>            Rows Removed by Filter: 515115
>          Planning time: 0.139 ms
>          Execution time: 665.105 ms
>         (5 rows)
>
>         Time: 665.758 ms
>         ```
>
>         My question is: is it possible to make `similarity` use the
>         index? If
>         not, is there a way to speed up the query above?
>
>         Best regards
>         --
>         Greg Navis
>         I help tech companies to scale Heroku-hosted Rails apps.
>         Free, biweekly scalability newsletter for SaaS CEOs
>         <http://www.gregnavis.com/newsletter/>
>
>
>
>     --
>     Artur Zakirov
>     Postgres Professional: http://www.postgrespro.com
>     Russian Postgres Company
>
>
>
>
> --
> Greg Navis
> I help tech companies to scale Heroku-hosted Rails apps.
> Free, biweekly scalability newsletter for SaaS CEOs
> <http://www.gregnavis.com/newsletter/>
>


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

В списке pgsql-general по дате отправления:

Предыдущее

От: Greg Navis
Дата: 03 июня 2016 г., 14:24:13
Сообщение: Re: [pg_trgm] Making similarity(?, ?) < ? use an index

Следующее

От: "David G. Johnston"
Дата: 03 июня 2016 г., 15:05:43
Сообщение: Re: [pg_trgm] Making similarity(?, ?) < ? use an index

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [pg_trgm] Making similarity(?, ?) < ? use an index

Предыдущее

Следующее