Обсуждение: Optimized WHERE UPPER(name) LIKE UPPER('%p_name%')

Поиск
Список
Период
Сортировка

Optimized WHERE UPPER(name) LIKE UPPER('%p_name%')

От
Loles
Дата:
Hi!

A question about query optimization.

We have a simple query with several tables joined.

We have seen performance down when we use WHERE UPPER(name) LIKE UPPER('%Alice%') in the condition.

If we compare with WHERE UPPER(name) = UPPER('Alice') it runs very fast.

We have tried an index on the name field, an UPPER(name) expression index, a GIN index, etc. but nothing seems to work.

We need to optimize the condition WHERE UPPER(name) LIKE UPPER('%p_name%)

Any ideas?

Thanks!!

RE: Optimized WHERE UPPER(name) LIKE UPPER('%p_name%')

От
HALAT-PRUVOT SYLVIE
Дата:

Hi,

 

You can try with extension pg_tgrm to create gin index to use LIKE with ‘%expression%’

 

https://www.postgresql.org/docs/current/pgtrgm.html

https://mazeez.dev/posts/pg-trgm-similarity-search-and-fast-like

 

Sylvie Halat

De : Loles <lolesft@gmail.com>
Envoyé : lundi 30 octobre 2023 14:28
À : pgsql-admin@postgresql.org
Objet : Optimized WHERE UPPER(name) LIKE UPPER('%p_name%')

 

Hi!

 

A question about query optimization.

We have a simple query with several tables joined.

We have seen performance down when we use WHERE UPPER(name) LIKE UPPER('%Alice%') in the condition.

If we compare with WHERE UPPER(name) = UPPER('Alice') it runs very fast.

We have tried an index on the name field, an UPPER(name) expression index, a GIN index, etc. but nothing seems to work.

We need to optimize the condition WHERE UPPER(name) LIKE UPPER('%p_name%)

Any ideas?

 

Thanks!!

Re: Optimized WHERE UPPER(name) LIKE UPPER('%p_name%')

От
Tom Lane
Дата:
Loles <lolesft@gmail.com> writes:
> We have seen performance down when we use WHERE UPPER(name) LIKE
> UPPER('%Alice%') in the condition.
> If we compare with WHERE UPPER(name) = UPPER('Alice') it runs very fast.

Presumably, that works well because you have a btree index on UPPER(name).
But btree can't optimize a pattern with a leading '%' --- see

https://www.postgresql.org/docs/current/indexes-types.html#INDEXES-TYPES-BTREE

You could possibly use a pg_trgm index instead, see

https://www.postgresql.org/docs/current/pgtrgm.html

The performance properties will be somewhat different from btree,
but it might get the job done.

            regards, tom lane



Re: Optimized WHERE UPPER(name) LIKE UPPER('%p_name%')

От
Jeff Janes
Дата:
On Mon, Oct 30, 2023 at 9:28 AM Loles <lolesft@gmail.com> wrote:
Hi!

A question about query optimization.

We have a simple query with several tables joined.

We have seen performance down when we use WHERE UPPER(name) LIKE UPPER('%Alice%') in the condition.

If we compare with WHERE UPPER(name) = UPPER('Alice') it runs very fast.

We have tried an index on the name field, an UPPER(name) expression index, a GIN index, etc. but nothing seems to work.

A gin_trgm_ops expression index on UPPER(name) should work.  We can't tell if that is one of the things you tried.

But it would probably be better to use ILIKE and omit the UPPER calls, in which case a gin_trgm_ops index on the column itself should work.
 
Cheers,

Jeff

Re: Optimized WHERE UPPER(name) LIKE UPPER('%p_name%')

От
Dan Smith
Дата:
Hello!

You might want to check that query plan; functions in where and join clauses can lead to running the function per row which will be a resource intensive process.  My guess is that you are touching an awful lot of data you don't need in the query.

What I would normally recommend is to add a column on these tables (or to a materialized view) to support the join where data is stored as you want to use it (store the value of UPPER(name) vs running on demand).  Alternatively, you could test creating an index where the function is already applied to see if the query planner would use it.  Further, I doubt the double wildcard like filter is ever going to be super efficient in the query (regardless of gin index).  These kinds of search operations are seldom optimal and in my experience have a tendency to decrease in performance with the volume of data.  Acceptable performance will probably depend on your specific use case and data.

Where the statement of direct equality can be supported by a b-tree index and would certainly be optimized.

If you really need to support this exact functionality, you may want to look in to the following features and extensions:
Hope this helps!  I'm certain others will have great ideas and comments as well.


Cheers!

Dan Smith