Обсуждение: Postgres 15 SELECT query doesn't use index under RLS

Поиск
Список
Период
Сортировка

Postgres 15 SELECT query doesn't use index under RLS

От
Alexander Okulovich
Дата:

Hello everyone!


Recently, we upgraded the AWS RDS instance from Postgres 12.14 to 15.4 and noticed extremely high disk consumption on the following query execution:

select (exists (select 1 as "one" from "public"."indexed_commit" where "public"."indexed_commit"."repo_id" in (964992,964994,964999, ...);

For some reason, the query planner starts using Seq Scan instead of the index on the "repo_id" column when requesting under user limited with RLS. On prod, it happens when there are more than 316 IDs in the IN part of the query, on stage - 3. If we execute the request from Superuser, the planner always uses the "repo_id" index.

Luckily, we can easily reproduce this on our stage database (which is smaller). If we add a multicolumn "repo_id, tenant_id" index, the planner uses it (Index Only Scan) with any IN params count under RLS. Could you please clarify if this is a Postgres bug or not? Should we include the "tenant_id" column in all our indexes to make them work under RLS?


Postgres version / Operating system+version

PostgreSQL 15.4 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), 64-bit

Full Table and Index Schema

\d indexed_commit
                        Table "public.indexed_commit"
    Column     |            Type             | Collation | Nullable | Default
---------------+-----------------------------+-----------+----------+---------
 id            | bigint                      |           | not null |
 commit_hash   | character varying(40)       |           | not null |
 parent_hash   | text                        |           |          |
 created_ts    | timestamp without time zone |           | not null |
 repo_id       | bigint                      |           | not null |
 lines_added   | bigint                      |           |          |
 lines_removed | bigint                      |           |          |
 tenant_id     | uuid                        |           | not null |
 author_id     | uuid                        |           | not null |
Indexes:
    "indexed-commit-repo-idx" btree (repo_id)
    "indexed_commit_commit_hash_repo_id_key" UNIQUE CONSTRAINT, btree (commit_hash, repo_id) REPLICA IDENTITY
    "indexed_commit_repo_id_without_loc_idx" btree (repo_id) WHERE lines_added IS NULL OR lines_removed IS NULL
Policies:
    POLICY "commit_isolation_policy"
      USING ((tenant_id = (current_setting('app.current_tenant_id'::text))::uuid))

Table Metadata

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='indexed_commit';
    relname     | relpages |  reltuples   | relallvisible | relkind | relnatts | relhassubclass |                                                                 reloptions                                                                  | pg_table_size
----------------+----------+--------------+---------------+---------+----------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------+---------------
 indexed_commit | 18170522 | 7.451964e+08 |      18104744 | r       |        9 | f              | {autovacuum_vacuum_scale_factor=0,autovacuum_analyze_scale_factor=0,autovacuum_vacuum_threshold=200000,autovacuum_analyze_threshold=100000} |  148903337984

EXPLAIN (ANALYZE, BUFFERS), not just EXPLAIN

Production queries:

316 ids under RLS limited user

392 ids under RLS limited user

392 ids under Superuser

History

It became slow after the upgrade to 15.4. We never had any issues before.

Hardware

AWS DB class db.t4g.large + GP3 400GB disk

Maintenance Setup

Are you running autovacuum? Yes

If so, with what settings?

autovacuum_vacuum_scale_factor=0,autovacuum_analyze_scale_factor=0,autovacuum_vacuum_threshold=200000,autovacuum_analyze_threshold=100000

SELECT * FROM pg_stat_user_tables WHERE relname='indexed_commit';
 relid | schemaname |    relname     | seq_scan | seq_tup_read | idx_scan  | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |        last_autovacuum        | last_analyze |       last_autoanalyze        | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+----------------+----------+--------------+-----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-------------------------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
 24662 | public     | indexed_commit |     2485 |  49215378424 | 374533865 |    4050928807 | 764089750 |   2191615 |  18500311 |             0 |  745241398 |        383 |               46018 |              45343 |             | 2023-10-11 23:51:29.170378+00 |              | 2023-10-11 23:50:18.922351+00 |            0 |              672 |             0 |               753

WAL Configuration

For data writing queries: have you moved the WAL to a different disk? Changed the settings? No.

GUC Settings

What database configuration settings have you changed? We use default settings.

What are their values?

SELECT * FROM pg_settings WHERE name IN ('effective_cache_size', 'shared_buffers', 'work_mem');
         name         | setting | unit |               category                |                               short_desc                               |                                                                              extra_desc                                                                               |  context   | vartype |       source       | min_val |  max_val   | enumvals | boot_val | reset_val | sourcefile | sourceline | pending_restart
----------------------+---------+------+---------------------------------------+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+---------+--------------------+---------+------------+----------+----------+-----------+------------+------------+-----------------
 effective_cache_size | 494234  | 8kB  | Query Tuning / Planner Cost Constants | Sets the planner's assumption about the total size of the data caches. | That is, the total size of the caches (kernel cache and shared buffers) used for PostgreSQL data files. This is measured in disk pages, which are normally 8 kB each. | user       | integer | configuration file | 1       | 2147483647 |          | 524288   | 494234    |            |            | f
 shared_buffers       | 247117  | 8kB  | Resource Usage / Memory               | Sets the number of shared memory buffers used by the server.           |                                                                                                                                                                       | postmaster | integer | configuration file | 16      | 1073741823 |          | 16384    | 247117    |            |            | f
 work_mem             | 4096    | kB   | Resource Usage / Memory               | Sets the maximum memory to be used for query workspaces.               | This much memory can be used by each internal sort operation and hash table before switching to temporary disk files.                                                 | user       | integer | default            | 64      | 2147483647 |          | 4096     | 4096      |            |            | f


Statistics: n_distinct, MCV, histogram

Useful to check statistics leading to bad join plan. SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='...' AND tablename='...' ORDER BY 1 DESC;

Returns 0 rows.


Kind regards,

Alexander

Re: Postgres 15 SELECT query doesn't use index under RLS

От
Tom Lane
Дата:
Alexander Okulovich <aokulovich@stiltsoft.com> writes:
> Recently, we upgraded the AWS RDS instance from Postgres 12.14 to 15.4 
> and noticed extremely high disk consumption on the following query 
> execution:
> select (exists (select 1 as "one" from "public"."indexed_commit" where 
> "public"."indexed_commit"."repo_id" in (964992,964994,964999, ...);
> For some reason, the query planner starts using Seq Scan instead of the 
> index on the "repo_id" column when requesting under user limited with 
> RLS. On prod, it happens when there are more than 316 IDs in the IN part 
> of the query, on stage - 3. If we execute the request from Superuser, 
> the planner always uses the "repo_id" index.

The superuser bypasses the RLS policy.  When that's enforced, the
query can no longer use an index-only scan (because it needs to fetch
tenant_id too).  Moreover, it may be that only a small fraction of the
rows fetched via the index will satisfy the RLS condition.  So the
estimated cost of an indexscan query could be high enough to persuade
the planner that a seqscan is a better idea.

> Luckily, we can easily reproduce this on our stage database (which is 
> smaller). If we add a multicolumn "repo_id, tenant_id" index, the 
> planner uses it (Index Only Scan) with any IN params count under RLS.

Yeah, that would be the obvious way to ameliorate both problems.

If in fact you were getting decent performance from an indexscan plan
before, the only explanation I can think of is that the repo_ids you
are querying for are correlated with the tenant_id, so that the RLS
filter doesn't eliminate very many rows from the index result.  The
planner wouldn't realize that by default, but if you create extended
statistics on repo_id and tenant_id then it might do better.  Still,
you probably want the extra index.

> Could you please clarify if this is a Postgres bug or not?

You haven't shown any evidence suggesting that.

> Should we 
> include the "tenant_id" column in all our indexes to make them work 
> under RLS?

Adding tenant_id is going to bloat your indexes quite a bit,
so I wouldn't do that except in cases where you've demonstrated
it's important.

            regards, tom lane



Re: Postgres 15 SELECT query doesn't use index under RLS

От
Alexander Okulovich
Дата:

Hi Oscar,

Thank you for the suggestion.

Unfortunately, I didn't mention that on prod we performed the upgrade from Postgres 12 to 15 using replication to another instance with pglogical, so I assume that the index was filled from scratch by Postgres 15.

We upgraded stage instance by changing Postgres version only, so potentially could run into the index issue there. I've tried to execute REINDEX CONCURRENTLY, but the performance issue hasn't gone. The problem is probably somewhere else. However, I do not exclude that we'll perform REINDEX on prod.

Kind regards,

Alexander

On 13.10.2023 11:44, Oscar van Baten wrote:

Hi Alexander,

I think this is caused by the de-duplication of B-tree index entries which was added to postgres in version 13
https://www.postgresql.org/docs/release/13.0/

"
More efficiently store duplicates in B-tree indexes (Anastasia Lubennikova, Peter Geoghegan)
This allows efficient B-tree indexing of low-cardinality columns by storing duplicate keys only once. Users upgrading with pg_upgrade will need to use REINDEX to make an existing index use this feature.
"

When we upgraded from 12->13 we had a similar issue. We had to rebuild the indexes and it was fixed..


regards,
Oscar


Op do 12 okt 2023 om 18:41 schreef Alexander Okulovich <aokulovich@stiltsoft.com>:

Hello everyone!


Recently, we upgraded the AWS RDS instance from Postgres 12.14 to 15.4 and noticed extremely high disk consumption on the following query execution:

select (exists (select 1 as "one" from "public"."indexed_commit" where "public"."indexed_commit"."repo_id" in (964992,964994,964999, ...);

For some reason, the query planner starts using Seq Scan instead of the index on the "repo_id" column when requesting under user limited with RLS. On prod, it happens when there are more than 316 IDs in the IN part of the query, on stage - 3. If we execute the request from Superuser, the planner always uses the "repo_id" index.

Luckily, we can easily reproduce this on our stage database (which is smaller). If we add a multicolumn "repo_id, tenant_id" index, the planner uses it (Index Only Scan) with any IN params count under RLS. Could you please clarify if this is a Postgres bug or not? Should we include the "tenant_id" column in all our indexes to make them work under RLS?


Postgres version / Operating system+version

PostgreSQL 15.4 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), 64-bit

Full Table and Index Schema

\d indexed_commit
                        Table "public.indexed_commit"
    Column     |            Type             | Collation | Nullable | Default
---------------+-----------------------------+-----------+----------+---------
 id            | bigint                      |           | not null |
 commit_hash   | character varying(40)       |           | not null |
 parent_hash   | text                        |           |          |
 created_ts    | timestamp without time zone |           | not null |
 repo_id       | bigint                      |           | not null |
 lines_added   | bigint                      |           |          |
 lines_removed | bigint                      |           |          |
 tenant_id     | uuid                        |           | not null |
 author_id     | uuid                        |           | not null |
Indexes:
    "indexed-commit-repo-idx" btree (repo_id)
    "indexed_commit_commit_hash_repo_id_key" UNIQUE CONSTRAINT, btree (commit_hash, repo_id) REPLICA IDENTITY
    "indexed_commit_repo_id_without_loc_idx" btree (repo_id) WHERE lines_added IS NULL OR lines_removed IS NULL
Policies:
    POLICY "commit_isolation_policy"
      USING ((tenant_id = (current_setting('app.current_tenant_id'::text))::uuid))

Table Metadata

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='indexed_commit';
    relname     | relpages |  reltuples   | relallvisible | relkind | relnatts | relhassubclass |                                                                 reloptions                                                                  | pg_table_size
----------------+----------+--------------+---------------+---------+----------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------+---------------
 indexed_commit | 18170522 | 7.451964e+08 |      18104744 | r       |        9 | f              | {autovacuum_vacuum_scale_factor=0,autovacuum_analyze_scale_factor=0,autovacuum_vacuum_threshold=200000,autovacuum_analyze_threshold=100000} |  148903337984

EXPLAIN (ANALYZE, BUFFERS), not just EXPLAIN

Production queries:

316 ids under RLS limited user

392 ids under RLS limited user

392 ids under Superuser

History

It became slow after the upgrade to 15.4. We never had any issues before.

Hardware

AWS DB class db.t4g.large + GP3 400GB disk

Maintenance Setup

Are you running autovacuum? Yes

If so, with what settings?

autovacuum_vacuum_scale_factor=0,autovacuum_analyze_scale_factor=0,autovacuum_vacuum_threshold=200000,autovacuum_analyze_threshold=100000

SELECT * FROM pg_stat_user_tables WHERE relname='indexed_commit';
 relid | schemaname |    relname     | seq_scan | seq_tup_read | idx_scan  | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | n_ins_since_vacuum | last_vacuum |        last_autovacuum        | last_analyze |       last_autoanalyze        | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+----------------+----------+--------------+-----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+--------------------+-------------+-------------------------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
 24662 | public     | indexed_commit |     2485 |  49215378424 | 374533865 |    4050928807 | 764089750 |   2191615 |  18500311 |             0 |  745241398 |        383 |               46018 |              45343 |             | 2023-10-11 23:51:29.170378+00 |              | 2023-10-11 23:50:18.922351+00 |            0 |              672 |             0 |               753

WAL Configuration

For data writing queries: have you moved the WAL to a different disk? Changed the settings? No.

GUC Settings

What database configuration settings have you changed? We use default settings.

What are their values?

SELECT * FROM pg_settings WHERE name IN ('effective_cache_size', 'shared_buffers', 'work_mem');
         name         | setting | unit |               category                |                               short_desc                               |                                                                              extra_desc                                                                               |  context   | vartype |       source       | min_val |  max_val   | enumvals | boot_val | reset_val | sourcefile | sourceline | pending_restart
----------------------+---------+------+---------------------------------------+------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+---------+--------------------+---------+------------+----------+----------+-----------+------------+------------+-----------------
 effective_cache_size | 494234  | 8kB  | Query Tuning / Planner Cost Constants | Sets the planner's assumption about the total size of the data caches. | That is, the total size of the caches (kernel cache and shared buffers) used for PostgreSQL data files. This is measured in disk pages, which are normally 8 kB each. | user       | integer | configuration file | 1       | 2147483647 |          | 524288   | 494234    |            |            | f
 shared_buffers       | 247117  | 8kB  | Resource Usage / Memory               | Sets the number of shared memory buffers used by the server.           |                                                                                                                                                                       | postmaster | integer | configuration file | 16      | 1073741823 |          | 16384    | 247117    |            |            | f
 work_mem             | 4096    | kB   | Resource Usage / Memory               | Sets the maximum memory to be used for query workspaces.               | This much memory can be used by each internal sort operation and hash table before switching to temporary disk files.                                                 | user       | integer | default            | 64      | 2147483647 |          | 4096     | 4096      |            |            | f


Statistics: n_distinct, MCV, histogram

Useful to check statistics leading to bad join plan. SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='...' AND tablename='...' ORDER BY 1 DESC;

Returns 0 rows.


Kind regards,

Alexander

Re: Postgres 15 SELECT query doesn't use index under RLS

От
Alexander Okulovich
Дата:
Hi Tom,

> If in fact you were getting decent performance from an indexscan plan
> before, the only explanation I can think of is that the repo_ids you
> are querying for are correlated with the tenant_id, so that the RLS
> filter doesn't eliminate very many rows from the index result.  The
> planner wouldn't realize that by default, but if you create extended
> statistics on repo_id and tenant_id then it might do better.  Still,
> you probably want the extra index.

Do you have any idea how to measure that correlation?

> You haven't shown any evidence suggesting that.
My suggestion is based on following backward reasoning.

We used the product with the default settings. The requests are simple. 
We didn't change the hardware (actually, we use even more performant 
hardware because of that issue) and DDL. I've checked the request on old 
and new databases. Requests that rely on this index execute more than 10 
times longer. Planner indeed used Index Scan before, but now it doesn't.

So, from my perspective, the only reason we experience that is database 
logic change. I think we could probably try to reproduce the issue on 
different Postgres versions and find the specific version that causes this.

> Adding tenant_id is going to bloat your indexes quite a bit,
> so I wouldn't do that except in cases where you've demonstrated
> it's important.

Any recommendations from the Postgres team on how to use the indexes 
under RLS would help a lot here, but I didn't find them.

Kind regards,

Alexander

On 13.10.2023 22:26, Tom Lane wrote:
> Alexander Okulovich <aokulovich@stiltsoft.com> writes:
>> Recently, we upgraded the AWS RDS instance from Postgres 12.14 to 15.4
>> and noticed extremely high disk consumption on the following query
>> execution:
>> select (exists (select 1 as "one" from "public"."indexed_commit" where
>> "public"."indexed_commit"."repo_id" in (964992,964994,964999, ...);
>> For some reason, the query planner starts using Seq Scan instead of the
>> index on the "repo_id" column when requesting under user limited with
>> RLS. On prod, it happens when there are more than 316 IDs in the IN part
>> of the query, on stage - 3. If we execute the request from Superuser,
>> the planner always uses the "repo_id" index.
> The superuser bypasses the RLS policy.  When that's enforced, the
> query can no longer use an index-only scan (because it needs to fetch
> tenant_id too).  Moreover, it may be that only a small fraction of the
> rows fetched via the index will satisfy the RLS condition.  So the
> estimated cost of an indexscan query could be high enough to persuade
> the planner that a seqscan is a better idea.
>
>> Luckily, we can easily reproduce this on our stage database (which is
>> smaller). If we add a multicolumn "repo_id, tenant_id" index, the
>> planner uses it (Index Only Scan) with any IN params count under RLS.
> Yeah, that would be the obvious way to ameliorate both problems.
>
> If in fact you were getting decent performance from an indexscan plan
> before, the only explanation I can think of is that the repo_ids you
> are querying for are correlated with the tenant_id, so that the RLS
> filter doesn't eliminate very many rows from the index result.  The
> planner wouldn't realize that by default, but if you create extended
> statistics on repo_id and tenant_id then it might do better.  Still,
> you probably want the extra index.
>
>> Could you please clarify if this is a Postgres bug or not?
> You haven't shown any evidence suggesting that.
>
>> Should we
>> include the "tenant_id" column in all our indexes to make them work
>> under RLS?
> Adding tenant_id is going to bloat your indexes quite a bit,
> so I wouldn't do that except in cases where you've demonstrated
> it's important.
>
>             regards, tom lane



Re: Postgres 15 SELECT query doesn't use index under RLS

От
Tom Lane
Дата:
Alexander Okulovich <aokulovich@stiltsoft.com> writes:
> We used the product with the default settings. The requests are simple. 
> We didn't change the hardware (actually, we use even more performant 
> hardware because of that issue) and DDL. I've checked the request on old 
> and new databases. Requests that rely on this index execute more than 10 
> times longer. Planner indeed used Index Scan before, but now it doesn't.

> So, from my perspective, the only reason we experience that is database 
> logic change.

[ shrug... ]  Maybe, but it's still not clear if it's a bug, or an
intentional change, or just a cost estimate that was on the hairy
edge before and your luck ran out.

If you could provide a self-contained test case that performs 10x worse
under v15 than v12, we'd surely take a look at it.  But with the
information you've given so far, little is possible beyond speculation.

            regards, tom lane



Re: Postgres 15 SELECT query doesn't use index under RLS

От
Tomek
Дата:
Hi Alexander!
Apart from the problem you are writing about I'd like to ask you to explain how you interpret counted frac_MCV - for me it has no sense at all to summarize most_common_freqs.
Please rethink it and explain what was the idea of such SUM ? I understand that it can be some measure for ratio of NULL values but only in some cases when n_distinct is small.

regards

Statistics: n_distinct, MCV, histogram

Useful to check statistics leading to bad join plan. SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='...' AND tablename='...' ORDER BY 1 DESC;

Returns 0 rows.


Kind regards,

Alexander

Re: Postgres 15 SELECT query doesn't use index under RLS

От
Alexander Okulovich
Дата:

Hi Tomek,

Unfortunately, I didn't dig into this. This request is recommended to provide when describing slow query issues, but looks like it relates to JOINs in the query, which we don't have.

Kind regards,

Alexander

On 19.10.2023 09:43, Tomek wrote:
Hi Alexander!
Apart from the problem you are writing about I'd like to ask you to explain how you interpret counted frac_MCV - for me it has no sense at all to summarize most_common_freqs.
Please rethink it and explain what was the idea of such SUM ? I understand that it can be some measure for ratio of NULL values but only in some cases when n_distinct is small.

regards

Statistics: n_distinct, MCV, histogram

Useful to check statistics leading to bad join plan. SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='...' AND tablename='...' ORDER BY 1 DESC;

Returns 0 rows.


Kind regards,

Alexander

Re: Postgres 15 SELECT query doesn't use index under RLS

От
Alexander Okulovich
Дата:

Hi Tom,

I've attempted to reproduce this on my PC in Docker from the stage database dump, but no luck. The first query execution on Postgres 15 behaves like on the real stage, but subsequent ones use the index. Also, they execute much faster. Looks like the hardware and(or) the data structure on disk matters.

Here is the Docker Compose sample config:

version: '2.4'

services:
database-15:  image: postgres:15.4  ports:    - "7300:5432"
  environment:    POSTGRES_DB: stage_db    POSTGRES_USER: stage    POSTGRES_PASSWORD: stage  volumes:    - "./init.sql:/docker-entrypoint-initdb.d/init.sql"
    - "./pgdb/aws-15:/var/lib/postgresql/data"
  mem_limit: 512M  cpus: 2  blkio_config:    device_read_bps:      - path: /dev/nvme0n1        rate: '10mb'
    device_read_iops:        - path: /dev/nvme0n1          rate: 2000    device_write_bps:        - path: /dev/nvme0n1          rate: '10mb'
    device_write_iops:        - path: /dev/nvme0n1          rate: 2000
I performed tests only with CPU and memory limits. If I try to limit the disk(blkio_config), my system hangs on container startup after a while.
Could you please share your thoughts on how to create such a self-contained test case.

Kind regards,

Alexander

On 18.10.2023 22:35, Tom Lane wrote:
If you could provide a self-contained test case that performs 10x worse under v15 than v12, we'd surely take a look at it. But with the information you've given so far, little is possible beyond speculation.

Re: Postgres 15 SELECT query doesn't use index under RLS

От
Tom Lane
Дата:
Alexander Okulovich <aokulovich@stiltsoft.com> writes:
> I've attempted to reproduce this on my PC in Docker from the stage 
> database dump, but no luck. The first query execution on Postgres 15 
> behaves like on the real stage, but subsequent ones use the index.

Can you force it in either direction with "set enable_seqscan = off"
(resp. "set enable_indexscan = off")?  If so, how do the estimated
costs compare for the two plan shapes?

> Also, 
> they execute much faster. Looks like the hardware and(or) the data 
> structure on disk matters.

Maybe your prod installation has a bloated index, and that's driving
up the estimated cost enough to steer the planner away from it.

            regards, tom lane



Re: Postgres 15 SELECT query doesn't use index under RLS

От
Alexander Okulovich
Дата:

Hi Tom,

Can you force it in either direction with "set enable_seqscan = off"
(resp. "set enable_indexscan = off")?  If so, how do the estimated
costs compare for the two plan shapes?
Here are the results from the prod instance:

seqscan off

indexscan_off

Just noticed that the WHEN clause differs from the initial one (392 ids under RLS). Probably, this is why the execution time isn't so catastrophic. Please let me know if this matters, and I'll rerun this with the initial request. Speaking of the stage vs local Docker Postgres instance, the execution time on stage is so short (0.1 ms with seq scan, 0.195 with index scan) that we probably should not consider them. But I'll execute the requests if it's necessary.

Maybe your prod installation has a bloated index, and that's driving
up the estimated cost enough to steer the planner away from it.
We tried to make REINDEX CONCURRENTLY on a prod copy, but the planner still used Seq Scan instead of Index Scan afterward.

Kind regards,

Alexander

Postgres Locking

От
"Dirschel, Steve"
Дата:

Relatively new to Postgres.  Running into a locking situation and I need to make sure I understand output.  I found this query to show a lock tree:

 

wldomart01a=>     WITH

wldomart01a->       RECURSIVE l AS (

wldomart01a(>                   SELECT pid, locktype, mode, granted,

wldomart01a(>                  ROW(locktype,database,relation,page,tuple,virtualxid,transactionid,classid,objid,objsubid) obj

wldomart01a(>             FROM pg_locks),

wldomart01a->       pairs AS (

wldomart01a(>                   SELECT w.pid waiter, l.pid locker, l.obj, l.mode

wldomart01a(>                     FROM l w

wldomart01a(>                     JOIN l

wldomart01a(>               ON l.obj IS NOT DISTINCT FROM w.obj

wldomart01a(>              AND l.locktype=w.locktype

wldomart01a(>              AND NOT l.pid=w.pid

wldomart01a(>              AND l.granted

wldomart01a(>                    WHERE NOT w.granted),

wldomart01a->       tree AS (

wldomart01a(>                   SELECT l.locker pid, l.locker root, NULL::record obj, NULL AS mode, 0 lvl, locker::text path, array_agg(l.locker) OVER () all_pids

wldomart01a(>                     FROM ( SELECT DISTINCT locker FROM pairs l WHERE NOT EXISTS (SELECT 1 FROM pairs WHERE waiter=l.locker) ) l

wldomart01a(>                    UNION ALL

wldomart01a(>                   SELECT w.waiter pid, tree.root, w.obj, w.mode, tree.lvl+1, tree.path||'.'||w.waiter, all_pids || array_agg(w.waiter) OVER ()

wldomart01a(>                     FROM tree

wldomart01a(>             JOIN pairs w

wldomart01a(>               ON tree.pid=w.locker

wldomart01a(>              AND NOT w.waiter = ANY ( all_pids ))

wldomart01a->    SELECT

wldomart01a->                   path, repeat(' .', lvl)||' '|| tree.pid as pid_tree, tree.pid,

wldomart01a->                   (clock_timestamp() - a.xact_start)::interval(3) AS ts_age,

wldomart01a->                   replace(a.state, 'idle in transaction', 'idletx') state,

wldomart01a->                   wait_event_type wait_type,

wldomart01a->                   wait_event,

wldomart01a->                   (clock_timestamp() - state_change)::interval(3) AS change_age,

wldomart01a->                   lvl,

wldomart01a->                   (SELECT count(*) FROM tree p WHERE p.path ~ ('^'||tree.path) AND NOT p.path=tree.path) blocked,

wldomart01a->                   repeat(' .', lvl)||' '||left(query,100) query

wldomart01a->     FROM tree

wldomart01a->           JOIN pg_stat_activity a

wldomart01a->    USING (pid)

wldomart01a->          ORDER BY path;

   path    | pid_tree | pid  |    ts_age    | state  | wait_type |  wait_event   |  change_age  | lvl | blocked |               query

-----------+----------+------+--------------+--------+-----------+---------------+--------------+-----+---------+------------------------------------

3740      |  3740    | 3740 | 01:23:03.294 | idletx | Client    | ClientRead    | 00:00:00.004 |   0 |       1 |  update "wln_mart"."ee_fact" set  +

           |          |      |              |        |           |               |              |     |         |     "changed_on" = $1             +

           |          |      |              |        |           |               |              |     |         | where "ee_fact_id" = $2

3740.3707 |  . 3707  | 3707 | 01:23:03.294 | active | Lock      | transactionid | 01:23:03.29  |   1 |       0 |  . update "wln_mart"."ee_fact" set+

           |          |      |              |        |           |               |              |     |         |     "changed_on" = $1             +

           |          |      |              |        |           |               |              |     |         | where "ee_fact_id" = $2

(2 rows)

 

Above I can see PID 3740 is blocking PID 3707.   The PK on table wln_mart.ee_fact is ee_fact_id.  I assume PID 3740 has updated a row (but not committed it yet) that PID 3707 is also trying to update.  But I am being told those 2 sessions should not be trying to process the same PK rows.

 

Here is output from pg_locks for those 2 sessions:

 

wldomart01a=> select * from pg_locks where pid in (3740,3707) order by pid;

   locktype    | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid  |       mode       | granted | fastpath |           waitstart

---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+------+------------------+---------+----------+-------------------------------

transactionid |          |          |      |       |            |     251189989 |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | f        |

relation      |    91999 |    94619 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94615 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94611 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94610 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94609 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94569 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    93050 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

virtualxid    |          |          |      |       | 54/196626  |               |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | t        |

transactionid |          |          |      |       |            |     251189988 |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | f        |

transactionid |          |          |      |       |            |     251189986 |         |       |          | 54/196626          | 3707 | ShareLock        | f       | f        | 2023-10-31 14:40:21.837507-05

tuple         |    91999 |    93050 |    0 |     1 |            |               |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | f        |

relation      |    91999 |   308853 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94693 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94693 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94619 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94615 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94611 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94610 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94609 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94569 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    93050 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

virtualxid    |          |          |      |       | 60/259887  |               |         |       |          | 60/259887          | 3740 | ExclusiveLock    | t       | t        |

transactionid |          |          |      |       |            |     251189986 |         |       |          | 60/259887          | 3740 | ExclusiveLock    | t       | f        |

relation      |    91999 |   308853 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

(25 rows)

 

 

I believe the locktype relation is pointing to the table and the indexes on the table.  Which data point(s) above point to this being row-level locking and not some other level of locking? I am very familiar with Oracle locking and different levels and am trying to quickly get up-to-speed on Postgres locking.  I am continuing to google for this but figured I could post this to see if someone can provide a quick response.

 

Thanks

Steve

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html

RE: Postgres Locking

От
"Smith, Travis"
Дата:

Hi Steve,

 

Its literally the same query.  I would try to extract the actual named object that is in the lock to verify.  Is there any partitioning?   An explain plan may be helpful.

 

 

 

Thank you

Travis

 

From: Dirschel, Steve <steve.dirschel@thomsonreuters.com>
Sent: Tuesday, October 31, 2023 4:12 PM
To: pgsql-performance@postgresql.org
Cc: Wong, Kam Fook (TR Technology) <kamfook.wong@thomsonreuters.com>
Subject: Postgres Locking

 

***ATTENTION!! This message originated from outside of Circana. Treat hyperlinks and attachments in this email with caution.***

 

Relatively new to Postgres.  Running into a locking situation and I need to make sure I understand output.  I found this query to show a lock tree:

 

wldomart01a=>     WITH

wldomart01a->       RECURSIVE l AS (

wldomart01a(>                   SELECT pid, locktype, mode, granted,

wldomart01a(>                  ROW(locktype,database,relation,page,tuple,virtualxid,transactionid,classid,objid,objsubid) obj

wldomart01a(>             FROM pg_locks),

wldomart01a->       pairs AS (

wldomart01a(>                   SELECT w.pid waiter, l.pid locker, l.obj, l.mode

wldomart01a(>                     FROM l w

wldomart01a(>                     JOIN l

wldomart01a(>               ON l.obj IS NOT DISTINCT FROM w.obj

wldomart01a(>              AND l.locktype=w.locktype

wldomart01a(>              AND NOT l.pid=w.pid

wldomart01a(>              AND l.granted

wldomart01a(>                    WHERE NOT w.granted),

wldomart01a->       tree AS (

wldomart01a(>                   SELECT l.locker pid, l.locker root, NULL::record obj, NULL AS mode, 0 lvl, locker::text path, array_agg(l.locker) OVER () all_pids

wldomart01a(>                     FROM ( SELECT DISTINCT locker FROM pairs l WHERE NOT EXISTS (SELECT 1 FROM pairs WHERE waiter=l.locker) ) l

wldomart01a(>                    UNION ALL

wldomart01a(>                   SELECT w.waiter pid, tree.root, w.obj, w.mode, tree.lvl+1, tree.path||'.'||w.waiter, all_pids || array_agg(w.waiter) OVER ()

wldomart01a(>                     FROM tree

wldomart01a(>             JOIN pairs w

wldomart01a(>               ON tree.pid=w.locker

wldomart01a(>              AND NOT w.waiter = ANY ( all_pids ))

wldomart01a->    SELECT

wldomart01a->                   path, repeat(' .', lvl)||' '|| tree.pid as pid_tree, tree.pid,

wldomart01a->                   (clock_timestamp() - a.xact_start)::interval(3) AS ts_age,

wldomart01a->                   replace(a.state, 'idle in transaction', 'idletx') state,

wldomart01a->                   wait_event_type wait_type,

wldomart01a->                   wait_event,

wldomart01a->                   (clock_timestamp() - state_change)::interval(3) AS change_age,

wldomart01a->                   lvl,

wldomart01a->                   (SELECT count(*) FROM tree p WHERE p.path ~ ('^'||tree.path) AND NOT p.path=tree.path) blocked,

wldomart01a->                   repeat(' .', lvl)||' '||left(query,100) query

wldomart01a->     FROM tree

wldomart01a->           JOIN pg_stat_activity a

wldomart01a->    USING (pid)

wldomart01a->          ORDER BY path;

   path    | pid_tree | pid  |    ts_age    | state  | wait_type |  wait_event   |  change_age  | lvl | blocked |               query

-----------+----------+------+--------------+--------+-----------+---------------+--------------+-----+---------+------------------------------------

3740      |  3740    | 3740 | 01:23:03.294 | idletx | Client    | ClientRead    | 00:00:00.004 |   0 |       1 |  update "wln_mart"."ee_fact" set  +

           |          |      |              |        |           |               |              |     |         |     "changed_on" = $1             +

           |          |      |              |        |           |               |              |     |         | where "ee_fact_id" = $2

3740.3707 |  . 3707  | 3707 | 01:23:03.294 | active | Lock      | transactionid | 01:23:03.29  |   1 |       0 |  . update "wln_mart"."ee_fact" set+

           |          |      |              |        |           |               |              |     |         |     "changed_on" = $1             +

           |          |      |              |        |           |               |              |     |         | where "ee_fact_id" = $2

(2 rows)

 

Above I can see PID 3740 is blocking PID 3707.   The PK on table wln_mart.ee_fact is ee_fact_id.  I assume PID 3740 has updated a row (but not committed it yet) that PID 3707 is also trying to update.  But I am being told those 2 sessions should not be trying to process the same PK rows.

 

Here is output from pg_locks for those 2 sessions:

 

wldomart01a=> select * from pg_locks where pid in (3740,3707) order by pid;

   locktype    | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid  |       mode       | granted | fastpath |           waitstart

---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+------+------------------+---------+----------+-------------------------------

transactionid |          |          |      |       |            |     251189989 |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | f        |

relation      |    91999 |    94619 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94615 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94611 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94610 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94609 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94569 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    93050 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

virtualxid    |          |          |      |       | 54/196626  |               |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | t        |

transactionid |          |          |      |       |            |     251189988 |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | f        |

transactionid |          |          |      |       |            |     251189986 |         |       |          | 54/196626          | 3707 | ShareLock        | f       | f        | 2023-10-31 14:40:21.837507-05

tuple         |    91999 |    93050 |    0 |     1 |            |               |         |       |          | 54/196626          | 3707 | ExclusiveLock    | t       | f        |

relation      |    91999 |   308853 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94693 |      |       |            |               |         |       |          | 54/196626          | 3707 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94693 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94619 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94615 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94611 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94610 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94609 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    94569 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

relation      |    91999 |    93050 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

virtualxid    |          |          |      |       | 60/259887  |               |         |       |          | 60/259887          | 3740 | ExclusiveLock    | t       | t        |

transactionid |          |          |      |       |            |     251189986 |         |       |          | 60/259887          | 3740 | ExclusiveLock    | t       | f        |

relation      |    91999 |   308853 |      |       |            |               |         |       |          | 60/259887          | 3740 | RowExclusiveLock | t       | t        |

(25 rows)

 

 

I believe the locktype relation is pointing to the table and the indexes on the table.  Which data point(s) above point to this being row-level locking and not some other level of locking? I am very familiar with Oracle locking and different levels and am trying to quickly get up-to-speed on Postgres locking.  I am continuing to google for this but figured I could post this to see if someone can provide a quick response.

 

Thanks

Steve

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html

Re: Postgres Locking

От
Tom Lane
Дата:
"Dirschel, Steve" <steve.dirschel@thomsonreuters.com> writes:
> Above I can see PID 3740 is blocking PID 3707.  The PK on table
> wln_mart.ee_fact is ee_fact_id.  I assume PID 3740 has updated a row
> (but not committed it yet) that PID 3707 is also trying to update.

Hmm. We can see that 3707 is waiting for 3740 to commit, because it's
trying to take ShareLock on 3740's transactionid:

> transactionid |          |          |      |       |            |     251189986 |         |       |          |
54/196626         | 3707 | ShareLock        | f       | f        | 2023-10-31 14:40:21.837507-05 

251189986 is indeed 3740's, because it has ExclusiveLock on that:

> transactionid |          |          |      |       |            |     251189986 |         |       |          |
60/259887         | 3740 | ExclusiveLock    | t       | f        | 

There are many reasons why one xact might be waiting on another to commit,
not only that they tried to update the same tuple.  However, in this case
I suspect that that is the problem, because we can also see that 3707 has
an exclusive tuple-level lock:

> tuple         |    91999 |    93050 |    0 |     1 |            |               |         |       |          |
54/196626         | 3707 | ExclusiveLock    | t       | f        | 

That kind of lock would only be held while queueing to modify a tuple.
(Basically, it establishes that 3707 is next in line, in case some
other transaction comes along and also wants to modify the same tuple.)
It should be released as soon as the tuple update is made, so 3707 is
definitely stuck waiting to modify a tuple, and AFAICS it must be stuck
because of 3740's uncommitted earlier update.

> But I am being told those 2 sessions should not be trying to process the
> same PK rows.

Perhaps "should not" is wrong.  Or it could be some indirect update
(caused by a foreign key with CASCADE, or the like).

You have here the relation OID (try "SELECT 93050::regclass" to
decode it) and the tuple ID, so it should work to do

SELECT * FROM that_table WHERE ctid = '(0,1)';

to see the previous state of the problematic tuple.  Might
help to decipher the problem.

            regards, tom lane



Performance down with JDBC 42

От
"Abraham, Danny"
Дата:
Hi,

Asking for help with a JDBC related issue.
Environment: Linux 7.9 PG 14.9 , very busy PG Server.

A big query - 3 unions and about 10 joins runs :
- 70ms on psql , DBeaver with JDBC 42  and  in our Server using old JDBC 9.2
- 2500 ms in our Server using new JDBC 42 driver. ( and  this is running many times)

Question: Is there a structured way to identify optimization setup ( Planner Method s ) changes?
Are there any known changes specific to JDBC 42.
Capture a vector of session optimization setup?
Any other Idea ?

Thanks

Danny




Re: Performance down with JDBC 42

От
Laurenz Albe
Дата:
On Sat, 2023-11-04 at 19:08 +0000, Abraham, Danny wrote:
> Asking for help with a JDBC related issue.
> Environment: Linux 7.9 PG 14.9 , very busy PG Server.
>
> A big query - 3 unions and about 10 joins runs :
> - 70ms on psql , DBeaver with JDBC 42  and  in our Server using old JDBC 9.2
> - 2500 ms in our Server using new JDBC 42 driver. ( and  this is running many times)
>
> Question: Is there a structured way to identify optimization setup ( Planner Method s ) changes?
> Are there any known changes specific to JDBC 42.

What I would do is enable auto_explain and look at the execution plan
when the statement is run by the JDBC driver.  Then you can compare the
execution plans and spot the difference.

Yours,
Laurenz Albe



RE: [EXTERNAL] Re: Performance down with JDBC 42

От
"Abraham, Danny"
Дата:
Thanks Laurenz,

Traced two huge plans. They differ.
The fast one does use Materialize and Memoize  (the psql).
Is there something in JDBC 42 that blocks these algoruthms?

Thanks again

Danny

-----Original Message-----
From: Laurenz Albe <laurenz.albe@cybertec.at> 
Sent: Saturday, November 4, 2023 11:07 PM
To: Abraham, Danny <danny_abraham@bmc.com>; psql-performance <pgsql-performance@postgresql.org>
Subject: [EXTERNAL] Re: Performance down with JDBC 42

On Sat, 2023-11-04 at 19:08 +0000, Abraham, Danny wrote:
> Asking for help with a JDBC related issue.
> Environment: Linux 7.9 PG 14.9 , very busy PG Server.
> 
> A big query - 3 unions and about 10 joins runs :
> - 70ms on psql , DBeaver with JDBC 42  and  in our Server using old 
> JDBC 9.2
> - 2500 ms in our Server using new JDBC 42 driver. ( and  this is 
> running many times)
> 
> Question: Is there a structured way to identify optimization setup ( Planner Method s ) changes?
> Are there any known changes specific to JDBC 42. 

What I would do is enable auto_explain and look at the execution plan when the statement is run by the JDBC driver.
Thenyou can compare the execution plans and spot the difference.
 

Yours,
Laurenz Albe

Re: [EXTERNAL] Re: Performance down with JDBC 42

От
Andreas Kretschmer
Дата:

Am 05.11.23 um 17:20 schrieb Abraham, Danny:
> Thanks Laurenz,
>
> Traced two huge plans. They differ.
> The fast one does use Materialize and Memoize  (the psql).
> Is there something in JDBC 42 that blocks these algoruthms?

*maybe* the driver changed some settings. You can check it with

select name, setting from pg_settings where name ~ 'enable';

using the JDBC-connection.


Regards, Andreas


>
> Thanks again
>
> Danny
>
> -----Original Message-----
> From: Laurenz Albe <laurenz.albe@cybertec.at>
> Sent: Saturday, November 4, 2023 11:07 PM
> To: Abraham, Danny <danny_abraham@bmc.com>; psql-performance <pgsql-performance@postgresql.org>
> Subject: [EXTERNAL] Re: Performance down with JDBC 42
>
> On Sat, 2023-11-04 at 19:08 +0000, Abraham, Danny wrote:
>> Asking for help with a JDBC related issue.
>> Environment: Linux 7.9 PG 14.9 , very busy PG Server.
>>
>> A big query - 3 unions and about 10 joins runs :
>> - 70ms on psql , DBeaver with JDBC 42  and  in our Server using old
>> JDBC 9.2
>> - 2500 ms in our Server using new JDBC 42 driver. ( and  this is
>> running many times)
>>
>> Question: Is there a structured way to identify optimization setup ( Planner Method s ) changes?
>> Are there any known changes specific to JDBC 42.
> What I would do is enable auto_explain and look at the execution plan when the statement is run by the JDBC driver.
Thenyou can compare the execution plans and spot the difference.
 
>
> Yours,
> Laurenz Albe

-- 
Andreas Kretschmer - currently still (garden leave)
Technical Account Manager (TAM)
www.enterprisedb.com




Re: [EXTERNAL] Re: Performance down with JDBC 42

От
Frits Hoogland
Дата:
Are you absolutely sure that the two databases you’re comparing the executing with are identical, and that the objects
involvedin the query are physically and logically identical? 

The planning is done based on cost/statistics of the objects. If the statistics are different, the planner may come up
withanother plan. 

Frits



> Op 5 nov 2023 om 17:20 heeft Abraham, Danny <danny_abraham@bmc.com> het volgende geschreven:
>
> Thanks Laurenz,
>
> Traced two huge plans. They differ.
> The fast one does use Materialize and Memoize  (the psql).
> Is there something in JDBC 42 that blocks these algoruthms?
>
> Thanks again
>
> Danny
>
> -----Original Message-----
> From: Laurenz Albe <laurenz.albe@cybertec.at>
> Sent: Saturday, November 4, 2023 11:07 PM
> To: Abraham, Danny <danny_abraham@bmc.com>; psql-performance <pgsql-performance@postgresql.org>
> Subject: [EXTERNAL] Re: Performance down with JDBC 42
>
>> On Sat, 2023-11-04 at 19:08 +0000, Abraham, Danny wrote:
>> Asking for help with a JDBC related issue.
>> Environment: Linux 7.9 PG 14.9 , very busy PG Server.
>>
>> A big query - 3 unions and about 10 joins runs :
>> - 70ms on psql , DBeaver with JDBC 42  and  in our Server using old
>> JDBC 9.2
>> - 2500 ms in our Server using new JDBC 42 driver. ( and  this is
>> running many times)
>>
>> Question: Is there a structured way to identify optimization setup ( Planner Method s ) changes?
>> Are there any known changes specific to JDBC 42.
>
> What I would do is enable auto_explain and look at the execution plan when the statement is run by the JDBC driver.
Thenyou can compare the execution plans and spot the difference. 
>
> Yours,
> Laurenz Albe



Re: [EXTERNAL] Re: Performance down with JDBC 42

От
Jeff Janes
Дата:
On Sun, Nov 5, 2023 at 11:20 AM Abraham, Danny <danny_abraham@bmc.com> wrote:
Thanks Laurenz,

Traced two huge plans. They differ.
The fast one does use Materialize and Memoize  (the psql).
Is there something in JDBC 42 that blocks these algoruthms?

Directly blocking those is not likely. Maybe the way the drivers fetch partial results is different, such that with one the planner knows to expect only partial results to be fetched and with the other it does not.  So in one case it chooses the fast-start plan, and in the other it doesn't.  But it will be hard to get anywhere if you just dribble information at us a bit at a time.  Can you come up with a self-contained test case?  Or at least show the entirety of both plans?

Cheers,

Jeff

RE: [EXTERNAL] Re: Performance down with JDBC 42

От
"Abraham, Danny"
Дата:
Thanks for the help.
Both plans refer to the same DB.

#1 – Fast – using psql or old JDBC driver
==>
Sort  (cost=13113.27..13113.33 rows=24 width=622)
   Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
   Sort Key: dm.calname, dm.jobyear
   ->  HashAggregate  (cost=13112.24..13112.48 rows=24 width=622)
         Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
         Group Key: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
         ->  Append  (cost=4603.96..13112.00 rows=24 width=622)
               ->  Unique  (cost=4603.96..4604.20 rows=19 width=535)
                     Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
                     ->  Sort  (cost=4603.96..4604.01 rows=19 width=535)
                           Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
                           Sort Key: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
                           ->  Nested Loop  (cost=0.00..4603.56 rows=19 width=535)
                                 Output: dm.calname, dm.jobyear, dm.caltype, (dm.daymask)::character varying(400)
                                 Join Filter: (((dm.calname)::text = (jd.dayscal)::text) OR ((dm.calname)::text =
(jd.weekcal)::text)OR ((dm.calname)::text = (jd.confcal)::text))
 
                                 ->  Seq Scan on public.cms_datemm dm  (cost=0.00..16.33 rows=171 width=389)
                                       Output: dm.calname, dm.jobyear, dm.daymask, dm.caltype, dm.caldesc
                                       Filter: ((dm.jobyear >= '2021'::bpchar) AND (dm.jobyear <= '2025'::bpchar))
                                 ->  Materialize  (cost=0.00..4559.84 rows=8 width=3)
                                       Output: jd.dayscal, jd.weekcal, jd.confcal
                                       ->  Seq Scan on public.cms_jobdef jd  (cost=0.00..4559.80 rows=8 width=3)
                                             Output: jd.dayscal, jd.weekcal, jd.confcal
                                             Filter: (((jd.schedtab)::text = 'PZL-ZETA_REDIS_UTILITY_PSE'::text) OR
((jd.schedtab)::text~~ 'PZL-ZETA_REDIS_UTILITY_PSE/%'::text))
 
               ->  Unique  (cost=3857.44..3857.46 rows=1 width=535)
                     Output: dm_1.calname, dm_1.jobyear, dm_1.caltype, ((dm_1.daymask)::character varying(400))
                     ->  Sort  (cost=3857.44..3857.45 rows=1 width=535)
                           Output: dm_1.calname, dm_1.jobyear, dm_1.caltype, ((dm_1.daymask)::character varying(400))
                           Sort Key: dm_1.calname, dm_1.jobyear, dm_1.caltype, ((dm_1.daymask)::character
varying(400))
                           ->  Nested Loop  (cost=0.30..3857.43 rows=1 width=535)
                                 Output: dm_1.calname, dm_1.jobyear, dm_1.caltype, (dm_1.daymask)::character
varying(400)
                                 Join Filter: (((dm_1.calname)::text = (tag.dayscal)::text) OR ((dm_1.calname)::text =
(tag.weekcal)::text)OR ((dm_1.calname)::text = (tag.confcal)::text))
 
                                 ->  Nested Loop  (cost=0.30..3838.11 rows=1 width=3)
                                       Output: tag.dayscal, tag.weekcal, tag.confcal
                                       Inner Unique: true
                                       ->  Seq Scan on public.cms_tag tag  (cost=0.00..30.96 rows=1396 width=7)
                                             Output: tag.tagname, tag.groupid, tag.maxwait, tag.cal_andor,
tag.monthstr,tag.dayscal, tag.weekcal, tag.confcal, tag.shift, tag.retro, tag.daystr, tag.wdaystr, tag.tagfrom,
tag.tagtill,tag.roworder, tag.exclude_rbc
 
                                       ->  Memoize  (cost=0.30..4.02 rows=1 width=4)
                                             Output: jd_1.jobno
                                             Cache Key: tag.groupid
                                             Cache Mode: logical
                                             ->  Index Scan using job on public.cms_jobdef jd_1  (cost=0.29..4.01
rows=1width=4)
 
                                                   Output: jd_1.jobno
                                                   Index Cond: (jd_1.jobno = tag.groupid)
                                                   Filter: (((jd_1.schedtab)::text =
'PZL-ZETA_REDIS_UTILITY_PSE'::text)OR ((jd_1.schedtab)::text ~~ 'PZL-ZETA_REDIS_UTILITY_PSE/%'::text))
 
                                 ->  Seq Scan on public.cms_datemm dm_1  (cost=0.00..16.33 rows=171 width=389)
                                       Output: dm_1.calname, dm_1.jobyear, dm_1.daymask, dm_1.caltype, dm_1.caldesc
                                       Filter: ((dm_1.jobyear >= '2021'::bpchar) AND (dm_1.jobyear <= '2025'::bpchar))
               ->  Unique  (cost=4649.93..4649.98 rows=4 width=535)
                     Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, ((dm_2.daymask)::character varying(400))
                     ->  Sort  (cost=4649.93..4649.94 rows=4 width=535)
                           Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, ((dm_2.daymask)::character varying(400))
                           Sort Key: dm_2.calname, dm_2.jobyear, dm_2.caltype, ((dm_2.daymask)::character
varying(400))
                           ->  Nested Loop  (cost=0.56..4649.89 rows=4 width=535)
                                 Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, (dm_2.daymask)::character
varying(400)
                                 Join Filter: (((dm_2.calname)::text = (tag_1.dayscal)::text) OR ((dm_2.calname)::text
=(tag_1.weekcal)::text) OR ((dm_2.calname)::text = (tag_1.confcal)::text))
 
                                 ->  Seq Scan on public.cms_datemm dm_2  (cost=0.00..16.33 rows=171 width=389)
                                       Output: dm_2.calname, dm_2.jobyear, dm_2.daymask, dm_2.caltype, dm_2.caldesc
                                       Filter: ((dm_2.jobyear >= '2021'::bpchar) AND (dm_2.jobyear <= '2025'::bpchar))
                                 ->  Materialize  (cost=0.56..4626.72 rows=2 width=3)
                                       Output: tag_1.dayscal, tag_1.weekcal, tag_1.confcal
                                       ->  Nested Loop  (cost=0.56..4626.71 rows=2 width=3)
                                             Output: tag_1.dayscal, tag_1.weekcal, tag_1.confcal
                                             Inner Unique: true
                                             ->  Nested Loop  (cost=0.29..4626.32 rows=1 width=5)
                                                   Output: tl.tagname
                                                   ->  Seq Scan on public.cms_jobdef jd_2  (cost=0.00..4559.80 rows=8
width=4)
                                                         Output: jd_2.jobname, jd_2.jobno, jd_2.descript, jd_2.applic,
jd_2.applgroup,jd_2.schedtab, jd_2.author, jd_2.owner, jd_2.priority, jd_2.critical, jd_2.cyclic, jd_2.retro,
jd_2.autoarch,jd_2.taskclass, jd_2.cyclicint, jd_2.tasktype, jd_2.datemem, jd_2.nodegrp, jd_2.platform, jd_2.nodeid,
jd_2.doclib,jd_2.docmem, jd_2.memlib, jd_2.memname, jd_2.overlib, jd_2.cmdline, jd_2.maxrerun, jd_2.maxdays,
jd_2.maxruns,jd_2.fromtime, jd_2.until, jd_2.maxwait, jd_2.daystr, jd_2.wdaystr, jd_2.monthstr, jd_2.ajfsonstr,
jd_2.conf,jd_2.unknowntim, jd_2.dayscal, jd_2.weekcal, jd_2.confcal, jd_2.cal_andor, jd_2.shift, jd_2.adjust_cond,
jd_2.startendcycind,jd_2.creationuserid, jd_2.creationdatetime, jd_2.changeuserid, jd_2.changedatetime,
jd_2.relationship,jd_2.groupid, jd_2.tabrowno, jd_2.multiagent, jd_2.appltype, jd_2.timezone, jd_2.statemsk,
jd_2.applver,jd_2.timeref, jd_2.actfrom, jd_2.acttill, jd_2.cmver, jd_2.applform, jd_2.instream_ind,
jd_2.instream_script,jd_2.run_times, jd_2.interval_sequence, jd_2.tolerance, jd_2.cyclic_type, jd_2.removeatonce,
jd_2.dayskeepinnotok,jd_2.delay, jd_2.end_folder, jd_2.is_reference, jd_2.referenced_path
 
                                                         Filter: (((jd_2.schedtab)::text =
'PZL-ZETA_REDIS_UTILITY_PSE'::text)OR ((jd_2.schedtab)::text ~~ 'PZL-ZETA_REDIS_UTILITY_PSE/%'::text))
 
                                                   ->  Index Scan using job_tag on public.cms_taglink tl
(cost=0.29..8.30rows=1 width=9)
 
                                                         Output: tl.tagname, tl.groupid, tl.jobno, tl.roworder,
tl.exclude_rbc
                                                         Index Cond: (tl.jobno = jd_2.jobno)
                                                         Filter: (tl.groupid = 0)
                                             ->  Index Scan using gro_tag on public.cms_tag tag_1  (cost=0.28..0.39
rows=1width=14)
 
                                                   Output: tag_1.tagname, tag_1.groupid, tag_1.maxwait,
tag_1.cal_andor,tag_1.monthstr, tag_1.dayscal, tag_1.weekcal, tag_1.confcal, tag_1.shift, tag_1.retro, tag_1.daystr,
tag_1.wdaystr,tag_1.tagfrom, tag_1.tagtill, tag_1.roworder, tag_1.exclude_rbc
 
                                                   Index Cond: ((tag_1.groupid = 0) AND ((tag_1.tagname)::text =
(tl.tagname)::text))
==>
Slow – when using JDBC 42
==>
Sort  (cost=11316.99..11317.00 rows=3 width=622)
  Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
  Sort Key: dm.calname, dm.jobyear
  ->  HashAggregate  (cost=11316.91..11316.94 rows=3 width=622)
        Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
        Group Key: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
        ->  Append  (cost=10252.89..11316.88 rows=3 width=622)
              ->  Unique  (cost=10252.89..10252.92 rows=1 width=535)
                    Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
                    ->  Sort  (cost=10252.89..10252.89 rows=3 width=535)
                          Output: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
                          Sort Key: dm.calname, dm.jobyear, dm.caltype, ((dm.daymask)::character varying(400))
                          ->  Nested Loop  (cost=0.14..10252.86 rows=3 width=535)
                                Output: dm.calname, dm.jobyear, dm.caltype, (dm.daymask)::character varying(400)
                                Join Filter: (((dm.calname)::text = (jd.dayscal)::text) OR ((dm.calname)::text =
(jd.weekcal)::text)OR ((dm.calname)::text = (jd.confcal)::text))
 
                                ->  Index Scan using calendar on public.cms_datemm dm  (cost=0.14..14.38 rows=1
width=389)
                                      Output: dm.calname, dm.jobyear, dm.daymask, dm.caltype, dm.caldesc
                                      Index Cond: ((dm.jobyear >= ($3)::bpchar) AND (dm.jobyear <= ($4)::bpchar))
                                ->  Seq Scan on public.cms_jobdef jd  (cost=0.00..10235.19 rows=188 width=3)
                                      Output: jd.dayscal, jd.weekcal, jd.confcal
                                      Filter: (((jd.schedtab)::text = ($1)::text) OR ((jd.schedtab)::text ~~
($2)::text))
              ->  Unique  (cost=180.91..180.93 rows=1 width=535)
                    Output: dm_1.calname, dm_1.jobyear, dm_1.caltype, ((dm_1.daymask)::character varying(400))
                    ->  Sort  (cost=180.91..180.92 rows=1 width=535)
                          Output: dm_1.calname, dm_1.jobyear, dm_1.caltype, ((dm_1.daymask)::character varying(400))
                          Sort Key: dm_1.calname, dm_1.jobyear, dm_1.caltype, ((dm_1.daymask)::character varying(400))
                          ->  Nested Loop  (cost=0.56..180.90 rows=1 width=535)
                                Output: dm_1.calname, dm_1.jobyear, dm_1.caltype, (dm_1.daymask)::character
varying(400)
                                Inner Unique: true
                                ->  Nested Loop  (cost=0.14..74.77 rows=18 width=393)
                                      Output: dm_1.calname, dm_1.jobyear, dm_1.caltype, dm_1.daymask, tag.groupid
                                      Join Filter: (((dm_1.calname)::text = (tag.dayscal)::text) OR
((dm_1.calname)::text= (tag.weekcal)::text) OR ((dm_1.calname)::text = (tag.confcal)::text))
 
                                      ->  Index Scan using calendar on public.cms_datemm dm_1  (cost=0.14..14.38 rows=1
width=389)
                                            Output: dm_1.calname, dm_1.jobyear, dm_1.daymask, dm_1.caltype,
dm_1.caldesc
                                            Index Cond: ((dm_1.jobyear >= ($7)::bpchar) AND (dm_1.jobyear <=
($8)::bpchar))
                                      ->  Seq Scan on public.cms_tag tag  (cost=0.00..35.96 rows=1396 width=7)
                                            Output: tag.tagname, tag.groupid, tag.maxwait, tag.cal_andor, tag.monthstr,
tag.dayscal,tag.weekcal, tag.confcal, tag.shift, tag.retro, tag.daystr, tag.wdaystr, tag.tagfrom, tag.tagtill,
tag.roworder,tag.exclude_rbc
 
                                ->  Index Scan using job on public.cms_jobdef jd_1  (cost=0.41..5.89 rows=1 width=4)
                                      Output: jd_1.jobno
                                      Index Cond: (jd_1.jobno = tag.groupid)
                                      Filter: (((jd_1.schedtab)::text = ($5)::text) OR ((jd_1.schedtab)::text ~~
($6)::text))
              ->  Unique  (cost=882.97..882.99 rows=1 width=535)
                    Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, ((dm_2.daymask)::character varying(400))
                    ->  Sort  (cost=882.97..882.98 rows=1 width=535)
                          Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, ((dm_2.daymask)::character varying(400))
                          Sort Key: dm_2.calname, dm_2.jobyear, dm_2.caltype, ((dm_2.daymask)::character varying(400))
                          ->  Nested Loop  (cost=67.06..882.96 rows=1 width=535)
                                Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, (dm_2.daymask)::character
varying(400)
                                Inner Unique: true
                                ->  Hash Join  (cost=66.64..225.90 rows=104 width=393)
                                      Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, dm_2.daymask, tl.jobno
                                      Hash Cond: ((tl.tagname)::text = (tag_1.tagname)::text)
                                      ->  Bitmap Heap Scan on public.cms_taglink tl  (cost=16.79..169.52 rows=1098
width=13)
                                            Output: tl.tagname, tl.groupid, tl.jobno, tl.roworder, tl.exclude_rbc
                                            Recheck Cond: (tl.groupid = 0)
                                            ->  Bitmap Index Scan on tl_groupid  (cost=0.00..16.52 rows=1098 width=0)
                                                  Index Cond: (tl.groupid = 0)
                                      ->  Hash  (cost=49.82..49.82 rows=2 width=404)
                                            Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, dm_2.daymask,
tag_1.tagname,tag_1.groupid
 
                                            ->  Nested Loop  (cost=9.48..49.82 rows=2 width=404)
                                                  Output: dm_2.calname, dm_2.jobyear, dm_2.caltype, dm_2.daymask,
tag_1.tagname,tag_1.groupid
 
                                                  Join Filter: (((dm_2.calname)::text = (tag_1.dayscal)::text) OR
((dm_2.calname)::text= (tag_1.weekcal)::text) OR ((dm_2.calname)::text = (tag_1.confcal)::text))
 
                                                  ->  Index Scan using calendar on public.cms_datemm dm_2
(cost=0.14..14.38rows=1 width=389)
 
                                                        Output: dm_2.calname, dm_2.jobyear, dm_2.daymask, dm_2.caltype,
dm_2.caldesc
                                                        Index Cond: ((dm_2.jobyear >= ($11)::bpchar) AND (dm_2.jobyear
<=($12)::bpchar))
 
                                                  ->  Bitmap Heap Scan on public.cms_tag tag_1  (cost=9.34..33.05
rows=137width=18)
 
                                                        Output: tag_1.tagname, tag_1.groupid, tag_1.maxwait,
tag_1.cal_andor,tag_1.monthstr, tag_1.dayscal, tag_1.weekcal, tag_1.confcal, tag_1.shift, tag_1.retro, tag_1.daystr,
tag_1.wdaystr,tag_1.tagfrom, tag_1.tagtill, tag_1.roworder, tag_1.exclude_rbc
 
                                                        Recheck Cond: (tag_1.groupid = 0)
                                                        ->  Bitmap Index Scan on gro_tag  (cost=0.00..9.30 rows=137
width=0)
                                                              Index Cond: (tag_1.groupid = 0)
                                ->  Index Scan using job on public.cms_jobdef jd_2  (cost=0.41..6.32 rows=1 width=4)
                                      Output: jd_2.jobno
                                      Index Cond: (jd_2.jobno = tl.jobno)
                                      Filter: (((jd_2.schedtab)::text = ($9)::text) OR ((jd_2.schedtab)::text ~~
($10)::text))
==>

Re: [EXTERNAL] Re: Performance down with JDBC 42

От
David Rowley
Дата:
On Mon, 6 Nov 2023 at 08:37, Abraham, Danny <danny_abraham@bmc.com> wrote:
>
> Both plans refer to the same DB.

JDBC is making use of PREPARE statements, whereas psql, unless you're
using PREPARE is not.

> #1 – Fast – using psql or old JDBC driver

The absence of any $1 type parameters here shows that's a custom plan
that's planned specifically using the parameter values given.

> Slow – when using JDBC 42

Because this query has $1, $2, etc, that's a generic plan. When
looking up statistics histogram bounds and MCV slots cannot be
checked. Only ndistinct is used. If you have a skewed dataset, then
this might not be very good.

You might find things run better if you adjust postgresql.conf and set
plan_cache_mode = force_custom_plan then select pg_reload_conf();

Please also check the documentation so that you understand the full
implications for that.

David



Re: [EXTERNAL] Performance down with JDBC 42

От
Frits Hoogland
Дата:
Very good point from Danny: generic and custom plans.

One thing that is almost certainly not at play here, and is mentioned: there are some specific cases where the planner does not optimise for the query in total to be executed as fast/cheap as possible, but for the first few rows. One reason for that to happen is if a query is used as a cursor.

(Warning: shameless promotion) I did a writeup on JDBC clientside/serverside prepared statements and custom and generic plans: https://dev.to/yugabyte/postgres-query-execution-jdbc-prepared-statements-51e2
The next obvious question then is if something material did change with JDBC for your old and new JDBC versions, I do believe the prepareThreshold did not change.


Frits Hoogland




On 5 Nov 2023, at 20:47, David Rowley <dgrowleyml@gmail.com> wrote:

On Mon, 6 Nov 2023 at 08:37, Abraham, Danny <danny_abraham@bmc.com> wrote:

Both plans refer to the same DB.

JDBC is making use of PREPARE statements, whereas psql, unless you're
using PREPARE is not.

#1 – Fast – using psql or old JDBC driver

The absence of any $1 type parameters here shows that's a custom plan
that's planned specifically using the parameter values given.

Slow – when using JDBC 42

Because this query has $1, $2, etc, that's a generic plan. When
looking up statistics histogram bounds and MCV slots cannot be
checked. Only ndistinct is used. If you have a skewed dataset, then
this might not be very good.

You might find things run better if you adjust postgresql.conf and set
plan_cache_mode = force_custom_plan then select pg_reload_conf();

Please also check the documentation so that you understand the full
implications for that.

David



Performance problems with Postgres JDBC 42.4.2

От
Jose Osinde
Дата:

Dear all,

I'm running a query  from Java on a postgres database:

Java version: 17
JDBC version: 42.4.2
Postgres version: 13.1

In parallel I'm testing the same queries from pgAdmin 4 version 6.13

The tables I'm using contains more than 10million rows each and I have two questions here:

1. I need to extract the path of a file without the file itself. For this I use two alternatives as I found that sentence "A" is much faster than the "B" one:

"A" sentence:

SELECT DISTINCT ( LEFT(opf.file_path, length(opf.file_path) - position('/' in reverse(opf.file_path))) ) AS path
                           FROM product AS op JOIN product_file AS opf ON opf.product_id = op.id
                           WHERE op.proprietary_end_date <= CURRENT_DATE AND op.id LIKE 'urn:esa:psa:%'

"B" sentence:

SELECT DISTINCT ( regexp_replace(opf.file_path, '(.*)\/(.*)$', '\1') ) AS path  
                           FROM product AS op JOIN product_file AS opf ON opf.product_id = op.id
                           WHERE op.proprietary_end_date <= CURRENT_DATE AND op.id LIKE 'urn:esa:psa:%'

2. Running sentence "A" on the pgAdmin client takes 4-5 minutes to finish but running it from a Java program it never ends. This is still the case when I limit the output to the first 100 rows so I assume this is not a problem with the amount of data being transferred but the way postgres resolve the query. To make it work in Java I had to define a postgres function that I call from the Java code instead of running the query directly. 

I had a similar problem in the past with a query that performed very poorly from a Java client while it was fine from pgAdmin or a python script. In that case it was a matter of column types not compatible with the JDBC (citext) deriving in an implicit cast that prevented the postgres engine from using a given index or to cast all the values of that column before using it, not sure now. But I don't think this is not the case here.

Could anyone help me again?

Many thanks in advance
Jose


Re: Performance problems with Postgres JDBC 42.4.2

От
Dave Cramer
Дата:


On Mon, 6 Nov 2023 at 09:59, Jose Osinde <jose.osinde@gmail.com> wrote:

Dear all,

I'm running a query  from Java on a postgres database:

Java version: 17
JDBC version: 42.4.2
Postgres version: 13.1

In parallel I'm testing the same queries from pgAdmin 4 version 6.13

The tables I'm using contains more than 10million rows each and I have two questions here:

1. I need to extract the path of a file without the file itself. For this I use two alternatives as I found that sentence "A" is much faster than the "B" one:

"A" sentence:

SELECT DISTINCT ( LEFT(opf.file_path, length(opf.file_path) - position('/' in reverse(opf.file_path))) ) AS path
                           FROM product AS op JOIN product_file AS opf ON opf.product_id = op.id
                           WHERE op.proprietary_end_date <= CURRENT_DATE AND op.id LIKE 'urn:esa:psa:%'

"B" sentence:

SELECT DISTINCT ( regexp_replace(opf.file_path, '(.*)\/(.*)$', '\1') ) AS path  
                           FROM product AS op JOIN product_file AS opf ON opf.product_id = op.id
                           WHERE op.proprietary_end_date <= CURRENT_DATE AND op.id LIKE 'urn:esa:psa:%'

2. Running sentence "A" on the pgAdmin client takes 4-5 minutes to finish but running it from a Java program it never ends. This is still the case when I limit the output to the first 100 rows so I assume this is not a problem with the amount of data being transferred but the way postgres resolve the query. To make it work in Java I had to define a postgres function that I call from the Java code instead of running the query directly. 

I had a similar problem in the past with a query that performed very poorly from a Java client while it was fine from pgAdmin or a python script. In that case it was a matter of column types not compatible with the JDBC (citext) deriving in an implicit cast that prevented the postgres engine from using a given index or to cast all the values of that column before using it, not sure now. But I don't think this is not the case here.

Could anyone help me again?

Can you share your java code ?

If you are using a PreparedStatement the driver will use the extended protocol which may be slower. Statements use SimpleQuery which is faster and more like pgadmin


Dave

RE: [EXTERNAL] Performance down with JDBC 42

От
"Abraham, Danny"
Дата:
Hi guys,
Thanks for the help.
I was able to recreate the problem , on the same DB, with PSQL only. No JDBC.

A plain run of a complicated query :                      50ms
A prepare and then execute of the same query:   2500ms.

The plans are different, as discussed above. The fast one is using Materialize and Memoize.

Thanks

Danny