Обсуждение: Distinct performance dropped by multiple times in v16

Поиск
Список
Период
Сортировка

Distinct performance dropped by multiple times in v16

От
Vitaliy Litovskiy
Дата:

Hello,

In our application after upgrading from postgresql from v15 to v16 performance for some queries dropped from less than 1-2 seconds to 2+ minutes. 
After checking out some blogposts regarding DISTINCT improvements in v16 i was able to "fix" it by adding order by clauses in subqueries, still it looks more like a workaround for me and i would like to ask if it is working as expected now.

I was able to reproduce the issue by the following steps
1. start postgresql v16 in docker 
     docker run --name some-postgres16 -e POSTGRES_PASSWORD=mysecretpassword -d postgres:16
2. create tables with data.
    - data is originated by northwind db, but tables were created with different schema by custom software
   - each table has 8k rows
    -   _BDN_Terretories__202406100822.sql and  _BDN_EmployeeTerritories__202406100822.sql files are attached 
3. Execute the following query 

select distinct tbl1."BDN_EmployeeTerritories_ID", tbl2."BDN_Terretories_ID", tbl1."Reference_Date" from

(

select "BDN_EmployeeTerritories_ID", "Reference_Date", token

from public."BDN_EmployeeTerritories",

unnest(string_to_array("EMP_TerretoryID", ';')) s(token)

--order by token

) tbl1

join (

select "BDN_Terretories_ID", token from public."BDN_Terretories", unnest(string_to_array("EMP_TerretoryID", ';')) s(token)

) tbl2 on tbl1.token = tbl2.token Observations:
1. query runs for 4-5 seconds on v16 and less than a second on v15 2. in v16 it also goes downs to less than a second if 2.1 distinct is removed

2.2 unnest is removed. it is not really needed for this particular data but this query is autogenerated and unnest makes sense for other data

2.3 "order by token" is uncommented, this is my current way of fixing the problem I would really appreciate some feedback if that is expected behaviour and if there are better solutions


--
Best Regards,
Vitalii Lytovskyi
Вложения

Re: Distinct performance dropped by multiple times in v16

От
Greg Sabino Mullane
Дата:
On Mon, Jun 10, 2024 at 3:32 AM Vitaliy Litovskiy <vitaliy.litovskiy@gmail.com> wrote:

1. query runs for 4-5 seconds on v16 and less than a second on v15


Yeah, that's a big regression.  Seeing the actual EXPLAIN ANALYZE output for both systems would be very helpful to us. Also nice to see the plan if you do a 
SET enable_incremental_sort=0;
before running the query

I wonder if materializing things might help, something like

with
  x as (select id, unnest(string_to_array(emp,';')) as token, refdate from terr),
  y as (select id, unnest(string_to_array(emp,';')) as token from employee)
select distinct x.id, y.id, refdate from x join y using (token);

(renamed to avoid all those mixed-case quoting)

Cheers,
Greg

Re: Distinct performance dropped by multiple times in v16

От
Andrei Lepikhov
Дата:
On 6/10/24 13:59, Vitaliy Litovskiy wrote:
> ) tbl2 on tbl1.token = tbl2.token Observations:
> 1. query runs for 4-5 seconds on v16 and less than a second on v15 2. in 
> v16 it also goes downs to less than a second if 2.1 distinct is removed
> 
> 2.2 unnest is removed. it is not really needed for this particular data 
> but this query is autogenerated and unnest makes sense for other data
> 
> 2.3 "order by token" is uncommented, this is my current way of fixing 
> the problem I would really appreciate some feedback if that is expected 
> behaviour and if there are better solutions
The reason for this behaviour is simple: commit 3c6fc58 allowed using 
incremental_sort with DISTINCT clauses.
So, in PostgreSQL 16 we have two concurrent strategies:
1. HashJoin + hashAgg at the end
2, NestLoop, which derives sort order from the index scan and planner 
can utilise IncrementalSort+Unique instead of full sort.

Disabling Incremental Sort (see explain 2) we get good plan with 
HashJoin at the top. Now we can see, that HashJoin definitely cheaper 
according to total cost, but has a big startup cost. Optimiser compares 
cost of incremental cost and, compare to hash agg it is much cheaper 
because of a reason.

Here we already have a couple of questions:
1. Why optimiser overestimates in such a simple situation.
2. Why in the case of big numbers of tuples incremental sort is better 
than hashAgg?

Before the next step just see how optimiser decides in the case of 
correct prediction. I usually use the AQO extension for that. Executing 
the query twice with the AQO in 'learn' mode we have correct plannedrows 
number in each node of the plan and, as you can see in EXPLAIN 3, 
optimiser chooses good strategy. So, having correct predictions, 
optimiser ends up with optimal plan.

Origins of overestimation lie in internals of the unnest, it is not 
obvious so far and may be discovered later.
The reason, why optimiser likes NestLoop on big data looks enigmatic. 
Attempting to increase a cost of unnest routing from 1 to 1E5 we don't 
see any changes in decision. In my opinion, the key reason here may be 
triggered by unusual width of the JOIN result:
Hash Join  (cost=0.24..0.47 rows=10 width=0)
In my opinion, the cost model can provide too low cost of the join and 
it is a reason why upper NestLoop looks better than HashJoin.

I don't have any general recommendations to resolve this issue, but this 
case should be discovered by the core developers.

[1] https://github.com/postgrespro/aqo

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

Re: Distinct performance dropped by multiple times in v16

От
Andrei Lepikhov
Дата:
On 6/10/24 13:59, Vitaliy Litovskiy wrote:
> 2.2 unnest is removed. it is not really needed for this particular data 
> but this query is autogenerated and unnest makes sense for other data
> 
> 2.3 "order by token" is uncommented, this is my current way of fixing 
> the problem I would really appreciate some feedback if that is expected 
> behaviour and if there are better solutions
After second thought I found that the key issue here is too cheap cycles 
other unnest() routine. unnest procost is 1 as any other routines but it 
looks a bit more costly than it is.
Also, you can tune cpu_operator_cost a bit. Right now it is set to 
0.0025 by default. Increasing it to 0.005:
SET cpu_operator_cost = 0.005;
resolves your issue.
I guess, the value of cpu_operator_cost usually not a problem, because 
table pages costs much more. But here function calls are the main source 
of load and because of that need to be estimated more precisely.
I hope this will be helpful for you.

-- 
regards,
Andrei Lepikhov
Postgres Professional