Re: Inconsistent query performance based on relation hit frequency

Поиск
Список
Период
Сортировка
От Laura Hausmann
Тема Re: Inconsistent query performance based on relation hit frequency
Дата
Msg-id CAKGAs9mVM=Qmd4ViNOJ9dk983ez-LEow4zrJpZaDBy1cxReaUA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Inconsistent query performance based on relation hit frequency  (Andrei Lepikhov <lepihov@gmail.com>)
Ответы Re: Inconsistent query performance based on relation hit frequency
Список pgsql-performance
Heya & thank you for the response!

That makes a lot of sense. I'm glad to hear it's on the radar of the team, but I understand that this is a complex task and won't happen anytime soon.

For the meantime, I've tried a couple ways of rewriting the query, sadly none of which seem to translate to the production database:

Simply dropping the or/union clause (and adding a relationship to the user themselves) fixes the problem in the test database (both user 1 (https://explain.depesz.com/s/ZY8l) and user 4 (https://explain.depesz.com/s/Q2Wk) run in 1~15ms, which isn't perfect but good enough), but not the production one (still fast for high frequency (https://explain.depesz.com/s/DixF) and slow for low frequency (https://explain.depesz.com/s/fIKm) users).

I also tried rewriting it as a join (https://explain.depesz.com/s/36Ve), but that also didn't seem to have an effect.

It's very possible I missed one or multiple ways the query could be rewritten in.

I'm sadly not sure how I could generate a test dataset that more closely resembles the production workload. In case that would be helpful in debugging this further, any tips on that would be greatly appreciated.

Thanks in advance,
Laura Hausmann


On Thu, Jun 27, 2024 at 12:31 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 6/27/24 07:50, Laura Hausmann wrote:
> I'd appreciate any and all input on the situation. If I've left out any
> information that would be useful in figuring this out, please tell me.
Thanks for this curious case, I like it!
At first, you can try to avoid "OR" expressions - PostgreSQL has quite
limited set of optimisation/prediction tricks on such expressions.
Second - I see, postgres predicts wrong number of tuples. But using my
typical tool [1] and getting more precise estimations i don't see
significant profit:

  Limit  (cost=10832.85..10838.69 rows=50 width=21)
    ->  Gather Merge  (cost=10832.85..10838.92 rows=52 width=21)
          Workers Planned: 2
          Workers Launched: 2
          ->  Sort  (cost=9832.83..9832.90 rows=26 width=21)
                Sort Key: objects.id DESC
                Sort Method: top-N heapsort  Memory: 32kB
                Worker 0:  Sort Method: quicksort  Memory: 32kB
                Worker 1:  Sort Method: quicksort  Memory: 32kB
                ->  Parallel Seq Scan on objects
                      Filter: ((hashed SubPlan 1) OR ("userId" = 1))
                      Rows Removed by Filter: 183372
                      SubPlan 1
                        ->  Nested Loop
                              ->  Index Only Scan using users_pkey on
                                    Index Cond: (id = 1)
                                    Heap Fetches: 0
                              ->  Index Only Scan using
"relationships_followerId_followeeId_idx" on relationships
                                    Index Cond: ("followerId" = 1)
                                    Heap Fetches: 0
  Planning Time: 0.762 ms
  Execution Time: 43.816 ms

  Limit  (cost=10818.83..10819.07 rows=2 width=21)
    ->  Gather Merge  (cost=10818.83..10819.07 rows=2 width=21)
          Workers Planned: 2
          Workers Launched: 2
          ->  Sort  (cost=9818.81..9818.81 rows=1 width=21)
                Sort Key: objects.id DESC
                Sort Method: quicksort  Memory: 25kB
                Worker 0:  Sort Method: quicksort  Memory: 25kB
                Worker 1:  Sort Method: quicksort  Memory: 25kB
                ->  Parallel Seq Scan on objects
                      Filter: ((hashed SubPlan 1) OR ("userId" = 4))
                      Rows Removed by Filter: 183477
                      SubPlan 1
                        ->  Nested Loop  (cost=0.56..8.61 rows=1 width=4)
                              ->  Index Only Scan using
"relationships_followerId_followeeId_idx" on relationships
                                    Index Cond: ("followerId" = 4)
                                    Heap Fetches: 0
                              ->  Index Only Scan using users_pkey
                                    Index Cond: (id = 4)
                                    Heap Fetches: 0
  Planning Time: 0.646 ms
  Execution Time: 30.824 ms

But this was achieved just because of parallel workers utilisation.
Disabling them we get:

  Limit  (cost=14635.07..14635.08 rows=2 width=21) (actual
time=75.941..75.943 rows=0 loops=1)
    ->  Sort  (cost=14635.07..14635.08 rows=2 width=21) (actual
time=75.939..75.940 rows=0 loops=1)
          Sort Key: objects.id DESC
          Sort Method: quicksort  Memory: 25kB
          ->  Seq Scan on objects  (cost=8.61..14635.06 rows=2 width=21)
(actual time=75.931..75.932 rows=0 loops=1)
                Filter: ((hashed SubPlan 1) OR ("userId" = 4))
                Rows Removed by Filter: 550430
                SubPlan 1
                  ->  Nested Loop  (cost=0.56..8.61 rows=1 width=4)
(actual time=0.039..0.040 rows=0 loops=1)
                        ->  Index Only Scan using
"relationships_followerId_followeeId_idx" on relationships
(cost=0.28..4.29 rows=1 width=8) (actual time=0.038..0.038 rows=0 loops=1)
                              Index Cond: ("followerId" = 4)
                              Heap Fetches: 0
                        ->  Index Only Scan using users_pkey on users
(cost=0.29..4.31 rows=1 width=4) (never executed)
                              Index Cond: (id = 4)
                              Heap Fetches: 0
  Planning Time: 0.945 ms
  Execution Time: 76.123 ms

So, from the optimiser's point of view, it has done the best it could.
Theoretically, if you have a big table with indexes and must select a
small number of tuples, the ideal query plan will include parameterised
NestLoop JOINs. Unfortunately, parameterisation in PostgreSQL can't pass
inside a subquery. It could be a reason for new development because
MSSQL can do such a trick, but it is a long way.
You can try to rewrite your schema and query to avoid subqueries in
expressions at all.
I hope this message gave you some insights.

[1] https://github.com/postgrespro/aqo

--
regards, Andrei Lepikhov

В списке pgsql-performance по дате отправления:

Предыдущее
От: Achilleas Mantzios - cloud
Дата:
Сообщение: Re: Inconsistent query performance based on relation hit frequency
Следующее
От: Achilleas Mantzios
Дата:
Сообщение: Re: Inconsistent query performance based on relation hit frequency