Re: Use virtual tuple slot for Unique node

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Use virtual tuple slot for Unique node
Дата
Msg-id 498cf1d0-ee06-16a7-6322-e2ff77b79bdd@iki.fi
обсуждение исходный текст
Ответ на Re: Use virtual tuple slot for Unique node  (Денис Смирнов <darthunix@gmail.com>)
Ответы Re: Use virtual tuple slot for Unique node  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
I did a little more perf testing with this. I'm seeing the same benefit 
with the query you posted. But can we find a case where it's not 
beneficial? If I understand correctly, when the input slot is a virtual 
slot, it's cheaper to copy it to another virtual slot than to form a 
minimal tuple. Like in your test case. What if the input is a minimial 
tuple?

On master:

postgres=# set enable_hashagg=off;
SET
postgres=# explain analyze select distinct g::text, 'a', 'b', 'c','d', 
'e','f','g','h' from generate_series(1, 5000000) g;
 
QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------------------------------------
  Unique  (cost=2630852.42..2655852.42 rows=200 width=288) (actual 
time=4525.212..6576.992 rows=5000000 loops=1)
    ->  Sort  (cost=2630852.42..2643352.42 rows=5000000 width=288) 
(actual time=4525.211..5960.967 rows=5000000 loops=1)
          Sort Key: ((g)::text)
          Sort Method: external merge  Disk: 165296kB
          ->  Function Scan on generate_series g  (cost=0.00..75000.00 
rows=5000000 width=288) (actual time=518.914..1194.702 rows=5000000 loops=1)
  Planning Time: 0.036 ms
  JIT:
    Functions: 5
    Options: Inlining true, Optimization true, Expressions true, 
Deforming true
    Timing: Generation 0.242 ms (Deform 0.035 ms), Inlining 63.457 ms, 
Optimization 29.764 ms, Emission 20.592 ms, Total 114.056 ms
  Execution Time: 6766.399 ms
(11 rows)


With the patch:

postgres=# set enable_hashagg=off;
SET
postgres=# explain analyze select distinct g::text, 'a', 'b', 'c','d', 
'e','f','g','h' from generate_series(1, 5000000) g;
 
QUERY PLAN 


---------------------------------------------------------------------------------------------------------------------------------------------------
  Unique  (cost=2630852.42..2655852.42 rows=200 width=288) (actual 
time=4563.639..7362.467 rows=5000000 loops=1)
    ->  Sort  (cost=2630852.42..2643352.42 rows=5000000 width=288) 
(actual time=4563.637..6069.000 rows=5000000 loops=1)
          Sort Key: ((g)::text)
          Sort Method: external merge  Disk: 165296kB
          ->  Function Scan on generate_series g  (cost=0.00..75000.00 
rows=5000000 width=288) (actual time=528.060..1191.483 rows=5000000 loops=1)
  Planning Time: 0.720 ms
  JIT:
    Functions: 5
    Options: Inlining true, Optimization true, Expressions true, 
Deforming true
    Timing: Generation 0.406 ms (Deform 0.065 ms), Inlining 68.385 ms, 
Optimization 21.656 ms, Emission 21.033 ms, Total 111.480 ms
  Execution Time: 7585.306 ms
(11 rows)


So not a win in this case. Could you peek at the outer slot type, and 
use the same kind of slot for the Unique's result? Or some more 
complicated logic, like use a virtual slot if all the values are 
pass-by-val? I'd also like to keep this simple, though...

Would this kind of optimization make sense elsewhere?

-- 
Heikki Linnakangas
Neon (https://neon.tech)




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Questions about the new subscription parameter: password_required
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: bug fix and documentation improvement about vacuumdb