Re: [HACKERS] Secondary index access optimizations

Поиск

Список

Период

Сортировка

От	Konstantin Knizhnik
Тема	Re: [HACKERS] Secondary index access optimizations
Дата	22 марта 2018 г. 15:38:33
Msg-id	9d3154bd-5e2d-0e0a-4975-ec76451bae90@postgrespro.ru обсуждение исходный текст
Ответ на	Re: [HACKERS] Secondary index access optimizations (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Ответы	Re: [HACKERS] Secondary index access optimizations (David Rowley <david.rowley@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On 21.03.2018 20:30, Konstantin Knizhnik wrote:

On 01.03.2018 23:15, Andres Freund wrote:
Hi,


This patch seems like quite a good improvement. One thing I've not
really looked at but am slightly concerned in passing: Are there cases
where we now would do a lot of predicate pruning work even though the
overhead of just evaluating the qual is trivial, e.g. because there's
only one row due to a pkey? The predtest code is many things but
lightning fast is not one of them.  Obviously that won't matter for
analytics queries, but I could see it hurt in OLTPish workloads...

Greetings,

Andres Freund
Hi,
I am sorry for delay with answer.

I understand your concern and did the following experiment.
I have initialized the database in the following way:

create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 10000)) inherits (base);
create table part2 (check (k between 10001 and 20000)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert into part1 values (generate series(1,10000), random()*100000);
insert into part2 values (generate_series(10001,20000), random()*100000);
vacuum analyze part1;
vacuum analyze part2;

Vanilla Postgres uses the following plan:

explain select * from base where k between 1 and 20000 and v = 100;
                              QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..16.63 rows=3 width=8)
   -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
         Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
   -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8)
         Index Cond: (v = 100)
         Filter: ((k >= 1) AND (k <= 20000))
   -> Index Scan using pi2 on part2 (cost=0.29..8.31 rows=1 width=8)
         Index Cond: (v = 100)
         Filter: ((k >= 1) AND (k <= 20000))
(9 rows)

Execution of this query 100000 times gives is done in 12 seconds.
With applied patch query plan is changed to:

                              QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..16.62 rows=3 width=8)
   -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
         Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
   -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8)
         Index Cond: (v = 100)
   -> Index Scan using pi2 on part2 (cost=0.29..8.30 rows=1 width=8)
         Index Cond: (v = 100)
(7 rows)

Elapsed time of 100000 query executions is 13 seconds.
So you was right that increased query optimization time exceeds advantage of extra checks elimination.
But it is true only if we are not using prepare statements.
With prepared statements results are the following:

Vanilla:       0m3.915s
This patch: 0m3.563s

So improvement is not so large, but it exists.
If somebody wants to repeat my experiments, I attached to this mail small shell script which I used to run query several times.
Non-prepared query is launched using the following command:

time ./repeat-query.sh "select * from base where k between 1 and 20000 and v = 100" 100000

and prepared query:

time ./repeat-query.sh "execute select100" 100000 "prepare select100 as select * from base where k between 1 and 20000 and v = 100"

And once again I want to notice that using prepared statements can increase performance almost 3 times!
As far as using prepared statements is not always possible I want to recall autoprepare patch waiting at the commitfest:

https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b3f3@postgrespro.ru

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Attached please find rebased version of the patch.
Also I do more testing, now using pgbench.
Scripts for initialization of the database and for custom script for pgbench are attached.
Results at my computer are the following:

pgbench options	Vanilla	Patch
-c 1	9208	8289
-c 1 -M prepared	38503	41206
-c 10	39224	34040
-c 10 -M prepared	165465	172874

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Fabien COELHO
Дата: 22 марта 2018 г., 15:32:13
Сообщение: Re: pg_stat_statements HLD for futur developments

Следующее

От: Dean Rasheed
Дата: 22 марта 2018 г., 15:56:11
Сообщение: Re: MCV lists for highly skewed distributions

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Secondary index access optimizations

Вложения

Предыдущее

Следующее