Re: Performance improvement for joins where outer side is unique

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Performance improvement for joins where outer side is unique
Дата	14 октября 2015 г. 00:28:31
Msg-id	CAKJS1f-t3s5NN-PxNCWoMviOwOFFeNQMefHcQP28iDEqnSGV1g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Performance improvement for joins where outer side is unique (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: Performance improvement for joins where outer side is unique
Список	pgsql-hackers

Дерево обсуждения

On 4 September 2015 at 04:50, Robert Haas <robertmhaas@gmail.com> wrote:

Also: very nice performance results.

Thanks.

On following a thread in [General] [1] it occurred to me that this patch can give a massive improvement on Merge joins where the mark and restore causes an index scan to have to skip over many filtered rows again and again.

I mocked up some tables and some data from the scenario on the [General] thread:

create table customers (id bigint, group_id bigint not null);

insert into customers select x.x,x.x%27724+1 from generate_series(1,473733) x(x);

alter table customers add constraint customer_pkey primary key (id);

create table balances (id bigint, balance int not null, tracking_number int not null, customer_id bigint not null);

insert into balances select x.x, 100, 12345, x.x % 45 + 1 from generate_Series(1,16876) x(x);

create index balance_customer_id_index on balances (customer_id);

create index balances_customer_tracking_number_index on balances (customer_id,tracking_number);

analyze;

Unpatched I get:

test=# explain analyze SELECT ac.* FROM balances ac join customers o ON o.id = ac.customer_id WHERE o.group_id = 45;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------

Merge Join (cost=164.87..1868.70 rows=1 width=24) (actual time=6.110..1491.408 rows=375 loops=1)

Merge Cond: (ac.customer_id = o.id)

-> Index Scan using balance_customer_id_index on balances ac (cost=0.29..881.24 rows=16876 width=24) (actual time=0.009..5.206 rows=16876 loops=1)

-> Index Scan using customer_pkey on customers o (cost=0.42..16062.75 rows=17 width=8) (actual time=0.014..1484.382 rows=376 loops=1)

Filter: (group_id = 45)

Rows Removed by Filter: 10396168

Planning time: 0.207 ms

Execution time: 1491.469 ms

(8 rows)

Patched:

test=# explain analyze SELECT ac.* FROM balances ac join customers o ON o.id = ac.customer_id WHERE o.group_id = 45;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------

Merge Join (cost=164.87..1868.70 rows=1 width=24) (actual time=6.037..11.528 rows=375 loops=1)

Merge Cond: (ac.customer_id = o.id)

-> Index Scan using balance_customer_id_index on balances ac (cost=0.29..881.24 rows=16876 width=24) (actual time=0.009..4.978 rows=16876 loops=1)

-> Index Scan using customer_pkey on customers o (cost=0.42..16062.75 rows=17 width=8) (actual time=0.015..5.141 rows=2 loops=1)

Filter: (group_id = 45)

Rows Removed by Filter: 27766

Planning time: 0.204 ms

Execution time: 11.575 ms

(8 rows)

Now it could well be that the merge join costs need a bit more work to avoid a merge join in this case, but as it stands as of today, this is your performance gain.

Regards

David Rowley

[1] http://www.postgresql.org/message-id/CACZYdDiAxEAM2RkMHBMuhwVcM4txH+5E3HQGgGyzfzbn-PnvFg@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 14 октября 2015 г., 00:14:31
Сообщение: Re: Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files

Следующее

От: Robert Haas
Дата: 14 октября 2015 г., 01:00:03
Сообщение: Re: Parallel Seq Scan

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Performance improvement for joins where outer side is unique

Предыдущее

Следующее