Обсуждение: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4

Поиск
Список
Период
Сортировка

Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4

От
Chris Hanks
Дата:
Hi everyone -

I have a slow query issue in an app I'm working on. I'm unfortunately
not at liberty to share the query/schema details, but I've put
together a very similar reproduction of the issue:

-----

CREATE TABLE a (id integer primary key, col integer);
CREATE TABLE b (id integer primary key, col integer);
CREATE TABLE c (id integer primary key, col integer);
CREATE TABLE d (id integer primary key, col integer);
CREATE TABLE e (id integer primary key, col integer);

INSERT INTO a (id, col) SELECT i, floor(random() * 100000) FROM
generate_series(1, 100000, 2) i;
INSERT INTO b (id, col) SELECT i, floor(random() * 100000) FROM
generate_series(1, 100000, 2) i;
INSERT INTO c (id, col) SELECT i, floor(random() * 100000) FROM
generate_series(2, 100000, 2) i;
INSERT INTO d (id, col) SELECT i, floor(random() * 100000) FROM
generate_series(2, 100000, 2) i;
INSERT INTO e (id, col) SELECT i, floor(random() * 100000) FROM
generate_series(1, 100000, 1) i;

ANALYZE;

CREATE VIEW tables AS
  SELECT a.*, b.col AS other_col
  FROM a
  LEFT JOIN b ON a.id = b.id
  UNION ALL
  SELECT c.*, d.col AS other_col
  FROM c
  LEFT JOIN d ON c.id = d.id;

EXPLAIN ANALYZE
SELECT *
FROM tables
WHERE id = 89; -- Index scans, as expected.

EXPLAIN ANALYZE
SELECT *
FROM e
JOIN tables ON e.col = tables.id
WHERE e.id = 568; -- Big merge joins, when simple index scans should
be possible?

-----

Would this be considered a deficiency in the optimizer? Is there a simple fix?

Thanks!
Chris


Re: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4

От
Igor Neyman
Дата:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Chris Hanks
> Sent: Thursday, July 10, 2014 5:02 AM
> To: PostgreSQL General
> Subject: [GENERAL] Joining on a view containing a UNION ALL produces a
> suboptimal plan on 9.3.4
> 
> Hi everyone -
> 
> I have a slow query issue in an app I'm working on. I'm unfortunately not at
> liberty to share the query/schema details, but I've put together a very similar
> reproduction of the issue:
> 
> -----
> 
> CREATE TABLE a (id integer primary key, col integer); CREATE TABLE b (id
> integer primary key, col integer); CREATE TABLE c (id integer primary key, col
> integer); CREATE TABLE d (id integer primary key, col integer); CREATE TABLE
> e (id integer primary key, col integer);
> 
> INSERT INTO a (id, col) SELECT i, floor(random() * 100000) FROM
> generate_series(1, 100000, 2) i; INSERT INTO b (id, col) SELECT i,
> floor(random() * 100000) FROM generate_series(1, 100000, 2) i; INSERT INTO
> c (id, col) SELECT i, floor(random() * 100000) FROM generate_series(2,
> 100000, 2) i; INSERT INTO d (id, col) SELECT i, floor(random() * 100000) FROM
> generate_series(2, 100000, 2) i; INSERT INTO e (id, col) SELECT i,
> floor(random() * 100000) FROM generate_series(1, 100000, 1) i;
> 
> ANALYZE;
> 
> CREATE VIEW tables AS
>   SELECT a.*, b.col AS other_col
>   FROM a
>   LEFT JOIN b ON a.id = b.id
>   UNION ALL
>   SELECT c.*, d.col AS other_col
>   FROM c
>   LEFT JOIN d ON c.id = d.id;
> 
> EXPLAIN ANALYZE
> SELECT *
> FROM tables
> WHERE id = 89; -- Index scans, as expected.
> 
> EXPLAIN ANALYZE
> SELECT *
> FROM e
> JOIN tables ON e.col = tables.id
> WHERE e.id = 568; -- Big merge joins, when simple index scans should be
> possible?
> 
> -----
> 
> Would this be considered a deficiency in the optimizer? Is there a simple fix?
> 
> Thanks!
> Chris
> 

Chris,

" JOIN tables ON e.col = tables.id"  - is this a typo?

Shouldn't it be " JOIN tables ON e.id = tables.id" ?

Or, you need it the way it is?

Regards,
Igor Neyman



Re: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4

От
Chris Hanks
Дата:
I need it the way it is. It's a foreign key in the actual query.

Thanks!

On Thu, Jul 10, 2014 at 8:31 AM, Igor Neyman <ineyman@perceptron.com> wrote:
>> -----Original Message-----
>> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
>> owner@postgresql.org] On Behalf Of Chris Hanks
>> Sent: Thursday, July 10, 2014 5:02 AM
>> To: PostgreSQL General
>> Subject: [GENERAL] Joining on a view containing a UNION ALL produces a
>> suboptimal plan on 9.3.4
>>
>> Hi everyone -
>>
>> I have a slow query issue in an app I'm working on. I'm unfortunately not at
>> liberty to share the query/schema details, but I've put together a very similar
>> reproduction of the issue:
>>
>> -----
>>
>> CREATE TABLE a (id integer primary key, col integer); CREATE TABLE b (id
>> integer primary key, col integer); CREATE TABLE c (id integer primary key, col
>> integer); CREATE TABLE d (id integer primary key, col integer); CREATE TABLE
>> e (id integer primary key, col integer);
>>
>> INSERT INTO a (id, col) SELECT i, floor(random() * 100000) FROM
>> generate_series(1, 100000, 2) i; INSERT INTO b (id, col) SELECT i,
>> floor(random() * 100000) FROM generate_series(1, 100000, 2) i; INSERT INTO
>> c (id, col) SELECT i, floor(random() * 100000) FROM generate_series(2,
>> 100000, 2) i; INSERT INTO d (id, col) SELECT i, floor(random() * 100000) FROM
>> generate_series(2, 100000, 2) i; INSERT INTO e (id, col) SELECT i,
>> floor(random() * 100000) FROM generate_series(1, 100000, 1) i;
>>
>> ANALYZE;
>>
>> CREATE VIEW tables AS
>>   SELECT a.*, b.col AS other_col
>>   FROM a
>>   LEFT JOIN b ON a.id = b.id
>>   UNION ALL
>>   SELECT c.*, d.col AS other_col
>>   FROM c
>>   LEFT JOIN d ON c.id = d.id;
>>
>> EXPLAIN ANALYZE
>> SELECT *
>> FROM tables
>> WHERE id = 89; -- Index scans, as expected.
>>
>> EXPLAIN ANALYZE
>> SELECT *
>> FROM e
>> JOIN tables ON e.col = tables.id
>> WHERE e.id = 568; -- Big merge joins, when simple index scans should be
>> possible?
>>
>> -----
>>
>> Would this be considered a deficiency in the optimizer? Is there a simple fix?
>>
>> Thanks!
>> Chris
>>
>
> Chris,
>
> " JOIN tables ON e.col = tables.id"  - is this a typo?
>
> Shouldn't it be " JOIN tables ON e.id = tables.id" ?
>
> Or, you need it the way it is?
>
> Regards,
> Igor Neyman
>
>


Re: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4

От
Tom Lane
Дата:
Chris Hanks <christopher.m.hanks@gmail.com> writes:
> CREATE VIEW tables AS
>   SELECT a.*, b.col AS other_col
>   FROM a
>   LEFT JOIN b ON a.id = b.id
>   UNION ALL
>   SELECT c.*, d.col AS other_col
>   FROM c
>   LEFT JOIN d ON c.id = d.id;

> EXPLAIN ANALYZE
> SELECT *
> FROM tables
> WHERE id = 89; -- Index scans, as expected.

> EXPLAIN ANALYZE
> SELECT *
> FROM e
> JOIN tables ON e.col = tables.id
> WHERE e.id = 568; -- Big merge joins, when simple index scans should
> be possible?

> Would this be considered a deficiency in the optimizer? Is there a simple fix?

Don't hold your breath.  To arrive at the
union-on-the-inside-of-a-nestloop plan you're hoping for, the planner
would have to create a "parameterized path" for the UNION ALL structure.
But when you have joins in the arms of the UNION ALL, they are considered
to be independent subqueries, and we currently have a policy decision not
to try to generate parameterized paths for subqueries.  It'd be quite
expensive and I think the planner is probably lacking some necessary
mechanisms anyway.

Given that e.id is unique, you could possibly fake it with something like

select * from tables where id = (select e.col from e where e.id = 568);

            regards, tom lane


Re: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4

От
Chris Hanks
Дата:
That did the trick! Thanks, Tom!

On Thu, Jul 10, 2014 at 8:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Chris Hanks <christopher.m.hanks@gmail.com> writes:
>> CREATE VIEW tables AS
>>   SELECT a.*, b.col AS other_col
>>   FROM a
>>   LEFT JOIN b ON a.id = b.id
>>   UNION ALL
>>   SELECT c.*, d.col AS other_col
>>   FROM c
>>   LEFT JOIN d ON c.id = d.id;
>
>> EXPLAIN ANALYZE
>> SELECT *
>> FROM tables
>> WHERE id = 89; -- Index scans, as expected.
>
>> EXPLAIN ANALYZE
>> SELECT *
>> FROM e
>> JOIN tables ON e.col = tables.id
>> WHERE e.id = 568; -- Big merge joins, when simple index scans should
>> be possible?
>
>> Would this be considered a deficiency in the optimizer? Is there a simple fix?
>
> Don't hold your breath.  To arrive at the
> union-on-the-inside-of-a-nestloop plan you're hoping for, the planner
> would have to create a "parameterized path" for the UNION ALL structure.
> But when you have joins in the arms of the UNION ALL, they are considered
> to be independent subqueries, and we currently have a policy decision not
> to try to generate parameterized paths for subqueries.  It'd be quite
> expensive and I think the planner is probably lacking some necessary
> mechanisms anyway.
>
> Given that e.id is unique, you could possibly fake it with something like
>
> select * from tables where id = (select e.col from e where e.id = 568);
>
>                         regards, tom lane