Обсуждение: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4
Hi everyone - I have a slow query issue in an app I'm working on. I'm unfortunately not at liberty to share the query/schema details, but I've put together a very similar reproduction of the issue: ----- CREATE TABLE a (id integer primary key, col integer); CREATE TABLE b (id integer primary key, col integer); CREATE TABLE c (id integer primary key, col integer); CREATE TABLE d (id integer primary key, col integer); CREATE TABLE e (id integer primary key, col integer); INSERT INTO a (id, col) SELECT i, floor(random() * 100000) FROM generate_series(1, 100000, 2) i; INSERT INTO b (id, col) SELECT i, floor(random() * 100000) FROM generate_series(1, 100000, 2) i; INSERT INTO c (id, col) SELECT i, floor(random() * 100000) FROM generate_series(2, 100000, 2) i; INSERT INTO d (id, col) SELECT i, floor(random() * 100000) FROM generate_series(2, 100000, 2) i; INSERT INTO e (id, col) SELECT i, floor(random() * 100000) FROM generate_series(1, 100000, 1) i; ANALYZE; CREATE VIEW tables AS SELECT a.*, b.col AS other_col FROM a LEFT JOIN b ON a.id = b.id UNION ALL SELECT c.*, d.col AS other_col FROM c LEFT JOIN d ON c.id = d.id; EXPLAIN ANALYZE SELECT * FROM tables WHERE id = 89; -- Index scans, as expected. EXPLAIN ANALYZE SELECT * FROM e JOIN tables ON e.col = tables.id WHERE e.id = 568; -- Big merge joins, when simple index scans should be possible? ----- Would this be considered a deficiency in the optimizer? Is there a simple fix? Thanks! Chris
Re: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4
От
Igor Neyman
Дата:
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Chris Hanks > Sent: Thursday, July 10, 2014 5:02 AM > To: PostgreSQL General > Subject: [GENERAL] Joining on a view containing a UNION ALL produces a > suboptimal plan on 9.3.4 > > Hi everyone - > > I have a slow query issue in an app I'm working on. I'm unfortunately not at > liberty to share the query/schema details, but I've put together a very similar > reproduction of the issue: > > ----- > > CREATE TABLE a (id integer primary key, col integer); CREATE TABLE b (id > integer primary key, col integer); CREATE TABLE c (id integer primary key, col > integer); CREATE TABLE d (id integer primary key, col integer); CREATE TABLE > e (id integer primary key, col integer); > > INSERT INTO a (id, col) SELECT i, floor(random() * 100000) FROM > generate_series(1, 100000, 2) i; INSERT INTO b (id, col) SELECT i, > floor(random() * 100000) FROM generate_series(1, 100000, 2) i; INSERT INTO > c (id, col) SELECT i, floor(random() * 100000) FROM generate_series(2, > 100000, 2) i; INSERT INTO d (id, col) SELECT i, floor(random() * 100000) FROM > generate_series(2, 100000, 2) i; INSERT INTO e (id, col) SELECT i, > floor(random() * 100000) FROM generate_series(1, 100000, 1) i; > > ANALYZE; > > CREATE VIEW tables AS > SELECT a.*, b.col AS other_col > FROM a > LEFT JOIN b ON a.id = b.id > UNION ALL > SELECT c.*, d.col AS other_col > FROM c > LEFT JOIN d ON c.id = d.id; > > EXPLAIN ANALYZE > SELECT * > FROM tables > WHERE id = 89; -- Index scans, as expected. > > EXPLAIN ANALYZE > SELECT * > FROM e > JOIN tables ON e.col = tables.id > WHERE e.id = 568; -- Big merge joins, when simple index scans should be > possible? > > ----- > > Would this be considered a deficiency in the optimizer? Is there a simple fix? > > Thanks! > Chris > Chris, " JOIN tables ON e.col = tables.id" - is this a typo? Shouldn't it be " JOIN tables ON e.id = tables.id" ? Or, you need it the way it is? Regards, Igor Neyman
Re: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4
От
Chris Hanks
Дата:
I need it the way it is. It's a foreign key in the actual query. Thanks! On Thu, Jul 10, 2014 at 8:31 AM, Igor Neyman <ineyman@perceptron.com> wrote: >> -----Original Message----- >> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- >> owner@postgresql.org] On Behalf Of Chris Hanks >> Sent: Thursday, July 10, 2014 5:02 AM >> To: PostgreSQL General >> Subject: [GENERAL] Joining on a view containing a UNION ALL produces a >> suboptimal plan on 9.3.4 >> >> Hi everyone - >> >> I have a slow query issue in an app I'm working on. I'm unfortunately not at >> liberty to share the query/schema details, but I've put together a very similar >> reproduction of the issue: >> >> ----- >> >> CREATE TABLE a (id integer primary key, col integer); CREATE TABLE b (id >> integer primary key, col integer); CREATE TABLE c (id integer primary key, col >> integer); CREATE TABLE d (id integer primary key, col integer); CREATE TABLE >> e (id integer primary key, col integer); >> >> INSERT INTO a (id, col) SELECT i, floor(random() * 100000) FROM >> generate_series(1, 100000, 2) i; INSERT INTO b (id, col) SELECT i, >> floor(random() * 100000) FROM generate_series(1, 100000, 2) i; INSERT INTO >> c (id, col) SELECT i, floor(random() * 100000) FROM generate_series(2, >> 100000, 2) i; INSERT INTO d (id, col) SELECT i, floor(random() * 100000) FROM >> generate_series(2, 100000, 2) i; INSERT INTO e (id, col) SELECT i, >> floor(random() * 100000) FROM generate_series(1, 100000, 1) i; >> >> ANALYZE; >> >> CREATE VIEW tables AS >> SELECT a.*, b.col AS other_col >> FROM a >> LEFT JOIN b ON a.id = b.id >> UNION ALL >> SELECT c.*, d.col AS other_col >> FROM c >> LEFT JOIN d ON c.id = d.id; >> >> EXPLAIN ANALYZE >> SELECT * >> FROM tables >> WHERE id = 89; -- Index scans, as expected. >> >> EXPLAIN ANALYZE >> SELECT * >> FROM e >> JOIN tables ON e.col = tables.id >> WHERE e.id = 568; -- Big merge joins, when simple index scans should be >> possible? >> >> ----- >> >> Would this be considered a deficiency in the optimizer? Is there a simple fix? >> >> Thanks! >> Chris >> > > Chris, > > " JOIN tables ON e.col = tables.id" - is this a typo? > > Shouldn't it be " JOIN tables ON e.id = tables.id" ? > > Or, you need it the way it is? > > Regards, > Igor Neyman > >
Chris Hanks <christopher.m.hanks@gmail.com> writes: > CREATE VIEW tables AS > SELECT a.*, b.col AS other_col > FROM a > LEFT JOIN b ON a.id = b.id > UNION ALL > SELECT c.*, d.col AS other_col > FROM c > LEFT JOIN d ON c.id = d.id; > EXPLAIN ANALYZE > SELECT * > FROM tables > WHERE id = 89; -- Index scans, as expected. > EXPLAIN ANALYZE > SELECT * > FROM e > JOIN tables ON e.col = tables.id > WHERE e.id = 568; -- Big merge joins, when simple index scans should > be possible? > Would this be considered a deficiency in the optimizer? Is there a simple fix? Don't hold your breath. To arrive at the union-on-the-inside-of-a-nestloop plan you're hoping for, the planner would have to create a "parameterized path" for the UNION ALL structure. But when you have joins in the arms of the UNION ALL, they are considered to be independent subqueries, and we currently have a policy decision not to try to generate parameterized paths for subqueries. It'd be quite expensive and I think the planner is probably lacking some necessary mechanisms anyway. Given that e.id is unique, you could possibly fake it with something like select * from tables where id = (select e.col from e where e.id = 568); regards, tom lane
Re: Joining on a view containing a UNION ALL produces a suboptimal plan on 9.3.4
От
Chris Hanks
Дата:
That did the trick! Thanks, Tom! On Thu, Jul 10, 2014 at 8:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Chris Hanks <christopher.m.hanks@gmail.com> writes: >> CREATE VIEW tables AS >> SELECT a.*, b.col AS other_col >> FROM a >> LEFT JOIN b ON a.id = b.id >> UNION ALL >> SELECT c.*, d.col AS other_col >> FROM c >> LEFT JOIN d ON c.id = d.id; > >> EXPLAIN ANALYZE >> SELECT * >> FROM tables >> WHERE id = 89; -- Index scans, as expected. > >> EXPLAIN ANALYZE >> SELECT * >> FROM e >> JOIN tables ON e.col = tables.id >> WHERE e.id = 568; -- Big merge joins, when simple index scans should >> be possible? > >> Would this be considered a deficiency in the optimizer? Is there a simple fix? > > Don't hold your breath. To arrive at the > union-on-the-inside-of-a-nestloop plan you're hoping for, the planner > would have to create a "parameterized path" for the UNION ALL structure. > But when you have joins in the arms of the UNION ALL, they are considered > to be independent subqueries, and we currently have a policy decision not > to try to generate parameterized paths for subqueries. It'd be quite > expensive and I think the planner is probably lacking some necessary > mechanisms anyway. > > Given that e.id is unique, you could possibly fake it with something like > > select * from tables where id = (select e.col from e where e.id = 568); > > regards, tom lane