Обсуждение: PSQL does not remove obvious useless joins

Поиск
Список
Период
Сортировка

PSQL does not remove obvious useless joins

От
"Sfiligoi, Igor"
Дата:

Hello.

 

We have a view that is very generic, and we noticed that PostgreSQL is not very good at removing useless joins, which makes our queries very slow.

We could change our code to avoid the view and write ad-hoc queries to the underlying tables, but would prefer not to, if there is a way around it.

(BTW: We are currently using psql 9.4)

 

Here is a simplified implementation:

# create table a (id int primary key, name varchar(128));

# create table b (id int primary key, name varchar(128));

# create table c (id int primary key, a_id int references a(id), b1_id int references b(id), b2_id int references b(id), b3_id int references b(id));

# create view v as select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id, a.name a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name from c,  a, b b1, b b2, b b3 where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and c.b3_id=b3.id;

 

When I try to get just info from tables c and b1:

# select id, b1_name from v

it still does all the joins (see below).

 

I would expect just one join (due to the request of columns from the two tables),

since all joins are on foreign constrains referencing primary keys,

there are no filters on the other tables,
so it is guaranteed that the useless joins will always return exactly one answer.

 

Is there a way to tweak the PostgreSQL optimizer to do the proper join removal during the planning?

Perhaps tweaking somehow either our schema or our queries (while still keeping a generic view)?

 

Thank you,

  Igor Sfiligoi

 

 

 

# explain select id, b1_name from v;

                                   QUERY PLAN                                  

--------------------------------------------------------------------------------

Nested Loop  (cost=1.02..5.45 rows=1 width=6)

   Join Filter: (c.b3_id = b3.id)

   ->  Nested Loop  (cost=1.02..4.32 rows=1 width=10)

         Join Filter: (c.a_id = a.id)

         ->  Nested Loop  (cost=1.02..3.25 rows=1 width=14)

               Join Filter: (c.b2_id = b2.id)

               ->  Hash Join  (cost=1.02..2.12 rows=1 width=18)

                     Hash Cond: (b1.id = c.b1_id)

                     ->  Seq Scan on b b1  (cost=0.00..1.06 rows=6 width=6)

                     ->  Hash  (cost=1.01..1.01 rows=1 width=20)

                           ->  Seq Scan on c  (cost=0.00..1.01 rows=1 width=20)

               ->  Seq Scan on b b2  (cost=0.00..1.06 rows=6 width=4)

         ->  Seq Scan on a  (cost=0.00..1.03 rows=3 width=4)

   ->  Seq Scan on b b3  (cost=0.00..1.06 rows=6 width=4)

(14 rows)

 

PS: The tables were very small in this example, but are quite big in the production environment.

 

Re: PSQL does not remove obvious useless joins

От
Merlin Moncure
Дата:
On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
> Hello.
>
> We have a view that is very generic, and we noticed that PostgreSQL is not
> very good at removing useless joins, which makes our queries very slow.
>
> We could change our code to avoid the view and write ad-hoc queries to the
> underlying tables, but would prefer not to, if there is a way around it.
>
> (BTW: We are currently using psql 9.4)
>
> Here is a simplified implementation:
>
> # create table a (id int primary key, name varchar(128));
>
> # create table b (id int primary key, name varchar(128));
>
> # create table c (id int primary key, a_id int references a(id), b1_id int
> references b(id), b2_id int references b(id), b3_id int references b(id));
>
> # create view v as select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id, a.name
> a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name from c,  a, b b1,
> b b2, b b3 where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and
> c.b3_id=b3.id;
>
> When I try to get just info from tables c and b1:
>
> # select id, b1_name from v
>
> it still does all the joins (see below).
>
> I would expect just one join (due to the request of columns from the two
> tables),
>
> since all joins are on foreign constrains referencing primary keys,
>
> there are no filters on the other tables,
> so it is guaranteed that the useless joins will always return exactly one
> answer.

I think what you're asking for is a lot more complex than it sounds,
and incorrect.  The precise state of the data influences how many
records come back (in this case, either 1 or 0), for example if b3_id
is null you get zero rows.  More to the point, you *instructed* the
server to make the join.  There are strategies to make joins
'optional' at run time with respect to a query, but they are more
complicated than simply withdrawing columns from the select list.

Stepping back a bit, the query needs to be planned before peeking at
the data in the tables.  The planner is able to make assumptions
against a statistical picture of the data but shouldn't be expected to
actually inspect precise result data in order to generate a better
plan.

merlin


Re: PSQL does not remove obvious useless joins

От
Kevin Grittner
Дата:
On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:

> We have a view that is very generic, and we noticed that PostgreSQL is not
> very good at removing useless joins, which makes our queries very slow.
>
> We could change our code to avoid the view and write ad-hoc queries to the
> underlying tables, but would prefer not to, if there is a way around it.

If it did not do the joins it could not provide the information you
are actually asking to see.  Of course, there is a very good chance
that what you are asking to see is not what you *want* to see.

test=# create table a (id int primary key, name varchar(128));
CREATE TABLE
test=# create table b (id int primary key, name varchar(128));
CREATE TABLE
test=# create table c (id int primary key,
test(#                 a_id int references a(id),
test(#                 b1_id int references b(id),
test(#                 b2_id int references b(id),
test(#                 b3_id int references b(id));
CREATE TABLE
test=#
test=# create view v_broken as
test-#   select
test-#        c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-#        b1.name b1_name, b2.name b2_name, b3.name b3_name
test-#     from c, a, b b1, b b2, b b3
test-#     where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and
c.b3_id=b3.id;
CREATE VIEW
test=#
test=# create view v as
test-#   select
test-#       c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-#       b1.name b1_name, b2.name b2_name, b3.name b3_name
test-#   from c
test-#   left join a    on a.id = c.a_id
test-#   left join b b1 on b1.id = c.b1_id
test-#   left join b b2 on b2.id = c.b2_id
test-#   left join b b3 on b3.id = c.b3_id;
CREATE VIEW
test=#
test=# insert into a values (1, 'a1');
INSERT 0 1
test=# insert into b values (1, 'b1'), (2, 'b2'), (3, 'b3');
INSERT 0 3
test=# insert into c values (1, 1, 1, 2, 3), (2, 1, 1, 2, null);
INSERT 0 2
test=#
test=# select id, b1_name from v_broken;
 id | b1_name
----+---------
  1 | b1
(1 row)

test=# explain analyze select id, b1_name from v_broken;
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=63.40..183.90 rows=1700 width=278) (actual
time=0.049..0.052 rows=1 loops=1)
   Hash Cond: (c.b3_id = b3.id)
   ->  Hash Join  (cost=47.55..144.68 rows=1700 width=282) (actual
time=0.030..0.033 rows=2 loops=1)
         Hash Cond: (c.b2_id = b2.id)
         ->  Hash Join  (cost=31.70..105.45 rows=1700 width=286)
(actual time=0.018..0.020 rows=2 loops=1)
               Hash Cond: (c.b1_id = b1.id)
               ->  Hash Join  (cost=15.85..66.22 rows=1700 width=16)
(actual time=0.010..0.012 rows=2 loops=1)
                     Hash Cond: (c.a_id = a.id)
                     ->  Seq Scan on c  (cost=0.00..27.00 rows=1700
width=20) (actual time=0.001..0.001 rows=2 loops=1)
                     ->  Hash  (cost=12.60..12.60 rows=260 width=4)
(actual time=0.003..0.003 rows=1 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on a  (cost=0.00..12.60
rows=260 width=4) (actual time=0.002..0.003 rows=1 loops=1)
               ->  Hash  (cost=12.60..12.60 rows=260 width=278)
(actual time=0.005..0.005 rows=3 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 9kB
                     ->  Seq Scan on b b1  (cost=0.00..12.60 rows=260
width=278) (actual time=0.002..0.003 rows=3 loops=1)
         ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual
time=0.006..0.006 rows=3 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               ->  Seq Scan on b b2  (cost=0.00..12.60 rows=260
width=4) (actual time=0.004..0.004 rows=3 loops=1)
   ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual
time=0.013..0.013 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b b3  (cost=0.00..12.60 rows=260 width=4)
(actual time=0.008..0.009 rows=3 loops=1)
 Planning time: 0.729 ms
 Execution time: 0.153 ms
(23 rows)

test=# select id, b1_name from v;
 id | b1_name
----+---------
  1 | b1
  2 | b1
(2 rows)

test=# explain analyze select id, b1_name from v;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=15.85..66.22 rows=1700 width=278) (actual
time=0.017..0.018 rows=2 loops=1)
   Hash Cond: (c.b1_id = b1.id)
   ->  Seq Scan on c  (cost=0.00..27.00 rows=1700 width=20) (actual
time=0.005..0.005 rows=2 loops=1)
   ->  Hash  (cost=12.60..12.60 rows=260 width=278) (actual
time=0.006..0.006 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b b1  (cost=0.00..12.60 rows=260 width=278)
(actual time=0.002..0.003 rows=3 loops=1)
 Planning time: 0.177 ms
 Execution time: 0.044 ms
(8 rows)

Note the difference in results using inner joins versus left outer joins.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
"Sfiligoi, Igor"
Дата:
Sorry... the example was incomplete.

All the fields are defined as not-null.
So it is guaranteed to always match the join.

And PostgreSQL release notes claim that PGSQL can do at least partial join removal:
https://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9.0#Join_Removal 

I was hoping this use case would fit in.

Any suggestions?

Igor

-----Original Message-----
From: Merlin Moncure [mailto:mmoncure@gmail.com] 
Sent: Friday, July 01, 2016 12:42 PM
To: Sfiligoi, Igor <Igor.Sfiligoi@ga.com>
Cc: pgsql-general@postgresql.org
Subject: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins

On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
> Hello.
>
> We have a view that is very generic, and we noticed that PostgreSQL is 
> not very good at removing useless joins, which makes our queries very slow.
>
> We could change our code to avoid the view and write ad-hoc queries to 
> the underlying tables, but would prefer not to, if there is a way around it.
>
> (BTW: We are currently using psql 9.4)
>
> Here is a simplified implementation:
>
> # create table a (id int primary key, name varchar(128));
>
> # create table b (id int primary key, name varchar(128));
>
> # create table c (id int primary key, a_id int references a(id), b1_id 
> int references b(id), b2_id int references b(id), b3_id int references 
> b(id));
>
> # create view v as select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id, 
> a.name a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name from 
> c,  a, b b1, b b2, b b3 where c.a_id=a.id and c.b1_id=b1.id and 
> c.b2_id=b2.id and c.b3_id=b3.id;
>
> When I try to get just info from tables c and b1:
>
> # select id, b1_name from v
>
> it still does all the joins (see below).
>
> I would expect just one join (due to the request of columns from the 
> two tables),
>
> since all joins are on foreign constrains referencing primary keys,
>
> there are no filters on the other tables, so it is guaranteed that the 
> useless joins will always return exactly one answer.

I think what you're asking for is a lot more complex than it sounds, and incorrect.  The precise state of the data
influenceshow many records come back (in this case, either 1 or 0), for example if b3_id is null you get zero rows.
Moreto the point, you *instructed* the server to make the join.  There are strategies to make joins 'optional' at run
timewith respect to a query, but they are more complicated than simply withdrawing columns from the select list.
 

Stepping back a bit, the query needs to be planned before peeking at the data in the tables.  The planner is able to
makeassumptions against a statistical picture of the data but shouldn't be expected to actually inspect precise result
datain order to generate a better plan.
 

merlin


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
"Sfiligoi, Igor"
Дата:
No, I don't want to use LEFT JOINS.
I want to use regular joins.

But (as mentioned in my other follow-up), all the fields are not null (was not in the original email, sorry), and are
foreignkeys, so it is guaranteed to always match.
 

The key part (in my mind) is that I am not filtering on any of the useless tables, and I am not returning any columns
fromthose tables either.
 
Both is known at planning time.

Or is my logic still broken?

Thanks,
 Igor

-----Original Message-----
From: Kevin Grittner [mailto:kgrittn@gmail.com] 
Sent: Friday, July 01, 2016 1:29 PM
To: Sfiligoi, Igor <Igor.Sfiligoi@ga.com>
Cc: pgsql-general@postgresql.org
Subject: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins

On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:

> We have a view that is very generic, and we noticed that PostgreSQL is 
> not very good at removing useless joins, which makes our queries very slow.
>
> We could change our code to avoid the view and write ad-hoc queries to 
> the underlying tables, but would prefer not to, if there is a way around it.

If it did not do the joins it could not provide the information you are actually asking to see.  Of course, there is a
verygood chance that what you are asking to see is not what you *want* to see.
 

test=# create table a (id int primary key, name varchar(128)); CREATE TABLE test=# create table b (id int primary key,
namevarchar(128)); CREATE TABLE test=# create table c (id int primary key,
 
test(#                 a_id int references a(id),
test(#                 b1_id int references b(id),
test(#                 b2_id int references b(id),
test(#                 b3_id int references b(id));
CREATE TABLE
test=#
test=# create view v_broken as
test-#   select
test-#        c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-#        b1.name b1_name, b2.name b2_name, b3.name b3_name
test-#     from c, a, b b1, b b2, b b3
test-#     where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and
c.b3_id=b3.id;
CREATE VIEW
test=#
test=# create view v as
test-#   select
test-#       c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-#       b1.name b1_name, b2.name b2_name, b3.name b3_name
test-#   from c
test-#   left join a    on a.id = c.a_id
test-#   left join b b1 on b1.id = c.b1_id
test-#   left join b b2 on b2.id = c.b2_id
test-#   left join b b3 on b3.id = c.b3_id;
CREATE VIEW
test=#
test=# insert into a values (1, 'a1');
INSERT 0 1
test=# insert into b values (1, 'b1'), (2, 'b2'), (3, 'b3'); INSERT 0 3 test=# insert into c values (1, 1, 1, 2, 3),
(2,1, 1, 2, null); INSERT 0 2 test=# test=# select id, b1_name from v_broken;  id | b1_name
 
----+---------
  1 | b1
(1 row)

test=# explain analyze select id, b1_name from v_broken;
                                                         QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=63.40..183.90 rows=1700 width=278) (actual
time=0.049..0.052 rows=1 loops=1)
   Hash Cond: (c.b3_id = b3.id)
   ->  Hash Join  (cost=47.55..144.68 rows=1700 width=282) (actual
time=0.030..0.033 rows=2 loops=1)
         Hash Cond: (c.b2_id = b2.id)
         ->  Hash Join  (cost=31.70..105.45 rows=1700 width=286) (actual time=0.018..0.020 rows=2 loops=1)
               Hash Cond: (c.b1_id = b1.id)
               ->  Hash Join  (cost=15.85..66.22 rows=1700 width=16) (actual time=0.010..0.012 rows=2 loops=1)
                     Hash Cond: (c.a_id = a.id)
                     ->  Seq Scan on c  (cost=0.00..27.00 rows=1700
width=20) (actual time=0.001..0.001 rows=2 loops=1)
                     ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual time=0.003..0.003 rows=1 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on a  (cost=0.00..12.60
rows=260 width=4) (actual time=0.002..0.003 rows=1 loops=1)
               ->  Hash  (cost=12.60..12.60 rows=260 width=278) (actual time=0.005..0.005 rows=3 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 9kB
                     ->  Seq Scan on b b1  (cost=0.00..12.60 rows=260
width=278) (actual time=0.002..0.003 rows=3 loops=1)
         ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual
time=0.006..0.006 rows=3 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               ->  Seq Scan on b b2  (cost=0.00..12.60 rows=260
width=4) (actual time=0.004..0.004 rows=3 loops=1)
   ->  Hash  (cost=12.60..12.60 rows=260 width=4) (actual
time=0.013..0.013 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b b3  (cost=0.00..12.60 rows=260 width=4) (actual time=0.008..0.009 rows=3 loops=1)  Planning
time:0.729 ms  Execution time: 0.153 ms
 
(23 rows)

test=# select id, b1_name from v;
 id | b1_name
----+---------
  1 | b1
  2 | b1
(2 rows)

test=# explain analyze select id, b1_name from v;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=15.85..66.22 rows=1700 width=278) (actual
time=0.017..0.018 rows=2 loops=1)
   Hash Cond: (c.b1_id = b1.id)
   ->  Seq Scan on c  (cost=0.00..27.00 rows=1700 width=20) (actual
time=0.005..0.005 rows=2 loops=1)
   ->  Hash  (cost=12.60..12.60 rows=260 width=278) (actual
time=0.006..0.006 rows=3 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b b1  (cost=0.00..12.60 rows=260 width=278) (actual time=0.002..0.003 rows=3 loops=1)
Planningtime: 0.177 ms  Execution time: 0.044 ms
 
(8 rows)

Note the difference in results using inner joins versus left outer joins.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
Adrian Klaver
Дата:
On 07/01/2016 01:28 PM, Sfiligoi, Igor wrote:
> Sorry... the example was incomplete.
>
> All the fields are defined as not-null.
> So it is guaranteed to always match the join.
>
> And PostgreSQL release notes claim that PGSQL can do at least partial join removal:
> https://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9.0#Join_Removal

Those examples use explicit joins, so you might try that in your view
definition.

>
> I was hoping this use case would fit in.
>
> Any suggestions?
>
> Igor
>
> -----Original Message-----
> From: Merlin Moncure [mailto:mmoncure@gmail.com]
> Sent: Friday, July 01, 2016 12:42 PM
> To: Sfiligoi, Igor <Igor.Sfiligoi@ga.com>
> Cc: pgsql-general@postgresql.org
> Subject: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins
>
> On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
>> Hello.
>>
>> We have a view that is very generic, and we noticed that PostgreSQL is
>> not very good at removing useless joins, which makes our queries very slow.
>>
>> We could change our code to avoid the view and write ad-hoc queries to
>> the underlying tables, but would prefer not to, if there is a way around it.
>>
>> (BTW: We are currently using psql 9.4)
>>
>> Here is a simplified implementation:
>>
>> # create table a (id int primary key, name varchar(128));
>>
>> # create table b (id int primary key, name varchar(128));
>>
>> # create table c (id int primary key, a_id int references a(id), b1_id
>> int references b(id), b2_id int references b(id), b3_id int references
>> b(id));
>>
>> # create view v as select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id,
>> a.name a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name from
>> c,  a, b b1, b b2, b b3 where c.a_id=a.id and c.b1_id=b1.id and
>> c.b2_id=b2.id and c.b3_id=b3.id;
>>
>> When I try to get just info from tables c and b1:
>>
>> # select id, b1_name from v
>>
>> it still does all the joins (see below).
>>
>> I would expect just one join (due to the request of columns from the
>> two tables),
>>
>> since all joins are on foreign constrains referencing primary keys,
>>
>> there are no filters on the other tables, so it is guaranteed that the
>> useless joins will always return exactly one answer.
>
> I think what you're asking for is a lot more complex than it sounds, and incorrect.  The precise state of the data
influenceshow many records come back (in this case, either 1 or 0), for example if b3_id is null you get zero rows.
Moreto the point, you *instructed* the server to make the join.  There are strategies to make joins 'optional' at run
timewith respect to a query, but they are more complicated than simply withdrawing columns from the select list. 
>
> Stepping back a bit, the query needs to be planned before peeking at the data in the tables.  The planner is able to
makeassumptions against a statistical picture of the data but shouldn't be expected to actually inspect precise result
datain order to generate a better plan. 
>
> merlin
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
"Sfiligoi, Igor"
Дата:
Nope, no difference how I express the joins:
create view v1 as 
select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id, a.name a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name 
from c join  a on (c.a_id=a.id) join  b b1 on (c.b1_id=b1.id) join b b2 on (c.b2_id=b2.id) join b b3 on
(c.b3_id=b3.id);

# explain select id, b1_name from v1;
                                   QUERY PLAN                                   
--------------------------------------------------------------------------------
 Nested Loop  (cost=1.02..5.47 rows=1 width=7)
   Join Filter: (c.b3_id = b3.id)
   ->  Nested Loop  (cost=1.02..4.34 rows=1 width=11)
         Join Filter: (c.a_id = a.id)
         ->  Nested Loop  (cost=1.02..3.25 rows=1 width=15)
               Join Filter: (c.b2_id = b2.id)
               ->  Hash Join  (cost=1.02..2.12 rows=1 width=19)
                     Hash Cond: (b1.id = c.b1_id)
                     ->  Seq Scan on b b1  (cost=0.00..1.06 rows=6 width=7)
                     ->  Hash  (cost=1.01..1.01 rows=1 width=20)
                           ->  Seq Scan on c  (cost=0.00..1.01 rows=1 width=20)
               ->  Seq Scan on b b2  (cost=0.00..1.06 rows=6 width=4)
         ->  Seq Scan on a  (cost=0.00..1.04 rows=4 width=4)
   ->  Seq Scan on b b3  (cost=0.00..1.06 rows=6 width=4)
(14 rows)

Igor

PS: Here are the updated table definitions:
create table a (id int not null primary key, name varchar(128));
create table b (id int not null primary key, name varchar(128));
create table c (id int not null primary key, a_id int not null references a(id), b1_id int not null references b(id),
b2_idint not null references b(id), b3_id int not null references b(id));
 

-----Original Message-----
From: Adrian Klaver [mailto:adrian.klaver@aklaver.com] 
Sent: Friday, July 01, 2016 1:38 PM
To: Sfiligoi, Igor <Igor.Sfiligoi@ga.com>; Merlin Moncure <mmoncure@gmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins

On 07/01/2016 01:28 PM, Sfiligoi, Igor wrote:
> Sorry... the example was incomplete.
>
> All the fields are defined as not-null.
> So it is guaranteed to always match the join.
>
> And PostgreSQL release notes claim that PGSQL can do at least partial join removal:
> https://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9.0#Join_Rem
> oval

Those examples use explicit joins, so you might try that in your view definition.

>
> I was hoping this use case would fit in.
>
> Any suggestions?
>
> Igor
>
> -----Original Message-----
> From: Merlin Moncure [mailto:mmoncure@gmail.com]
> Sent: Friday, July 01, 2016 12:42 PM
> To: Sfiligoi, Igor <Igor.Sfiligoi@ga.com>
> Cc: pgsql-general@postgresql.org
> Subject: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious 
> useless joins
>
> On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
>> Hello.
>>
>> We have a view that is very generic, and we noticed that PostgreSQL 
>> is not very good at removing useless joins, which makes our queries very slow.
>>
>> We could change our code to avoid the view and write ad-hoc queries 
>> to the underlying tables, but would prefer not to, if there is a way around it.
>>
>> (BTW: We are currently using psql 9.4)
>>
>> Here is a simplified implementation:
>>
>> # create table a (id int primary key, name varchar(128));
>>
>> # create table b (id int primary key, name varchar(128));
>>
>> # create table c (id int primary key, a_id int references a(id), 
>> b1_id int references b(id), b2_id int references b(id), b3_id int 
>> references b(id));
>>
>> # create view v as select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id, 
>> a.name a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name from 
>> c,  a, b b1, b b2, b b3 where c.a_id=a.id and c.b1_id=b1.id and 
>> c.b2_id=b2.id and c.b3_id=b3.id;
>>
>> When I try to get just info from tables c and b1:
>>
>> # select id, b1_name from v
>>
>> it still does all the joins (see below).
>>
>> I would expect just one join (due to the request of columns from the 
>> two tables),
>>
>> since all joins are on foreign constrains referencing primary keys,
>>
>> there are no filters on the other tables, so it is guaranteed that 
>> the useless joins will always return exactly one answer.
>
> I think what you're asking for is a lot more complex than it sounds, and incorrect.  The precise state of the data
influenceshow many records come back (in this case, either 1 or 0), for example if b3_id is null you get zero rows.
Moreto the point, you *instructed* the server to make the join.  There are strategies to make joins 'optional' at run
timewith respect to a query, but they are more complicated than simply withdrawing columns from the select list.
 
>
> Stepping back a bit, the query needs to be planned before peeking at the data in the tables.  The planner is able to
makeassumptions against a statistical picture of the data but shouldn't be expected to actually inspect precise result
datain order to generate a better plan.
 
>
> merlin
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
Kevin Grittner
Дата:
On Fri, Jul 1, 2016 at 3:33 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
> No, I don't want to use LEFT JOINS.
> I want to use regular joins.
>
> But (as mentioned in my other follow-up), all the fields are not
> null (was not in the original email, sorry), and are foreign
> keys, so it is guaranteed to always match.

In that case there is no difference between the inner join and the
left join except that the left join currently supports and
optimization that makes your query faster if the optional table is
not reference.  Whether you want to take advantage of that is up to
you.

> The key part (in my mind) is that I am not filtering on any of
> the useless tables, and I am not returning any columns from those
> tables either.
> Both is known at planning time.

The fact that something can be determined at planning time doesn't
mean that checking for it is free.

> is my logic still broken?

Your logic seems OK with the table definitions you are now showing.

Whether we ever decide it is OK to omit tables which use an inner
join rather than only considering omitting them when the query
specifies that the join is optional is anybody's guess.  If it is
important enough to you you could submit a patch or fund
development of such a feature; but since it would add at least some
small amount of planning time to every inner join just to avoid
specifying that the join is an optional one when writing the query,
it seems to me unlikely to be accepted.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
"Sfiligoi, Igor"
Дата:
OK.  Will change our query generation code to not use the view.
(I have tried the LEFT JOIN approach, but it just does not seem to perform.)

Thanks,
  Igor

PS: Here are the numbers for the real production query (will not provide details):
Original query:                 300s
Query on a manually optimized view:         1ms
Using left joins:                 200s

I would have gladly paid a few ms in additional planning time!

-----Original Message-----
From: Kevin Grittner [mailto:kgrittn@gmail.com] 
Sent: Friday, July 01, 2016 1:57 PM
To: Sfiligoi, Igor <Igor.Sfiligoi@ga.com>
Cc: pgsql-general@postgresql.org
Subject: Re: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins

On Fri, Jul 1, 2016 at 3:33 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
> No, I don't want to use LEFT JOINS.
> I want to use regular joins.
>
> But (as mentioned in my other follow-up), all the fields are not null 
> (was not in the original email, sorry), and are foreign keys, so it is 
> guaranteed to always match.

In that case there is no difference between the inner join and the left join except that the left join currently
supportsand optimization that makes your query faster if the optional table is not reference.  Whether you want to take
advantageof that is up to you.
 

> The key part (in my mind) is that I am not filtering on any of the 
> useless tables, and I am not returning any columns from those tables 
> either.
> Both is known at planning time.

The fact that something can be determined at planning time doesn't mean that checking for it is free.

> is my logic still broken?

Your logic seems OK with the table definitions you are now showing.

Whether we ever decide it is OK to omit tables which use an inner join rather than only considering omitting them when
thequery specifies that the join is optional is anybody's guess.  If it is important enough to you you could submit a
patchor fund development of such a feature; but since it would add at least some small amount of planning time to every
innerjoin just to avoid specifying that the join is an optional one when writing the query, it seems to me unlikely to
beaccepted.
 

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
Kevin Grittner
Дата:
On Fri, Jul 1, 2016 at 7:15 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
> OK.  Will change our query generation code to not use the view.
> (I have tried the LEFT JOIN approach, but it just does not seem to perform.)

> PS: Here are the numbers for the real production query (will not provide details):
> Original query:                          300s
> Query on a manually optimized view:        1ms
> Using left joins:                        200s

Please show a self-contained case (i.e., one that can be run
against an empty database to demonstrate the problem).  You might
start from this one and modify it until you see the problem that
you describe:

create table a (id int primary key, name varchar(128));
create table b (id int primary key, name varchar(128));
create table c (id int primary key,
                a_id int not null references a(id),
                b1_id int not null references b(id),
                b2_id int not null references b(id),
                b3_id int not null references b(id));

create view v as
  select
      c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
      b1.name b1_name, b2.name b2_name, b3.name b3_name
  from c
  left join a    on a.id = c.a_id
  left join b b1 on b1.id = c.b1_id
  left join b b2 on b2.id = c.b2_id
  left join b b3 on b3.id = c.b3_id;

insert into a values (1, 'a1');
insert into b values (1, 'b1'), (2, 'b2'), (3, 'b3');
insert into c values (1, 1, 1, 2, 3);

vacuum analyze a;
vacuum analyze b;
vacuum analyze c;

select id, b1_name from v;
explain (analyze, buffers, verbose) select id, b1_name from v;

I'm seeing the unreferenced tables pruned from the plan, and a 1ms
execution time for the select from the view.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

От
"Sfiligoi, Igor"
Дата:
The best that I can do right now is provide the explain of the three variants (see below).

The use of a left join did indeed remove the useless joins, but the selected plan is just terrible.

Thanks,
  Igor

-----Original Message-----
From: Kevin Grittner [mailto:kgrittn@gmail.com] 
Sent: Saturday, July 02, 2016 6:28 AM
To: Sfiligoi, Igor <Igor.Sfiligoi@ga.com>
Cc: pgsql-general@postgresql.org
Subject: Re: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins

On Fri, Jul 1, 2016 at 7:15 PM, Sfiligoi, Igor <Igor.Sfiligoi@ga.com> wrote:
> OK.  Will change our query generation code to not use the view.
> (I have tried the LEFT JOIN approach, but it just does not seem to 
> perform.)

> PS: Here are the numbers for the real production query (will not provide details):
> Original query:                          300s
> Query on a manually optimized view:        1ms
> Using left joins:                        200s

Please show a self-contained case (i.e., one that can be run against an empty database to demonstrate the problem).  

---------------------------------------------
Original view, two unused joins
(data_info.true_arrow_id = true_dart.arrow_id && data_info.acl_arrow_id = acl_dart.arrow_id)

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=121449.22..11705013.57 rows=890 width=63) (actual time=326791.858..365059.117 rows=1 loops=1)
   Merge Cond: (locn_info.rock_person_id = bird_rsrc_physical.rock_person_id)
   ->  Sort  (cost=1.03..1.03 rows=2 width=4) (actual time=0.018..0.020 rows=2 loops=1)
         Sort Key: locn_info.rock_person_id
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on bird_locn_info locn_info  (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=2
loops=1)
   ->  Materialize  (cost=121448.19..14015467.54 rows=890 width=71) (actual time=326791.835..365059.092 rows=1
loops=1)
         ->  Merge Join  (cost=121448.19..14015465.32 rows=890 width=71) (actual time=326791.833..365059.089 rows=1
loops=1)
               Merge Cond: (bird_rsrc_physical.rock_person_id = bird_person_bike.person_id)
               ->  Nested Loop  (cost=121444.20..16786761.58 rows=890 width=67) (actual time=326791.779..365059.033
rows=1loops=1)
 
                     Join Filter: (bird_data_silk.rsrc_id = bird_rsrc_physical.phy_rsrc_id)
                     ->  Nested Loop  (cost=0.13..15.68 rows=1 width=8) (actual time=0.025..0.035 rows=1 loops=1)
                           Join Filter: (bird_rsrc_physical.lock_rock_person_id = lock_info.rock_person_id)
                           Rows Removed by Join Filter: 1
                           ->  Nested Loop  (cost=0.13..14.63 rows=1 width=12) (actual time=0.022..0.031 rows=1
loops=1)
                                 Join Filter: (bird_rsrc_physical.rsrc_class_id = bird_rsrc_class.rsrc_class_id)
                                 Rows Removed by Join Filter: 7
                                 ->  Nested Loop  (cost=0.13..13.45 rows=1 width=16) (actual time=0.017..0.025 rows=1
loops=1)
                                       Join Filter: ((bird_rsrc_physical.rsrc_type_id)::text =
(bird_rsrc_type.rsrc_type_id)::text)
                                       Rows Removed by Join Filter: 11
                                       ->  Index Scan using fidx_1_rsrc_physical on bird_rsrc_physical
(cost=0.13..12.18rows=1 width=24) (actual time=0.009..0.013 rows=1 loops=1)
 
                                             Filter: ((rsrc_name)::text = 'data'::text)
                                             Rows Removed by Filter: 2
                                       ->  Seq Scan on bird_rsrc_type  (cost=0.00..1.12 rows=12 width=8) (actual
time=0.002..0.004rows=12 loops=1)
 
                                 ->  Seq Scan on bird_rsrc_class  (cost=0.00..1.08 rows=8 width=4) (actual
time=0.002..0.003rows=8 loops=1)
 
                           ->  Seq Scan on bird_locn_info lock_info  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.001..0.002rows=2 loops=1)
 
                     ->  Hash Join  (cost=121444.07..16786712.53 rows=2670 width=67) (actual
time=326791.750..365058.993rows=1 loops=1)
 
                           Hash Cond: (data_info.true_data_id = bird_data_silk.data_id)
                           ->  Hash Join  (cost=118019.92..16144855.88 rows=102144928 width=43) (actual
time=1945.934..346470.487rows=102020209 loops=1)
 
                                 Hash Cond: (data_info.sheet_id = bird_sheet.sheet_id)
                                 ->  Hash Join  (cost=118018.36..14740361.56 rows=102144928 width=47) (actual
time=1945.911..310945.806rows=102020209 loops=1)
 
                                       Hash Cond: ((data_info.data_type_id)::text =
(bird_data_type.data_type_id)::text)
                                       ->  Hash Join  (cost=118011.53..13335861.97 rows=102144928 width=61) (actual
time=1945.805..269859.918rows=102020209 loops=1)
 
                                             Hash Cond: (data_info.patron_id = patron.person_id)
                                             ->  Hash Join  (cost=118009.96..11931367.64 rows=102144928 width=65)
(actualtime=1945.778..239667.755 rows=102020209 loops=1)
 
                                                   Hash Cond: (data_info.data_dog_id = dog.person_id)
                                                   ->  Hash Join  (cost=118008.38..10526873.30 rows=102144928 width=69)
(actualtime=1945.755..207642.046 rows=102020209 loops=1)
 
                                                         Hash Cond: (data_info.acl_arrow_id = acl_dart.arrow_id)
                                                         ->  Hash Join  (cost=78672.26..8189276.30 rows=102144928
width=72)(actual time=1325.078..153055.303 rows=102020209 loops=1)
 
                                                               Hash Cond: (data_info.true_arrow_id =
true_dart.arrow_id)
                                                               ->  Hash Join  (cost=39336.13..5851679.29 rows=102144928
width=79)(actual time=706.719..96335.462 rows=102020209 loops=1)
 
                                                                     Hash Cond: (data_info.arrow_id =
dart_info.arrow_id)
                                                                     ->  Seq Scan on bird_data_info data_info
(cost=0.00..3514082.28rows=102144928 width=53) (actual time=0.029..19491.161 rows=102020209 loops=1)
 
                                                                     ->  Hash  (cost=26584.39..26584.39 rows=1020139
width=40)(actual time=706.111..706.111 rows=1020208 loops=1)
 
                                                                           Buckets: 131072  Batches: 1  Memory Usage:
71880kB
                                                                           ->  Seq Scan on bird_dart_info dart_info
(cost=0.00..26584.39rows=1020139 width=40) (actual time=0.007..255.729 rows=1020208 loops=1)
 
                                                               ->  Hash  (cost=26584.39..26584.39 rows=1020139 width=7)
(actualtime=617.809..617.809 rows=1020208 loops=1)
 
                                                                     Buckets: 131072  Batches: 1  Memory Usage:
38895kB
                                                                     ->  Seq Scan on bird_dart_info true_dart
(cost=0.00..26584.39rows=1020139 width=7) (actual time=0.007..252.377 rows=1020208 loops=1)
 
                                                         ->  Hash  (cost=26584.39..26584.39 rows=1020139 width=7)
(actualtime=620.401..620.401 rows=1020208 loops=1)
 
                                                               Buckets: 131072  Batches: 1  Memory Usage: 38895kB
                                                               ->  Seq Scan on bird_dart_info acl_dart
(cost=0.00..26584.39rows=1020139 width=7) (actual time=0.011..251.752 rows=1020208 loops=1)
 
                                                   ->  Hash  (cost=1.29..1.29 rows=23 width=4) (actual
time=0.013..0.013rows=23 loops=1)
 
                                                         Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                         ->  Seq Scan on bird_person_bike dog  (cost=0.00..1.29 rows=23
width=4)(actual time=0.002..0.007 rows=23 loops=1)
 
                                                               Filter: (is_primary = 1)
                                             ->  Hash  (cost=1.29..1.29 rows=23 width=4) (actual time=0.016..0.016
rows=23loops=1)
 
                                                   Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                   ->  Seq Scan on bird_person_bike patron  (cost=0.00..1.29 rows=23
width=4)(actual time=0.004..0.012 rows=23 loops=1)
 
                                                         Filter: (is_primary = 1)
                                       ->  Hash  (cost=4.70..4.70 rows=170 width=20) (actual time=0.098..0.098 rows=170
loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                             ->  Seq Scan on bird_data_type  (cost=0.00..4.70 rows=170 width=20)
(actualtime=0.006..0.038 rows=170 loops=1)
 
                                 ->  Hash  (cost=1.25..1.25 rows=25 width=4) (actual time=0.014..0.014 rows=25
loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                       ->  Seq Scan on bird_sheet  (cost=0.00..1.25 rows=25 width=4) (actual
time=0.003..0.007rows=25 loops=1)
 
                           ->  Hash  (cost=3390.77..3390.77 rows=2670 width=38) (actual time=0.033..0.033 rows=1
loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                 ->  Index Scan using idx_0_data_silk on bird_data_silk  (cost=0.57..3390.77 rows=2670
width=38)(actual time=0.030..0.031 rows=1 loops=1)
 
                                       Index Cond: ((data_path)::text = 'bdd1_vault1'::text)
               ->  Sort  (cost=1.81..1.87 rows=23 width=4) (actual time=0.046..0.050 rows=23 loops=1)
                     Sort Key: bird_person_bike.person_id
                     Sort Method: quicksort  Memory: 26kB
                     ->  Seq Scan on bird_person_bike  (cost=0.00..1.29 rows=23 width=4) (actual time=0.007..0.015
rows=23loops=1)
 
                           Filter: (is_primary = 1)
 Planning time: 13.844 ms
 Execution time: 365059.322 ms
(77 rows)


Simplified view, useless joines removed by hand

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=13.36..10692.27 rows=890 width=63) (actual time=0.928..0.938 rows=1 loops=1)
   Hash Cond: (data_info.sheet_id = bird_sheet.sheet_id)
   ->  Hash Join  (cost=11.79..10678.47 rows=890 width=67) (actual time=0.904..0.914 rows=1 loops=1)
         Hash Cond: ((data_info.data_type_id)::text = (bird_data_type.data_type_id)::text)
         ->  Hash Join  (cost=4.97..10659.41 rows=890 width=81) (actual time=0.805..0.815 rows=1 loops=1)
               Hash Cond: (data_info.patron_id = patron.person_id)
               ->  Hash Join  (cost=3.39..10645.60 rows=890 width=85) (actual time=0.783..0.793 rows=1 loops=1)
                     Hash Cond: (data_info.data_dog_id = dog.person_id)
                     ->  Nested Loop  (cost=1.82..10631.79 rows=890 width=89) (actual time=0.765..0.775 rows=1
loops=1)
                           ->  Nested Loop  (cost=1.39..10226.17 rows=890 width=63) (actual time=0.753..0.761 rows=1
loops=1)
                                 ->  Nested Loop  (cost=0.82..2586.39 rows=890 width=34) (actual time=0.736..0.743
rows=1loops=1)
 
                                       ->  Nested Loop  (cost=0.26..17.93 rows=1 width=4) (actual time=0.688..0.695
rows=1loops=1)
 
                                             Join Filter: (bird_rsrc_physical.lock_rock_person_id =
lock_info.rock_person_id)
                                             Rows Removed by Join Filter: 1
                                             ->  Nested Loop  (cost=0.26..16.89 rows=1 width=8) (actual
time=0.685..0.691rows=1 loops=1)
 
                                                   Join Filter: (bird_rsrc_physical.rock_person_id =
locn_info.rock_person_id)
                                                   ->  Nested Loop  (cost=0.13..16.21 rows=1 width=16) (actual
time=0.028..0.034rows=1 loops=1)
 
                                                         Join Filter: (bird_rsrc_physical.rock_person_id =
bird_person_bike.person_id)
                                                         Rows Removed by Join Filter: 22
                                                         ->  Nested Loop  (cost=0.13..14.63 rows=1 width=12) (actual
time=0.017..0.021rows=1 loops=1)
 
                                                               Join Filter: (bird_rsrc_physical.rsrc_class_id =
bird_rsrc_class.rsrc_class_id)
                                                               Rows Removed by Join Filter: 7
                                                               ->  Nested Loop  (cost=0.13..13.45 rows=1 width=16)
(actualtime=0.012..0.016 rows=1 loops=1)
 
                                                                     Join Filter:
((bird_rsrc_physical.rsrc_type_id)::text= (bird_rsrc_type.rsrc_type_id)::text)
 
                                                                     Rows Removed by Join Filter: 11
                                                                     ->  Index Scan using pkey_rsrc_physical on
bird_rsrc_physical (cost=0.13..12.18 rows=1 width=24) (actual time=0.005..0.006 rows=1 loops=1)
 
                                                                           Filter: ((rsrc_name)::text = 'data'::text)
                                                                           Rows Removed by Filter: 2
                                                                     ->  Seq Scan on bird_rsrc_type  (cost=0.00..1.12
rows=12width=8) (actual time=0.002..0.003 rows=12 loops=1)
 
                                                               ->  Seq Scan on bird_rsrc_class  (cost=0.00..1.08 rows=8
width=4)(actual time=0.002..0.003 rows=8 loops=1)
 
                                                         ->  Seq Scan on bird_person_bike  (cost=0.00..1.29 rows=23
width=4)(actual time=0.002..0.008 rows=23 loops=1)
 
                                                               Filter: (is_primary = 1)
                                                   ->  Index Only Scan using pkey_locn_info on bird_locn_info locn_info
(cost=0.13..0.67 rows=1 width=4) (actual time=0.651..0.651 rows=1 loops=1)
 
                                                         Index Cond: (rock_person_id = bird_person_bike.person_id)
                                                         Heap Fetches: 1
                                             ->  Seq Scan on bird_locn_info lock_info  (cost=0.00..1.02 rows=2 width=4)
(actualtime=0.001..0.002 rows=2 loops=1)
 
                                       ->  Index Scan using ukey_2_data_silk on bird_data_silk  (cost=0.57..2555.11
rows=1335width=38) (actual time=0.044..0.044 rows=1 loops=1)
 
                                             Index Cond: ((rsrc_id = bird_rsrc_physical.phy_rsrc_id) AND
((data_path)::text= 'bdd1_vault1'::text))
 
                                 ->  Index Scan using idx_0_data_info on bird_data_info data_info  (cost=0.57..8.57
rows=1width=43) (actual time=0.015..0.016 rows=1 loops=1)
 
                                       Index Cond: (true_data_id = bird_data_silk.data_id)
                           ->  Index Scan using pkey_dart_info on bird_dart_info dart_info  (cost=0.42..0.45 rows=1
width=40)(actual time=0.010..0.010 rows=1 loops=1)
 
                                 Index Cond: (arrow_id = data_info.arrow_id)
                     ->  Hash  (cost=1.29..1.29 rows=23 width=4) (actual time=0.014..0.014 rows=23 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           ->  Seq Scan on bird_person_bike dog  (cost=0.00..1.29 rows=23 width=4) (actual
time=0.002..0.004rows=23 loops=1)
 
                                 Filter: (is_primary = 1)
               ->  Hash  (cost=1.29..1.29 rows=23 width=4) (actual time=0.017..0.017 rows=23 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 1kB
                     ->  Seq Scan on bird_person_bike patron  (cost=0.00..1.29 rows=23 width=4) (actual
time=0.005..0.012rows=23 loops=1)
 
                           Filter: (is_primary = 1)
         ->  Hash  (cost=4.70..4.70 rows=170 width=20) (actual time=0.093..0.093 rows=170 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               ->  Seq Scan on bird_data_type  (cost=0.00..4.70 rows=170 width=20) (actual time=0.005..0.042 rows=170
loops=1)
   ->  Hash  (cost=1.25..1.25 rows=25 width=4) (actual time=0.016..0.016 rows=25 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 1kB
         ->  Seq Scan on bird_sheet  (cost=0.00..1.25 rows=25 width=4) (actual time=0.003..0.006 rows=25 loops=1)
 Planning time: 9.906 ms
 Execution time: 1.129 ms
(58 rows)


Like th original view, but with the two unused joins changed to left join

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=42776.96..8431442.43 rows=890 width=63) (actual time=57764.609..209119.185 rows=1 loops=1)
   Merge Cond: (locn_info.rock_person_id = bird_rsrc_physical.rock_person_id)
   ->  Sort  (cost=1.03..1.03 rows=2 width=4) (actual time=0.013..0.014 rows=2 loops=1)
         Sort Key: locn_info.rock_person_id
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on bird_locn_info locn_info  (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=2
loops=1)
   ->  Materialize  (cost=42775.93..10104638.94 rows=890 width=71) (actual time=57764.591..209119.164 rows=1 loops=1)
         ->  Merge Join  (cost=42775.93..10104636.72 rows=890 width=71) (actual time=57764.587..209119.160 rows=1
loops=1)
               Merge Cond: (bird_rsrc_physical.rock_person_id = bird_person_bike.person_id)
               ->  Nested Loop  (cost=42771.94..12111567.57 rows=890 width=67) (actual time=57764.536..209119.107
rows=1loops=1)
 
                     Join Filter: (bird_data_silk.rsrc_id = bird_rsrc_physical.phy_rsrc_id)
                     ->  Nested Loop  (cost=0.13..15.68 rows=1 width=8) (actual time=0.025..0.035 rows=1 loops=1)
                           Join Filter: (bird_rsrc_physical.lock_rock_person_id = lock_info.rock_person_id)
                           Rows Removed by Join Filter: 1
                           ->  Nested Loop  (cost=0.13..14.63 rows=1 width=12) (actual time=0.022..0.030 rows=1
loops=1)
                                 Join Filter: (bird_rsrc_physical.rsrc_class_id = bird_rsrc_class.rsrc_class_id)
                                 Rows Removed by Join Filter: 7
                                 ->  Nested Loop  (cost=0.13..13.45 rows=1 width=16) (actual time=0.017..0.025 rows=1
loops=1)
                                       Join Filter: ((bird_rsrc_physical.rsrc_type_id)::text =
(bird_rsrc_type.rsrc_type_id)::text)
                                       Rows Removed by Join Filter: 11
                                       ->  Index Scan using fidx_1_rsrc_physical on bird_rsrc_physical
(cost=0.13..12.18rows=1 width=24) (actual time=0.009..0.012 rows=1 loops=1)
 
                                             Filter: ((rsrc_name)::text = 'data'::text)
                                             Rows Removed by Filter: 2
                                       ->  Seq Scan on bird_rsrc_type  (cost=0.00..1.12 rows=12 width=8) (actual
time=0.002..0.002rows=12 loops=1)
 
                                 ->  Seq Scan on bird_rsrc_class  (cost=0.00..1.08 rows=8 width=4) (actual
time=0.002..0.003rows=8 loops=1)
 
                           ->  Seq Scan on bird_locn_info lock_info  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.001..0.001rows=2 loops=1)
 
                     ->  Hash Join  (cost=42771.81..12111518.51 rows=2670 width=67) (actual time=57764.506..209119.066
rows=1loops=1)
 
                           Hash Cond: (data_info.true_data_id = bird_data_silk.data_id)
                           ->  Hash Join  (cost=39347.67..11469661.87 rows=102144928 width=43) (actual
time=719.142..193957.314rows=102020209 loops=1)
 
                                 Hash Cond: (data_info.arrow_id = dart_info.arrow_id)
                                 ->  Hash Join  (cost=11.54..9132064.86 rows=102144928 width=17) (actual
time=0.172..143616.386rows=102020209 loops=1)
 
                                       Hash Cond: (data_info.sheet_id = bird_sheet.sheet_id)
                                       ->  Hash Join  (cost=9.98..7727570.54 rows=102144928 width=21) (actual
time=0.154..115782.879rows=102020209 loops=1)
 
                                             Hash Cond: ((data_info.data_type_id)::text =
(bird_data_type.data_type_id)::text)
                                             ->  Hash Join  (cost=3.15..6323070.95 rows=102144928 width=35) (actual
time=0.052..81218.795rows=102020209 loops=1)
 
                                                   Hash Cond: (data_info.patron_id = patron.person_id)
                                                   ->  Hash Join  (cost=1.58..4918576.62 rows=102144928 width=39)
(actualtime=0.030..54639.659 rows=102020209 loops=1)
 
                                                         Hash Cond: (data_info.data_dog_id = dog.person_id)
                                                         ->  Seq Scan on bird_data_info data_info
(cost=0.00..3514082.28rows=102144928 width=53) (actual time=0.010..18264.383 rows=102020209 loops=1)
 
                                                         ->  Hash  (cost=1.29..1.29 rows=23 width=4) (actual
time=0.015..0.015rows=23 loops=1)
 
                                                               Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                               ->  Seq Scan on bird_person_bike dog  (cost=0.00..1.29
rows=23width=4) (actual time=0.002..0.010 rows=23 loops=1)
 
                                                                     Filter: (is_primary = 1)
                                                   ->  Hash  (cost=1.29..1.29 rows=23 width=4) (actual
time=0.017..0.017rows=23 loops=1)
 
                                                         Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                         ->  Seq Scan on bird_person_bike patron  (cost=0.00..1.29
rows=23width=4) (actual time=0.004..0.013 rows=23 loops=1)
 
                                                               Filter: (is_primary = 1)
                                             ->  Hash  (cost=4.70..4.70 rows=170 width=20) (actual time=0.096..0.096
rows=170loops=1)
 
                                                   Buckets: 1024  Batches: 1  Memory Usage: 9kB
                                                   ->  Seq Scan on bird_data_type  (cost=0.00..4.70 rows=170 width=20)
(actualtime=0.005..0.041 rows=170 loops=1)
 
                                       ->  Hash  (cost=1.25..1.25 rows=25 width=4) (actual time=0.013..0.013 rows=25
loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                             ->  Seq Scan on bird_sheet  (cost=0.00..1.25 rows=25 width=4) (actual
time=0.003..0.007rows=25 loops=1)
 
                                 ->  Hash  (cost=26584.39..26584.39 rows=1020139 width=40) (actual
time=718.449..718.449rows=1020208 loops=1)
 
                                       Buckets: 131072  Batches: 1  Memory Usage: 71880kB
                                       ->  Seq Scan on bird_dart_info dart_info  (cost=0.00..26584.39 rows=1020139
width=40)(actual time=0.006..265.742 rows=1020208 loops=1)
 
                           ->  Hash  (cost=3390.77..3390.77 rows=2670 width=38) (actual time=0.041..0.041 rows=1
loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                 ->  Index Scan using idx_0_data_silk on bird_data_silk  (cost=0.57..3390.77 rows=2670
width=38)(actual time=0.038..0.038 rows=1 loops=1)
 
                                       Index Cond: ((data_path)::text = 'bdd1_vault1'::text)
               ->  Sort  (cost=1.81..1.87 rows=23 width=4) (actual time=0.045..0.046 rows=23 loops=1)
                     Sort Key: bird_person_bike.person_id
                     Sort Method: quicksort  Memory: 26kB
                     ->  Seq Scan on bird_person_bike  (cost=0.00..1.29 rows=23 width=4) (actual time=0.005..0.007
rows=23loops=1)
 
                           Filter: (is_primary = 1)
 Planning time: 10.115 ms
 Execution time: 209119.417 ms
(67 rows)