Обсуждение: postgres_fdw: wrong results with self join + enable_nestloop off

Поиск
Список
Период
Сортировка

postgres_fdw: wrong results with self join + enable_nestloop off

От
Nishant Sharma
Дата:
Hi,


We have observed that running the same self JOIN query on postgres FDW setup is returning different results with set enable_nestloop off & on. I am at today's latest commit:- 928e05ddfd4031c67e101c5e74dbb5c8ec4f9e23

I created a local FDW setup. And ran this experiment on the same. Kindly refer to the P.S section for details.

|********************************************************************|
Below is the output difference along with query plan:-
postgres@71609=#set enable_nestloop=off;
SET
postgres@71609=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
 id1 | id2 | id1 | id2
-----+-----+-----+-----
   1 |  10 |   1 |  10
   2 |  20 |   1 |  10
   3 |  30 |   1 |  10
   1 |  10 |   2 |  20
   2 |  20 |   2 |  20
   3 |  30 |   2 |  20
   1 |  10 |   3 |  30
   2 |  20 |   3 |  30
   3 |  30 |   3 |  30
(9 rows)

postgres@71609=#explain (analyze, verbose) select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Foreign Scan  (cost=100.00..49310.40 rows=2183680 width=16) (actual time=0.514..0.515 rows=9 loops=1)
   Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
   Relations: (public.pg_tbl_foreign tbl1) INNER JOIN (public.pg_tbl_foreign tbl2)
   Remote SQL: SELECT r1.id1, r1.id2, r2.id1, r2.id2 FROM (public.pg_tbl r1 INNER JOIN public.pg_tbl r2 ON (((r1.id1 < 5))))
 Planning Time: 0.139 ms
 Execution Time: 0.984 ms
(6 rows)

postgres@71609=#set enable_nestloop=on;
SET
postgres@71609=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
 id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)

postgres@71609=#explain (analyze, verbose) select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
 Result  (cost=200.00..27644.00 rows=2183680 width=16) (actual time=0.003..0.004 rows=0 loops=1)
   Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
   One-Time Filter: (now() < '2020-02-23 00:00:00'::timestamp without time zone)
   ->  Nested Loop  (cost=200.00..27644.00 rows=2183680 width=16) (never executed)
         Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
         ->  Foreign Scan on public.pg_tbl_foreign tbl2  (cost=100.00..186.80 rows=2560 width=8) (never executed)
               Output: tbl2.id1, tbl2.id2
               Remote SQL: SELECT id1, id2 FROM public.pg_tbl
         ->  Materialize  (cost=100.00..163.32 rows=853 width=8) (never executed)
               Output: tbl1.id1, tbl1.id2
               ->  Foreign Scan on public.pg_tbl_foreign tbl1  (cost=100.00..159.06 rows=853 width=8) (never executed)
                     Output: tbl1.id1, tbl1.id2
                     Remote SQL: SELECT id1, id2 FROM public.pg_tbl WHERE ((id1 < 5))
 Planning Time: 0.178 ms
 Execution Time: 0.292 ms
(15 rows)


|********************************************************************|

I debugged this issue and was able to find a fix for the same. Kindly please refer to the attached fix. With the fix I am able to resolve the issue. But I am not that confident whether this change would affect some other existing functionally but it has helped me resolve this result difference in output.

What is the technical issue?
The problem here is the use of extract_actual_clauses. Because of which the plan creation misses adding the second condition of AND i.e "now() < '23-Feb-2020'::timestamp" in the plan. Because it is not considered a pseudo constant and extract_actual_clause is passed with false as the second parameter and it gets skipped from the list. As a result that condition is never taken into consideration as either one-time filter (before or after) or part of SQL remote execution

Why do I think the fix is correct?
The fix is simple, where we have created a new function similar to extract_actual_clause which just extracts all the conditions from the list with no checks and returns the list to the caller. As a result all conditions would be taken into consideration in the query plan.

After my fix patch:-
postgres@78754=#set enable_nestloop=off;
SET
postgres@78754=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
 id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)
                                                 ^
postgres@78754=#explain (analyze, verbose) select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Foreign Scan  (cost=100.00..49310.40 rows=2183680 width=16) (actual time=0.652..0.652 rows=0 loops=1)
   Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
   Filter: (now() < '2020-02-23 00:00:00'::timestamp without time zone)
   Rows Removed by Filter: 9
   Relations: (public.pg_tbl_foreign tbl1) INNER JOIN (public.pg_tbl_foreign tbl2)
   Remote SQL: SELECT r1.id1, r1.id2, r2.id1, r2.id2 FROM (public.pg_tbl r1 INNER JOIN public.pg_tbl r2 ON (((r1.id1 < 5))))
 Planning Time: 0.133 ms
 Execution Time: 1.127 ms
(8 rows)

postgres@78754=#set enable_nestloop=on;
SET
postgres@78754=#select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
 id1 | id2 | id1 | id2
-----+-----+-----+-----
(0 rows)

postgres@78754=#explain (analyze, verbose) select * from pg_tbl_foreign tbl1 join pg_tbl_foreign tbl2 on tbl1.id1 < 5 and now() < '23-Feb-2020'::timestamp;
                                                      QUERY PLAN                                                      
-----------------------------------------------------------------------------------------------------------------------
 Result  (cost=200.00..27644.00 rows=2183680 width=16) (actual time=0.004..0.005 rows=0 loops=1)
   Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
   One-Time Filter: (now() < '2020-02-23 00:00:00'::timestamp without time zone)
   ->  Nested Loop  (cost=200.00..27644.00 rows=2183680 width=16) (never executed)
         Output: tbl1.id1, tbl1.id2, tbl2.id1, tbl2.id2
         ->  Foreign Scan on public.pg_tbl_foreign tbl2  (cost=100.00..186.80 rows=2560 width=8) (never executed)
               Output: tbl2.id1, tbl2.id2
               Remote SQL: SELECT id1, id2 FROM public.pg_tbl
         ->  Materialize  (cost=100.00..163.32 rows=853 width=8) (never executed)
               Output: tbl1.id1, tbl1.id2
               ->  Foreign Scan on public.pg_tbl_foreign tbl1  (cost=100.00..159.06 rows=853 width=8) (never executed)
                     Output: tbl1.id1, tbl1.id2
                     Remote SQL: SELECT id1, id2 FROM public.pg_tbl WHERE ((id1 < 5))
 Planning Time: 0.134 ms
 Execution Time: 0.347 ms
(15 rows)
|********************************************************************|

Kindly please comment if I am in the correct direction or not?


Regards,
Nishant Sharma.
Developer at EnterpriseDB, Pune, India.



P.S
Steps that I used to create local postgres FDW setup ( followed link - https://www.postgresql.org/docs/current/postgres-fdw.html )

1) ./configure --prefix=/home/edb/POSTGRES_INSTALL/MASTER --with-pgport=9996 --with-openssl --with-libxml --with-zlib --with-tcl --with-perl --with-libxslt --with-ossp-uuid --with-ldap --with-pam --enable-nls --enable-debug --enable-depend --enable-dtrace --with-selinux --with-icu --enable-tap-tests --enable-cassert  CFLAGS="-g -O0"

2) make

3) make install

4) cd contrib/postgres_fdw/

5) make install

6) Start the server

7)
[edb@localhost MASTER]$ bin/psql postgres edb;
psql (16devel)
Type "help" for help.

postgres@70613=#create database remote_db;
CREATE DATABASE
postgres@70613=#quit

[edb@localhost MASTER]$ bin/psql remote_db edb;
psql (16devel)
Type "help" for help.

remote_db@70613=#CREATE USER fdw_user;
CREATE ROLE

remote_db@70613=#GRANT ALL ON SCHEMA public TO fdw_user;
GRANT
remote_db@70613=#quit

[edb@localhost MASTER]$ bin/psql remote_db fdw_user;
psql (16devel)
Type "help" for help.

remote_db@70613=#create table pg_tbl(id1 int, id2 int);
CREATE TABLE
remote_db@70613=#insert into pg_tbl values(1, 10);
INSERT 0 1
remote_db@70613=#insert into pg_tbl values(2, 20);
INSERT 0 1
remote_db@70613=#insert into pg_tbl values(3, 30);
INSERT 0 1

8)
New terminal/Tab:-
[edb@localhost MASTER]$ bin/psql postgres edb;
postgres@71609=#create extension postgres_fdw;
CREATE EXTENSION
postgres@71609=#CREATE SERVER localhost_fdw FOREIGN DATA WRAPPER postgres_fdw  OPTIONS (dbname 'remote_db', host 'localhost', port '9996');
CREATE SERVER
postgres@71609=#CREATE USER MAPPING for edb SERVER localhost_fdw OPTIONS (user 'fdw_user', password '');
CREATE USER MAPPING
postgres@71609=#GRANT ALL ON FOREIGN SERVER localhost_fdw TO edb;
GRANT
postgres@71609=#CREATE FOREIGN TABLE pg_tbl_foreign(id1 int, id2 int) SERVER localhost_fdw OPTIONS (schema_name 'public', table_name 'pg_tbl');
CREATE FOREIGN TABLE
postgres@71609=#select * from pg_tbl_foreign;
 id1 | id2
-----+-----
   1 |  10
   2 |  20
   3 |  30
(3 rows)
Вложения

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi Nishant,

On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
> I debugged this issue and was able to find a fix for the same. Kindly please refer to the attached fix. With the fix
Iam able to resolve the issue. 

Thanks for the report and patch!

> What is the technical issue?
> The problem here is the use of extract_actual_clauses. Because of which the plan creation misses adding the second
conditionof AND i.e "now() < '23-Feb-2020'::timestamp" in the plan. Because it is not considered a pseudo constant and
extract_actual_clauseis passed with false as the second parameter and it gets skipped from the list. As a result that
conditionis never taken into consideration as either one-time filter (before or after) or part of SQL remote execution 
>
> Why do I think the fix is correct?
> The fix is simple, where we have created a new function similar to extract_actual_clause which just extracts all the
conditionsfrom the list with no checks and returns the list to the caller. As a result all conditions would be taken
intoconsideration in the query plan. 

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.  Anyway, I will look at your patch
closely, first.

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Nishant Sharma
Дата:
Thanks Etsuro for your response!

One small typo correction in my answer to "What is the technical issue?"
"it is not considered a pseudo constant" --> "it is considered a pseudo constant"


Regards,
Nishant.

On Fri, Apr 14, 2023 at 6:21 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Hi Nishant,

On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
> I debugged this issue and was able to find a fix for the same. Kindly please refer to the attached fix. With the fix I am able to resolve the issue.

Thanks for the report and patch!

> What is the technical issue?
> The problem here is the use of extract_actual_clauses. Because of which the plan creation misses adding the second condition of AND i.e "now() < '23-Feb-2020'::timestamp" in the plan. Because it is not considered a pseudo constant and extract_actual_clause is passed with false as the second parameter and it gets skipped from the list. As a result that condition is never taken into consideration as either one-time filter (before or after) or part of SQL remote execution
>
> Why do I think the fix is correct?
> The fix is simple, where we have created a new function similar to extract_actual_clause which just extracts all the conditions from the list with no checks and returns the list to the caller. As a result all conditions would be taken into consideration in the query plan.

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.  Anyway, I will look at your patch
closely, first.

Best regards,
Etsuro Fujita

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Nishant Sharma
Дата:
Hi Etsuro Fujita,


Any updates? -- did you get a chance to look into this?


Regards,
Nishant.

On Mon, Apr 17, 2023 at 11:00 AM Nishant Sharma <nishant.sharma@enterprisedb.com> wrote:
Thanks Etsuro for your response!

One small typo correction in my answer to "What is the technical issue?"
"it is not considered a pseudo constant" --> "it is considered a pseudo constant"


Regards,
Nishant.

On Fri, Apr 14, 2023 at 6:21 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Hi Nishant,

On Fri, Apr 14, 2023 at 8:39 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
> I debugged this issue and was able to find a fix for the same. Kindly please refer to the attached fix. With the fix I am able to resolve the issue.

Thanks for the report and patch!

> What is the technical issue?
> The problem here is the use of extract_actual_clauses. Because of which the plan creation misses adding the second condition of AND i.e "now() < '23-Feb-2020'::timestamp" in the plan. Because it is not considered a pseudo constant and extract_actual_clause is passed with false as the second parameter and it gets skipped from the list. As a result that condition is never taken into consideration as either one-time filter (before or after) or part of SQL remote execution
>
> Why do I think the fix is correct?
> The fix is simple, where we have created a new function similar to extract_actual_clause which just extracts all the conditions from the list with no checks and returns the list to the caller. As a result all conditions would be taken into consideration in the query plan.

I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.  Anyway, I will look at your patch
closely, first.

Best regards,
Etsuro Fujita

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
On Mon, Apr 24, 2023 at 3:31 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
> Any updates? -- did you get a chance to look into this?

Sorry, I have not looked into this yet, because I have been busy with
some other work recently.  I  plan to do so early next week.

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.

Yes exactly.  In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node.  Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses.  But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.

I looked at Nishant's patch.  IIUC it treats the pseudoconstant clauses
as local conditions.  While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node.  So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.

BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *.  I changed it to NIL.

Thanks
Richard
Вложения

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Suraj Kharage
Дата:
+1 for fixing this in the backend code rather than FDW code.

Thanks, Richard, for working on this. The patch looks good to me at a glance.

On Tue, Apr 25, 2023 at 3:36 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.

Yes exactly.  In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node.  Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses.  But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.

I looked at Nishant's patch.  IIUC it treats the pseudoconstant clauses
as local conditions.  While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node.  So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.

BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *.  I changed it to NIL.

Thanks
Richard


--
--

Thanks & Regards, 
Suraj kharage, 

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Nishant Sharma
Дата:
I also agree that Richard's patch is better. As it fixes the issue at the backend and does not treat pseudoconstant as local condition.

I have tested Richard's patch and observe that it is resolving the problem. Patch looks good to me as well.

I only had a minor comment on below change:-

-   gating_clauses = get_gating_quals(root, scan_clauses);
+   if (best_path->pathtype == T_ForeignScan && IS_JOIN_REL(rel))
+       gating_clauses = get_gating_quals(root, ((ForeignPath *) best_path)->joinrestrictinfo);
+   else
+       gating_clauses = get_gating_quals(root, scan_clauses);


>> Instead of using 'if' and creating a special case here can't we do something in the above switch?


Regards,
Nishant.


P.S
I tried something quickly but I am seeing a crash:-
                case T_IndexOnlyScan:
                        scan_clauses = castNode(IndexPath, best_path)->indexinfo->indrestrictinfo;
                        break;
+               case T_ForeignScan:
+                       /*
+                        * Note that for scans of foreign joins, we do not have restriction clauses
+                        * stored in baserestrictinfo and we do not consider parameterization.
+                        * Instead we need to check against joinrestrictinfo stored in ForeignPath.
+                        */
+                       if (IS_JOIN_REL(rel))
+                               scan_clauses = ((ForeignPath *) best_path)->joinrestrictinfo;
+                       else
+                               scan_clauses = rel->baserestrictinfo;
+                       break;
                default:
                        scan_clauses = rel->baserestrictinfo;
                        break;


On Fri, Jun 2, 2023 at 9:00 AM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
+1 for fixing this in the backend code rather than FDW code.

Thanks, Richard, for working on this. The patch looks good to me at a glance.

On Tue, Apr 25, 2023 at 3:36 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Apr 14, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I think that the root cause for this issue would be in the
create_scan_plan handling of pseudoconstant quals when creating a
foreign-join (or custom-join) plan.

Yes exactly.  In create_scan_plan, we are supposed to extract all the
pseudoconstant clauses and use them as one-time quals in a gating Result
node.  Currently we check against rel->baserestrictinfo and ppi_clauses
for the pseudoconstant clauses.  But for scans of foreign joins, we do
not have any restriction clauses in these places and thus the gating
Result node as well as the pseudoconstant clauses would just be lost.

I looked at Nishant's patch.  IIUC it treats the pseudoconstant clauses
as local conditions.  While it can fix the wrong results issue, I think
maybe it's better to still treat the pseudoconstant clauses as one-time
quals in a gating node.  So I wonder if we can store the restriction
clauses for foreign joins in ForeignPath, just as what we do for normal
JoinPath, and then check against them for pseudoconstant clauses in
create_scan_plan, something like attached.

BTW, while going through the codes I noticed one place in
add_foreign_final_paths that uses NULL for List *.  I changed it to NIL.

Thanks
Richard


--
--

Thanks & Regards, 
Suraj kharage, 

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Fri, Jun 2, 2023 at 11:30 AM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
+1 for fixing this in the backend code rather than FDW code.

Thanks, Richard, for working on this. The patch looks good to me at a glance.

Thank you Suraj for the review!

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Fri, Jun 2, 2023 at 8:31 PM Nishant Sharma <nishant.sharma@enterprisedb.com> wrote:
I only had a minor comment on below change:-

-   gating_clauses = get_gating_quals(root, scan_clauses);
+   if (best_path->pathtype == T_ForeignScan && IS_JOIN_REL(rel))
+       gating_clauses = get_gating_quals(root, ((ForeignPath *) best_path)->joinrestrictinfo);
+   else
+       gating_clauses = get_gating_quals(root, scan_clauses);


Instead of using 'if' and creating a special case here can't we do something in the above switch?

I thought about that too.  IIRC I did not do it in that way because
postgresGetForeignPlan expects that there is no scan_clauses for a join
rel.  So doing that would trigger the Assert there.

   /*
    * For a join rel, baserestrictinfo is NIL and we are not considering
    * parameterization right now, so there should be no scan_clauses for
    * a joinrel or an upper rel either.
    */
   Assert(!scan_clauses);

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi,

On Fri, Jun 2, 2023 at 9:31 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
> I also agree that Richard's patch is better. As it fixes the issue at the backend and does not treat pseudoconstant
aslocal condition. 
>
> I have tested Richard's patch and observe that it is resolving the problem. Patch looks good to me as well.

If the patch is intended for HEAD only, I also think it goes in the
right direction.  But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)

To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses.  Attached is a patch for that.

My apologies for not reviewing your patch and the long long delay.

Best regards,
Etsuro Fujita

Вложения

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
If the patch is intended for HEAD only, I also think it goes in the
right direction.  But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)

Thanks for pointing this out.  You're right.  The patch has backport
issue because of the ABI breakage.  So it can only be applied on HEAD.
 
To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses.  Attached is a patch for that.

I think we can do that in back branches.  But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases.  I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Nishant Sharma
Дата:
Hi,


Etsuro's patch is also showing the correct output for "set enable_nestloop=off". Looks good to me for back branches due to backport issues.

But below are a few observations for the same:-
1) I looked into the query plan for both "set enable_nestloop" on & off case and observe that they are the same. That is, what we see with "set enable_nestloop=on".
2) In back branches for "set enable_nestloop" on & off value, at least this type of query execution won't make any difference. No comparison of plans to be selected based on total cost of two plans old (Nested Loop with Foreign Scans) & new (Only Foreign Scan) will be done, because we are avoiding the call to "postgresGetForeignJoinPaths()" up front when we have pseudo constants.


Regards,
Nishant.

On Tue, Jun 6, 2023 at 8:50 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
If the patch is intended for HEAD only, I also think it goes in the
right direction.  But if it is intended for back branches as well, I
do not think so, because it would cause ABI breakage due to changes
made to the ForeignPath struct and the create_foreign_join_path() API.
(For the former, I think we could avoid doing so by adding the new
member at the end of the struct, not in the middle, though.)

Thanks for pointing this out.  You're right.  The patch has backport
issue because of the ABI breakage.  So it can only be applied on HEAD.
 
To avoid this issue, I am wondering if we should modify
add_paths_to_joinrel() in back branches so that it just disallows the
FDW to consider pushing down joins when the restrictlist has
pseudoconstant clauses.  Attached is a patch for that.

I think we can do that in back branches.  But I'm a little concerned
that we'd miss a better plan if FDW cannot push down joins in such
cases.  I may be worrying over nothing though if it's not common that
the restrictlist has pseudoconstant clauses.

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi Richard,

On Tue, Jun 6, 2023 at 12:20 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Mon, Jun 5, 2023 at 9:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> To avoid this issue, I am wondering if we should modify
>> add_paths_to_joinrel() in back branches so that it just disallows the
>> FDW to consider pushing down joins when the restrictlist has
>> pseudoconstant clauses.  Attached is a patch for that.

> I think we can do that in back branches.  But I'm a little concerned
> that we'd miss a better plan if FDW cannot push down joins in such
> cases.  I may be worrying over nothing though if it's not common that
> the restrictlist has pseudoconstant clauses.

Yeah, it is unfortunate that we would not get better plans.  Given
that it took quite a long time to find this issue, I suppose that
users seldom do foreign joins with pseudoconstant clauses, though.

Anyway thanks for working on this, Richard!

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi,

On Wed, Jun 7, 2023 at 7:28 PM Nishant Sharma
<nishant.sharma@enterprisedb.com> wrote:
> Etsuro's patch is also showing the correct output for "set enable_nestloop=off". Looks good to me for back branches
dueto backport issues. 
>
> But below are a few observations for the same:-
> 1) I looked into the query plan for both "set enable_nestloop" on & off case and observe that they are the same. That
is,what we see with "set enable_nestloop=on". 
> 2) In back branches for "set enable_nestloop" on & off value, at least this type of query execution won't make any
difference.No comparison of plans to be selected based on total cost of two plans old (Nested Loop with Foreign Scans)
&new (Only Foreign Scan) will be done, because we are avoiding the call to "postgresGetForeignJoinPaths()" up front
whenwe have pseudo constants. 

Thanks for looking!

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> To avoid this issue, I am wondering if we should modify
> add_paths_to_joinrel() in back branches so that it just disallows the
> FDW to consider pushing down joins when the restrictlist has
> pseudoconstant clauses.  Attached is a patch for that.

I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary.  Attached is a
new version of the patch.

Best regards,
Etsuro Fujita

Вложения

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Nishant Sharma
Дата:
Looks good to me. Tested on master and it works.
New patch used a bool flag to avoid calls for both FDW and custom hook's call. And a slight change in comment of "has_pseudoconstant_clauses" function.

Regards,
Nishant.

On Wed, Jun 14, 2023 at 12:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> To avoid this issue, I am wondering if we should modify
> add_paths_to_joinrel() in back branches so that it just disallows the
> FDW to consider pushing down joins when the restrictlist has
> pseudoconstant clauses.  Attached is a patch for that.

I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary.  Attached is a
new version of the patch.

Best regards,
Etsuro Fujita

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Wed, Jun 14, 2023 at 2:49 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> To avoid this issue, I am wondering if we should modify
> add_paths_to_joinrel() in back branches so that it just disallows the
> FDW to consider pushing down joins when the restrictlist has
> pseudoconstant clauses.  Attached is a patch for that.

I think that custom scans have the same issue, so I modified the patch
further so that it also disallows custom-scan providers to consider
join pushdown in add_paths_to_joinrel() if necessary.  Attached is a
new version of the patch.

Good point.  The v2 patch looks good to me for back branches.

I'm wondering what the plan is for HEAD.  Should we also disallow
foreign/custom join pushdown in the case that there is any
pseudoconstant restriction clause, or instead still allow join pushdown
in that case?  If it is the latter, I think we can do something like my
patch upthread does.  But that patch needs to be revised to consider
custom scans, maybe by storing the restriction clauses also in
CustomPath?

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi Richard,

On Sun, Jun 25, 2023 at 3:05 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Wed, Jun 14, 2023 at 2:49 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> On Mon, Jun 5, 2023 at 10:19 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> > To avoid this issue, I am wondering if we should modify
>> > add_paths_to_joinrel() in back branches so that it just disallows the
>> > FDW to consider pushing down joins when the restrictlist has
>> > pseudoconstant clauses.  Attached is a patch for that.
>>
>> I think that custom scans have the same issue, so I modified the patch
>> further so that it also disallows custom-scan providers to consider
>> join pushdown in add_paths_to_joinrel() if necessary.  Attached is a

> Good point.  The v2 patch looks good to me for back branches.

Cool!  Thanks for looking!

> I'm wondering what the plan is for HEAD.  Should we also disallow
> foreign/custom join pushdown in the case that there is any
> pseudoconstant restriction clause, or instead still allow join pushdown
> in that case?  If it is the latter, I think we can do something like my
> patch upthread does.  But that patch needs to be revised to consider
> custom scans, maybe by storing the restriction clauses also in
> CustomPath?

I think we should choose the latter, so I modified your patch as
mentioned, after re-creating it on top of my patch.  Attached is a new
version (0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patch).
I am attaching my patch as well
(0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patch).

Other changes made to your patch:

* I renamed the new member of the ForeignPath struct to
fdw_restrictinfo.  (And I named that of the CustomPath struct
custom_restrictinfo.)

* In your patch, only for create_foreign_join_path(), the API was
modified so that the caller provides the new member of ForeignPath,
but I modified that for
create_foreignscan_path()/create_foreign_upper_path() as well, for
consistency.

* In this bit I changed the last argument to NIL, which would be
nitpicking, though.

@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
  add_path(baserel, (Path *) path);

  /* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);

* I dropped this test case, because it would not be stable if the
system clock was too slow.

+-- bug due to sloppy handling of pseudoconstant clauses for foreign joins
+EXPLAIN (VERBOSE, COSTS OFF)
+  SELECT * FROM ft2 a, ft2 b
+  WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;
+SELECT * FROM ft2 a, ft2 b
+WHERE b.c1 = a.c1 AND now() < '25-April-2023'::timestamp;

That is it.

Sorry for the long long delay.

Best regards,
Etsuro Fujita

Вложения

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Fri, Jul 21, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I think we should choose the latter, so I modified your patch as
mentioned, after re-creating it on top of my patch.  Attached is a new
version (0002-Allow-join-pushdown-even-if-pseudoconstant-quals-v2.patch).
I am attaching my patch as well
(0001-Disable-join-pushdown-if-pseudoconstant-quals-v2.patch).

Other changes made to your patch:

* I renamed the new member of the ForeignPath struct to
fdw_restrictinfo.  (And I named that of the CustomPath struct
custom_restrictinfo.)
 
That's much better, and more consistent with other members in
ForeignPath/CustomPath.  Thanks!
 
* In your patch, only for create_foreign_join_path(), the API was
modified so that the caller provides the new member of ForeignPath,
but I modified that for
create_foreignscan_path()/create_foreign_upper_path() as well, for
consistency.

LGTM.
 
* In this bit I changed the last argument to NIL, which would be
nitpicking, though.

@@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
  add_path(baserel, (Path *) path);

  /* Add paths with pathkeys */
- add_paths_with_pathkeys_for_rel(root, baserel, NULL);
+ add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);

Good catch!  This was my oversight.
 
* I dropped this test case, because it would not be stable if the
system clock was too slow.

Agreed.  And the test case from 0001 should be sufficient.

So the two patches both look good to me now.

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi Richard,

On Mon, Jul 24, 2023 at 11:45 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Fri, Jul 21, 2023 at 8:51 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> * In this bit I changed the last argument to NIL, which would be
>> nitpicking, though.
>>
>> @@ -1038,7 +1038,7 @@ postgresGetForeignPaths(PlannerInfo *root,
>>   add_path(baserel, (Path *) path);
>>
>>   /* Add paths with pathkeys */
>> - add_paths_with_pathkeys_for_rel(root, baserel, NULL);
>> + add_paths_with_pathkeys_for_rel(root, baserel, NULL, NULL);

> This was my oversight.

No.  IIUC, I think that that would work well as-proposed, but I
changed it as such, for readability.

> So the two patches both look good to me now.

Cool!  I pushed the first patch after polishing it a little bit, so
here is a rebased version of the second patch, in which I modified the
ForeignPath and CustomPath cases in reparameterize_path_by_child() to
reflect the new members fdw_restrictinfo and custom_restrictinfo, for
safety, and tweaked a comment a bit.

Thanks for looking!

Best regards,
Etsuro Fujita

Вложения

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Fri, Jul 28, 2023 at 4:56 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
Cool!  I pushed the first patch after polishing it a little bit, so
here is a rebased version of the second patch, in which I modified the
ForeignPath and CustomPath cases in reparameterize_path_by_child() to
reflect the new members fdw_restrictinfo and custom_restrictinfo, for
safety, and tweaked a comment a bit.

Hmm, it seems that ForeignPath for a foreign join does not support
parameterized paths for now, as in postgresGetForeignJoinPaths() we have
this check:

  /*
   * This code does not work for joins with lateral references, since those
   * must have parameterized paths, which we don't generate yet.
   */
  if (!bms_is_empty(joinrel->lateral_relids))
      return;

And in create_foreign_join_path() we just set the path.param_info to
NULL.

  pathnode->path.param_info = NULL;   /* XXX see above */

So I doubt that it's necessary to adjust fdw_restrictinfo in
reparameterize_path_by_child, because it seems to me that
fdw_restrictinfo must be empty there.  Maybe we can add an Assert there
as below:

-        ADJUST_CHILD_ATTRS(fpath->fdw_restrictinfo);
+
+        /*
+         * Parameterized foreign joins are not supported.  So this
+         * ForeignPath cannot be a foreign join and fdw_restrictinfo
+         * must be empty.
+         */
+        Assert(fpath->fdw_restrictinfo == NIL);

That being said, it's also no harm to handle fdw_restrictinfo in
reparameterize_path_by_child as the patch does.  So I'm OK we do that
for safety.

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi Richard,

On Mon, Jul 31, 2023 at 5:52 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Fri, Jul 28, 2023 at 4:56 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> here is a rebased version of the second patch, in which I modified the
>> ForeignPath and CustomPath cases in reparameterize_path_by_child() to
>> reflect the new members fdw_restrictinfo and custom_restrictinfo, for
>> safety, and tweaked a comment a bit.

> Hmm, it seems that ForeignPath for a foreign join does not support
> parameterized paths for now, as in postgresGetForeignJoinPaths() we have
> this check:
>
>   /*
>    * This code does not work for joins with lateral references, since those
>    * must have parameterized paths, which we don't generate yet.
>    */
>   if (!bms_is_empty(joinrel->lateral_relids))
>       return;
>
> And in create_foreign_join_path() we just set the path.param_info to
> NULL.
>
>   pathnode->path.param_info = NULL;   /* XXX see above */
>
> So I doubt that it's necessary to adjust fdw_restrictinfo in
> reparameterize_path_by_child, because it seems to me that
> fdw_restrictinfo must be empty there.  Maybe we can add an Assert there
> as below:
>
> -        ADJUST_CHILD_ATTRS(fpath->fdw_restrictinfo);
> +
> +        /*
> +         * Parameterized foreign joins are not supported.  So this
> +         * ForeignPath cannot be a foreign join and fdw_restrictinfo
> +         * must be empty.
> +         */
> +        Assert(fpath->fdw_restrictinfo == NIL);
>
> That being said, it's also no harm to handle fdw_restrictinfo in
> reparameterize_path_by_child as the patch does.  So I'm OK we do that
> for safety.

Ok, but maybe my explanation was not good, so let me explain.  The
reason why I modified the code as such is to make the handling of
fdw_restrictinfo consistent with that of fdw_outerpath: we have the
code to reparameterize fdw_outerpath, which should be NULL though, as
we do not currently support parameterized foreign joins.

I modified the code a bit further to use an if-test to avoid a useless
function call, and added/tweaked comments and docs further.  Attached
is a new version of the patch.  I am planning to commit this, if there
are no objections.

Thanks!

Best regards,
Etsuro Fujita

Вложения

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Richard Guo
Дата:

On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
I modified the code a bit further to use an if-test to avoid a useless
function call, and added/tweaked comments and docs further.  Attached
is a new version of the patch.  I am planning to commit this, if there
are no objections.

+1 to the v4 patch.  It looks good to me.

Thanks
Richard

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
On Tue, Aug 8, 2023 at 6:30 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> I modified the code a bit further to use an if-test to avoid a useless
>> function call, and added/tweaked comments and docs further.  Attached
>> is a new version of the patch.  I am planning to commit this, if there
>> are no objections.

> +1 to the v4 patch.  It looks good to me.

Pushed after some copy-and-paste editing of comments/documents.

Thanks!

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Önder Kalacı
Дата:
Hi Etsuro, all

The commit[1] seems to break some queries in Citus[2], which is an extension which relies on set_join_pathlist_hook.

Although the comment says /*Finally, give extensions a chance to manipulate the path list.*/  we use it to extract lots of information about the joins and do the planning based on the information.

Now,  for some joins where consider_join_pushdown=false, we cannot get the information that we used to get, which prevents doing distributed planning for certain queries.

We wonder if it is possible to allow extensions to access the join info under all circumstances, as it used to be? Basically, removing the additional check:

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 03b3185984..080e76cbe9 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -349,8 +349,7 @@ add_paths_to_joinrel(PlannerInfo *root,
        /*
         * 6. Finally, give extensions a chance to manipulate the path list.
         */
-       if (set_join_pathlist_hook &&
-               consider_join_pushdown)
+       if (set_join_pathlist_hook)
                set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
                                                           jointype, &extra);
 }



Thanks,
Onder



Etsuro Fujita <etsuro.fujita@gmail.com>, 15 Ağu 2023 Sal, 11:05 tarihinde şunu yazdı:
On Tue, Aug 8, 2023 at 6:30 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Tue, Aug 8, 2023 at 4:40 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> I modified the code a bit further to use an if-test to avoid a useless
>> function call, and added/tweaked comments and docs further.  Attached
>> is a new version of the patch.  I am planning to commit this, if there
>> are no objections.

> +1 to the v4 patch.  It looks good to me.

Pushed after some copy-and-paste editing of comments/documents.

Thanks!

Best regards,
Etsuro Fujita


Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi,

On Tue, Aug 15, 2023 at 11:02 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
> The commit[1] seems to break some queries in Citus[2], which is an extension which relies on set_join_pathlist_hook.
>
> Although the comment says /*Finally, give extensions a chance to manipulate the path list.*/  we use it to extract
lotsof information about the joins and do the planning based on the information. 
>
> Now,  for some joins where consider_join_pushdown=false, we cannot get the information that we used to get, which
preventsdoing distributed planning for certain queries. 
>
> We wonder if it is possible to allow extensions to access the join info under all circumstances, as it used to be?
Basically,removing the additional check: 
>
> diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
> index 03b3185984..080e76cbe9 100644
> --- a/src/backend/optimizer/path/joinpath.c
> +++ b/src/backend/optimizer/path/joinpath.c
> @@ -349,8 +349,7 @@ add_paths_to_joinrel(PlannerInfo *root,
>         /*
>          * 6. Finally, give extensions a chance to manipulate the path list.
>          */
> -       if (set_join_pathlist_hook &&
> -               consider_join_pushdown)
> +       if (set_join_pathlist_hook)
>                 set_join_pathlist_hook(root, joinrel, outerrel, innerrel,
>                                                            jointype, &extra);

Maybe we could do so by leaving to extensions the decision whether
they replace joins with pseudoconstant clauses, but I am not sure that
that is a good idea, because that would require the authors to modify
and recompile their extensions to fix the issue...  So I fixed the
core side.

I am not familiar with the Citus extension, but such pseudoconstant
clauses are handled within the Citus extension?

Thanks for the report!

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Önder Kalacı
Дата:
Hi Etsuro,

Thanks for the response!
 
Maybe we could do so by leaving to extensions the decision whether
they replace joins with pseudoconstant clauses, but I am not sure that
that is a good idea, because that would require the authors to modify
and recompile their extensions to fix the issue... 

I think I cannot easily follow this argument. The decision to push down the join
(or not) doesn't seem to be related to calling set_join_pathlist_hook. It seems like the
extension should decide what to do with the hook. 

That seems the generic theme of the hooks that Postgres provides. For example, the extension
is allowed to even override the whole planner/executor, and there is no condition that would
prevent it from happening. In other words, an extension can easily return wrong results with the
wrong actions taken with the hooks, and that should be responsibility of the extension, not Postgres


I am not familiar with the Citus extension, but such pseudoconstant
clauses are handled within the Citus extension?


As I noted earlier, Citus relies on this hook for collecting information about all the joins that Postgres
knows about, there is nothing specific to pseudoconstants. Some parts of creating the (distributed) 
plan relies on the information gathered from this hook. So, if information about some of the joins 
are not passed to the extension, then the decisions that the extension gives are broken (and as a result
the queries are broken).

Thanks,
Onder 


  

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi Onder,

On Wed, Aug 16, 2023 at 10:58 PM Önder Kalacı <onderkalaci@gmail.com> wrote:

>> Maybe we could do so by leaving to extensions the decision whether
>> they replace joins with pseudoconstant clauses, but I am not sure that
>> that is a good idea, because that would require the authors to modify
>> and recompile their extensions to fix the issue...

> I think I cannot easily follow this argument. The decision to push down the join
> (or not) doesn't seem to be related to calling set_join_pathlist_hook. It seems like the
> extension should decide what to do with the hook.
>
> That seems the generic theme of the hooks that Postgres provides. For example, the extension
> is allowed to even override the whole planner/executor, and there is no condition that would
> prevent it from happening. In other words, an extension can easily return wrong results with the
> wrong actions taken with the hooks, and that should be responsibility of the extension, not Postgres

>> I am not familiar with the Citus extension, but such pseudoconstant
>> clauses are handled within the Citus extension?

> As I noted earlier, Citus relies on this hook for collecting information about all the joins that Postgres
> knows about, there is nothing specific to pseudoconstants. Some parts of creating the (distributed)
> plan relies on the information gathered from this hook. So, if information about some of the joins
> are not passed to the extension, then the decisions that the extension gives are broken (and as a result
> the queries are broken).

Thanks for the explanation!

Maybe my explanation was not enough, so let me explain:

* I think you could use the set_join_pathlist_hook hook as you like at
your own responsibility, but typical use cases of the hook that are
designed to support in the core system would be just add custom paths
for replacing joins with scans, as described in custom-scan.sgml (this
note is about set_rel_pathlist_hook, but it should also apply to
set_join_pathlist_hook):

    Although this hook function can be used to examine, modify, or remove
    paths generated by the core system, a custom scan provider will typically
    confine itself to generating <structname>CustomPath</structname>
objects and adding
    them to <literal>rel</literal> using <function>add_path</function>.

* The problem we had with the set_join_pathlist_hook hook is that in
such a typical use case, previously, if the replaced joins had any
pseudoconstant clauses, the planner would produce incorrect query
plans, due to the lack of support for handling such quals in
createplan.c.  We could fix the extensions side, as you proposed, but
the cause of the issue is 100% the planner's deficiency, so it would
be unreasonable to force the authors to do so, which would also go
against our policy of ABI compatibility.  So I fixed the core side, as
in the FDW case, so that extensions created for such a typical use
case, which I guess are the majority of the hook extensions, need not
be modified/recompiled.  I think it is unfortunate that that breaks
the use case of the Citus extension, though.

BTW: commit 9e9931d2b removed the restriction on the call to the hook
extensions, so you might want to back-patch it.  Though, I think it
would be better if the hook was well implemented from the beginning.

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Andres Freund
Дата:
Hi,

On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
> Maybe my explanation was not enough, so let me explain:
> 
> * I think you could use the set_join_pathlist_hook hook as you like at
> your own responsibility, but typical use cases of the hook that are
> designed to support in the core system would be just add custom paths
> for replacing joins with scans, as described in custom-scan.sgml (this
> note is about set_rel_pathlist_hook, but it should also apply to
> set_join_pathlist_hook):
> 
>     Although this hook function can be used to examine, modify, or remove
>     paths generated by the core system, a custom scan provider will typically
>     confine itself to generating <structname>CustomPath</structname>
> objects and adding
>     them to <literal>rel</literal> using <function>add_path</function>.

That supports citus' use more than not: "this hook function can be used to
examine ... paths generated by the core system".


> * The problem we had with the set_join_pathlist_hook hook is that in
> such a typical use case, previously, if the replaced joins had any
> pseudoconstant clauses, the planner would produce incorrect query
> plans, due to the lack of support for handling such quals in
> createplan.c.  We could fix the extensions side, as you proposed, but
> the cause of the issue is 100% the planner's deficiency, so it would
> be unreasonable to force the authors to do so, which would also go
> against our policy of ABI compatibility.  So I fixed the core side, as
> in the FDW case, so that extensions created for such a typical use
> case, which I guess are the majority of the hook extensions, need not
> be modified/recompiled.  I think it is unfortunate that that breaks
> the use case of the Citus extension, though.

I'm not neutral - I don't work on citus, but work in the same Unit as
Onder. With that said: I don't think that's really a justification for
breaking a pre-existing, not absurd, use case in a minor release.

Except that this was only noticed after it was released in a set of minor
versions, I would say that 6f80a8d9c should just straight up be reverted.
Skimming the thread there wasn't really any analysis done about breaking
extensions etc - and that ought to be done before a substantial semantics
change in a somewhat commonly used hook.  I'm inclined to think that that
might still be the right path.


> BTW: commit 9e9931d2b removed the restriction on the call to the hook
> extensions, so you might want to back-patch it.

Citus is an extension, not a fork, there's not really a way to just backpatch
a random commit.


> Though, I think it would be better if the hook was well implemented from the
> beginning.

Sure, but that's neither here nor there.

Greetings,

Andres Freund



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi,

On Sun, Aug 20, 2023 at 4:34 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
> > Maybe my explanation was not enough, so let me explain:
> >
> > * I think you could use the set_join_pathlist_hook hook as you like at
> > your own responsibility, but typical use cases of the hook that are
> > designed to support in the core system would be just add custom paths
> > for replacing joins with scans, as described in custom-scan.sgml (this
> > note is about set_rel_pathlist_hook, but it should also apply to
> > set_join_pathlist_hook):
> >
> >     Although this hook function can be used to examine, modify, or remove
> >     paths generated by the core system, a custom scan provider will typically
> >     confine itself to generating <structname>CustomPath</structname>
> > objects and adding
> >     them to <literal>rel</literal> using <function>add_path</function>.
>
> That supports citus' use more than not: "this hook function can be used to
> examine ... paths generated by the core system".
>
>
> > * The problem we had with the set_join_pathlist_hook hook is that in
> > such a typical use case, previously, if the replaced joins had any
> > pseudoconstant clauses, the planner would produce incorrect query
> > plans, due to the lack of support for handling such quals in
> > createplan.c.  We could fix the extensions side, as you proposed, but
> > the cause of the issue is 100% the planner's deficiency, so it would
> > be unreasonable to force the authors to do so, which would also go
> > against our policy of ABI compatibility.  So I fixed the core side, as
> > in the FDW case, so that extensions created for such a typical use
> > case, which I guess are the majority of the hook extensions, need not
> > be modified/recompiled.  I think it is unfortunate that that breaks
> > the use case of the Citus extension, though.
>
> I'm not neutral - I don't work on citus, but work in the same Unit as
> Onder. With that said: I don't think that's really a justification for
> breaking a pre-existing, not absurd, use case in a minor release.
>
> Except that this was only noticed after it was released in a set of minor
> versions, I would say that 6f80a8d9c should just straight up be reverted.
> Skimming the thread there wasn't really any analysis done about breaking
> extensions etc - and that ought to be done before a substantial semantics
> change in a somewhat commonly used hook.  I'm inclined to think that that
> might still be the right path.
>
>
> > BTW: commit 9e9931d2b removed the restriction on the call to the hook
> > extensions, so you might want to back-patch it.
>
> Citus is an extension, not a fork, there's not really a way to just backpatch
> a random commit.
>
>
> > Though, I think it would be better if the hook was well implemented from the
> > beginning.
>
> Sure, but that's neither here nor there.
>
> Greetings,
>
> Andres Freund



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Sorry, I hit the send button by mistake.

On Sun, Aug 20, 2023 at 4:34 AM Andres Freund <andres@anarazel.de> wrote:
> On 2023-08-19 20:09:25 +0900, Etsuro Fujita wrote:
> > * The problem we had with the set_join_pathlist_hook hook is that in
> > such a typical use case, previously, if the replaced joins had any
> > pseudoconstant clauses, the planner would produce incorrect query
> > plans, due to the lack of support for handling such quals in
> > createplan.c.  We could fix the extensions side, as you proposed, but
> > the cause of the issue is 100% the planner's deficiency, so it would
> > be unreasonable to force the authors to do so, which would also go
> > against our policy of ABI compatibility.  So I fixed the core side, as
> > in the FDW case, so that extensions created for such a typical use
> > case, which I guess are the majority of the hook extensions, need not
> > be modified/recompiled.  I think it is unfortunate that that breaks
> > the use case of the Citus extension, though.
>
> I'm not neutral - I don't work on citus, but work in the same Unit as
> Onder. With that said: I don't think that's really a justification for
> breaking a pre-existing, not absurd, use case in a minor release.
>
> Except that this was only noticed after it was released in a set of minor
> versions, I would say that 6f80a8d9c should just straight up be reverted.
> Skimming the thread there wasn't really any analysis done about breaking
> extensions etc - and that ought to be done before a substantial semantics
> change in a somewhat commonly used hook.  I'm inclined to think that that
> might still be the right path.

I think you misread the thread; actually, we did an analysis and
applied a fix that would avoid ABI breakage (see the commit message
for 6f80a8d9c).  It turned out that that breaks the Citus extension,
though.

Also, this is not such a change; it is just an optimization
disablement.  Let me explain.  This is the commit message for
e7cb7ee14, which added the hook we are discussing:

    Allow FDWs and custom scan providers to replace joins with scans.

    Foreign data wrappers can use this capability for so-called "join
    pushdown"; that is, instead of executing two separate foreign scans
    and then joining the results locally, they can generate a path which
    performs the join on the remote server and then is scanned locally.
    This commit does not extend postgres_fdw to take advantage of this
    capability; it just provides the infrastructure.

    Custom scan providers can use this in a similar way.  Previously,
    it was only possible for a custom scan provider to scan a single
    relation.  Now, it can scan an entire join tree, provided of course
    that it knows how to produce the same results that the join would
    have produced if executed normally.

As described in the commit message, we assume that extensions use the
hook in a similar way to FDWs; if they do so, the restriction added by
6f80a8d9c just diables them to add paths for join pushdown, making the
planner use paths involving local joins, so any breakage (other than
plan changes from custom joins to local joins) would never happen.

So my question is: does the Citus extension use the hook like this?
(Sorry, I do not fully understand Onder's explanation.)

> > BTW: commit 9e9931d2b removed the restriction on the call to the hook
> > extensions, so you might want to back-patch it.
>
> Citus is an extension, not a fork, there's not really a way to just backpatch
> a random commit.

Yeah, I was thinking that that would be your concern.

Best regards,
Etsuro Fujita



Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Önder Kalacı
Дата:
Hi,

Thanks for the explanation.

As described in the commit message, we assume that extensions use the
hook in a similar way to FDWs

I'm not sure if it is fair to assume that extensions use any hook in any way.

So my question is: does the Citus extension use the hook like this?
(Sorry, I do not fully understand Onder's explanation.)


I haven't gone into detail about how Citus uses this hook, but I don't think we should 
need to explain it. In general, Citus uses many hooks, and many other extensions 
use this specific hook. With minor version upgrades, we haven't seen this kind of 
behavior change before.

In general, Citus relies on this hook for collecting information about joins across 
relations/ctes/subqueries. So, its scope is bigger than a single join for Citus.

The extension assigns a special marker(s) for RTE Relations, and then checks whether
all relations with these special markers joined transitively across subqueries, such that 
it can decide to pushdown the whole or some parts of the (sub)query.

I must admit, I have not yet looked into whether we can fix the problem within the extension. 
Maybe we can, maybe not.

But the bigger issue is that there has usually been a clear line between the extensions and 
the PG itself when it comes to hooks within the minor version upgrades. Sadly, this change 
breaks that line. We wanted to share our worries here and find out what others think.

>Except that this was only noticed after it was released in a set of minor
> versions, I would say that 6f80a8d9c should just straight up be reverted.

I cannot be the one to ask for reverting a commit in PG, but I think doing it would be a 
fair action. We kindly ask those who handle this to think about it.

Thanks,
Onder
 

Re: postgres_fdw: wrong results with self join + enable_nestloop off

От
Etsuro Fujita
Дата:
Hi,

Thanks for the detailed explanation!

On Mon, Aug 21, 2023 at 10:34 PM Önder Kalacı <onderkalaci@gmail.com> wrote:

>> As described in the commit message, we assume that extensions use the
>> hook in a similar way to FDWs

> I'm not sure if it is fair to assume that extensions use any hook in any way.

I am not sure either, but as for the hook, I think it is an undeniable
fact that the core system assumes that extensions will use it in that
way.

>> So my question is: does the Citus extension use the hook like this?
>> (Sorry, I do not fully understand Onder's explanation.)

> I haven't gone into detail about how Citus uses this hook, but I don't think we should
> need to explain it. In general, Citus uses many hooks, and many other extensions
> use this specific hook. With minor version upgrades, we haven't seen this kind of
> behavior change before.
>
> In general, Citus relies on this hook for collecting information about joins across
> relations/ctes/subqueries. So, its scope is bigger than a single join for Citus.
>
> The extension assigns a special marker(s) for RTE Relations, and then checks whether
> all relations with these special markers joined transitively across subqueries, such that
> it can decide to pushdown the whole or some parts of the (sub)query.

IIUC, I think that that is going beyond what the hook supports.

> But the bigger issue is that there has usually been a clear line between the extensions and
> the PG itself when it comes to hooks within the minor version upgrades. Sadly, this change
> breaks that line. We wanted to share our worries here and find out what others think.

My understanding is: at least for hooks with intended usages, if an
extension uses them as intended, it is guaranteed that the extension
as-is will work correctly with minor version upgrades; otherwise it is
not necessarily.  I think it is unfortunate that my commit broke the
Citus extension, though.

>> >Except that this was only noticed after it was released in a set of minor
>> > versions, I would say that 6f80a8d9c should just straight up be reverted.

> I cannot be the one to ask for reverting a commit in PG, but I think doing it would be a
> fair action. We kindly ask those who handle this to think about it.

Reverting the commit would resolve your issue, but re-introduce the
issue mentioned upthread to extensions that use the hook properly, so
I do not think that reverting the commit would be a fair action.

Sorry for the delay.

Best regards,
Etsuro Fujita