Обсуждение: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
PG Bug reporting form
Дата:
The following bug has been logged on the website: Bug reference: 17344 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 14.1 Operating system: Ubuntu 20.04 Description: When executing the following query: create extension postgres_fdw; do $d$ begin execute $$create server loopback foreign data wrapper postgres_fdw options (dbname '$$||current_database()||$$', port '$$||current_setting('port')||$$', async_capable 'true' )$$; end; $d$; create user mapping for current_user server loopback; create table tab1 (a int); insert into tab1 values(1), (2), (3); create foreign table ftab1 (b int) server loopback options (table_name 'tab1'); create foreign table ftab2 () inherits (ftab1) server loopback; select a from tab1 where a in (select a from ftab1); I get an assertion failed with the stacktrace: Core was generated by `postgres: law regression [local] SELECT '. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f2a2f9fa859 in __GI_abort () at abort.c:79 #2 0x0000557dece3c5d8 in ExceptionalCondition (conditionName=conditionName@entry=0x7f2a225a8da7 "areq != pendingAreq", errorType=errorType@entry=0x7f2a225a76aa "FailedAssertion", fileName=0x7ffde2262540 "\271\305\343\354}U", fileName@entry=0x7f2a225a88af "postgres_fdw.c", lineNumber=lineNumber@entry=6974) at assert.c:69 #3 0x00007f2a225a4e34 in produce_tuple_asynchronously (areq=<optimized out>, fetch=<optimized out>) at postgres_fdw.c:6974 #4 0x0000557decb6e152 in ExecAsyncRequest (areq=0x557ded812670) at execAsync.c:38 #5 0x0000557decb94e4e in ExecAppendAsyncBegin (node=0x557ded7f2040) at nodeAppend.c:907 #6 ExecAppend (pstate=0x557ded7f2040) at nodeAppend.c:317 #7 0x0000557decbb0183 in ExecProcNode (node=0x557ded7f2040) at ../../../src/include/executor/executor.h:257 #8 ExecResult (pstate=0x557ded7f1e38) at nodeResult.c:115 #9 0x0000557decbb263d in ExecProcNode (node=0x557ded7f1e38) at ../../../src/include/executor/executor.h:257 #10 ExecScanSubPlan (isNull=0x557ded81ba35, econtext=0x557ded81b7e8, node=0x557ded81c2e0) at nodeSubplan.c:323 #11 ExecSubPlan (node=node@entry=0x557ded81c2e0, econtext=econtext@entry=0x557ded81b7e8, isNull=0x557ded81ba35) at nodeSubplan.c:89 #12 0x0000557decb77beb in ExecEvalSubPlan (econtext=0x557ded81b7e8, op=<optimized out>, state=0x557ded81ba30) at execExprInterp.c:3932 #13 ExecInterpExpr (state=0x557ded81ba30, econtext=0x557ded81b7e8, isnull=<optimized out>) at execExprInterp.c:1564 #14 0x0000557decb865d5 in ExecEvalExprSwitchContext (isNull=0x7ffde2262ad7, econtext=0x557ded81b7e8, state=0x557ded81ba30) at ../../../src/include/executor/executor.h:339 #15 ExecQual (econtext=0x557ded81b7e8, state=0x557ded81ba30) at ../../../src/include/executor/executor.h:408 #16 ExecScan (node=0x557ded7f3748, accessMtd=0x557decbb0ad0 <SeqNext>, recheckMtd=0x557decbb0ac0 <SeqRecheck>) at execScan.c:227 #17 0x0000557decb7cc7a in ExecProcNode (node=0x557ded7f3748) at ../../../src/include/executor/executor.h:257 #18 ExecutePlan (execute_once=<optimized out>, dest=0x557ded80a428, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x557ded7f3748, estate=0x557ded7f1b90) at execMain.c:1551 #19 standard_ExecutorRun (queryDesc=0x557ded800c20, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:361 #20 0x0000557decd18b9c in PortalRunSelect (portal=0x557ded756860, forward=<optimized out>, count=0, dest=<optimized out>) at pquery.c:921 #21 0x0000557decd1a2cb in PortalRun (portal=portal@entry=0x557ded756860, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x557ded80a428, altdest=altdest@entry=0x557ded80a428, qc=0x7ffde2262d80) at pquery.c:765 #22 0x0000557decd1614b in exec_simple_query ( query_string=0x557ded6e7960 "select a from tab1 where a in (select a from ftab1);") at postgres.c:1214 #23 0x0000557decd17d51 in PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7ffde22631f0, dbname=<optimized out>, username=<optimized out>) at postgres.c:4486 #24 0x0000557decc83c8d in BackendRun (port=0x557ded70b3d0, port=0x557ded70b3d0) at postmaster.c:4530 #25 BackendStartup (port=0x557ded70b3d0) at postmaster.c:4252 #26 ServerLoop () at postmaster.c:1745 #27 0x0000557decc84c31 in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1417 #28 0x0000557dec9b24b2 in main (argc=3, argv=0x557ded6e1990) at main.c:209 With async_capable = 'false' I get expected results. But when ftab1 defined as create foreign table ftab1 (a int) server loopback options (table_name 'tab1'); The select fails with: psql:query.sql:19: ERROR: could not connect to server "loopback" DETAIL: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: sorry, too many clients already CONTEXT: remote SQL command: SELECT a FROM public.ftab2 remote SQL command: SELECT a FROM public.ftab2 remote SQL command: SELECT a FROM public.ftab2 remote SQL command: SELECT a FROM public.ftab2 ... with or without async_capable.
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Dmitry Dolgov
Дата:
> On Sat, Dec 25, 2021 at 10:00:01AM +0000, PG Bug reporting form wrote: > > create extension postgres_fdw; > do $d$ > begin > execute $$create server loopback foreign data wrapper postgres_fdw > options (dbname '$$||current_database()||$$', > port '$$||current_setting('port')||$$', > async_capable 'true' > )$$; > end; > $d$; > create user mapping for current_user server loopback; > > create table tab1 (a int); > insert into tab1 values(1), (2), (3); > create foreign table ftab1 (b int) server loopback options (table_name > 'tab1'); > > create foreign table ftab2 () inherits (ftab1) server loopback; > select a from tab1 where a in (select a from ftab1); > > With async_capable = 'false' I get expected results. > But when ftab1 defined as > create foreign table ftab1 (a int) server loopback options (table_name > 'tab1'); > The select fails with: > psql:query.sql:19: ERROR: could not connect to server "loopback" > DETAIL: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: > sorry, too many clients already > CONTEXT: remote SQL command: SELECT a FROM public.ftab2 > remote SQL command: SELECT a FROM public.ftab2 > remote SQL command: SELECT a FROM public.ftab2 > remote SQL command: SELECT a FROM public.ftab2 Apparently in both cases the second foreign scan of append ends up in recursion: -> Foreign Scan on public.ftab2 ftab1_2 Remote SQL: SELECT NULL FROM public.ftab2 This leads to a chain of the same cursor declarations: DECLARE c1 CURSOR FOR SELECT a FROM public.ftab4 With a pattern: bind -> new backend with the same declare -> execute -> bind -> ... In the case of async foreign scan it hits limitation when no new requests be sent while there is a pending one. In the second case it hits connection limit. Shouldn't this type of foreign paths be avoided by FDW?
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Etsuro Fujita
Дата:
On Sun, Dec 26, 2021 at 8:26 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > > On Sat, Dec 25, 2021 at 10:00:01AM +0000, PG Bug reporting form wrote: > > > > create extension postgres_fdw; > > do $d$ > > begin > > execute $$create server loopback foreign data wrapper postgres_fdw > > options (dbname '$$||current_database()||$$', > > port '$$||current_setting('port')||$$', > > async_capable 'true' > > )$$; > > end; > > $d$; > > create user mapping for current_user server loopback; > > > > create table tab1 (a int); > > insert into tab1 values(1), (2), (3); > > create foreign table ftab1 (b int) server loopback options (table_name > > 'tab1'); > > > > create foreign table ftab2 () inherits (ftab1) server loopback; > > select a from tab1 where a in (select a from ftab1); > > > > With async_capable = 'false' I get expected results. > > But when ftab1 defined as > > create foreign table ftab1 (a int) server loopback options (table_name > > 'tab1'); > > The select fails with: > > psql:query.sql:19: ERROR: could not connect to server "loopback" > > DETAIL: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: > > sorry, too many clients already > > CONTEXT: remote SQL command: SELECT a FROM public.ftab2 > > remote SQL command: SELECT a FROM public.ftab2 > > remote SQL command: SELECT a FROM public.ftab2 > > remote SQL command: SELECT a FROM public.ftab2 Reproduced here. > Apparently in both cases the second foreign scan of append ends up in > recursion: > > -> Foreign Scan on public.ftab2 ftab1_2 > Remote SQL: SELECT NULL FROM public.ftab2 > > This leads to a chain of the same cursor declarations: > > DECLARE c1 CURSOR FOR SELECT a FROM public.ftab4 > > With a pattern: > > bind -> new backend with the same declare -> execute -> bind -> ... > > In the case of async foreign scan it hits limitation when no new > requests be sent while there is a pending one. Yeah, but when executing a simple query “select NULL from ftab1”, which would essentially be the same as the sublink subquery used in the first case, I don’t get the assertion failure. (The simple query only throws the connection limit error.) The root cause of the assertion failure in the first case might be something other than the limitation. I’ll look into this in more detail. Thanks, Alexander and Dmitry! Best regards, Etsuro Fujita
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Etsuro Fujita
Дата:
On Tue, Dec 28, 2021 at 10:14 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > The root cause of the > assertion failure in the first case might be something other than the > limitation. I’ll look into this in more detail. I think the root cause is that we fail to process a pending async request (if any) in postgresReScanForeignScan() in the case where we just reset the next_tuple counter in that function, without destroying/recreating or rewinding the cursor. (In the case where we destroy/recreate or rewind the cursor in that function, the pending async request would be processed by pgfdw_exec_query() called from that function.) This breaks the assumption about ExecAppend() that when called for the first time after ReScan, there are no pending async requests made for async subplans for the Append node, causing the assertion failure in postgresForeignAsyncRequest() called from that ExecAppend(). My oversight in commit 27e1f1456. :-( To fix, I modified postgresReScanForeignScan() so that we always process a pending async request (if any) before restarting the foreign scan. Attached is a patch for that. I tested the patch with the first case, and it addresses the assertion failure. Best regards, Etsuro Fujita
Вложения
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Dmitry Dolgov
Дата:
> On Fri, Dec 31, 2021 at 04:36:42PM +0900, Etsuro Fujita wrote: > On Tue, Dec 28, 2021 at 10:14 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > > The root cause of the > > assertion failure in the first case might be something other than the > > limitation. I’ll look into this in more detail. > > I think the root cause is that we fail to process a pending async > request (if any) in postgresReScanForeignScan() in the case where we > just reset the next_tuple counter in that function, without > destroying/recreating or rewinding the cursor. (In the case where we > destroy/recreate or rewind the cursor in that function, the pending > async request would be processed by pgfdw_exec_query() called from > that function.) This breaks the assumption about ExecAppend() that > when called for the first time after ReScan, there are no pending > async requests made for async subplans for the Append node, causing > the assertion failure in postgresForeignAsyncRequest() called from > that ExecAppend(). My oversight in commit 27e1f1456. :-( > > To fix, I modified postgresReScanForeignScan() so that we always > process a pending async request (if any) before restarting the foreign > scan. Attached is a patch for that. I tested the patch with the > first case, and it addresses the assertion failure. Yep, makes sense now, thank you. The fix works for me, but I'm curious about the requestee condition: fsstate->conn_state->pendingAreq->requestee == (PlanState *) node) You've mentioned that in those cases where the cursor will be destroyed/recreated or rewind, pgfdw_exec_query will take care of processing pending requests, and looks like it will do this without checking the requestee. Just for me to understand, why is this condition necessary?
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Alexander Lakhin
Дата:
Hello Etsuro-san, 31.12.2021 10:36, Etsuro Fujita wrote: > On Tue, Dec 28, 2021 at 10:14 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: >> The root cause of the >> assertion failure in the first case might be something other than the >> limitation. I’ll look into this in more detail. > To fix, I modified postgresReScanForeignScan() so that we always > process a pending async request (if any) before restarting the foreign > scan. Attached is a patch for that. I tested the patch with the > first case, and it addresses the assertion failure. Thanks for the fix! I can confirm that it eliminates the failure and propose a simple test case for the issue. Maybe you will find it useful. Best regards, Alexander
Вложения
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Etsuro Fujita
Дата:
On Fri, Dec 31, 2021 at 9:32 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > > On Fri, Dec 31, 2021 at 04:36:42PM +0900, Etsuro Fujita wrote: > > I think the root cause is that we fail to process a pending async > > request (if any) in postgresReScanForeignScan() in the case where we > > just reset the next_tuple counter in that function, without > > destroying/recreating or rewinding the cursor. (In the case where we > > destroy/recreate or rewind the cursor in that function, the pending > > async request would be processed by pgfdw_exec_query() called from > > that function.) This breaks the assumption about ExecAppend() that > > when called for the first time after ReScan, there are no pending > > async requests made for async subplans for the Append node, causing > > the assertion failure in postgresForeignAsyncRequest() called from > > that ExecAppend(). My oversight in commit 27e1f1456. :-( > > > > To fix, I modified postgresReScanForeignScan() so that we always > > process a pending async request (if any) before restarting the foreign > > scan. Attached is a patch for that. I tested the patch with the > > first case, and it addresses the assertion failure. > > Yep, makes sense now, thank you. The fix works for me, but I'm curious > about the requestee condition: > > fsstate->conn_state->pendingAreq->requestee == (PlanState *) node) > > You've mentioned that in those cases where the cursor will be > destroyed/recreated or rewind, pgfdw_exec_query will take care of > processing pending requests, and looks like it will do this without > checking the requestee. Just for me to understand, why is this condition > necessary? The reason is that the pending request might not need to be processed. Consider this query: alter server loopback1 options (add async_capable 'true'); alter server loopback1 options (add use_remote_estimate 'true'); create table a (aa int, constraint a_pkey primary key (aa)); create table loct (aa int, bb text, constraint loct_pkey primary key (aa)); create foreign table b (bb text) inherits (a) server loopback1 options (table_name 'loct'); insert into a select i from generate_series(1, 10000) i; insert into b select i, to_char(i, 'FM00000') from generate_series(1, 10000) i; analyze loct; analyze a; explain verbose select * from a t1, a t2 where t1.aa = t2.aa limit 1; QUERY PLAN --------------------------------------------------------------------------------------------------- Limit (cost=0.29..54.64 rows=1 width=8) Output: t1.aa, t2.aa -> Nested Loop (cost=0.29..2174178.00 rows=40000 width=8) Output: t1.aa, t2.aa -> Append (cost=0.00..700.00 rows=20000 width=4) -> Seq Scan on public.a t1_1 (cost=0.00..145.00 rows=10000 width=4) Output: t1_1.aa -> Async Foreign Scan on public.b t1_2 (cost=100.00..455.00 rows=10000 width=4) Output: t1_2.aa Remote SQL: SELECT aa FROM public.loct -> Append (cost=0.29..108.65 rows=2 width=4) -> Index Only Scan using a_pkey on public.a t2_1 (cost=0.29..0.31 rows=1 width=4) Output: t2_1.aa Index Cond: (t2_1.aa = t1.aa) -> Async Foreign Scan on public.b t2_2 (cost=100.30..108.33 rows=1 width=4) Output: t2_2.aa Remote SQL: SELECT aa FROM public.loct WHERE (($1::integer = aa)) (17 rows) Note that when starting the scan for the outer side, we’ll begin an asynchronous fetch for the foreign table on the outer side, but we won’t wait for the results of it until after we get all tuples from the local table on the outer side (see the documentation note about the async_capable option for postgres_fdw [1]). So when rescanning the inner side after getting the first tuple from the local table on the outer side, the asynchronous fetch begun for the foreign table on the outer side will still be in progress, and we’ll have it as pendingAreq when called from postgresReScanForeignScan() for rescanning the foreign table on the inner side. BUT it would be useless to process the pending request there, because even without processing the pending request, we can get a matching tuple for the outer tuple from the local table on the inner side, as the local table will be scanned first even in the inner side, and then return the query result. I think the condition mentioned above is useful to avoid processing such a useless pending request and return result tuples more quickly. Thanks for reviewing! Best regards, Etsuro Fujita [1] https://www.postgresql.org/docs/14/postgres-fdw.html#id-1.11.7.44.11
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Etsuro Fujita
Дата:
Hi Alexander, On Mon, Jan 3, 2022 at 4:00 AM Alexander Lakhin <exclusion@gmail.com> wrote: > 31.12.2021 10:36, Etsuro Fujita wrote: > > To fix, I modified postgresReScanForeignScan() so that we always > > process a pending async request (if any) before restarting the foreign > > scan. Attached is a patch for that. I tested the patch with the > > first case, and it addresses the assertion failure. > Thanks for the fix! I can confirm that it eliminates the failure and Good to know. Thanks for testing! > propose a simple test case for the issue. Maybe you will find it useful. Thanks, but in my environment, the test case doesn’t cause any failure. How about something like the attached, which is made by modifying the original test case to avoid the connection-limit-error issue. In the attached, I also added a comment to a function to match other places. Best regards, Etsuro Fujita
Вложения
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Alexander Lakhin
Дата:
07.01.2022 09:15, Etsuro Fujita wrote: > On Mon, Jan 3, 2022 at 4:00 AM Alexander Lakhin <exclusion@gmail.com> wrote: > >> propose a simple test case for the issue. Maybe you will find it useful. > Thanks, but in my environment, the test case doesn’t cause any > failure. It's interesting. I perform on REL_14_STABLE (61c8da50) the following: wget https://www.postgresql.org/message-id/attachment/129455/posgres_fdw.sql.patch -O - | git apply ./configure --enable-debug --enable-cassert >/dev/null && make -j8 >/dev/null && make -j8 -C contrib >/dev/null && make check -C contrib/postgres_fdw/ and get: test postgres_fdw ... FAILED (test process exited with exit code 2) 1116 ms > How about something like the attached, which is made by > modifying the original test case to avoid the connection-limit-error > issue. In the attached, I also added a comment to a function to match > other places. Your modified test fails with unpatched postgres_fdw.c too (as expected). I would prefer your test as more elaborated. Best regards, Alexander
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Etsuro Fujita
Дата:
On Sat, Jan 8, 2022 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > 07.01.2022 09:15, Etsuro Fujita wrote: > > On Mon, Jan 3, 2022 at 4:00 AM Alexander Lakhin <exclusion@gmail.com> wrote: > >> propose a simple test case for the issue. Maybe you will find it useful. > > Thanks, but in my environment, the test case doesn’t cause any > > failure. > It's interesting. I perform on REL_14_STABLE (61c8da50) the following: > wget > https://www.postgresql.org/message-id/attachment/129455/posgres_fdw.sql.patch > -O - | git apply > ./configure --enable-debug --enable-cassert >/dev/null && make -j8 > >/dev/null && make -j8 -C contrib >/dev/null && make check -C > contrib/postgres_fdw/ > and get: > test postgres_fdw ... FAILED (test process exited with > exit code 2) 1116 ms Reproduced. Sorry for that. Maybe, I tested this on a non-assertion-enabled build by mistake. :-( > > How about something like the attached, which is made by > > modifying the original test case to avoid the connection-limit-error > > issue. In the attached, I also added a comment to a function to match > > other places. > Your modified test fails with unpatched postgres_fdw.c too (as > expected). I would prefer your test as more elaborated. Ok, but I modified the added test case a bit further so that foreign tables are well-defined like in your second test case, which I think is good. Attached is an updated version of the patch. Thanks! Best regards, Etsuro Fujita
Вложения
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Alexander Lakhin
Дата:
Hello Etsuro-san, 08.01.2022 11:57, Etsuro Fujita wrote: > Ok, but I modified the added test case a bit further so that foreign > tables are well-defined like in your second test case, which I think > is good. Attached is an updated version of the patch. Thank you! The updated version looks even better to me. Best regards, Alexander
Re: BUG #17344: Assert failed on queiring async_capable foreign table with inheritance
От
Etsuro Fujita
Дата:
On Sun, Jan 9, 2022 at 4:00 AM Alexander Lakhin <exclusion@gmail.com> wrote: > 08.01.2022 11:57, Etsuro Fujita wrote: > > Ok, but I modified the added test case a bit further so that foreign > > tables are well-defined like in your second test case, which I think > > is good. Attached is an updated version of the patch. > Thank you! The updated version looks even better to me. Thanks for reviewing! I've committed the patch. Best regards, Etsuro Fujita