Обсуждение: Segmentation fault on postgresql 13.4, 12.8 with function call in a cursor
Hi,
a recent change in postgresql is causing a segfault when a function is called in filter section, e.g. with a plan like this:
Merge Left Join (cost=0.30..207.75 rows=400 width=7) (actual time=0.006..0.009 rows=0 loops=1)
Merge Cond: (di.itemid = foo.itemid)
Filter: (COALESCE((min(foo.last_update)), '1970-01-01 00:00:00'::timestamp without time zone) < di.download_time)
-> Index Scan using downloaded_images_pkey on downloaded_images di (cost=0.15..29.25 rows=500 width=15) (actual time=0.005..0.005 rows=0 loops=1)
-> Materialize (cost=0.15..135.25 rows=1200 width=40) (never executed)
-> GroupAggregate (cost=0.15..96.25 rows=1200 width=40) (never executed)
Group Key: foo.itemid
-> Index Scan using foo_pkey on foo (cost=0.15..54.25 rows=1200 width=40) (never executed)
Merge Cond: (di.itemid = foo.itemid)
Filter: (COALESCE((min(foo.last_update)), '1970-01-01 00:00:00'::timestamp without time zone) < di.download_time)
-> Index Scan using downloaded_images_pkey on downloaded_images di (cost=0.15..29.25 rows=500 width=15) (actual time=0.005..0.005 rows=0 loops=1)
-> Materialize (cost=0.15..135.25 rows=1200 width=40) (never executed)
-> GroupAggregate (cost=0.15..96.25 rows=1200 width=40) (never executed)
Group Key: foo.itemid
-> Index Scan using foo_pkey on foo (cost=0.15..54.25 rows=1200 width=40) (never executed)
At least 13.4-1.pgdg100+1 and 12.8-1.pgdg100+1 are affected. 13.3-1.pgdg100+1 works fine. Here are steps to reproduce the issue.
cat <<EOF> postgresql-segfault.sql
CREATE SCHEMA debug;
CREATE TABLE debug.downloaded_images (
itemid text NOT NULL,
download_time timestamp,
PRIMARY KEY(itemId)
);
INSERT INTO debug.downloaded_images (itemid, download_time) VALUES ('1190300','2021-09-07 11:00:10.255831');
BEGIN;
CREATE TABLE IF NOT EXISTS "debug"."foo"
(itemId TEXT,
last_update TIMESTAMP,
PRIMARY KEY(itemId)
);
DECLARE "test-cursor-crash" CURSOR WITH HOLD FOR
SELECT di.itemId FROM "debug".downloaded_images di
LEFT JOIN (SELECT itemId, MIN(last_update) as last_update FROM "debug"."foo" GROUP BY itemId) computed ON di.itemId=computed.itemId
WHERE COALESCE(last_update, '1970-01-01') < download_time;
FETCH 10000 IN "test-cursor-crash";
COMMIT;
EOF
createdb testdb
psql -d testdb -f postgresql-segfault.sql
CREATE SCHEMA debug;
CREATE TABLE debug.downloaded_images (
itemid text NOT NULL,
download_time timestamp,
PRIMARY KEY(itemId)
);
INSERT INTO debug.downloaded_images (itemid, download_time) VALUES ('1190300','2021-09-07 11:00:10.255831');
BEGIN;
CREATE TABLE IF NOT EXISTS "debug"."foo"
(itemId TEXT,
last_update TIMESTAMP,
PRIMARY KEY(itemId)
);
DECLARE "test-cursor-crash" CURSOR WITH HOLD FOR
SELECT di.itemId FROM "debug".downloaded_images di
LEFT JOIN (SELECT itemId, MIN(last_update) as last_update FROM "debug"."foo" GROUP BY itemId) computed ON di.itemId=computed.itemId
WHERE COALESCE(last_update, '1970-01-01') < download_time;
FETCH 10000 IN "test-cursor-crash";
COMMIT;
EOF
createdb testdb
psql -d testdb -f postgresql-segfault.sql
here's output from server logs:
[3325] LOG: starting PostgreSQL 12.8 (Debian 12.8-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
[3325] LOG: listening on IPv4 address "127.0.0.1", port 5433
[3325] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5433"
[3326] LOG: database system was shut down at 2021-09-09 10:26:19 UTC
[3325] LOG: database system is ready to accept connections
[3325] LOG: server process (PID 3409) was terminated by signal 11: Segmentation fault
[3325] DETAIL: Failed process was running: COMMIT;
[3325] LOG: terminating any other active server processes
[3330] WARNING: terminating connection because of crash of another server process
[3330] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
[3330] HINT: In a moment you should be able to reconnect to the database and repeat your command.
[3325] LOG: all server processes terminated; reinitializing
[3411] LOG: database system was interrupted; last known up at 2021-09-09 10:27:05 UTC
[3411] LOG: database system was not properly shut down; automatic recovery in progress
[3325] LOG: listening on IPv4 address "127.0.0.1", port 5433
[3325] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5433"
[3326] LOG: database system was shut down at 2021-09-09 10:26:19 UTC
[3325] LOG: database system is ready to accept connections
[3325] LOG: server process (PID 3409) was terminated by signal 11: Segmentation fault
[3325] DETAIL: Failed process was running: COMMIT;
[3325] LOG: terminating any other active server processes
[3330] WARNING: terminating connection because of crash of another server process
[3330] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
[3330] HINT: In a moment you should be able to reconnect to the database and repeat your command.
[3325] LOG: all server processes terminated; reinitializing
[3411] LOG: database system was interrupted; last known up at 2021-09-09 10:27:05 UTC
[3411] LOG: database system was not properly shut down; automatic recovery in progress
I've already filled a debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993848
Best Regards,
Tomas Barton
Re: Segmentation fault on postgresql 13.4, 12.8 with function call in a cursor
От
Sergei Kornilov
Дата:
Hello I am able to reproduce on REL_13_STABLE. With assertions I got SIGABRT on COMMIT with following backtrace: #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007fae23a82535 in __GI_abort () at abort.c:79 #2 0x00005598318b869a in ExceptionalCondition (conditionName=conditionName@entry=0x5598319d44ae "!TTS_EMPTY(slot)", errorType=errorType@entry=0x55983190f028 "FailedAssertion", fileName=fileName@entry=0x5598319f2007 "execTuples.c", lineNumber=lineNumber@entry=693)at assert.c:67 #3 0x000055983160788e in tts_buffer_heap_getsomeattrs (slot=0x559832aeb638, natts=<optimized out>) at execTuples.c:693 #4 0x000055983160a104 in slot_getsomeattrs_int (slot=slot@entry=0x559832aeb638, attnum=2) at execTuples.c:1912 #5 0x00005598315f66e7 in slot_getsomeattrs (attnum=<optimized out>, slot=0x559832aeb638) at ../../../src/include/executor/tuptable.h:344 #6 ExecInterpExpr (state=0x559832aec0d8, econtext=0x559832abba98, isnull=0x7ffc0d62e67f) at execExprInterp.c:482 #7 0x0000559831628da7 in ExecEvalExprSwitchContext (isNull=0x7ffc0d62e67f, econtext=0x559832abba98, state=0x559832aec0d8)at ../../../src/include/executor/executor.h:322 #8 ExecQual (econtext=0x559832abba98, state=0x559832aec0d8) at ../../../src/include/executor/executor.h:391 #9 MJFillInner (node=node@entry=0x559832abb880) at nodeMergejoin.c:494 #10 0x000055983162980e in ExecMergeJoin (pstate=0x559832abb880) at nodeMergejoin.c:1353 #11 0x00005598315fb971 in ExecProcNode (node=0x559832abb880) at ../../../src/include/executor/executor.h:248 #12 ExecutePlan (estate=estate@entry=0x559832abb630, planstate=0x559832abb880, use_parallel_mode=<optimized out>, operation=operation@entry=CMD_SELECT, sendTuples=sendTuples@entry=true, numberTuples=numberTuples@entry=0, direction=ForwardScanDirection, dest=0x559832ae29e0,execute_once=false) at execMain.c:1632 #13 0x00005598315fc653 in standard_ExecutorRun (queryDesc=0x559832ac50e8, direction=ForwardScanDirection, count=0, execute_once=execute_once@entry=false)at execMain.c:350 #14 0x00005598315fc71c in ExecutorRun (queryDesc=queryDesc@entry=0x559832ac50e8, direction=direction@entry=ForwardScanDirection,count=count@entry=0, execute_once=execute_once@entry=false) at execMain.c:294 #15 0x00005598315a5f23 in PersistHoldablePortal (portal=portal@entry=0x559832a1f3c8) at portalcmds.c:407 #16 0x00005598318e66f4 in HoldPortal (portal=portal@entry=0x559832a1f3c8) at portalmem.c:642 #17 0x00005598318e6fd6 in PreCommit_Portals (isPrepare=isPrepare@entry=false) at portalmem.c:738 #18 0x00005598314c0a2a in CommitTransaction () at xact.c:2087 #19 0x00005598314c18b8 in CommitTransactionCommand () at xact.c:2974 #20 0x000055983179342f in finish_xact_command () at postgres.c:2662 #21 0x000055983179590e in exec_simple_query (query_string=query_string@entry=0x5598329bdae0 "COMMIT;") at postgres.c:1264 #22 0x0000559831797b01 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x5598329e9380, dbname=<optimized out>, username=<optimizedout>) at postgres.c:4339 #23 0x0000559831706ff8 in BackendRun (port=port@entry=0x5598329e02e0) at postmaster.c:4526 #24 0x000055983170a0cd in BackendStartup (port=port@entry=0x5598329e02e0) at postmaster.c:4210 #25 0x000055983170a2e8 in ServerLoop () at postmaster.c:1739 #26 0x000055983170b7cf in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at postmaster.c:1412 #27 0x0000559831654b68 in main (argc=3, argv=0x5598329b8200) at main.c:210 Git bisect points me to a c1fd756fd23f60fcac120c9cd36de2921144e3bd commit ( https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c1fd756fd23f60fcac120c9cd36de2921144e3bd) regards, Sergei
Tomas Barton <barton.tomas@gmail.com> writes: > a recent change in postgresql is causing a segfault when a function is > called in filter section, e.g. with a plan like this: Ugh, thanks. PersistHoldablePortal isn't being cautious about re-executing a plan that already reached completion. Will fix. regards, tom lane