Обсуждение: BUG #17229: Segmentation Fault after upgrading to version 13

Поиск
Список
Период
Сортировка

BUG #17229: Segmentation Fault after upgrading to version 13

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17229
Logged by:          Efrain Berdecia
Email address:      ejberdecia@yahoo.com
PostgreSQL version: 13.1
Operating system:   Linux  3.10.0-1160.11.1
Description:

Ever since we upgraded to PostgreSQL version 13 we are sporadically getting
segmentation faults for the following process and all existing connections
get killed and our streaming replication process goes down. The sql below
does establish a dblink to a Greenplum cluster. For reference, this is our
log_line_prefix value;
                                  log_line_prefix
------------------------------------------------------------------------------------
 %t [%p]: [%v]-[%x] [%l-1]
cluster_name=campaign_service,user=%u,db=%d,hostname=%r


2021-10-13 03:45:24 UTC [31961]: []-[0] [1-1]
cluster_name=campaign_service,user=[unknown],db=[unknown],hostname=10.110.181.113(65011)
LOG:  connection received: host=10.110.181.113 port=65011
2021-10-13 03:45:24 UTC [31961]: [37/1745]-[0] [2-1]
cluster_name=campaign_service,user=repmgr,db=[unknown],hostname=10.110.181.113(65011)
FATAL:  no pg_hba.conf entry for replication connection from host
"10.110.181.113", user "repmgr", SSL off
2021-10-13 08:00:08 UTC [31961]: []-[0] [1-1]
cluster_name=campaign_service,user=[unknown],db=[unknown],hostname=10.110.149.59(46724)
LOG:  connection received: host=10.110.149.59 port=46724
2021-10-13 08:00:08 UTC [31961]: [52/9474]-[0] [2-1]
cluster_name=campaign_service,user=generic_toolkit_utility,db=bi_tools,hostname=10.110.149.59(46724)
LOG:  connection authorized: user=generic_toolkit_utility
database=bi_tools
2021-10-13 08:01:12 UTC [31961]: [52/0]-[0] [3-1]
cluster_name=campaign_service,user=generic_toolkit_utility,db=bi_tools,hostname=10.110.149.59(46724)
LOG:  duration: 63946.616 ms  statement:
            SELECT dblink_connect('gp_metadata_gpso','greenplum_gpso');

            SELECT dblink_exec('gp_metadata_gpso', 'SET statement_timeout =
90000');

            SET statement_timeout = 600000;

            SELECT gp_metadata.load_functions('gp_metadata_gpso');
            SELECT gp_metadata.load_tables('gp_metadata_gpso');
            SELECT gp_metadata.load_views('gp_metadata_gpso');
            SELECT gp_metadata.load_columns('gp_metadata_gpso');
            SELECT gp_metadata.load_dependencies('gp_metadata_gpso');

            SELECT
gp_metadata.load_from_tmp_to_main('functions','gp_metadata_gpso');
            SELECT
gp_metadata.load_from_tmp_to_main('tables','gp_metadata_gpso');
            SELECT
gp_metadata.load_from_tmp_to_main('views','gp_metadata_gpso');
            SELECT
gp_metadata.load_from_tmp_to_main('dependencies','gp_metadata_gpso');
            SELECT
gp_metadata.load_from_tmp_to_main('columns','gp_metadata_gpso');

            SELECT dblink_disconnect('gp_metadata_gpso');

            SELECT gp_metadata.load_all_objects('gp_metadata_gpso');
            SELECT
gp_metadata.load_from_tmp_to_main('all_objects','gp_metadata_gpso');

            UPDATE gp_metadata.status SET last_update=NOW() WHERE
enviroment='gp_metadata_gpso';
2021-10-13 08:01:36 UTC [59543]: []-[0] [248-1]
cluster_name=campaign_service,user=,db=,hostname= LOG:  server process (PID
31961) was terminated by signal 11: Segmentation fault

Here are the /var/log/messages corresponding entries;
Oct 13 08:01:36 dtord00pgm39p.dc.dotomi.net kernel: traps: postgres[31961]
general protection ip:7ff03e003140 sp:7fff3d87cd58 error:0 in
libc-2.17.so[7ff03deac000+1c4000]
Oct 13 08:01:36 dtord00pgm39p.dc.dotomi.net kernel: [12628017.578633] traps:
postgres[31961] general protection ip:7ff03e003140 sp:7fff3d87cd58 error:0
in libc-2.17.so[7ff03deac000+1c4000]

In other occasions we do see the following messages in /var/log/messages;
Oct 12 22:01:37 dtord00pgm39p.dc.dotomi.net kernel: postgres[14751]:
segfault at 204 ip 00007ff03e003140 sp 00007fff3d87d038 error 4 in
libc-2.17.so[7ff03deac000+1c4000]
Oct 12 22:01:37 dtord00pgm39p.dc.dotomi.net kernel: [12592102.529977]
postgres[14751]: segfault at 204 ip 00007ff03e003140 sp 00007fff3d87d038
error 4 in libc-2.17.so[7ff03deac000+1c4000]


Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> Ever since we upgraded to PostgreSQL version 13 we are sporadically getting
> segmentation faults for the following process and all existing connections
> get killed and our streaming replication process goes down.

Hmm, can you get a stack trace from the crash?  See

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

            regards, tom lane



Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
"Efrain J. Berdecia"
Дата:
This a production server but I'll check with out server admin group.


On Wed, Oct 13, 2021 at 3:55 PM, Tom Lane
<tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:

> Ever since we upgraded to PostgreSQL version 13 we are sporadically getting
> segmentation faults for the following process and all existing connections
> get killed and our streaming replication process goes down.


Hmm, can you get a stack trace from the crash?  See

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

            regards, tom lane

Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
"Efrain J. Berdecia"
Дата:
Here's the core dump.

Please let me know if there's anything we can provide to troubleshoot this situation.

Efrain J. Berdecia


On Wednesday, October 13, 2021, 04:53:04 PM EDT, Efrain J. Berdecia <ejberdecia@yahoo.com> wrote:


This a production server but I'll check with out server admin group.


On Wed, Oct 13, 2021 at 3:55 PM, Tom Lane
<tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:

> Ever since we upgraded to PostgreSQL version 13 we are sporadically getting
> segmentation faults for the following process and all existing connections
> get killed and our streaming replication process goes down.


Hmm, can you get a stack trace from the crash?  See

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

            regards, tom lane

Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
"Efrain J. Berdecia"
Дата:
Trying to attach it again...






Efrain J. Berdecia


On Monday, October 18, 2021, 08:48:45 AM EDT, Efrain J. Berdecia <ejberdecia@yahoo.com> wrote:


Here's the core dump.

Please let me know if there's anything we can provide to troubleshoot this situation.

Efrain J. Berdecia


On Wednesday, October 13, 2021, 04:53:04 PM EDT, Efrain J. Berdecia <ejberdecia@yahoo.com> wrote:


This a production server but I'll check with out server admin group.


On Wed, Oct 13, 2021 at 3:55 PM, Tom Lane
<tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:

> Ever since we upgraded to PostgreSQL version 13 we are sporadically getting
> segmentation faults for the following process and all existing connections
> get killed and our streaming replication process goes down.


Hmm, can you get a stack trace from the crash?  See

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

            regards, tom lane

Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
Tom Lane
Дата:
"Efrain J. Berdecia" <ejberdecia@yahoo.com> writes:
> Trying to attach it again...
> core.28085.sig11.zip

A core dump is entirely useless except on the system it was generated
on.  Please inspect the dump with gdb and send the textual stack trace,
per the directions in the wiki page I pointed you to.

            regards, tom lane



Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
"Efrain J. Berdecia"
Дата:
This is what we see;

postgres@dtord03pgm25p:/localpart0/db/postgres/13/campaign_service/data>gdb -c /localpart0/db/postgres/13/campaign_service/data/core.28085.sig11.1634328354s /usr/pgsql-13/bin/postgres
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from /usr/lib/debug/usr/pgsql-13/bin/postgres.debug...done.
done.
[New LWP 28085]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

warning: the debug information found in "/usr/lib/debug//usr/pgsql-11/lib/libpq.so.5.11.debug" does not match "/usr/pgsql-11/lib/libpq.so.5" (CRC mismatch).


warning: the debug information found in "/usr/lib/debug/usr/pgsql-11/lib/libpq.so.5.11.debug" does not match "/usr/pgsql-11/lib/libpq.so.5" (CRC mismatch).

Core was generated by `postgres: campaign_service: generic_toolkit_utility bi_tools 10.110.149.55(4096'.
Program terminated with signal 11, Segmentation fault.
#0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1852
1852            mov     -12(%rsi), %rdx
Missing separate debuginfos, use: debuginfo-install llvm5.0-libs-5.0.1-7.el7.x86_64 pg_partman13-4.4.0-1.rhel7.x86_64 postgresql11-libs-11.10-1PGDG.rhel7.x86_64 repmgr_13-5.2.1-1.rhel7.x86_64
(gdb) bt
#0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1852
#1  0x00007f42aee2488b in memcpy (__len=12, __src=0x7023c19c7, __dest=0x291967d) at /usr/include/bits/string3.h:51
#2  gtrgm_alloc (isalltrue=<optimized out>, siglen=siglen@entry=12, sign=0x7023c19c7 <Address 0x7023c19c7 out of bounds>) at trgm_gist.c:82
#3  0x00007f42aee25d80 in gtrgm_picksplit (fcinfo=<optimized out>) at trgm_gist.c:852
#4  0x00000000008afcfa in FunctionCall2Coll (flinfo=flinfo@entry=0x2bf9400, collation=<optimized out>, arg1=arg1@entry=43095272, arg2=arg2@entry=140737194144480) at fmgr.c:1164
#5  0x00000000004b2d03 in gistUserPicksplit (len=2, giststate=0x2bf75d8, itup=0x29193c8, v=0x7fffee76b2e0, attno=0, entryvec=0x29194e8, r=0x7f42aecd3668) at gistsplit.c:433
#6  gistSplitByKey (r=r@entry=0x7f42aecd3668, page=page@entry=0x2aaad2a7b500 <Address 0x2aaad2a7b500 out of bounds>, itup=itup@entry=0x29193c8, len=len@entry=2, giststate=giststate@entry=0x2bf75d8,
    v=v@entry=0x7fffee76b2e0, attno=attno@entry=0) at gistsplit.c:697
#7  0x00000000004aa045 in gistSplit (r=r@entry=0x7f42aecd3668, page=page@entry=0x2aaad2a7b500 <Address 0x2aaad2a7b500 out of bounds>, itup=itup@entry=0x29193c8, len=2,
    giststate=giststate@entry=0x2bf75d8) at gist.c:1443
#8  0x00000000004aa0fe in gistSplit (r=r@entry=0x7f42aecd3668, page=page@entry=0x2aaad2a7b500 <Address 0x2aaad2a7b500 out of bounds>, itup=itup@entry=0x29190b8, len=<optimized out>,
    giststate=giststate@entry=0x2bf75d8) at gist.c:1473
#9  0x00000000004aa0d6 in gistSplit (r=r@entry=0x7f42aecd3668, page=page@entry=0x2aaad2a7b500 <Address 0x2aaad2a7b500 out of bounds>, itup=itup@entry=0x2918c08, len=<optimized out>,
    giststate=giststate@entry=0x2bf75d8) at gist.c:1458
#10 0x00000000004aa0fe in gistSplit (r=r@entry=0x7f42aecd3668, page=page@entry=0x2aaad2a7b500 <Address 0x2aaad2a7b500 out of bounds>, itup=<optimized out>, len=<optimized out>,
    giststate=giststate@entry=0x2bf75d8) at gist.c:1473
#11 0x00000000004aa3d9 in gistplacetopage (rel=0x7f42aecd3668, freespace=0, giststate=giststate@entry=0x2bf75d8, buffer=15844, itup=itup@entry=0x7fffee76c018, ntup=ntup@entry=1,
    oldoffnum=oldoffnum@entry=0, newblkno=newblkno@entry=0x0, leftchildbuf=leftchildbuf@entry=0, splitinfo=splitinfo@entry=0x7fffee76bf60, markfollowright=markfollowright@entry=true,
    heapRel=0x7f42aecaffb0, is_build=false) at gist.c:303
#12 0x00000000004aaeda in gistinserttuples (state=state@entry=0x7fffee76c050, stack=stack@entry=0x2918a98, giststate=giststate@entry=0x2bf75d8, tuples=tuples@entry=0x7fffee76c018, ntup=ntup@entry=1,
    oldoffnum=oldoffnum@entry=0, leftchild=leftchild@entry=0, rightchild=rightchild@entry=0, unlockbuf=unlockbuf@entry=false, unlockleftchild=unlockleftchild@entry=false) at gist.c:1271
#13 0x00000000004ab5cf in gistinserttuple (oldoffnum=0, tuple=0x2916628, giststate=0x2bf75d8, stack=<optimized out>, state=0x7fffee76c050) at gist.c:1224
#14 gistdoinsert (r=r@entry=0x7f42aecd3668, itup=0x2916628, freespace=freespace@entry=0, giststate=giststate@entry=0x2bf75d8, heapRel=heapRel@entry=0x7f42aecaffb0, is_build=is_build@entry=false)
    at gist.c:880
#15 0x00000000004abe57 in gistinsert (r=0x7f42aecd3668, values=<optimized out>, isnull=<optimized out>, ht_ctid=0x263db48, heapRel=0x7f42aecaffb0, checkUnique=<optimized out>, indexInfo=0x263d850)
    at gist.c:180
#16 0x000000000062be6a in ExecInsertIndexTuples (slot=slot@entry=0x263db18, estate=estate@entry=0x263d1c8, noDupErr=noDupErr@entry=false, specConflict=specConflict@entry=0x0,
    arbiterIndexes=arbiterIndexes@entry=0x0) at execIndexing.c:393
#17 0x000000000065561a in ExecInsert (mtstate=mtstate@entry=0x263d598, slot=0x263db18, planSlot=0x263db18, estate=estate@entry=0x263d1c8, canSetTag=<optimized out>) at nodeModifyTable.c:624
#18 0x00000000006569d9 in ExecModifyTable (pstate=0x263d598) at nodeModifyTable.c:2246
#19 0x000000000062caa2 in ExecProcNode (node=0x263d598) at ../../../src/include/executor/executor.h:248
#20 ExecutePlan (execute_once=<optimized out>, dest=0x9fb620 <spi_printtupDR>, direction=<optimized out>, numberTuples=0, sendTuples=false, operation=CMD_INSERT, use_parallel_mode=<optimized out>,
    planstate=0x263d598, estate=0x263d1c8) at execMain.c:1646
#21 standard_ExecutorRun (queryDesc=0x29ae260, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#22 0x00007f42af46439d in pgss_ExecutorRun (queryDesc=0x29ae260, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:1045
#23 0x0000000000664e17 in _SPI_pquery (tcount=0, fire_triggers=true, queryDesc=<optimized out>) at spi.c:2511
#24 _SPI_execute_plan (plan=plan@entry=0x7fffee76c720, paramLI=paramLI@entry=0x0, snapshot=snapshot@entry=0x0, crosscheck_snapshot=crosscheck_snapshot@entry=0x0, read_only=read_only@entry=false,
    fire_triggers=fire_triggers@entry=true, tcount=tcount@entry=0) at spi.c:2288
#25 0x0000000000665079 in SPI_execute (src=src@entry=0x258c258 " \nDELETE FROM gp_metadata_gpso.tables;\nINSERT INTO gp_metadata_gpso.tables\nSELECT * FROM tables_tmp;\n", read_only=<optimized out>,
    tcount=tcount@entry=0) at spi.c:514
#26 0x00007f42af04138e in exec_stmt_dynexecute (stmt=0x25bdaf8, estate=0x7fffee76cbe0) at pl_exec.c:4429
#27 exec_stmt (estate=estate@entry=0x7fffee76cbe0, stmt=0x25bdaf8) at pl_exec.c:2056
#28 0x00007f42af0428e3 in exec_stmts (estate=0x7fffee76cbe0, stmts=0x25bdb48) at pl_exec.c:1943
#29 0x00007f42af042f22 in exec_stmt_block (estate=estate@entry=0x7fffee76cbe0, block=block@entry=0x25bdb98) at pl_exec.c:1884
#30 0x00007f42af0408ee in exec_stmt (estate=estate@entry=0x7fffee76cbe0, stmt=0x25bdb98) at pl_exec.c:1976
#31 0x00007f42af042448 in plpgsql_exec_function (func=func@entry=0x29636f8, fcinfo=fcinfo@entry=0x25567b0, simple_eval_estate=simple_eval_estate@entry=0x0,
    simple_eval_resowner=simple_eval_resowner@entry=0x0, atomic=atomic@entry=true) at pl_exec.c:610
#32 0x00007f42af04d286 in plpgsql_call_handler (fcinfo=0x25567b0) at pl_handler.c:265
#33 0x000000000062848f in ExecInterpExpr (state=0x25566d8, econtext=0x2556400, isnull=<optimized out>) at execExprInterp.c:675
#34 0x0000000000658a1f in ExecEvalExprSwitchContext (isNull=0x7fffee76ceb7, econtext=0x2556400, state=0x25566d8) at ../../../src/include/executor/executor.h:316
#35 ExecProject (projInfo=0x25566d0) at ../../../src/include/executor/executor.h:350
#36 ExecResult (pstate=<optimized out>) at nodeResult.c:136
#37 0x000000000062caa2 in ExecProcNode (node=0x25562f0) at ../../../src/include/executor/executor.h:248
#38 ExecutePlan (execute_once=<optimized out>, dest=0x2976218, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x25562f0,
    estate=0x25560c8) at execMain.c:1646
#39 standard_ExecutorRun (queryDesc=0x269a448, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364
#40 0x00007f42af46439d in pgss_ExecutorRun (queryDesc=0x269a448, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:1045
---Type <return> to continue, or q <return> to quit---
#41 0x000000000079071b in PortalRunSelect (portal=portal@entry=0x2512588, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x2976218) at pquery.c:912
#42 0x0000000000791a07 in PortalRun (portal=<optimized out>, count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x2976218, altdest=0x2976218, qc=0x7fffee76d240)
    at pquery.c:756
#43 0x000000000078d6c7 in exec_simple_query (query_string=<optimized out>) at postgres.c:1239
#44 0x000000000078ea37 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:4315
#45 0x00000000004879d4 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4536
#46 BackendStartup (port=0x248ff50) at postmaster.c:4220
#47 ServerLoop () at postmaster.c:1739
#48 0x0000000000718598 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x2462420) at postmaster.c:1412
#49 0x000000000048890d in main (argc=3, argv=0x2462420) at main.c:210


Thanks,
Efrain J. Berdecia


On Wednesday, October 13, 2021, 03:55:25 PM EDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:


PG Bug reporting form <noreply@postgresql.org> writes:

> Ever since we upgraded to PostgreSQL version 13 we are sporadically getting
> segmentation faults for the following process and all existing connections
> get killed and our streaming replication process goes down.


Hmm, can you get a stack trace from the crash?  See

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

            regards, tom lane

Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
Peter Geoghegan
Дата:
On Mon, Oct 18, 2021 at 11:20 AM Efrain J. Berdecia
<ejberdecia@yahoo.com> wrote:
> #0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1852
> #1  0x00007f42aee2488b in memcpy (__len=12, __src=0x7023c19c7, __dest=0x291967d) at /usr/include/bits/string3.h:51
> #2  gtrgm_alloc (isalltrue=<optimized out>, siglen=siglen@entry=12, sign=0x7023c19c7 <Address 0x7023c19c7 out of
bounds>)at trgm_gist.c:82
 
> #3  0x00007f42aee25d80 in gtrgm_picksplit (fcinfo=<optimized out>) at trgm_gist.c:852
> #4  0x00000000008afcfa in FunctionCall2Coll (flinfo=flinfo@entry=0x2bf9400, collation=<optimized out>,
arg1=arg1@entry=43095272,arg2=arg2@entry=140737194144480) at fmgr.c:1164
 
> #5  0x00000000004b2d03 in gistUserPicksplit (len=2, giststate=0x2bf75d8, itup=0x29193c8, v=0x7fffee76b2e0, attno=0,
entryvec=0x29194e8,r=0x7f42aecd3668) at gistsplit.c:433
 

Commit 911e702077 ("Implement operator class parameters") seems like
the most likely culprit, based on a quick "git blame". Alexander?

-- 
Peter Geoghegan



Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
Alexander Korotkov
Дата:
On Mon, Oct 18, 2021 at 9:27 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Mon, Oct 18, 2021 at 11:20 AM Efrain J. Berdecia
> <ejberdecia@yahoo.com> wrote:
> > #0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1852
> > #1  0x00007f42aee2488b in memcpy (__len=12, __src=0x7023c19c7, __dest=0x291967d) at /usr/include/bits/string3.h:51
> > #2  gtrgm_alloc (isalltrue=<optimized out>, siglen=siglen@entry=12, sign=0x7023c19c7 <Address 0x7023c19c7 out of
bounds>)at trgm_gist.c:82
 
> > #3  0x00007f42aee25d80 in gtrgm_picksplit (fcinfo=<optimized out>) at trgm_gist.c:852
> > #4  0x00000000008afcfa in FunctionCall2Coll (flinfo=flinfo@entry=0x2bf9400, collation=<optimized out>,
arg1=arg1@entry=43095272,arg2=arg2@entry=140737194144480) at fmgr.c:1164
 
> > #5  0x00000000004b2d03 in gistUserPicksplit (len=2, giststate=0x2bf75d8, itup=0x29193c8, v=0x7fffee76b2e0, attno=0,
entryvec=0x29194e8,r=0x7f42aecd3668) at gistsplit.c:433
 
>
> Commit 911e702077 ("Implement operator class parameters") seems like
> the most likely culprit, based on a quick "git blame". Alexander?


As I can see from the bug report, the affected version is 13.1.  I
think this bug is already fixed by 48ab1fa304 in 13.2.

------
Regards,
Alexander Korotkov



Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
"Efrain J. Berdecia"
Дата:
We'll download the latest dot release and test again.


On Tue, Oct 19, 2021 at 7:17 AM, Alexander Korotkov
<aekorotkov@gmail.com> wrote:
On Mon, Oct 18, 2021 at 9:27 PM Peter Geoghegan <pg@bowt.ie> wrote:

>
> On Mon, Oct 18, 2021 at 11:20 AM Efrain J. Berdecia
> <ejberdecia@yahoo.com> wrote:
> > #0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1852
> > #1  0x00007f42aee2488b in memcpy (__len=12, __src=0x7023c19c7, __dest=0x291967d) at /usr/include/bits/string3.h:51
> > #2  gtrgm_alloc (isalltrue=<optimized out>, siglen=siglen@entry=12, sign=0x7023c19c7 <Address 0x7023c19c7 out of bounds>) at trgm_gist.c:82
> > #3  0x00007f42aee25d80 in gtrgm_picksplit (fcinfo=<optimized out>) at trgm_gist.c:852
> > #4  0x00000000008afcfa in FunctionCall2Coll (flinfo=flinfo@entry=0x2bf9400, collation=<optimized out>, arg1=arg1@entry=43095272, arg2=arg2@entry=140737194144480) at fmgr.c:1164
> > #5  0x00000000004b2d03 in gistUserPicksplit (len=2, giststate=0x2bf75d8, itup=0x29193c8, v=0x7fffee76b2e0, attno=0, entryvec=0x29194e8, r=0x7f42aecd3668) at gistsplit.c:433
>
> Commit 911e702077 ("Implement operator class parameters") seems like
> the most likely culprit, based on a quick "git blame". Alexander?



As I can see from the bug report, the affected version is 13.1.  I
think this bug is already fixed by 48ab1fa304 in 13.2.

------
Regards,
Alexander Korotkov

Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
"Efrain J. Berdecia"
Дата:
We've been running 13.4 since last Thursday on our test cluster and haven't seen the dump.

I would say the latest patched addressed the issue.

On a separate note, applying 13.4 on a running cluster caused some issues with;

"could not load library "/usr/pgsql-13/lib/plpgsql.so": /usr/pgsql-13/lib/plpgsql.so: undefined symbol: EnsurePortalSnapshotExists"

...just FYI....

I would considered this issue with the gist index resolved.

Thanks,
Efrain J. Berdecia


On Tuesday, October 19, 2021, 07:17:26 AM EDT, Alexander Korotkov <aekorotkov@gmail.com> wrote:


On Mon, Oct 18, 2021 at 9:27 PM Peter Geoghegan <pg@bowt.ie> wrote:

>
> On Mon, Oct 18, 2021 at 11:20 AM Efrain J. Berdecia
> <ejberdecia@yahoo.com> wrote:
> > #0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1852
> > #1  0x00007f42aee2488b in memcpy (__len=12, __src=0x7023c19c7, __dest=0x291967d) at /usr/include/bits/string3.h:51
> > #2  gtrgm_alloc (isalltrue=<optimized out>, siglen=siglen@entry=12, sign=0x7023c19c7 <Address 0x7023c19c7 out of bounds>) at trgm_gist.c:82
> > #3  0x00007f42aee25d80 in gtrgm_picksplit (fcinfo=<optimized out>) at trgm_gist.c:852
> > #4  0x00000000008afcfa in FunctionCall2Coll (flinfo=flinfo@entry=0x2bf9400, collation=<optimized out>, arg1=arg1@entry=43095272, arg2=arg2@entry=140737194144480) at fmgr.c:1164
> > #5  0x00000000004b2d03 in gistUserPicksplit (len=2, giststate=0x2bf75d8, itup=0x29193c8, v=0x7fffee76b2e0, attno=0, entryvec=0x29194e8, r=0x7f42aecd3668) at gistsplit.c:433
>
> Commit 911e702077 ("Implement operator class parameters") seems like
> the most likely culprit, based on a quick "git blame". Alexander?



As I can see from the bug report, the affected version is 13.1.  I
think this bug is already fixed by 48ab1fa304 in 13.2.

------
Regards,
Alexander Korotkov

Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
Alexander Korotkov
Дата:
Hi!

On Mon, Oct 25, 2021 at 4:49 PM Efrain J. Berdecia <ejberdecia@yahoo.com> wrote:
> We've been running 13.4 since last Thursday on our test cluster and haven't seen the dump.
>
> I would say the latest patched addressed the issue.

Cool, thank you for reporting it!

> On a separate note, applying 13.4 on a running cluster caused some issues with;
>
> "could not load library "/usr/pgsql-13/lib/plpgsql.so": /usr/pgsql-13/lib/plpgsql.so: undefined symbol:
EnsurePortalSnapshotExists"

Not sure what is our policy on introducing incompatibilities between
extensions and postgres binary in minor releases...

------
Regards,
Alexander Korotkov



Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
Tom Lane
Дата:
Alexander Korotkov <aekorotkov@gmail.com> writes:
> On Mon, Oct 25, 2021 at 4:49 PM Efrain J. Berdecia <ejberdecia@yahoo.com> wrote:
>> On a separate note, applying 13.4 on a running cluster caused some issues with;
>> "could not load library "/usr/pgsql-13/lib/plpgsql.so": /usr/pgsql-13/lib/plpgsql.so: undefined symbol:
EnsurePortalSnapshotExists"

That sounds a lot like the OP hadn't actually restarted the server,
or at least not right away.  A 13.4 plpgsql.so loading into a 13.3
server would do that.

> Not sure what is our policy on introducing incompatibilities between
> extensions and postgres binary in minor releases...

We've done that before, we'll do it again.  You can't realistically
fix many bugs without adding new functions.

            regards, tom lane



Re: BUG #17229: Segmentation Fault after upgrading to version 13

От
Alexander Korotkov
Дата:
On Tue, Oct 26, 2021 at 2:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > On Mon, Oct 25, 2021 at 4:49 PM Efrain J. Berdecia <ejberdecia@yahoo.com> wrote:
> >> On a separate note, applying 13.4 on a running cluster caused some issues with;
> >> "could not load library "/usr/pgsql-13/lib/plpgsql.so": /usr/pgsql-13/lib/plpgsql.so: undefined symbol:
EnsurePortalSnapshotExists"
>
> That sounds a lot like the OP hadn't actually restarted the server,
> or at least not right away.  A 13.4 plpgsql.so loading into a 13.3
> server would do that.
>
> > Not sure what is our policy on introducing incompatibilities between
> > extensions and postgres binary in minor releases...
>
> We've done that before, we'll do it again.  You can't realistically
> fix many bugs without adding new functions.

OK, thank you for the clarification!

------
Regards,
Alexander Korotkov