Обсуждение: Server crash on RHEL 9/s390x platform against PG16

Поиск
Список
Период
Сортировка

Server crash on RHEL 9/s390x platform against PG16

От
Suraj Kharage
Дата:
Hi,

Found server crash on RHEL 9/s390x platform with below test case - 

Machine details:
[edb@9428da9d2137 postgres]$ cat /etc/redhat-release
AlmaLinux release 9.2 (Turquoise Kodkod)
[edb@9428da9d2137 postgres]$ lscpu
Architecture:           s390x
  CPU op-mode(s):       32-bit, 64-bit
  Address sizes:        39 bits physical, 48 bits virtual
  Byte Order:           Big Endian

Configure command:
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm --with-perl --with-python --with-tcl --with-openssl --enable-nls --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu --enable-debug --enable-cassert --with-pgport=5414


Test case:
CREATE TABLE rm32044_t1
(
    pkey   integer,
    val  text
);
CREATE TABLE rm32044_t2
(
    pkey   integer,
    label  text,
    hidden boolean
);
CREATE TABLE rm32044_t3
(
        pkey integer,
        val integer
);
CREATE TABLE rm32044_t4
(
        pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

backtrace:
[edb@9428da9d2137 postgres]$ gdb bin/postgres data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT  '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
#1  0x00000000010a9bb0 in heap_form_minimal_tuple (tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at heaptuple.c:1484
#2  0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>) at ../../../../src/include/executor/tuptable.h:472
#3  tuplesort_puttupleslot (state=state@entry=0x1be4d18, slot=slot@entry=0x1ba4120) at tuplesortvariants.c:610
#4  0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at nodeIncrementalSort.c:716
#5  0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at ../../../src/include/executor/executor.h:273
#6  ExecutePlan (execute_once=<optimized out>, dest=0x1ade698, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7  standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:365
#8  0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x1ade698) at pquery.c:924
#9  0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x1ade698, altdest=0x1ade698, qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
    query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;") at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x19a59a0) at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards, 
Suraj kharage, 

Re: Server crash on RHEL 9/s390x platform against PG16

От
Suraj Kharage
Дата:
Few more details on this:

(gdb) p val
$1 = 0
(gdb) p i
$2 = 3
(gdb) f 3
#3  0x0000000001a1ef70 in ExecCopySlotMinimalTuple (slot=0x202e4f8) at ../../../../src/include/executor/tuptable.h:472
472 return slot->tts_ops->copy_minimal_tuple(slot);
(gdb) p *slot
$3 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 8, tts_ops = 0x1b6dcc8 <TTSOpsVirtual>, tts_tupleDescriptor = 0x202e0e8, tts_values = 0x202e540, tts_isnull = 0x202e580, tts_mcxt = 0x1f54550, tts_tid = {ip_blkid = {bi_hi = 65535,
      bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p *slot->tts_tupleDescriptor
$2 = {natts = 8, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr = 0x0, attrs = 0x202cd28}

(gdb) p slot.tts_values[3]
$4 = 0
(gdb) p slot.tts_values[2]
$5 = 1
(gdb) p slot.tts_values[1]
$6 = 34027556


As per the resultslot, it has 0 value for the third attribute (column lable).
Im testing this on the docker container and facing some issues with gdb hence could not able to debug it further.

Here is a explain plan:

postgres=# explain (verbose, costs off) SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Incremental Sort
   Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
   Sort Key: rm32044_t1.pkey, rm32044_t2.label, rm32044_t2.hidden
   Presorted Key: rm32044_t1.pkey
   ->  Merge Left Join
         Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
         Merge Cond: (rm32044_t1.pkey = rm32044_t2.pkey)
         ->  Sort
               Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
               Sort Key: rm32044_t1.pkey
               ->  Nested Loop
                     Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
                     ->  Merge Left Join
                           Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
                           Merge Cond: (rm32044_t3.pkey = rm32044_t4.pkey)
                           ->  Sort
                                 Output: rm32044_t3.pkey, rm32044_t3.val
                                 Sort Key: rm32044_t3.pkey
                                 ->  Seq Scan on public.rm32044_t3
                                       Output: rm32044_t3.pkey, rm32044_t3.val
                           ->  Sort
                                 Output: rm32044_t4.pkey
                                 Sort Key: rm32044_t4.pkey
                                 ->  Seq Scan on public.rm32044_t4
                                       Output: rm32044_t4.pkey
                     ->  Materialize
                           Output: rm32044_t1.pkey, rm32044_t1.val
                           ->  Seq Scan on public.rm32044_t1
                                 Output: rm32044_t1.pkey, rm32044_t1.val
         ->  Sort
               Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
               Sort Key: rm32044_t2.pkey
               ->  Seq Scan on public.rm32044_t2
                     Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
(34 rows)


It seems like while building the innerslot for merge join, the value for attnum 1 is not getting fetched correctly.

On Tue, Sep 12, 2023 at 3:27 PM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
Hi,

Found server crash on RHEL 9/s390x platform with below test case - 

Machine details:
[edb@9428da9d2137 postgres]$ cat /etc/redhat-release
AlmaLinux release 9.2 (Turquoise Kodkod)
[edb@9428da9d2137 postgres]$ lscpu
Architecture:           s390x
  CPU op-mode(s):       32-bit, 64-bit
  Address sizes:        39 bits physical, 48 bits virtual
  Byte Order:           Big Endian

Configure command:
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm --with-perl --with-python --with-tcl --with-openssl --enable-nls --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu --enable-debug --enable-cassert --with-pgport=5414


Test case:
CREATE TABLE rm32044_t1
(
    pkey   integer,
    val  text
);
CREATE TABLE rm32044_t2
(
    pkey   integer,
    label  text,
    hidden boolean
);
CREATE TABLE rm32044_t3
(
        pkey integer,
        val integer
);
CREATE TABLE rm32044_t4
(
        pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

backtrace:
[edb@9428da9d2137 postgres]$ gdb bin/postgres data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT  '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
#1  0x00000000010a9bb0 in heap_form_minimal_tuple (tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at heaptuple.c:1484
#2  0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>) at ../../../../src/include/executor/tuptable.h:472
#3  tuplesort_puttupleslot (state=state@entry=0x1be4d18, slot=slot@entry=0x1ba4120) at tuplesortvariants.c:610
#4  0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at nodeIncrementalSort.c:716
#5  0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at ../../../src/include/executor/executor.h:273
#6  ExecutePlan (execute_once=<optimized out>, dest=0x1ade698, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7  standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:365
#8  0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x1ade698) at pquery.c:924
#9  0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x1ade698, altdest=0x1ade698, qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
    query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;") at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x19a59a0) at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards, 
Suraj kharage, 



--
--

Thanks & Regards, 
Suraj kharage, 

Re: Server crash on RHEL 9/s390x platform against PG16

От
Suraj Kharage
Дата:
It looks like an issue with JIT. If I disable the JIT then the above query runs successfully.

postgres=# set jit to off;

SET

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

 pkey | val  | pkey |  label  | hidden | pkey | val | pkey 

------+------+------+---------+--------+------+-----+------

    1 | row1 |    1 | hidden  | t      |    1 |   1 |     

    1 | row1 |    1 | hidden  | t      |    2 |   1 |     

    2 | row2 |    2 | visible | f      |    1 |   1 |     

    2 | row2 |    2 | visible | f      |    2 |   1 |     

(4 rows)
Any idea on this?


On Mon, Sep 18, 2023 at 11:20 AM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
Few more details on this:

(gdb) p val
$1 = 0
(gdb) p i
$2 = 3
(gdb) f 3
#3  0x0000000001a1ef70 in ExecCopySlotMinimalTuple (slot=0x202e4f8) at ../../../../src/include/executor/tuptable.h:472
472 return slot->tts_ops->copy_minimal_tuple(slot);
(gdb) p *slot
$3 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 8, tts_ops = 0x1b6dcc8 <TTSOpsVirtual>, tts_tupleDescriptor = 0x202e0e8, tts_values = 0x202e540, tts_isnull = 0x202e580, tts_mcxt = 0x1f54550, tts_tid = {ip_blkid = {bi_hi = 65535,
      bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p *slot->tts_tupleDescriptor
$2 = {natts = 8, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr = 0x0, attrs = 0x202cd28}

(gdb) p slot.tts_values[3]
$4 = 0
(gdb) p slot.tts_values[2]
$5 = 1
(gdb) p slot.tts_values[1]
$6 = 34027556


As per the resultslot, it has 0 value for the third attribute (column lable).
Im testing this on the docker container and facing some issues with gdb hence could not able to debug it further.

Here is a explain plan:

postgres=# explain (verbose, costs off) SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Incremental Sort
   Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
   Sort Key: rm32044_t1.pkey, rm32044_t2.label, rm32044_t2.hidden
   Presorted Key: rm32044_t1.pkey
   ->  Merge Left Join
         Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
         Merge Cond: (rm32044_t1.pkey = rm32044_t2.pkey)
         ->  Sort
               Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
               Sort Key: rm32044_t1.pkey
               ->  Nested Loop
                     Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
                     ->  Merge Left Join
                           Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
                           Merge Cond: (rm32044_t3.pkey = rm32044_t4.pkey)
                           ->  Sort
                                 Output: rm32044_t3.pkey, rm32044_t3.val
                                 Sort Key: rm32044_t3.pkey
                                 ->  Seq Scan on public.rm32044_t3
                                       Output: rm32044_t3.pkey, rm32044_t3.val
                           ->  Sort
                                 Output: rm32044_t4.pkey
                                 Sort Key: rm32044_t4.pkey
                                 ->  Seq Scan on public.rm32044_t4
                                       Output: rm32044_t4.pkey
                     ->  Materialize
                           Output: rm32044_t1.pkey, rm32044_t1.val
                           ->  Seq Scan on public.rm32044_t1
                                 Output: rm32044_t1.pkey, rm32044_t1.val
         ->  Sort
               Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
               Sort Key: rm32044_t2.pkey
               ->  Seq Scan on public.rm32044_t2
                     Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
(34 rows)


It seems like while building the innerslot for merge join, the value for attnum 1 is not getting fetched correctly.

On Tue, Sep 12, 2023 at 3:27 PM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
Hi,

Found server crash on RHEL 9/s390x platform with below test case - 

Machine details:
[edb@9428da9d2137 postgres]$ cat /etc/redhat-release
AlmaLinux release 9.2 (Turquoise Kodkod)
[edb@9428da9d2137 postgres]$ lscpu
Architecture:           s390x
  CPU op-mode(s):       32-bit, 64-bit
  Address sizes:        39 bits physical, 48 bits virtual
  Byte Order:           Big Endian

Configure command:
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm --with-perl --with-python --with-tcl --with-openssl --enable-nls --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu --enable-debug --enable-cassert --with-pgport=5414


Test case:
CREATE TABLE rm32044_t1
(
    pkey   integer,
    val  text
);
CREATE TABLE rm32044_t2
(
    pkey   integer,
    label  text,
    hidden boolean
);
CREATE TABLE rm32044_t3
(
        pkey integer,
        val integer
);
CREATE TABLE rm32044_t4
(
        pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

backtrace:
[edb@9428da9d2137 postgres]$ gdb bin/postgres data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT  '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
#1  0x00000000010a9bb0 in heap_form_minimal_tuple (tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at heaptuple.c:1484
#2  0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>) at ../../../../src/include/executor/tuptable.h:472
#3  tuplesort_puttupleslot (state=state@entry=0x1be4d18, slot=slot@entry=0x1ba4120) at tuplesortvariants.c:610
#4  0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at nodeIncrementalSort.c:716
#5  0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at ../../../src/include/executor/executor.h:273
#6  ExecutePlan (execute_once=<optimized out>, dest=0x1ade698, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7  standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:365
#8  0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x1ade698) at pquery.c:924
#9  0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x1ade698, altdest=0x1ade698, qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
    query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;") at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x19a59a0) at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards, 
Suraj kharage, 



--
--

Thanks & Regards, 
Suraj kharage, 



--
--

Thanks & Regards, 
Suraj kharage, 

Re: Server crash on RHEL 9/s390x platform against PG16

От
Suraj Kharage
Дата:
Here is clang version:

[edb@9428da9d2137]$ clang --version

clang version 15.0.7 (Red Hat 15.0.7-2.el9)

Target: s390x-ibm-linux-gnu

Thread model: posix

InstalledDir: /usr/bin


Let me know if any further information is needed.


On Mon, Oct 9, 2023 at 8:21 AM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
It looks like an issue with JIT. If I disable the JIT then the above query runs successfully.

postgres=# set jit to off;

SET

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

 pkey | val  | pkey |  label  | hidden | pkey | val | pkey 

------+------+------+---------+--------+------+-----+------

    1 | row1 |    1 | hidden  | t      |    1 |   1 |     

    1 | row1 |    1 | hidden  | t      |    2 |   1 |     

    2 | row2 |    2 | visible | f      |    1 |   1 |     

    2 | row2 |    2 | visible | f      |    2 |   1 |     

(4 rows)
Any idea on this?


On Mon, Sep 18, 2023 at 11:20 AM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
Few more details on this:

(gdb) p val
$1 = 0
(gdb) p i
$2 = 3
(gdb) f 3
#3  0x0000000001a1ef70 in ExecCopySlotMinimalTuple (slot=0x202e4f8) at ../../../../src/include/executor/tuptable.h:472
472 return slot->tts_ops->copy_minimal_tuple(slot);
(gdb) p *slot
$3 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 8, tts_ops = 0x1b6dcc8 <TTSOpsVirtual>, tts_tupleDescriptor = 0x202e0e8, tts_values = 0x202e540, tts_isnull = 0x202e580, tts_mcxt = 0x1f54550, tts_tid = {ip_blkid = {bi_hi = 65535,
      bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p *slot->tts_tupleDescriptor
$2 = {natts = 8, tdtypeid = 2249, tdtypmod = -1, tdrefcount = -1, constr = 0x0, attrs = 0x202cd28}

(gdb) p slot.tts_values[3]
$4 = 0
(gdb) p slot.tts_values[2]
$5 = 1
(gdb) p slot.tts_values[1]
$6 = 34027556


As per the resultslot, it has 0 value for the third attribute (column lable).
Im testing this on the docker container and facing some issues with gdb hence could not able to debug it further.

Here is a explain plan:

postgres=# explain (verbose, costs off) SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Incremental Sort
   Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
   Sort Key: rm32044_t1.pkey, rm32044_t2.label, rm32044_t2.hidden
   Presorted Key: rm32044_t1.pkey
   ->  Merge Left Join
         Output: rm32044_t1.pkey, rm32044_t1.val, rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden, rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
         Merge Cond: (rm32044_t1.pkey = rm32044_t2.pkey)
         ->  Sort
               Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
               Sort Key: rm32044_t1.pkey
               ->  Nested Loop
                     Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey, rm32044_t1.pkey, rm32044_t1.val
                     ->  Merge Left Join
                           Output: rm32044_t3.pkey, rm32044_t3.val, rm32044_t4.pkey
                           Merge Cond: (rm32044_t3.pkey = rm32044_t4.pkey)
                           ->  Sort
                                 Output: rm32044_t3.pkey, rm32044_t3.val
                                 Sort Key: rm32044_t3.pkey
                                 ->  Seq Scan on public.rm32044_t3
                                       Output: rm32044_t3.pkey, rm32044_t3.val
                           ->  Sort
                                 Output: rm32044_t4.pkey
                                 Sort Key: rm32044_t4.pkey
                                 ->  Seq Scan on public.rm32044_t4
                                       Output: rm32044_t4.pkey
                     ->  Materialize
                           Output: rm32044_t1.pkey, rm32044_t1.val
                           ->  Seq Scan on public.rm32044_t1
                                 Output: rm32044_t1.pkey, rm32044_t1.val
         ->  Sort
               Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
               Sort Key: rm32044_t2.pkey
               ->  Seq Scan on public.rm32044_t2
                     Output: rm32044_t2.pkey, rm32044_t2.label, rm32044_t2.hidden
(34 rows)


It seems like while building the innerslot for merge join, the value for attnum 1 is not getting fetched correctly.

On Tue, Sep 12, 2023 at 3:27 PM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
Hi,

Found server crash on RHEL 9/s390x platform with below test case - 

Machine details:
[edb@9428da9d2137 postgres]$ cat /etc/redhat-release
AlmaLinux release 9.2 (Turquoise Kodkod)
[edb@9428da9d2137 postgres]$ lscpu
Architecture:           s390x
  CPU op-mode(s):       32-bit, 64-bit
  Address sizes:        39 bits physical, 48 bits virtual
  Byte Order:           Big Endian

Configure command:
./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm --with-perl --with-python --with-tcl --with-openssl --enable-nls --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu --enable-debug --enable-cassert --with-pgport=5414


Test case:
CREATE TABLE rm32044_t1
(
    pkey   integer,
    val  text
);
CREATE TABLE rm32044_t2
(
    pkey   integer,
    label  text,
    hidden boolean
);
CREATE TABLE rm32044_t3
(
        pkey integer,
        val integer
);
CREATE TABLE rm32044_t4
(
        pkey integer
);
insert into rm32044_t1 values ( 1 , 'row1');
insert into rm32044_t1 values ( 2 , 'row2');
insert into rm32044_t2 values ( 1 , 'hidden', true);
insert into rm32044_t2 values ( 2 , 'visible', false);
insert into rm32044_t3 values (1 , 1);
insert into rm32044_t3 values (2 , 1);

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

backtrace:
[edb@9428da9d2137 postgres]$ gdb bin/postgres data/qemu_postgres_20230911-140628_65620.core
Core was generated by `postgres: edb postgres [local] SELECT  '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
227 VARATT_CAN_MAKE_SHORT(DatumGetPointer(val)))
[Current thread is 1 (LWP 65597)]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-60.el9.s390x libcap-2.48-8.el9.s390x libedit-3.1-37.20210216cvs.el9.s390x libffi-3.4.2-7.el9.s390x libgcc-11.3.1-4.3.el9.alma.s390x libgcrypt-1.10.0-10.el9_2.s390x libgpg-error-1.42-5.el9.s390x libstdc++-11.3.1-4.3.el9.alma.s390x libxml2-2.9.13-3.el9_2.1.s390x libzstd-1.5.1-2.el9.s390x llvm-libs-15.0.7-1.el9.s390x lz4-libs-1.9.3-5.el9.s390x ncurses-libs-6.2-8.20210508.el9.s390x openssl-libs-3.0.7-17.el9_2.s390x systemd-libs-252-14.el9_2.3.s390x xz-libs-5.2.5-8.el9_0.s390x
(gdb) bt
#0  0x00000000010a8366 in heap_compute_data_size (tupleDesc=tupleDesc@entry=0x1ba3d10, values=values@entry=0x1ba4168, isnull=isnull@entry=0x1ba41a8) at heaptuple.c:227
#1  0x00000000010a9bb0 in heap_form_minimal_tuple (tupleDescriptor=0x1ba3d10, values=0x1ba4168, isnull=0x1ba41a8) at heaptuple.c:1484
#2  0x00000000016553fa in ExecCopySlotMinimalTuple (slot=<optimized out>) at ../../../../src/include/executor/tuptable.h:472
#3  tuplesort_puttupleslot (state=state@entry=0x1be4d18, slot=slot@entry=0x1ba4120) at tuplesortvariants.c:610
#4  0x00000000012dc0e0 in ExecIncrementalSort (pstate=0x1acb4d8) at nodeIncrementalSort.c:716
#5  0x00000000012b32c6 in ExecProcNode (node=0x1acb4d8) at ../../../src/include/executor/executor.h:273
#6  ExecutePlan (execute_once=<optimized out>, dest=0x1ade698, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x1acb4d8, estate=0x1acb258) at execMain.c:1670
#7  standard_ExecutorRun (queryDesc=0x19ad338, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:365
#8  0x00000000014a6ae2 in PortalRunSelect (portal=portal@entry=0x1a63558, forward=forward@entry=true, count=0, count@entry=9223372036854775807, dest=dest@entry=0x1ade698) at pquery.c:924
#9  0x00000000014a84e0 in PortalRun (portal=portal@entry=0x1a63558, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x1ade698, altdest=0x1ade698, qc=0x40007ff7b0) at pquery.c:768
#10 0x00000000014a3c1c in exec_simple_query (
    query_string=0x19ea0e8 "SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;") at postgres.c:1274
#11 0x00000000014a57aa in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#12 0x00000000013fdaf6 in BackendRun (port=0x1a132c0, port=0x1a132c0) at postmaster.c:4464
#13 BackendStartup (port=0x1a132c0) at postmaster.c:4192
#14 ServerLoop () at postmaster.c:1782
#15 0x00000000013fec34 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x19a59a0) at postmaster.c:1466
#16 0x0000000001096faa in main (argc=<optimized out>, argv=0x19a59a0) at main.c:198

(gdb) p val
$1 = 0
```

Does anybody have any idea about this?

--
--

Thanks & Regards, 
Suraj kharage, 



--
--

Thanks & Regards, 
Suraj kharage, 



--
--

Thanks & Regards, 
Suraj kharage, 



--
--

Thanks & Regards, 
Suraj kharage, 

Re: Server crash on RHEL 9/s390x platform against PG16

От
Robert Haas
Дата:
On Sun, Oct 8, 2023 at 10:55 PM Suraj Kharage <suraj.kharage@enterprisedb.com> wrote:
It looks like an issue with JIT. If I disable the JIT then the above query runs successfully.

postgres=# set jit to off;

SET

postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

 pkey | val  | pkey |  label  | hidden | pkey | val | pkey 

------+------+------+---------+--------+------+-----+------

    1 | row1 |    1 | hidden  | t      |    1 |   1 |     

    1 | row1 |    1 | hidden  | t      |    2 |   1 |     

    2 | row2 |    2 | visible | f      |    1 |   1 |     

    2 | row2 |    2 | visible | f      |    2 |   1 |     

(4 rows)
Any idea on this?


No, but I found a few previous threads complaining about JIT not working on s390x.


The most interesting email I found in those threads was this one:

 
The backtrace there is different from the one you posted here in significant ways, but it seems like both that case and this one involve a null pointer showing up for a non-null pass-by-reference datum. That doesn't seem like a whole lot to go on, but maybe somebody who understands the JIT stuff better than I do will have an idea.

--

Re: Server crash on RHEL 9/s390x platform against PG16

От
Andres Freund
Дата:
Hi,

On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote:
> *[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2
> (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
> s390x  CPU op-mode(s):       32-bit, 64-bit  Address sizes:        39 bits

Can you provide the rest of the lscpu output?  There have been issues with Z14
vs Z15:
https://github.com/llvm/llvm-project/issues/53009

You're apparently not hitting that, but given that fact, you either are on a
slightly older CPU, or you have applied a patch to work around it. Because
otherwise your uild instructions below would hit that problem, I think.


> physical, 48 bits virtual  Byte Order:           Big Endian*
> *Configure command:*
> ./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm
> --with-perl --with-python --with-tcl --with-openssl --enable-nls
> --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
> --enable-debug --enable-cassert --with-pgport=5414

Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have you
verified the issue reproduces on upstream postgres?

> 
> *Test case:*
> CREATE TABLE rm32044_t1
> (
>     pkey   integer,
>     val  text
> );
> CREATE TABLE rm32044_t2
> (
>     pkey   integer,
>     label  text,
>     hidden boolean
> );
> CREATE TABLE rm32044_t3
> (
>         pkey integer,
>         val integer
> );
> CREATE TABLE rm32044_t4
> (
>         pkey integer
> );
> insert into rm32044_t1 values ( 1 , 'row1');
> insert into rm32044_t1 values ( 2 , 'row2');
> insert into rm32044_t2 values ( 1 , 'hidden', true);
> insert into rm32044_t2 values ( 2 , 'visible', false);
> insert into rm32044_t3 values (1 , 1);
> insert into rm32044_t3 values (2 , 1);
> 
> postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
> = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
> rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.

I tried this on both master and 16, without hitting this issue.

If you can reproduce the issue on upstream postgres, can you share more about
your configuration?

Greetings,

Andres Freund



Re: Server crash on RHEL 9/s390x platform against PG16

От
Suraj Kharage
Дата:


On Sat, Oct 21, 2023 at 5:17 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote:
> *[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2
> (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
> s390x  CPU op-mode(s):       32-bit, 64-bit  Address sizes:        39 bits

Can you provide the rest of the lscpu output?  There have been issues with Z14
vs Z15:
https://github.com/llvm/llvm-project/issues/53009

You're apparently not hitting that, but given that fact, you either are on a
slightly older CPU, or you have applied a patch to work around it. Because
otherwise your uild instructions below would hit that problem, I think.


> physical, 48 bits virtual  Byte Order:           Big Endian*
> *Configure command:*
> ./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm
> --with-perl --with-python --with-tcl --with-openssl --enable-nls
> --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
> --enable-debug --enable-cassert --with-pgport=5414

Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have you
verified the issue reproduces on upstream postgres?

Yes, I can reproduce this on upstream postgres master and v16 branch.

Here are details:

./configure --prefix=/home/edb/postgres/ --with-zstd --with-llvm --with-perl --with-python --with-tcl --with-openssl --enable-nls --with-libxml --with-libxslt --with-systemd --without-icu --enable-debug --enable-cassert --with-pgport=5414 CFLAGS="-g -O0"



[edb@9428da9d2137 postgres]$ cat /etc/redhat-release

AlmaLinux release 9.2 (Turquoise Kodkod)


[edb@9428da9d2137 edbas]$ lscpu

Architecture:           s390x

  CPU op-mode(s):       32-bit, 64-bit

  Address sizes:        39 bits physical, 48 bits virtual

  Byte Order:           Big Endian

CPU(s):                 9

  On-line CPU(s) list:  0-8

Vendor ID:              GenuineIntel

  Model name:           Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz

    CPU family:         6

    Model:              158

    Thread(s) per core: 1

    Core(s) per socket: 1

    Socket(s):          9

    Stepping:           10

    BogoMIPS:           5200.00

    Flags:              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx

                        16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 avx2 bmi2 erms xsaveopt arat

Caches (sum of all):    

  L1d:                  288 KiB (9 instances)

  L1i:                  288 KiB (9 instances)

  L2:                   2.3 MiB (9 instances)

  L3:                   108 MiB (9 instances)

Vulnerabilities:        

  Itlb multihit:        KVM: Mitigation: VMX unsupported

  L1tf:                 Mitigation; PTE Inversion

  Mds:                  Vulnerable; SMT Host state unknown

  Meltdown:             Vulnerable

  Mmio stale data:      Vulnerable

  Spec store bypass:    Vulnerable

  Spectre v1:           Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers

  Spectre v2:           Vulnerable, STIBP: disabled

  Srbds:                Unknown: Dependent on hypervisor status

  Tsx async abort:      Not affected


[edb@9428da9d2137 postgres]$ clang --version

clang version 15.0.7 (Red Hat 15.0.7-2.el9)

Target: s390x-ibm-linux-gnu

Thread model: posix

InstalledDir: /usr/bin


[edb@9428da9d2137 postgres]$ rpm -qa | grep llvm

llvm-libs-15.0.7-1.el9.s390x

llvm-15.0.7-1.el9.s390x

llvm-test-15.0.7-1.el9.s390x

llvm-static-15.0.7-1.el9.s390x

llvm-devel-15.0.7-1.el9.s390x

 
Please let me know if any further information is required.


>
> *Test case:*
> CREATE TABLE rm32044_t1
> (
>     pkey   integer,
>     val  text
> );
> CREATE TABLE rm32044_t2
> (
>     pkey   integer,
>     label  text,
>     hidden boolean
> );
> CREATE TABLE rm32044_t3
> (
>         pkey integer,
>         val integer
> );
> CREATE TABLE rm32044_t4
> (
>         pkey integer
> );
> insert into rm32044_t1 values ( 1 , 'row1');
> insert into rm32044_t1 values ( 2 , 'row2');
> insert into rm32044_t2 values ( 1 , 'hidden', true);
> insert into rm32044_t2 values ( 2 , 'visible', false);
> insert into rm32044_t3 values (1 , 1);
> insert into rm32044_t3 values (2 , 1);
>
> postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
> = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
> rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.

I tried this on both master and 16, without hitting this issue.

If you can reproduce the issue on upstream postgres, can you share more about
your configuration?

Greetings,

Andres Freund


--
--

Thanks & Regards, 
Suraj kharage,