Обсуждение: BUG #16976: server crash when deleting via a trigger on a foreign table

Поиск
Список
Период
Сортировка

BUG #16976: server crash when deleting via a trigger on a foreign table

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16976
Logged by:          Herwig Goemans
Email address:      herwig.goemans@gmail.com
PostgreSQL version: 12.6
Operating system:   ubuntu 18.04
Description:

Hello,

I have experienced a server crash when deleting from a table that ahs a
before trigger on it. The before trigger calls a function that deletes from
a foreign (informix) table.
The table in PG:
CREATE TABLE metadblucht.c_eenheid (
    enh_id serial NOT NULL,
    dat_wijziging timestamp NOT NULL,
    gebr_wijziging varchar(255) NULL,
    CONSTRAINT pk_enh PRIMARY KEY (enh_id)
);
Trigger and function:
create trigger td_c_eenheid before
delete
    on
    metadblucht.c_eenheid for each row execute function pd_c_eenheid();
create trigger ti_c_eenheid before
insert
    on
    metadblucht.c_eenheid for each row execute function pi_c_eenheid();
create trigger tu_c_eenheid before
update
    on
    metadblucht.c_eenheid for each row execute function pu_c_eenheid();
    
   
   CREATE OR REPLACE FUNCTION metadblucht.pd_c_eenheid()
 RETURNS trigger
 LANGUAGE plpgsql
AS $function$
            BEGIN
                delete from informix.t_water_eenheid
                where enh_id = new.enh_id;
            END;
        $function$
;
foreign table: 
CREATE FOREIGN TABLE informix.t_water_eenheid (
    enh_id int4 NOT NULL
)
SERVER o_oltp_app_water
OPTIONS (table 'c_eenheid_metadblucht', database 'water', client_locale
'en_US.utf8', db_locale 'nl_be.utf8');

The statement causing the server crash:
start transaction;
delete from metadblucht.c_eenheid where enh_id = 40;

If I do a delete directly on the foreign table it works fine:

start transaction;
delete from informix.t_water_eenheid  where enh_id = 40;
rollback;

This is the relevant section in the PG log file:
__2021-04-21 11:54:30 CEST LOG:  server process (PID 3933) was terminated by
signal 11: Segmentation fault
__2021-04-21 11:54:30 CEST DETAIL:  Failed process was running: delete from
metadblucht.c_eenheid where enh_id = 40
__2021-04-21 11:54:30 CEST LOG:  terminating any other active server
processes
gebouwenbeheerdba_gebouwenbeheer_2021-04-21 11:54:30 CEST WARNING:
terminating connection because of crash of another server process
gebouwenbeheerdba_gebouwenbeheer_2021-04-21 11:54:30 CEST DETAIL:  The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.
gebouwenbeheerdba_gebouwenbeheer_2021-04-21 11:54:30 CEST HINT:  In a moment
you should be able to reconnect to the database and repeat your command.
oari_ari_2021-04-21 11:54:30 CEST WARNING:  terminating connection because
of crash of another server process
oari_ari_2021-04-21 11:54:30 CEST DETAIL:  The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
oari_ari_2021-04-21 11:54:30 CEST HINT:  In a moment you should be able to
reconnect to the database and repeat your command.


I hope someone can help.
Thanks.


Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Tom Lane
Дата:
> I have experienced a server crash when deleting from a table that ahs a
> before trigger on it. The before trigger calls a function that deletes from
> a foreign (informix) table.

Hm, can you get a stack trace from that?

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

It seems fairly likely that the bug is in whatever FDW you're using
rather than in core Postgres.  A stack trace would help to
confirm or refute that.

            regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Hi,


I was in the process of reproducing the problem and following the 
guidelines to produce the stack trace... but now it works.

I asked my colleagues if they changed anything and will let you know.

Thanks for the quick response, and I will keep you posted if I have any 
news.



Kind regards,


Herwig

On 21/04/2021 15:40, Tom Lane wrote:
>> I have experienced a server crash when deleting from a table that ahs a
>> before trigger on it. The before trigger calls a function that deletes from
>> a foreign (informix) table.
> Hm, can you get a stack trace from that?
>
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
> It seems fairly likely that the bug is in whatever FDW you're using
> rather than in core Postgres.  A stack trace would help to
> confirm or refute that.
>
>             regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Hi,


I tried it again. With a previous test the trigger did not exists, that 
is why it worked.

Anyway, with the trigger in place we encounter PG server crashes.

I am having trouble collection a trace file because when I attach gdb to 
the process:

0x00007f99b46f7a07 in epoll_wait (epfd=5, events=0x561321b40630, 
maxevents=maxevents@entry=1, timeout=timeout@entry=-1) at 
../sysdeps/unix/sysv/linux/epoll_wait.c:30
30      ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) quit

or

Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading 
symbols from 
/usr/lib/debug//lib/x86_64-linux-gnu/libnss_files-2.27.so...done.
done.
0x00007fd77dc3343e in __libc_read (fd=0, buf=0x7ffc49e20010, nbytes=1) 
at ../sysdeps/unix/sysv/linux/read.c:27
27      ../sysdeps/unix/sysv/linux/read.c: No such file or directory.
(gdb) quit
A debugging session is active.

         Inferior 1 [process 35694] will be detached.

And I cannot execute anything in the attached (psql) process,

it complains about a file not found , always in sysdeps. ( i do not find 
this directory on the server)

Anyway when the trigger is fired, it crashes the server.

I can delete on the foreign table directly and I can delete on the 
foreign table with a function that returns void instead of trigger. all 
that works fine.

The function is:

  CREATE OR REPLACE FUNCTION metadblucht.test_herwig(p_enh_id int)
  RETURNS void
  LANGUAGE plpgsql
AS $function$
             BEGIN
                 delete from informix.t_water_eenheid
                 where enh_id = p_enh_id;
             END;
         $function$
;


When I run this function, it works fine.

SELECT metadblucht.test_herwig(38);

and the record is deleted from the foriegn table.

I can delete from the foreign table directly:

delete from informix.t_water_eenheid  where enh_id = 40;

that works fine as well.


Can you continue zith this information or do you still need the trace 
file. I have no idea what I can do to resolve the issue with sysdeps not 
on the server.


Kind regards,


Herwig


On 21/04/2021 15:40, Tom Lane wrote:
>> I have experienced a server crash when deleting from a table that ahs a
>> before trigger on it. The before trigger calls a function that deletes from
>> a foreign (informix) table.
> Hm, can you get a stack trace from that?
>
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
> It seems fairly likely that the bug is in whatever FDW you're using
> rather than in core Postgres.  A stack trace would help to
> confirm or refute that.
>
>             regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Michael Paquier
Дата:
On Thu, Apr 22, 2021 at 12:09:44PM +0200, Herwig Goemans wrote:
> Anyway, with the trigger in place we encounter PG server crashes.
>
> I am having trouble collection a trace file because when I attach gdb to the
> process:
>
> 0x00007f99b46f7a07 in epoll_wait (epfd=5, events=0x561321b40630,
> maxevents=maxevents@entry=1, timeout=timeout@entry=-1) at
> ../sysdeps/unix/sysv/linux/epoll_wait.c:30
> 30      ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
> (gdb) quit

What does a simple "bt" print?
--
Michael

Вложения

Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Hi,


I am not sure what you mean, I am not a developer.


What I did:

started gdb and then:

(gdb) set pagination 0
(gdb) attach 84465
Attaching to process 84465

0x00007f58bd33143e in __libc_read (fd=0, buf=0x7ffe09e53c80, nbytes=1) 
at ../sysdeps/unix/sysv/linux/read.c:27
27      ../sysdeps/unix/sysv/linux/read.c: No such file or directory.
(gdb) continue
Continuing.
[Inferior 1 (process 84465) exited normally]
(gdb) backtrace full
No stack.
(gdb) quit

that's it.

I think I have to get a stacktrace of the postgres server process , not 
of psql but I do not how to do that.


Herwig


On 22/04/2021 12:52, Michael Paquier wrote:
> On Thu, Apr 22, 2021 at 12:09:44PM +0200, Herwig Goemans wrote:
>> Anyway, with the trigger in place we encounter PG server crashes.
>>
>> I am having trouble collection a trace file because when I attach gdb to the
>> process:
>>
>> 0x00007f99b46f7a07 in epoll_wait (epfd=5, events=0x561321b40630,
>> maxevents=maxevents@entry=1, timeout=timeout@entry=-1) at
>> ../sysdeps/unix/sysv/linux/epoll_wait.c:30
>> 30      ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
>> (gdb) quit
> What does a simple "bt" print?
> --
> Michael



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Thu, Apr 22, 2021 at 12:09:44PM +0200, Herwig Goemans wrote:
>> I am having trouble collection a trace file because when I attach gdb to the
>> process:
>> 0x00007f99b46f7a07 in epoll_wait (epfd=5, events=0x561321b40630,
>> maxevents=maxevents@entry=1, timeout=timeout@entry=-1) at
>> ../sysdeps/unix/sysv/linux/epoll_wait.c:30
>> 30      ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.

> What does a simple "bt" print?

Yeah, bt should work anyway, though you won't get details about functions
for which the corresponding "debug symbols" package isn't installed.
The above is showing that you lack debug symbols for glibc, but it's
unlikely that that is going to be of interest here.  What *is* of interest
is Postgres proper and the relevant FDW, so make sure that those symbols
are installed.  (At least on Red Hat-based systems, gdb will tell you
which debug symbol packages you're missing when you first attach to
a process or core file.)

            regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Tom Lane
Дата:
Herwig Goemans <herwig.goemans@gmail.com> writes:
> I think I have to get a stacktrace of the postgres server process , not
> of psql but I do not how to do that.

Right.  The wiki page I pointed you to upthread has recipes for that.

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

            regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Ok,


  I have a core dump file, It is not generated by a test of me, but that 
does not matter, the sql statement was launched by a colleage of mine 
and it generated a core dump.

The file is 50MB (compressed) how do I get it to you ?


Herwig

On 22/04/2021 15:50, Tom Lane wrote:
> Herwig Goemans <herwig.goemans@gmail.com> writes:
>> I think I have to get a stacktrace of the postgres server process , not
>> of psql but I do not how to do that.
> Right.  The wiki page I pointed you to upthread has recipes for that.
>
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
>             regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Tom Lane
Дата:
Herwig Goemans <herwig.goemans@gmail.com> writes:
>   I have a core dump file, It is not generated by a test of me, but that 
> does not matter, the sql statement was launched by a colleage of mine 
> and it generated a core dump.
> The file is 50MB (compressed) how do I get it to you ?

We don't want the core file; it'd be useless without an exact copy of your
postgres executables and system libraries.  What we want you to do is
run gdb's "bt full" command on it and send us the output.

            regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Via wetrasnfer:

https://we.tl/t-w3sPuA29Pu


core dump file


On 22/04/2021 15:50, Tom Lane wrote:
> Herwig Goemans <herwig.goemans@gmail.com> writes:
>> I think I have to get a stacktrace of the postgres server process , not
>> of psql but I do not how to do that.
> Right.  The wiki page I pointed you to upthread has recipes for that.
>
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
>             regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Right,


Sorry I sent the file to you via wetransfer but you can ignore it.

Anyway


root@o-oltpmilieu03-UB-201906:~# gdb /usr/lib/postgresql/12/bin/postgres 
/home/local/VMM/h.goemans/_usr_lib_postgresql_12_bin_postgres.201.crash
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/12/bin/postgres...Reading 
symbols from 
/usr/lib/debug/.build-id/ec/40c5b4738ba1aef9fb39dfbbf928f1bbe57743.debug...done.
done.
"/home/local/VMM/h.goemans/_usr_lib_postgresql_12_bin_postgres.201.crash" 
is not a core dump: File format not recognized

It says the file is an ascii text file with very long lines...


I am lost (again)

Herwig

On 22/04/2021 16:14, Tom Lane wrote:
> Herwig Goemans <herwig.goemans@gmail.com> writes:
>>    I have a core dump file, It is not generated by a test of me, but that
>> does not matter, the sql statement was launched by a colleage of mine
>> and it generated a core dump.
>> The file is 50MB (compressed) how do I get it to you ?
> We don't want the core file; it'd be useless without an exact copy of your
> postgres executables and system libraries.  What we want you to do is
> run gdb's "bt full" command on it and send us the output.
>
>             regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Tom Lane
Дата:
Herwig Goemans <herwig.goemans@gmail.com> writes:
> "/home/local/VMM/h.goemans/_usr_lib_postgresql_12_bin_postgres.201.crash" 
> is not a core dump: File format not recognized

Hmm, that's a new one on me, but also that file name doesn't look
very much like a core file name --- usually they're just "core.NNNN"
or such.  Maybe this is a version you'd base64-encoded to put onto
your web server, or something like that?

            regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Clinton Adams
Дата:
On Thu, Apr 22, 2021 at 10:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Herwig Goemans <herwig.goemans@gmail.com> writes:
> > "/home/local/VMM/h.goemans/_usr_lib_postgresql_12_bin_postgres.201.crash"
> > is not a core dump: File format not recognized
>
> Hmm, that's a new one on me, but also that file name doesn't look
> very much like a core file name --- usually they're just "core.NNNN"
> or such.  Maybe this is a version you'd base64-encoded to put onto

Hi Herwig,

On ubuntu you have to jump through a hoop to get the core file. I'd
recommend just unpacking it, with apport-unpack.

apport-unpack (crash file) (empty or nonexistent target directory for unpacking)
cd into the target directory, and you should see a CoreDump file,
which you can use with gdb.

Thanks,
Clinton



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Thomas Munro
Дата:
On Thu, Apr 22, 2021 at 10:09 PM Herwig Goemans
<herwig.goemans@gmail.com> wrote:
> And I cannot execute anything in the attached (psql) process,

A couple of notes about the attach-then-crash approach:  It should be
"postgres", not "psql", that you attach to, and then you need to type
"cont" to tell the debugger to allow execution to continue, or it'll
just block as you described. Sometimes it will stop execution when
signals arrive, and you can disable that with eg "handle SIGUSR1
nostop" and then go back to executing with "cont" again.  Once you
make it crash, you'll be able to use "bt" to see exactly where, but as
mentioned it'll just be a bunch of big hex numbers unless the symbols
are installed; at the least it'll show which modules (postgres,
informix_fdw.so or such).



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Hi Support,


The problem is  in ifx_fdw.c but I do not have dbg packages installed 
for this nor do I know where to find these for ubuntu.

But is there no way you can simulate this issue ? There is a PG server 
crash due to bad user code in a trigger function.

In the mean time we found out that the user code in the trigger function 
was wrong: it has to work with old instead of new values:

      delete from informix.t_water_eenheid

                 where enh_id = old.enh_id;

                 return old;

instead of:


                 delete from informix.t_water_eenheid

                 where enh_id = new.enh_id;

                 return new;

using old it works fine.

But regardless, the PG server should not crash due to bad user code. I 
suppose you agree.

I only work 2 days a week for this client and this is taking up much of 
my time so I was hoping you could simulate this.



Kind regards,



Herwig



On 23/04/2021 00:45, Thomas Munro wrote:
> On Thu, Apr 22, 2021 at 10:09 PM Herwig Goemans
> <herwig.goemans@gmail.com> wrote:
>> And I cannot execute anything in the attached (psql) process,
> A couple of notes about the attach-then-crash approach:  It should be
> "postgres", not "psql", that you attach to, and then you need to type
> "cont" to tell the debugger to allow execution to continue, or it'll
> just block as you described. Sometimes it will stop execution when
> signals arrive, and you can disable that with eg "handle SIGUSR1
> nostop" and then go back to executing with "cont" again.  Once you
> make it crash, you'll be able to use "bt" to see exactly where, but as
> mentioned it'll just be a bunch of big hex numbers unless the symbols
> are installed; at the least it'll show which modules (postgres,
> informix_fdw.so or such).



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Tom Lane
Дата:
Herwig Goemans <herwig.goemans@gmail.com> writes:
> The problem is in ifx_fdw.c but I do not have dbg packages installed 
> for this nor do I know where to find these for ubuntu.

Well, at this point the right thing to do is contact the developer of
the informix FDW.  The bug is most likely in that code rather than
the core server, and even if it isn't, we'd need their cooperation to
narrow it down.  This list is primarily for handling bugs in the core
server, so it's likely the FDW's author doesn't read it (or at least
hasn't noticed this thread).

            regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
What mailing list do I need to use to get in contact with informix fdw ?


regards,


Herwig

On 28/04/2021 16:10, Tom Lane wrote:
> Herwig Goemans <herwig.goemans@gmail.com> writes:
>> The problem is in ifx_fdw.c but I do not have dbg packages installed
>> for this nor do I know where to find these for ubuntu.
> Well, at this point the right thing to do is contact the developer of
> the informix FDW.  The bug is most likely in that code rather than
> the core server, and even if it isn't, we'd need their cooperation to
> narrow it down.  This list is primarily for handling bugs in the core
> server, so it's likely the FDW's author doesn't read it (or at least
> hasn't noticed this thread).
>
>             regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Tom Lane
Дата:
Herwig Goemans <herwig.goemans@gmail.com> writes:
> What mailing list do I need to use to get in contact with informix fdw ?

Perhaps filing an issue in their github project would do the trick.

https://github.com/credativ/informix_fdw

            regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Herwig Goemans
Дата:
Hi Tom,


Thank you for your advice, I did as you asked and made a request (31) 8 
days ago but nobody seems to take this issue under his/her care. Is 
there anything that I can do to make someone look at it ?


Herwig

On 29/04/2021 18:05, Tom Lane wrote:
> Herwig Goemans <herwig.goemans@gmail.com> writes:
>> What mailing list do I need to use to get in contact with informix fdw ?
> Perhaps filing an issue in their github project would do the trick.
>
> https://github.com/credativ/informix_fdw
>
>             regards, tom lane



Re: BUG #16976: server crash when deleting via a trigger on a foreign table

От
Thomas Munro
Дата:
On Thu, May 20, 2021 at 10:57 PM Herwig Goemans
<herwig.goemans@gmail.com> wrote:
> Thank you for your advice, I did as you asked and made a request (31) 8
> days ago but nobody seems to take this issue under his/her care. Is
> there anything that I can do to make someone look at it ?

You could contact the developers, the well known PostgreSQL support
company that owns the repo.

Unfortunately it might be hard to get anyone else to help with it,
because it requires a bunch of non-free software to test, and even if
there are easy-to-install Docker images for that these days, it still
takes time to study how to configure it, and how to install the
proprietary client libraries etc, so you'd probably need someone who
already has such a system running and ready to connect to.  For
example, you :-)

If you can compile informix_fdw yourself (perhaps adding PG_CFLAGS=-g
to its Makefile for better debug information), and post a backtrace as
described earlier, we might be able to guess what's wrong and suggest
some changes you could test, or at least you might be able to add
enough information to the bug report to help it move faster.  Since I
personally have no idea how to drive Ubuntu's core stealing system, I
recommend these steps:

First, connect with psql and find the backend's pid:

postgres=# select pg_backend_pid();
 pg_backend_pid
----------------
        3209646
(1 row)

In another shell windows, connect to that backend with gdb:

$ sudo gdb -p 3209646
...
(gdb)

Tell it to ignore SIGUSR1 (makes things simpler), and then continue:

(gdb) handle SIGUSR1 nostop noprint
Signal        Stop    Print    Pass to program    Description
SIGUSR1       No    No    Yes        User defined signal 1
(gdb) cont
Continuing.

Now back in psql, trigger the problem with your reproducer steps:
postgres=# DELETE ...;

At this point the debugger should show something like this:

Program received signal SIGSEGV, Segmentation fault.
0x00007f5cf5ce60d6 in epoll_wait (epfd=11, events=0x55b7f0fbc088,
maxevents=maxevents@entry=1, timeout=timeout@entry=60000)
    at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
30    in ../sysdeps/unix/sysv/linux/epoll_wait.c
(gdb)

(I don't expect it to show epoll_wait, that's just because I manually
killed a sleeping backend with signal 11 for illustration, but for you
it hopefully shows some code that is trying to access a NULL pointer
or something like that).  That's where you enter "bt" and it prints
out a stack of function calls and line numbers that might help us to
guess what's going wrong.