Обсуждение: Strange failure on mamba

Поиск

Список

Период

Сортировка

Strange failure on mamba

От

Thomas Munro

Дата:

18 ноября 2022 г., 00:54:27

Hi,

I wonder why the walreceiver didn't start in
008_min_recovery_point_node_3.log here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-11-16%2023%3A13%3A38

There was the case of commit 8acd8f86, but that involved a deadlocked
postmaster whereas this one still handled a shutdown request.

Re: Strange failure on mamba

От

Tom Lane

Дата:

18 ноября 2022 г., 01:08:09

Thomas Munro <thomas.munro@gmail.com> writes:
> I wonder why the walreceiver didn't start in
> 008_min_recovery_point_node_3.log here:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-11-16%2023%3A13%3A38

mamba has been showing intermittent failures in various replication
tests since day one.  My guess is that it's slow enough to be
particularly subject to the signal-handler race conditions that we
know exist in walreceivers and elsewhere.  (Now, it wasn't any faster
in its previous incarnation as a macOS critter.  But maybe modern
NetBSD has different scheduler behavior than ancient macOS and that
contributes somehow.  Or maybe there's some other NetBSD weirdness
in here.)

I've tried to reproduce manually, without much success :-(

Like many of its other failures, there's a suggestive postmaster
log entry at the very end:

2022-11-16 19:45:53.851 EST [2036:4] LOG:  received immediate shutdown request
2022-11-16 19:45:58.873 EST [2036:5] LOG:  issuing SIGKILL to recalcitrant children
2022-11-16 19:45:58.881 EST [2036:6] LOG:  database system is shut down

So some postmaster child is stuck somewhere where it's not responding
to SIGQUIT.  While it's not unreasonable to guess that that's a
walreceiver, there's no hard evidence of it here.  I've been wondering
if it'd be worth patching the postmaster so that it's a bit more verbose
about which children it had to SIGKILL.  I've also wondered about
changing the SIGKILL to SIGABRT in hopes of reaping a core file that
could be investigated.

            regards, tom lane

Re: Strange failure on mamba

От

Thomas Munro

Дата:

18 ноября 2022 г., 01:35:10

On Fri, Nov 18, 2022 at 11:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > I wonder why the walreceiver didn't start in
> > 008_min_recovery_point_node_3.log here:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-11-16%2023%3A13%3A38
>
> mamba has been showing intermittent failures in various replication
> tests since day one.  My guess is that it's slow enough to be
> particularly subject to the signal-handler race conditions that we
> know exist in walreceivers and elsewhere.  (Now, it wasn't any faster
> in its previous incarnation as a macOS critter.  But maybe modern
> NetBSD has different scheduler behavior than ancient macOS and that
> contributes somehow.  Or maybe there's some other NetBSD weirdness
> in here.)
>
> I've tried to reproduce manually, without much success :-(
>
> Like many of its other failures, there's a suggestive postmaster
> log entry at the very end:
>
> 2022-11-16 19:45:53.851 EST [2036:4] LOG:  received immediate shutdown request
> 2022-11-16 19:45:58.873 EST [2036:5] LOG:  issuing SIGKILL to recalcitrant children
> 2022-11-16 19:45:58.881 EST [2036:6] LOG:  database system is shut down
>
> So some postmaster child is stuck somewhere where it's not responding
> to SIGQUIT.  While it's not unreasonable to guess that that's a
> walreceiver, there's no hard evidence of it here.  I've been wondering
> if it'd be worth patching the postmaster so that it's a bit more verbose
> about which children it had to SIGKILL.  I've also wondered about
> changing the SIGKILL to SIGABRT in hopes of reaping a core file that
> could be investigated.

I wonder if it's a runtime variant of the other problem.  We do
load_file("libpqwalreceiver", false) before unblocking signals but
maybe don't resolve the symbols until calling them, or something like
that...

Re: Strange failure on mamba

От

Thomas Munro

Дата:

18 ноября 2022 г., 01:47:48

On Fri, Nov 18, 2022 at 11:35 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> I wonder if it's a runtime variant of the other problem.  We do
> load_file("libpqwalreceiver", false) before unblocking signals but
> maybe don't resolve the symbols until calling them, or something like
> that...

Hmm, no, I take that back.  A key ingredient was that a symbol was
being resolved inside the signal handler, which is a postmaster-only
thing.

Re: Strange failure on mamba

От

Tom Lane

Дата:

18 ноября 2022 г., 01:47:50

Thomas Munro <thomas.munro@gmail.com> writes:
> On Fri, Nov 18, 2022 at 11:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> mamba has been showing intermittent failures in various replication
>> tests since day one.

> I wonder if it's a runtime variant of the other problem.  We do
> load_file("libpqwalreceiver", false) before unblocking signals but
> maybe don't resolve the symbols until calling them, or something like
> that...

Yeah, that or some other NetBSD bug could be the explanation, too.
Without a stack trace it's hard to have any confidence about it,
but I've been unable to reproduce the problem outside the buildfarm.
(Which is a familiar refrain.  I wonder what it is about the buildfarm
environment that makes it act different from the exact same code running
on the exact same machine.)

So I'd like to have some way to make the postmaster send SIGABRT instead
of SIGKILL in the buildfarm environment.  The lowest-tech way would be
to drive that off some #define or other.  We could scale it up to a GUC
perhaps.  Adjacent to that, I also wonder whether SIGABRT wouldn't be
more useful than SIGSTOP for the existing SendStop half-a-feature ---
the idea that people should collect cores manually seems mighty
last-century.

            regards, tom lane

Re: Strange failure on mamba

От

Andres Freund

Дата:

18 ноября 2022 г., 06:25:23

Hi,

On 2022-11-17 17:47:50 -0500, Tom Lane wrote:
> Yeah, that or some other NetBSD bug could be the explanation, too.
> Without a stack trace it's hard to have any confidence about it,
> but I've been unable to reproduce the problem outside the buildfarm.
> (Which is a familiar refrain.  I wonder what it is about the buildfarm
> environment that makes it act different from the exact same code running
> on the exact same machine.)
> 
> So I'd like to have some way to make the postmaster send SIGABRT instead
> of SIGKILL in the buildfarm environment.  The lowest-tech way would be
> to drive that off some #define or other.  We could scale it up to a GUC
> perhaps.  Adjacent to that, I also wonder whether SIGABRT wouldn't be
> more useful than SIGSTOP for the existing SendStop half-a-feature ---
> the idea that people should collect cores manually seems mighty
> last-century.

I suspect that having a GUC would be a good idea. I needed something similar
recently, debugging an occasional hang in the AIO patchset. I first tried
something like your #define approach and it did cause a problematic flood of
core files.

I ended up using libbacktrace to generate useful backtraces (vs what
backtrace_symbols() generates) when receiving SIGQUIT. I didn't do the legwork
to make it properly signal safe, but it'd be doable afaiu.

Greetings,

Andres Freund

Sending SIGABRT to child processes (was Re: Strange failure on mamba)

От

Tom Lane

Дата:

18 ноября 2022 г., 21:48:14

Andres Freund <andres@anarazel.de> writes:
> On 2022-11-17 17:47:50 -0500, Tom Lane wrote:
>> So I'd like to have some way to make the postmaster send SIGABRT instead
>> of SIGKILL in the buildfarm environment.  The lowest-tech way would be
>> to drive that off some #define or other.  We could scale it up to a GUC
>> perhaps.  Adjacent to that, I also wonder whether SIGABRT wouldn't be
>> more useful than SIGSTOP for the existing SendStop half-a-feature ---
>> the idea that people should collect cores manually seems mighty
>> last-century.

> I suspect that having a GUC would be a good idea. I needed something similar
> recently, debugging an occasional hang in the AIO patchset. I first tried
> something like your #define approach and it did cause a problematic flood of
> core files.

Yeah, the main downside of such a thing is the risk of lots of core files
accumulating over repeated crashes.  Nonetheless, I think it'll be a
useful debugging aid.  Here's a proposed patch.  (I took the opportunity
to kill off the long-since-unimplemented Reinit switch, too.)

One thing I'm not too clear on is if we want to send SIGABRT to the child
groups (ie, SIGABRT grandchild processes too).  I made signal_child do
so here, but perhaps it's overkill.

            regards, tom lane

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bd50ea8e48..24b1624bad 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11500,6 +11500,62 @@ LOG:  CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
       </listitem>
      </varlistentry>

+     <varlistentry id="guc-send-abort-for-crash" xreflabel="send_abort_for_crash">
+      <term><varname>send_abort_for_crash</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>send_abort_for_crash</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        By default, after a backend crash the postmaster will stop remaining
+        child processes by sending them <systemitem>SIGQUIT</systemitem>
+        signals, which permits them to exit more-or-less gracefully.  When
+        this option is set to <literal>on</literal>,
+        <systemitem>SIGABRT</systemitem> is sent instead.  That normally
+        results in production of a core dump file for each such child
+        process.
+        This can be handy for investigating the states of other processes
+        after a crash.  It can also consume lots of disk space in the event
+        of repeated crashes, so do not enable this on systems you are not
+        monitoring carefully.
+        Beware that no support exists for cleaning up the core file(s)
+        automatically.
+        This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry id="guc-send-abort-for-kill" xreflabel="send_abort_for_kill">
+      <term><varname>send_abort_for_kill</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>send_abort_for_kill</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        By default, after attempting to stop a child process with
+        <systemitem>SIGQUIT</systemitem>, the postmaster will wait five
+        seconds and then send <systemitem>SIGKILL</systemitem> to force
+        immediate termination.  When this option is set
+        to <literal>on</literal>, <systemitem>SIGABRT</systemitem> is sent
+        instead of <systemitem>SIGKILL</systemitem>.  That normally results
+        in production of a core dump file for each such child process.
+        This can be handy for investigating the states
+        of <quote>stuck</quote> child processes.  It can also consume lots
+        of disk space in the event of repeated crashes, so do not enable
+        this on systems you are not monitoring carefully.
+        Beware that no support exists for cleaning up the core file(s)
+        automatically.
+        This parameter can only be set in
+        the <filename>postgresql.conf</filename> file or on the server
+        command line.
+       </para>
+      </listitem>
+     </varlistentry>
+
     </variablelist>
   </sect1>
   <sect1 id="runtime-config-short">
diff --git a/doc/src/sgml/ref/postgres-ref.sgml b/doc/src/sgml/ref/postgres-ref.sgml
index 55a3f6c69d..b13a16a117 100644
--- a/doc/src/sgml/ref/postgres-ref.sgml
+++ b/doc/src/sgml/ref/postgres-ref.sgml
@@ -409,24 +409,6 @@ PostgreSQL documentation
       </listitem>
      </varlistentry>

-     <varlistentry>
-      <term><option>-n</option></term>
-      <listitem>
-       <para>
-        This option is for debugging problems that cause a server
-        process to die abnormally.  The ordinary strategy in this
-        situation is to notify all other server processes that they
-        must terminate and then reinitialize the shared memory and
-        semaphores.  This is because an errant server process could
-        have corrupted some shared state before terminating.  This
-        option specifies that <command>postgres</command> will
-        not reinitialize shared data structures.  A knowledgeable
-        system programmer can then use a debugger to examine shared
-        memory and semaphore state.
-       </para>
-     </listitem>
-    </varlistentry>
-
      <varlistentry>
       <term><option>-O</option></term>
       <listitem>
@@ -466,14 +448,9 @@ PostgreSQL documentation
         This option is for debugging problems that cause a server
         process to die abnormally.  The ordinary strategy in this
         situation is to notify all other server processes that they
-        must terminate and then reinitialize the shared memory and
-        semaphores.  This is because an errant server process could
-        have corrupted some shared state before terminating.  This
-        option specifies that <command>postgres</command> will
-        stop all other server processes by sending the signal
-        <literal>SIGSTOP</literal>, but will not cause them to
-        terminate.  This permits system programmers to collect core
-        dumps from all server processes by hand.
+        must terminate, by sending them <systemitem>SIGQUIT</systemitem>
+        signals.  With this option, <systemitem>SIGABRT</systemitem>
+        will be sent instead, resulting in production of core dump files.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index f5da4260a1..67f556b4dd 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -352,11 +352,10 @@ help(const char *progname)

     printf(_("\nDeveloper options:\n"));
     printf(_("  -f s|i|o|b|t|n|m|h forbid use of some plan types\n"));
-    printf(_("  -n                 do not reinitialize shared memory after abnormal exit\n"));
     printf(_("  -O                 allow system table structure changes\n"));
     printf(_("  -P                 disable system indexes\n"));
     printf(_("  -t pa|pl|ex        show timings after each query\n"));
-    printf(_("  -T                 send SIGSTOP to all backend processes if one dies\n"));
+    printf(_("  -T                 send SIGABRT to all backend processes if one dies\n"));
     printf(_("  -W NUM             wait NUM seconds to allow attach from a debugger\n"));

     printf(_("\nOptions for single-user mode:\n"));
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 1da5752047..c83cc8cc6c 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -219,16 +219,6 @@ int            ReservedBackends;
 #define MAXLISTEN    64
 static pgsocket ListenSocket[MAXLISTEN];

-/*
- * These globals control the behavior of the postmaster in case some
- * backend dumps core.  Normally, it kills all peers of the dead backend
- * and reinitializes shared memory.  By specifying -s or -n, we can have
- * the postmaster stop (rather than kill) peers and not reinitialize
- * shared data structures.  (Reinit is currently dead code, though.)
- */
-static bool Reinit = true;
-static int    SendStop = false;
-
 /* still more option variables */
 bool        EnableSSL = false;

@@ -243,6 +233,8 @@ bool        enable_bonjour = false;
 char       *bonjour_name;
 bool        restart_after_crash = true;
 bool        remove_temp_files_after_crash = true;
+bool        send_abort_for_crash = false;
+bool        send_abort_for_kill = false;

 /* PIDs of special child processes; 0 when not running */
 static pid_t StartupPID = 0,
@@ -414,6 +406,7 @@ static void report_fork_failure_to_client(Port *port, int errnum);
 static CAC_state canAcceptConnections(int backend_type);
 static bool RandomCancelKey(int32 *cancel_key);
 static void signal_child(pid_t pid, int signal);
+static void sigquit_child(pid_t pid);
 static bool SignalSomeChildren(int signal, int target);
 static void TerminateChildren(int signal);

@@ -697,7 +690,7 @@ PostmasterMain(int argc, char *argv[])
      * tcop/postgres.c (the option sets should not conflict) and with the
      * common help() function in main/main.c.
      */
-    while ((opt = getopt(argc, argv, "B:bc:C:D:d:EeFf:h:ijk:lN:nOPp:r:S:sTt:W:-:")) != -1)
+    while ((opt = getopt(argc, argv, "B:bc:C:D:d:EeFf:h:ijk:lN:OPp:r:S:sTt:W:-:")) != -1)
     {
         switch (opt)
         {
@@ -767,11 +760,6 @@ PostmasterMain(int argc, char *argv[])
                 SetConfigOption("max_connections", optarg, PGC_POSTMASTER, PGC_S_ARGV);
                 break;

-            case 'n':
-                /* Don't reinit shared mem after abnormal exit */
-                Reinit = false;
-                break;
-
             case 'O':
                 SetConfigOption("allow_system_table_mods", "true", PGC_POSTMASTER, PGC_S_ARGV);
                 break;
@@ -799,11 +787,10 @@ PostmasterMain(int argc, char *argv[])
             case 'T':

                 /*
-                 * In the event that some backend dumps core, send SIGSTOP,
-                 * rather than SIGQUIT, to all its peers.  This lets the wily
-                 * post_hacker collect core dumps from everyone.
+                 * This option used to be defined as sending SIGSTOP after a
+                 * backend crash, but sending SIGABRT seems more useful.
                  */
-                SendStop = true;
+                SetConfigOption("send_abort_for_crash", "true", PGC_POSTMASTER, PGC_S_ARGV);
                 break;

             case 't':
@@ -1896,20 +1883,23 @@ ServerLoop(void)

         /*
          * If we already sent SIGQUIT to children and they are slow to shut
-         * down, it's time to send them SIGKILL.  This doesn't happen
-         * normally, but under certain conditions backends can get stuck while
-         * shutting down.  This is a last measure to get them unwedged.
+         * down, it's time to send them SIGKILL (or SIGABRT if requested).
+         * This doesn't happen normally, but under certain conditions backends
+         * can get stuck while shutting down.  This is a last measure to get
+         * them unwedged.
          *
          * Note we also do this during recovery from a process crash.
          */
-        if ((Shutdown >= ImmediateShutdown || (FatalError && !SendStop)) &&
+        if ((Shutdown >= ImmediateShutdown || FatalError) &&
             AbortStartTime != 0 &&
             (now - AbortStartTime) >= SIGKILL_CHILDREN_AFTER_SECS)
         {
             /* We were gentle with them before. Not anymore */
             ereport(LOG,
-                    (errmsg("issuing SIGKILL to recalcitrant children")));
-            TerminateChildren(SIGKILL);
+            /* translator: %s is SIGKILL or SIGABRT */
+                    (errmsg("issuing %s to recalcitrant children",
+                            send_abort_for_kill ? "SIGABRT" : "SIGKILL")));
+            TerminateChildren(send_abort_for_kill ? SIGABRT : SIGKILL);
             /* reset flag so we don't SIGKILL again */
             AbortStartTime = 0;
         }
@@ -2903,6 +2893,7 @@ pmdie(SIGNAL_ARGS)
 #endif

             /* tell children to shut down ASAP */
+            /* (note we don't apply send_abort_for_crash here) */
             SetQuitSignalReason(PMQUIT_FOR_STOP);
             TerminateChildren(SIGQUIT);
             pmState = PM_WAIT_BACKENDS;
@@ -3470,20 +3461,9 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
             /*
              * This worker is still alive.  Unless we did so already, tell it
              * to commit hara-kiri.
-             *
-             * SIGQUIT is the special signal that says exit without proc_exit
-             * and let the user know what's going on. But if SendStop is set
-             * (-T on command line), then we send SIGSTOP instead, so that we
-             * can get core dumps from all backends by hand.
              */
             if (take_action)
-            {
-                ereport(DEBUG2,
-                        (errmsg_internal("sending %s to process %d",
-                                         (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                         (int) rw->rw_pid)));
-                signal_child(rw->rw_pid, (SendStop ? SIGSTOP : SIGQUIT));
-            }
+                sigquit_child(rw->rw_pid);
         }
     }

@@ -3514,13 +3494,8 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
              * This backend is still alive.  Unless we did so already, tell it
              * to commit hara-kiri.
              *
-             * SIGQUIT is the special signal that says exit without proc_exit
-             * and let the user know what's going on. But if SendStop is set
-             * (-T on command line), then we send SIGSTOP instead, so that we
-             * can get core dumps from all backends by hand.
-             *
-             * We could exclude dead_end children here, but at least in the
-             * SIGSTOP case it seems better to include them.
+             * We could exclude dead_end children here, but at least when
+             * sending SIGABRT it seems better to include them.
              *
              * Background workers were already processed above; ignore them
              * here.
@@ -3529,13 +3504,7 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
                 continue;

             if (take_action)
-            {
-                ereport(DEBUG2,
-                        (errmsg_internal("sending %s to process %d",
-                                         (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                         (int) bp->pid)));
-                signal_child(bp->pid, (SendStop ? SIGSTOP : SIGQUIT));
-            }
+                sigquit_child(bp->pid);
         }
     }

@@ -3547,11 +3516,7 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
     }
     else if (StartupPID != 0 && take_action)
     {
-        ereport(DEBUG2,
-                (errmsg_internal("sending %s to process %d",
-                                 (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                 (int) StartupPID)));
-        signal_child(StartupPID, (SendStop ? SIGSTOP : SIGQUIT));
+        sigquit_child(StartupPID);
         StartupStatus = STARTUP_SIGNALED;
     }

@@ -3559,73 +3524,37 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
     if (pid == BgWriterPID)
         BgWriterPID = 0;
     else if (BgWriterPID != 0 && take_action)
-    {
-        ereport(DEBUG2,
-                (errmsg_internal("sending %s to process %d",
-                                 (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                 (int) BgWriterPID)));
-        signal_child(BgWriterPID, (SendStop ? SIGSTOP : SIGQUIT));
-    }
+        sigquit_child(BgWriterPID);

     /* Take care of the checkpointer too */
     if (pid == CheckpointerPID)
         CheckpointerPID = 0;
     else if (CheckpointerPID != 0 && take_action)
-    {
-        ereport(DEBUG2,
-                (errmsg_internal("sending %s to process %d",
-                                 (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                 (int) CheckpointerPID)));
-        signal_child(CheckpointerPID, (SendStop ? SIGSTOP : SIGQUIT));
-    }
+        sigquit_child(CheckpointerPID);

     /* Take care of the walwriter too */
     if (pid == WalWriterPID)
         WalWriterPID = 0;
     else if (WalWriterPID != 0 && take_action)
-    {
-        ereport(DEBUG2,
-                (errmsg_internal("sending %s to process %d",
-                                 (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                 (int) WalWriterPID)));
-        signal_child(WalWriterPID, (SendStop ? SIGSTOP : SIGQUIT));
-    }
+        sigquit_child(WalWriterPID);

     /* Take care of the walreceiver too */
     if (pid == WalReceiverPID)
         WalReceiverPID = 0;
     else if (WalReceiverPID != 0 && take_action)
-    {
-        ereport(DEBUG2,
-                (errmsg_internal("sending %s to process %d",
-                                 (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                 (int) WalReceiverPID)));
-        signal_child(WalReceiverPID, (SendStop ? SIGSTOP : SIGQUIT));
-    }
+        sigquit_child(WalReceiverPID);

     /* Take care of the autovacuum launcher too */
     if (pid == AutoVacPID)
         AutoVacPID = 0;
     else if (AutoVacPID != 0 && take_action)
-    {
-        ereport(DEBUG2,
-                (errmsg_internal("sending %s to process %d",
-                                 (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                 (int) AutoVacPID)));
-        signal_child(AutoVacPID, (SendStop ? SIGSTOP : SIGQUIT));
-    }
+        sigquit_child(AutoVacPID);

     /* Take care of the archiver too */
     if (pid == PgArchPID)
         PgArchPID = 0;
     else if (PgArchPID != 0 && take_action)
-    {
-        ereport(DEBUG2,
-                (errmsg_internal("sending %s to process %d",
-                                 (SendStop ? "SIGSTOP" : "SIGQUIT"),
-                                 (int) PgArchPID)));
-        signal_child(PgArchPID, (SendStop ? SIGSTOP : SIGQUIT));
-    }
+        sigquit_child(PgArchPID);

     /* We do NOT restart the syslogger */

@@ -3834,6 +3763,10 @@ PostmasterStateMachine(void)
                      * Any required cleanup will happen at next restart. We
                      * set FatalError so that an "abnormal shutdown" message
                      * gets logged when we exit.
+                     *
+                     * We don't consult send_abort_for_crash here, as it's
+                     * unlikely that dumping cores would illuminate the reason
+                     * for checkpointer fork failure.
                      */
                     FatalError = true;
                     pmState = PM_WAIT_DEAD_END;
@@ -4003,8 +3936,8 @@ signal_child(pid_t pid, int signal)
         case SIGINT:
         case SIGTERM:
         case SIGQUIT:
-        case SIGSTOP:
         case SIGKILL:
+        case SIGABRT:
             if (kill(-pid, signal) < 0)
                 elog(DEBUG3, "kill(%ld,%d) failed: %m", (long) (-pid), signal);
             break;
@@ -4014,6 +3947,24 @@ signal_child(pid_t pid, int signal)
 #endif
 }

+/*
+ * Convenience function for killing a child process after a crash of some
+ * other child process.  We log the action at a higher level than we would
+ * otherwise do, and we apply send_abort_for_crash to decide which signal
+ * to send.  Normally it's SIGQUIT -- and most other comments in this file
+ * are written on the assumption that it is -- but developers might prefer
+ * to use SIGABRT to collect per-child core dumps.
+ */
+static void
+sigquit_child(pid_t pid)
+{
+    ereport(DEBUG2,
+            (errmsg_internal("sending %s to process %d",
+                             (send_abort_for_crash ? "SIGABRT" : "SIGQUIT"),
+                             (int) pid)));
+    signal_child(pid, (send_abort_for_crash ? SIGABRT : SIGQUIT));
+}
+
 /*
  * Send a signal to the targeted children (but NOT special children;
  * dead_end children are never signaled, either).
@@ -4069,7 +4020,7 @@ TerminateChildren(int signal)
     if (StartupPID != 0)
     {
         signal_child(StartupPID, signal);
-        if (signal == SIGQUIT || signal == SIGKILL)
+        if (signal == SIGQUIT || signal == SIGKILL || signal == SIGABRT)
             StartupStatus = STARTUP_SIGNALED;
     }
     if (BgWriterPID != 0)
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 836b49484a..349dd6a537 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1227,6 +1227,26 @@ struct config_bool ConfigureNamesBool[] =
         true,
         NULL, NULL, NULL
     },
+    {
+        {"send_abort_for_crash", PGC_SIGHUP, DEVELOPER_OPTIONS,
+            gettext_noop("Send SIGABRT not SIGQUIT to child processes after backend crash."),
+            NULL,
+            GUC_NOT_IN_SAMPLE
+        },
+        &send_abort_for_crash,
+        false,
+        NULL, NULL, NULL
+    },
+    {
+        {"send_abort_for_kill", PGC_SIGHUP, DEVELOPER_OPTIONS,
+            gettext_noop("Send SIGABRT not SIGKILL to stuck child processes."),
+            NULL,
+            GUC_NOT_IN_SAMPLE
+        },
+        &send_abort_for_kill,
+        false,
+        NULL, NULL, NULL
+    },

     {
         {"log_duration", PGC_SUSET, LOGGING_WHAT,
diff --git a/src/include/postmaster/postmaster.h b/src/include/postmaster/postmaster.h
index 90e333ccd2..f078321df2 100644
--- a/src/include/postmaster/postmaster.h
+++ b/src/include/postmaster/postmaster.h
@@ -30,6 +30,8 @@ extern PGDLLIMPORT bool enable_bonjour;
 extern PGDLLIMPORT char *bonjour_name;
 extern PGDLLIMPORT bool restart_after_crash;
 extern PGDLLIMPORT bool remove_temp_files_after_crash;
+extern PGDLLIMPORT bool send_abort_for_crash;
+extern PGDLLIMPORT bool send_abort_for_kill;

 #ifdef WIN32
 extern PGDLLIMPORT HANDLE PostmasterHandle;

Re: Sending SIGABRT to child processes (was Re: Strange failure on mamba)

От

Tom Lane

Дата:

21 ноября 2022 г., 20:04:21

I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> I suspect that having a GUC would be a good idea. I needed something similar
>> recently, debugging an occasional hang in the AIO patchset. I first tried
>> something like your #define approach and it did cause a problematic flood of
>> core files.

> Yeah, the main downside of such a thing is the risk of lots of core files
> accumulating over repeated crashes.  Nonetheless, I think it'll be a
> useful debugging aid.  Here's a proposed patch.  (I took the opportunity
> to kill off the long-since-unimplemented Reinit switch, too.)

Hearing no complaints, I've pushed this and reconfigured mamba to use
send_abort_for_kill.  Once I've got a core file or two to look at,
I'll try to figure out what's going on there.

> One thing I'm not too clear on is if we want to send SIGABRT to the child
> groups (ie, SIGABRT grandchild processes too).  I made signal_child do
> so here, but perhaps it's overkill.

After further thought, we do have to SIGABRT the grandchildren too,
or they won't shut down promptly.  I think there might be a small
risk of some programs trapping SIGABRT and doing something other than
what we want; but since this is only a debug aid that's probably
tolerable.

            regards, tom lane

Re: Strange failure on mamba

От

Tom Lane

Дата:

30 ноября 2022 г., 04:44:34

I wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
>> On Fri, Nov 18, 2022 at 11:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> mamba has been showing intermittent failures in various replication
>>> tests since day one.

>> I wonder if it's a runtime variant of the other problem.  We do
>> load_file("libpqwalreceiver", false) before unblocking signals but
>> maybe don't resolve the symbols until calling them, or something like
>> that...

> Yeah, that or some other NetBSD bug could be the explanation, too.
> Without a stack trace it's hard to have any confidence about it,
> but I've been unable to reproduce the problem outside the buildfarm.

Thanks to commit 51b5834cd I've now been able to capture some info
from mamba's last couple of failures [1][2].  Sure enough, what is
happening is that postmaster children are getting stuck in recursive
rtld symbol resolution.  A couple of the stack traces I collected are

#0  0xfdeede4c in ___lwp_park60 () from /usr/libexec/ld.elf_so
#1  0xfdee3e08 in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
#2  0xfdee59e4 in dlopen () from /usr/libexec/ld.elf_so
#3  0x01e54ed0 in internal_load_library (
    libname=libname@entry=0xfd74cc88
"/home/buildfarm/bf-data/HEAD/pgsql.build/tmp_install/home/buildfarm/bf-data/HEAD/inst/lib/postgresql/libpqwalreceiver.so")
atdfmgr.c:239 
#4  0x01e55c78 in load_file (filename=<optimized out>, restricted=<optimized out>) at dfmgr.c:156
#5  0x01c5ba24 in WalReceiverMain () at walreceiver.c:292
#6  0x01c090f8 in AuxiliaryProcessMain (auxtype=auxtype@entry=WalReceiverProcess) at auxprocess.c:161
#7  0x01c10970 in StartChildProcess (type=WalReceiverProcess) at postmaster.c:5310
#8  0x01c123ac in MaybeStartWalReceiver () at postmaster.c:5475
#9  MaybeStartWalReceiver () at postmaster.c:5468
#10 sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5131
#11 <signal handler called>
#12 0xfdee6b44 in _rtld_symlook_obj () from /usr/libexec/ld.elf_so
#13 0xfdee6fc0 in _rtld_symlook_list () from /usr/libexec/ld.elf_so
#14 0xfdee7644 in _rtld_symlook_default () from /usr/libexec/ld.elf_so
#15 0xfdee795c in _rtld_find_symdef () from /usr/libexec/ld.elf_so
#16 0xfdee7ad0 in _rtld_find_plt_symdef () from /usr/libexec/ld.elf_so
#17 0xfdee1918 in _rtld_bind () from /usr/libexec/ld.elf_so
#18 0xfdee1dc0 in _rtld_bind_secureplt_start () from /usr/libexec/ld.elf_so
Backtrace stopped: frame did not save the PC

#0  0xfdeede4c in ___lwp_park60 () from /usr/libexec/ld.elf_so
#1  0xfdee3e08 in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
#2  0xfdee4ba4 in _rtld_exit () from /usr/libexec/ld.elf_so
#3  0xfd54ea74 in __cxa_finalize () from /usr/lib/libc.so.12
#4  0xfd54e354 in exit () from /usr/lib/libc.so.12
#5  0x01c963c0 in proc_exit (code=code@entry=0) at ipc.c:152
#6  0x01c056e4 in AutoVacLauncherShutdown () at autovacuum.c:853
#7  0x01c071dc in AutoVacLauncherMain (argv=0x0, argc=0) at autovacuum.c:800
#8  0x01c07694 in StartAutoVacLauncher () at autovacuum.c:416
#9  0x01c11d3c in reaper (postgres_signal_arg=<optimized out>) at postmaster.c:3038
#10 <signal handler called>
#11 0xfdee6f64 in _rtld_symlook_list () from /usr/libexec/ld.elf_so
#12 0xfdee7644 in _rtld_symlook_default () from /usr/libexec/ld.elf_so
#13 0xfdee795c in _rtld_find_symdef () from /usr/libexec/ld.elf_so
#14 0xfdee7ad0 in _rtld_find_plt_symdef () from /usr/libexec/ld.elf_so
#15 0xfdee1918 in _rtld_bind () from /usr/libexec/ld.elf_so
#16 0xfdee1dc0 in _rtld_bind_secureplt_start () from /usr/libexec/ld.elf_so
Backtrace stopped: frame did not save the PC

which is pretty much just the same thing we were seeing before
commit 8acd8f869 :-(

Now, we certainly cannot think that these are occurring early in
postmaster startup.  In the wake of 8acd8f869, we should expect
that there's no further need to call rtld_bind at all in the
postmaster, but seemingly that's not so.  It's very frustrating
that the backtrace stops where it does :-(.  It's also strange
that we're apparently running with signals enabled whereever
it is that rtld_bind is getting called from.  Could it be that
sigaction is failing to install the requested signal mask, so
that one postmaster signal handler is interrupting another?

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-11-24%2021%3A45%3A29
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2022-11-29%2020%3A50%3A36

Re: Strange failure on mamba

От

Thomas Munro

Дата:

30 ноября 2022 г., 05:43:20

On Wed, Nov 30, 2022 at 2:44 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Now, we certainly cannot think that these are occurring early in
> postmaster startup.  In the wake of 8acd8f869, we should expect
> that there's no further need to call rtld_bind at all in the
> postmaster, but seemingly that's not so.  It's very frustrating
> that the backtrace stops where it does :-(.  It's also strange
> that we're apparently running with signals enabled whereever
> it is that rtld_bind is getting called from.  Could it be that
> sigaction is failing to install the requested signal mask, so
> that one postmaster signal handler is interrupting another?

Add in some code that does sigaction(0, NULL, &mask) to read the
current mask and assert that it's blocked as expected in the handlers?
Start the postmaster in gdb with a break on _rtld_bind to find all the
places that reach it (unexpectedly)?

Re: Strange failure on mamba

От

Thomas Munro

Дата:

30 ноября 2022 г., 05:45:36

On Wed, Nov 30, 2022 at 3:43 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> sigaction(0, NULL, &mask)

s/sigaction/sigprocmask/

Re: Strange failure on mamba

От

Andres Freund

Дата:

30 ноября 2022 г., 08:42:25

Hi,

On 2022-11-29 20:44:34 -0500, Tom Lane wrote:
> Thanks to commit 51b5834cd I've now been able to capture some info
> from mamba's last couple of failures [1][2].  Sure enough, what is
> happening is that postmaster children are getting stuck in recursive
> rtld symbol resolution.  A couple of the stack traces I collected are
> 
> #0  0xfdeede4c in ___lwp_park60 () from /usr/libexec/ld.elf_so
> #1  0xfdee3e08 in _rtld_exclusive_enter () from /usr/libexec/ld.elf_so
> #2  0xfdee59e4 in dlopen () from /usr/libexec/ld.elf_so
> #3  0x01e54ed0 in internal_load_library (
>     libname=libname@entry=0xfd74cc88
"/home/buildfarm/bf-data/HEAD/pgsql.build/tmp_install/home/buildfarm/bf-data/HEAD/inst/lib/postgresql/libpqwalreceiver.so")
atdfmgr.c:239
 
> #4  0x01e55c78 in load_file (filename=<optimized out>, restricted=<optimized out>) at dfmgr.c:156
> #5  0x01c5ba24 in WalReceiverMain () at walreceiver.c:292
> #6  0x01c090f8 in AuxiliaryProcessMain (auxtype=auxtype@entry=WalReceiverProcess) at auxprocess.c:161
> #7  0x01c10970 in StartChildProcess (type=WalReceiverProcess) at postmaster.c:5310
> #8  0x01c123ac in MaybeStartWalReceiver () at postmaster.c:5475
> #9  MaybeStartWalReceiver () at postmaster.c:5468
> #10 sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5131
> #11 <signal handler called>
> #12 0xfdee6b44 in _rtld_symlook_obj () from /usr/libexec/ld.elf_so
> #13 0xfdee6fc0 in _rtld_symlook_list () from /usr/libexec/ld.elf_so
> #14 0xfdee7644 in _rtld_symlook_default () from /usr/libexec/ld.elf_so
> #15 0xfdee795c in _rtld_find_symdef () from /usr/libexec/ld.elf_so
> #16 0xfdee7ad0 in _rtld_find_plt_symdef () from /usr/libexec/ld.elf_so
> #17 0xfdee1918 in _rtld_bind () from /usr/libexec/ld.elf_so
> #18 0xfdee1dc0 in _rtld_bind_secureplt_start () from /usr/libexec/ld.elf_so
> Backtrace stopped: frame did not save the PC

Do you have any idea why the stack can't be unwound further here? Is it
possibly indicative of a corrupted stack? I guess we'd need to dig into the
the netbsd libc code :(


> which is pretty much just the same thing we were seeing before
> commit 8acd8f869 :->

What libraries is postgres linked against? I don't know whether -z now only
affects the "top-level" dependencies of postgres, or also the dependencies of
shared libraries that haven't been built with -z now.  The only dependencies
that I could see being relevant are libintl and openssl.

You could try if anything changes if you set LD_BIND_NOW, that should trigger
"recursive" dependencies to be loaded eagerly as well.

Greetings,

Andres Freund

Re: Strange failure on mamba

От

Tom Lane

Дата:

30 ноября 2022 г., 08:55:42

Andres Freund <andres@anarazel.de> writes:
> On 2022-11-29 20:44:34 -0500, Tom Lane wrote:
>> Backtrace stopped: frame did not save the PC

> Do you have any idea why the stack can't be unwound further here? Is it
> possibly indicative of a corrupted stack? I guess we'd need to dig into
> the netbsd libc code :(

I did do some digging in that area previously when we were seeing this
on HPPA, and determined that the assembly code in that area was not
bothering to establish a standard stack frame, for no very obvious
reason :-(.  I haven't studied their equivalent PPC code, but apparently
it's equally cavalier.  I recall trying to hack the HPPA code to make
it set up the stack frame correctly, without success, but I didn't
try very hard.  Maybe I'll have a go at that on the PPC side.

> What libraries is postgres linked against? I don't know whether -z now only
> affects the "top-level" dependencies of postgres, or also the dependencies of
> shared libraries that haven't been built with -z now.  The only dependencies
> that I could see being relevant are libintl and openssl.

Hmm.  mamba is using both --enable-nls and --with-openssl, but
I can't see a reason why the postmaster would be interacting with
OpenSSL post-startup in test cases that don't use SSL.  Perhaps
libintl is doing something it shouldn't?

> You could try if anything changes if you set LD_BIND_NOW, that should trigger
> "recursive" dependencies to be loaded eagerly as well.

Googling LD_BIND_NOW suggests that that's a Linux thing; do you know that
it should have an effect on NetBSD?

            regards, tom lane

Re: Strange failure on mamba

От

Andres Freund

Дата:

30 ноября 2022 г., 09:06:47

Hi,

On 2022-11-29 20:44:34 -0500, Tom Lane wrote:
> It's also strange that we're apparently running with signals enabled
> whereever it is that rtld_bind is getting called from.  Could it be that
> sigaction is failing to install the requested signal mask, so that one
> postmaster signal handler is interrupting another?

This made me look at pqsignal_pm() / pqsignal() and realize that we wouldn't
even notice if it failed, because they just return SIG_ERR and callers don't
check. I don't think that's a likely to be related, but theoretically it could
lead to some odd situations.

Greetings,

Andres Freund

Re: Strange failure on mamba

От

Tom Lane

Дата:

30 ноября 2022 г., 09:15:15

Andres Freund <andres@anarazel.de> writes:
> On 2022-11-29 20:44:34 -0500, Tom Lane wrote:
>> It's also strange that we're apparently running with signals enabled
>> whereever it is that rtld_bind is getting called from.  Could it be that
>> sigaction is failing to install the requested signal mask, so that one
>> postmaster signal handler is interrupting another?

> This made me look at pqsignal_pm() / pqsignal() and realize that we wouldn't
> even notice if it failed, because they just return SIG_ERR and callers don't
> check. I don't think that's a likely to be related, but theoretically it could
> lead to some odd situations.

Yeah, I noticed that just now too.  But if sigaction() failed,
the signal handler wouldn't get installed at all, which'd lead
to different and more-obvious symptoms.  So I doubt that that's
what happened.

            regards, tom lane

Re: Strange failure on mamba

От

Andres Freund

Дата:

30 ноября 2022 г., 09:31:50

Hi,

On 2022-11-30 00:55:42 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > What libraries is postgres linked against? I don't know whether -z now only
> > affects the "top-level" dependencies of postgres, or also the dependencies of
> > shared libraries that haven't been built with -z now.  The only dependencies
> > that I could see being relevant are libintl and openssl.
> 
> Hmm.  mamba is using both --enable-nls and --with-openssl, but
> I can't see a reason why the postmaster would be interacting with
> OpenSSL post-startup in test cases that don't use SSL.  Perhaps
> libintl is doing something it shouldn't?

We do call into openssl in postmaster, via RandomCancelKey(). But we should
have signals masked at that point, so it shouldn't matter.


> > You could try if anything changes if you set LD_BIND_NOW, that should trigger
> > "recursive" dependencies to be loaded eagerly as well.
> 
> Googling LD_BIND_NOW suggests that that's a Linux thing; do you know that
> it should have an effect on NetBSD?

I'm not at all sure it does, but I did see it listed in
https://man.netbsd.org/ld.elf_so.1

     LD_BIND_NOW      If defined immediate binding of Procedure Link Table
                      (PLT) entries is performed instead of the default lazy
                      method.

so I assumed it would do the same as on linux.

Greetings,

Andres Freund

Re: Strange failure on mamba

От

Andres Freund

Дата:

30 ноября 2022 г., 09:55:49

Hi,

On 2022-11-29 22:31:50 -0800, Andres Freund wrote:
> On 2022-11-30 00:55:42 -0500, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > What libraries is postgres linked against? I don't know whether -z now only
> > > affects the "top-level" dependencies of postgres, or also the dependencies of
> > > shared libraries that haven't been built with -z now.  The only dependencies
> > > that I could see being relevant are libintl and openssl.
> > 
> > Hmm.  mamba is using both --enable-nls and --with-openssl, but
> > I can't see a reason why the postmaster would be interacting with
> > OpenSSL post-startup in test cases that don't use SSL.  Perhaps
> > libintl is doing something it shouldn't?
> 
> We do call into openssl in postmaster, via RandomCancelKey(). But we should
> have signals masked at that point, so it shouldn't matter.

Openssl does some muckery with signal masks on ppc (and a few others archs,
but not x86), but I don't immediately see it conflicting with our code:

https://github.com/openssl/openssl/blob/master/crypto/ppccap.c#L275

It should also already have been executed by the time we accept connections,
due to the __attribute__ ((constructor)).


I didn't check where netbsd gets libcrypto and whether it does something
different than upstream openssl...

Greetings,

Andres Freund

Re: Strange failure on mamba

От

Tom Lane

Дата:

01 декабря 2022 г., 02:33:06

Andres Freund <andres@anarazel.de> writes:
> On 2022-11-30 00:55:42 -0500, Tom Lane wrote:
>> Googling LD_BIND_NOW suggests that that's a Linux thing; do you know that
>> it should have an effect on NetBSD?

> I'm not at all sure it does, but I did see it listed in
> https://man.netbsd.org/ld.elf_so.1
>      LD_BIND_NOW      If defined immediate binding of Procedure Link Table
>                       (PLT) entries is performed instead of the default lazy
>                       method.

I checked the source code, and learned that (1) yes, rtld does pay
attention to this, and (2) the documentation lies: it has to be not
only defined, but nonempty, to get any effect.

Also, I dug into my stuck processes some more, and I have to take
back the claim that this is happening later than postmaster startup.
All the stuck children are ones that either are launched on request
from the startup process, or are launched as soon as we get the
termination report for the startup process.  So it's plausible that
the problem is happening during the postmaster's first select()
wait.  I then got dirty with the assembly code, and found out that
where the stack trace stops is an attempt to resolve this call:

   0xfd6f7a48 <__select50+76>:  bl      0xfd700ed0 <0000803c.got2.plt_pic32._sys___select50>

which is inside libpthread.so and is trying to call something in libc.so.
So we successfully got to the select() function from PostmasterMain, but
that has a non-prelinked call to someplace else, and kaboom.

In short, looks like Andres' theory is right.  It means that 8acd8f869
didn't actually fix anything, though it reduced the probability of the
failure by reducing the number of vulnerable PLT-indirect calls.

I've adjusted mamba to set LD_BIND_NOW=1 in its environment.
I've verified that that causes the call inside __select50
to get resolved before we reach main(), so I'm hopeful that
it will cure the issue.  But it'll probably be a few weeks
before we can be sure.

Don't have a good idea about a non-band-aid fix.  Perhaps we
should revert 8acd8f869 altogether, but then what?  Even if
somebody comes up with a rewrite to avoid doing interesting
stuff in the postmaster's signal handlers, we surely wouldn't
risk back-patching it.

It's possible that doing nothing is okay, at least in the
short term.  It's probably nigh impossible to hit this
issue on modern multi-CPU hardware.  Or perhaps we could revive
the idea of having postmaster.c do one dummy select() call
before it unblocks signals.

            regards, tom lane

Re: Strange failure on mamba

От

Andres Freund

Дата:

01 декабря 2022 г., 03:19:57

Hi,

On 2022-11-30 18:33:06 -0500, Tom Lane wrote:
> Also, I dug into my stuck processes some more, and I have to take
> back the claim that this is happening later than postmaster startup.
> All the stuck children are ones that either are launched on request
> from the startup process, or are launched as soon as we get the
> termination report for the startup process.  So it's plausible that
> the problem is happening during the postmaster's first select()
> wait.  I then got dirty with the assembly code, and found out that
> where the stack trace stops is an attempt to resolve this call:
> 
>    0xfd6f7a48 <__select50+76>:  bl      0xfd700ed0 <0000803c.got2.plt_pic32._sys___select50>
> 
> which is inside libpthread.so and is trying to call something in libc.so.
> So we successfully got to the select() function from PostmasterMain, but
> that has a non-prelinked call to someplace else, and kaboom.

This whole area just seems quite broken in netbsd :(.

We're clearly doing stuff in a signal handler that we really shouldn't, but
not being able to call any functions implemented in libc, even if they're
async signal safe (as e.g. select is) means signals are basically not
usable. Afaict this basically means that signals are *never* safe on netbsd,
as long as there's a single external function call in a signal handler.

> I've adjusted mamba to set LD_BIND_NOW=1 in its environment.
> I've verified that that causes the call inside __select50
> to get resolved before we reach main(), so I'm hopeful that
> it will cure the issue.  But it'll probably be a few weeks
> before we can be sure.
> 
> Don't have a good idea about a non-band-aid fix.

It's also a band aid, but perhaps a bit more reliable: We could link
statically to libc and libpthread.

Another approach could be to iterate over the loaded shared libraries during
postmaster startup and force symbols to be resolved. IIRC there's functions
that'd allow that. But it seems like a lot of work to work around an OS bug.

> Perhaps we should revert 8acd8f869 altogether, but then what?

FWIW, I think we should consider using those flags everywhere for the backend
- they make copy-on-write more effective and decrease connection overhead a
bit, because otherwise each backend process does the same symbol resolutions
again and again, dirtying memory post-fork.

> Even if somebody comes up with a rewrite to avoid doing interesting stuff in
> the postmaster's signal handlers, we surely wouldn't risk back-patching it.

Would that actually fix anything, given netbsd's brokenness? If we used a
latch like mechanism, the signal handler would still use functions in libc. So
postmaster could deadlock, at least during the first execution of a signal
handler? So I think 8acd8f869 continues to be important...

Greetings,

Andres Freund

Re: Strange failure on mamba

От

Tom Lane

Дата:

01 декабря 2022 г., 03:36:09

Andres Freund <andres@anarazel.de> writes:
> On 2022-11-30 18:33:06 -0500, Tom Lane wrote:
>> Even if somebody comes up with a rewrite to avoid doing interesting stuff in
>> the postmaster's signal handlers, we surely wouldn't risk back-patching it.

> Would that actually fix anything, given netbsd's brokenness? If we used a
> latch like mechanism, the signal handler would still use functions in libc. So
> postmaster could deadlock, at least during the first execution of a signal
> handler? So I think 8acd8f869 continues to be important...

I agree that "-z now" is a good idea for performance reasons, but
what we're seeing is that it's only a partial fix for netbsd's issue,
since it doesn't apply to shared libraries that the postmaster pulls
in.

I'm not sure about your thesis that things are fundamentally broken.
It does seem like if a signal handler does SetLatch then that could
require PLT resolution, and if it interrupts something else doing
PLT resolution then we have a problem.  But if it were a live
problem then we'd have seen instances outside of the postmaster's
select() wait, and we haven't.

I'm kind of inclined to band-aid that select() call as previously
suggested, and see where we end up.

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Strange failure on mamba