Обсуждение: [HACKERS] Regression stoping PostgreSQL 9.4.13 if a walsender is running

Поиск
Список
Период
Сортировка

[HACKERS] Regression stoping PostgreSQL 9.4.13 if a walsender is running

От
Marco Nenciarini
Дата:
I have noticed that after the 9.4.13 release PostgreSQL reliably fails
to shutdown with smart and fast method if there is a running walsender.

The postmaster continues waiting forever for the walsender termination.

It works perfectly with all the other major releases.

I bisected the issue to commit 1cdc0ab9c180222a94e1ea11402e728688ddc37d

After some investigation I discovered that the instruction that sets
got_SIGUSR2 was lost during the backpatch in the WalSndLastCycleHandler
function.

The trivial patch is the following:

~~~
diff --git a/src/backend/replication/walsender.c
b/src/backend/replication/walsender.c
index a0601b3..b24f9a1 100644
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/walsender.c
*************** WalSndLastCycleHandler(SIGNAL_ARGS)
*** 2658,2663 ****
--- 2658,2664 ---- {   int         save_errno = errno;

+   got_SIGUSR2 = true;   if (MyWalSnd)       SetLatch(&MyWalSnd->latch);

~~~

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it


Re: [HACKERS] Regression stoping PostgreSQL 9.4.13 if a walsender is running

От
Michael Paquier
Дата:
On Wed, Aug 23, 2017 at 2:28 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:
> I have noticed that after the 9.4.13 release PostgreSQL reliably fails
> to shutdown with smart and fast method if there is a running walsender.
>
> The postmaster continues waiting forever for the walsender termination.
>
> It works perfectly with all the other major releases.

Right. A similar issue has been reported yesterday:
https://www.postgresql.org/message-id/CAA5_DuD0O1XyM8OnOzhRepyPU-t8nZKLzs1pT2JpzP0NS+vVNA@mail.gmail.com
Thanks for digging into the origin of the problem, I was lacking of
time yesterday to look at it.

> I bisected the issue to commit 1cdc0ab9c180222a94e1ea11402e728688ddc37d
>
> After some investigation I discovered that the instruction that sets
> got_SIGUSR2 was lost during the backpatch in the WalSndLastCycleHandler
> function.

That looks correct to me, only REL9_4_STABLE is impacted. This bug
breaks many use cases like failovers :(
-- 
Michael



Re: [HACKERS] Regression stoping PostgreSQL 9.4.13 if a walsender is running

От
Andres Freund
Дата:
Hi,

On 2017-08-22 19:28:22 +0200, Marco Nenciarini wrote:
> I have noticed that after the 9.4.13 release PostgreSQL reliably fails
> to shutdown with smart and fast method if there is a running walsender.
> 
> The postmaster continues waiting forever for the walsender termination.
> 
> It works perfectly with all the other major releases.
> 
> I bisected the issue to commit 1cdc0ab9c180222a94e1ea11402e728688ddc37d
> 
> After some investigation I discovered that the instruction that sets
> got_SIGUSR2 was lost during the backpatch in the WalSndLastCycleHandler
> function.
> 
> The trivial patch is the following:

Pushed, thanks!  And sorry again.

Greetings,

Andres Freund