Re: Mailing list subscription's mail delivery delays?

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Mailing list subscription's mail delivery delays?
Дата
Msg-id CABUevEx=Nswe8OLg7kdr4A3UNSTG_rTO4P8fs_VVpeJamzfp3w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Mailing list subscription's mail delivery delays?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Mailing list subscription's mail delivery delays?
Список pgsql-www
On Mon, Oct 2, 2023 at 4:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Magnus Hagander <magnus@hagander.net> writes:
> > On Fri, Sep 29, 2023 at 1:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I have been seeing the same thing for a few days now, on my
> >> definitely-not-gmail personal server.  Something's flaky in the
> >> PG mail infrastructure.  It's gotten better since yesterday's
> >> outage, though I'm not convinced it's totally fixed.
>
> > There have been some pretty bad issues with gmail recently. Some
> > changes have been deployed that will hopefully help mitigate those and
> > make things better, but it takes time to recover.
>
> > The massive backlogs caused by gmail have been enough to spill over
> > and affect other destinations as well simply due to the load created
> > since we have such a huge number of gmail subscribers. But we're
> > slowly seeing the backlogs shrink now and the load come down so
> > hopefully the changes made will continue to have effect and let us be
> > back to normal soon.
>
> I'm still seeing multi-hour delivery delays on a subset of traffic,
> like maybe half a dozen instances today.
>
> Looking at the Received: timestamps shows pretty conclusively that
> the delays are within PG infra, for example this recent message from
> Heikki got hung up at two separate jumps:
>
> Return-Path: <pgsql-hackers-owner+M15-507066@lists.postgresql.org>
> Received: from malur.postgresql.org (malur.postgresql.org [217.196.149.56])
>         by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTPS id 392HruLZ2135620
>         (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT)
>         for <tgl@sss.pgh.pa.us>; Mon, 2 Oct 2023 13:53:57 -0400
> Received: from localhost ([127.0.0.1] helo=malur.postgresql.org)
>         by malur.postgresql.org with esmtp (Exim 4.94.2)
>         (envelope-from <pgsql-hackers-owner+M15-507066@lists.postgresql.org>)
>         id 1qnN7D-00GbGd-FB
>         for tgl@sss.pgh.pa.us; Mon, 02 Oct 2023 17:53:55 +0000
> Received: from makus.postgresql.org ([2001:4800:3e1:1::229])
>         by malur.postgresql.org with esmtps  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
>         (Exim 4.94.2)
>         (envelope-from <hlinnaka@iki.fi>)
>         id 1qnGcb-00AqOg-Ti
>         for pgsql-hackers@lists.postgresql.org; Mon, 02 Oct 2023 10:57:53 +0000
> Received: from meesny.iki.fi ([195.140.195.201])
>         by makus.postgresql.org with esmtps  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
>         (Exim 4.94.2)
>         (envelope-from <hlinnaka@iki.fi>)
>         id 1qnF5S-007kvc-AQ
>         for pgsql-hackers@postgresql.org; Mon, 02 Oct 2023 09:19:35 +0000
> Received: from [192.168.1.115] (dsl-hkibng22-54f8db-125.dhcp.inet.fi [84.248.219.125])
>         (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
>          key-exchange X25519 server-signature RSA-PSS (2048 bits))
>         (No client certificate requested)
>         (Authenticated sender: hlinnaka)
>         by meesny.iki.fi (Postfix) with ESMTPSA id 4Rzb4d51FBzydx;
>         Mon,  2 Oct 2023 12:19:29 +0300 (EEST)
> Message-ID: <fe32d2a0-0998-d866-d6ee-2aed70b9be00@iki.fi>
> Date: Mon, 2 Oct 2023 12:19:29 +0300
> ...
>
>
> Also, my own message <2154347.1696278028@sss.pgh.pa.us> went
> out to -hackers about 25 minutes ago and hasn't come back,
> so based on other recent examples I'm betting I won't see it
> for hours.
>
> Plenty of other traffic *is* coming through in normal-ish time,
> so I'm not sure I buy that there's still a massive logjam.

There is still definitely a problem, but it is slowly recovering. It
is *mostliy* hitting gmail at this point, but there can be spillover
to others in some cases (for example, there's a general throttling
when the load on the server gets too high). In this particular case,
it coincides timing-wise with our old friend the oom-killer nuking
postgres on the machine thereby stopping all incoming email for a
while before it got moving again. That particular problem should have
been taken care of completely by now, but the general backlog/queueing
problem is still ongoing but has been improving.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



В списке pgsql-www по дате отправления:

Предыдущее
От: Akshat Jaimini
Дата:
Сообщение: Permission to allow testing harness to send error reports for pgweb directly to mailing list.
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Permission to allow testing harness to send error reports for pgweb directly to mailing list.