Re: Excessive number of replication slots for 12->14 logical replication

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Excessive number of replication slots for 12->14 logical replication
Дата
Msg-id CAA4eK1+RL43Qty=Rb+RJhw0+0sm-f2T=7ST=9u0R+vmnDDVZaA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Excessive number of replication slots for 12->14 logical replication  (hubert depesz lubaczewski <depesz@depesz.com>)
Ответы Re: Excessive number of replication slots for 12->14 logical replication
Список pgsql-bugs
On Mon, Jul 18, 2022 at 3:13 PM hubert depesz lubaczewski
<depesz@depesz.com> wrote:
>
> On Mon, Jul 18, 2022 at 09:07:35AM +0530, Amit Kapila wrote:
>
> First error:
> #v+
> 2022-07-18 09:22:07.046 UTC,,,4145917,,62d5263f.3f42fd,2,,2022-07-18 09:22:07 UTC,28/21641,1219146,ERROR,53400,"could
notfind free replication state slot for replication origin with OID 51",,"Increase max_replication_slots and try
again.",,,,,,,"","logicalreplication worker",,0 
> #v-
>
> Nothing else errored out before, no warning, no fatals.
>
> from the first ERROR I was getting them in the range of 40-70 per minute.
>
> At the same time I was logging data from `select now(), * from pg_replication_slots`, every 2 seconds.
>
...
>
> So, it looks that there are up to 10 focal slots, all active, and then there are sync slots with weirdly high counts
forinactive ones. 
>
> At most, I had 11 active sync slots.
>
> Looks like some kind of timing issue, which would be inline with what
> Kyotaro Horiguchi wrote initially.
>

I think this is a timing issue similar to what Horiguchi-San has
pointed out but due to replication origins. We drop the replication
origin after the sync worker that has used it is finished. This is
done by the apply worker because we don't allow to drop the origin
till the process owning the origin is alive. I am not sure of
repercussions but maybe we can allow dropping the origin by the
process that owns it.

I think this will also be addressed once we start resuing
workers/slots/origin to copy multiple tables in the initial sync phase
as is being discussed in the thread [1].

[1] - https://www.postgresql.org/message-id/CAGPVpCTq%3DrUDd4JUdaRc1XUWf4BrH2gdSNf3rtOMUGj9rPpfzQ%40mail.gmail.com

--
With Regards,
Amit Kapila.



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Francisco Olarte
Дата:
Сообщение: Re: BUG #17554: when i use rule on table which have serial column, the nextval exec twice.
Следующее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17555: Missing rhel-9 repo