Re: How to debug logical replication error "columns are missing" if they are not

Поиск
Список
Период
Сортировка
От Thomas Kellerer
Тема Re: How to debug logical replication error "columns are missing" if they are not
Дата
Msg-id e5d69628-de66-79a6-b11b-53c105747e30@gmx.net
обсуждение исходный текст
Ответ на How to debug logical replication error "columns are missing" if they are not  (Thomas Kellerer <shammat@gmx.net>)
Список pgsql-admin
Thomas Kellerer schrieb am 25.08.2020 um 07:32:
> we have a a logical replication from Postgres 11 to Postgres 12.
>
> We did some structural changes to the published tables. The changes
> to the subscriber where applied about 60 minutes later then those on
> the publisher. Obviously before the subscriber's tables where synced
> the replication failed.
> But even though the tables are now completely identical, the subscriber still claims:
>   logical replication target relation "public.employee" is missing some replicated columns

So - as explained by Samed - the problem boils down to a setup like this:

The initial table looked like this (on both sides)

    create table test_table (id integer primary key, valid_from date, valid_to date);

Replication is running fine, then we run the following on the publisher

  begin transaction;
    ALTER TABLE test_table ADD COLUMN start_end_date daterange;

    UPDATE test_table SET start_end_date = daterange(valid_from, valid_to, '[]');

    ALTER TABLE test_table ALTER COLUMN start_end_date SET NOT NULL;

    ALTER TABLE test_table
      DROP COLUMN valid_from,
      DROP COLUMN valid_to;
  commit;

If the UPDATE part is removed from the change, everything works smoothly.

So, the UPDATE sends the complete modified row including the to be dropped columns to the subscriber.
As I ran the same steps on the subscriber, the columns were dropped on the subscriber before the UPDATE could be
replayedand thus it kept failing. 

I do understand now why this happens, and that it is a limitation of the current implementation.


However, what I don't understand is, why removing the table from the replication doesn't fix this.
It seems, that if the table is re-added later, the old WAL segments are still considered valid and Postgres tries to
replaythem. 

Which seems a rather strange thing to do to me.
Could anyone enlighten me on that?
Is there a way to mark the no longer needed WAL segments as obsolete?


Regards
Thomas




В списке pgsql-admin по дате отправления:

Предыдущее
От: Thomas Kellerer
Дата:
Сообщение: Re: How to debug logical replication error "columns are missing" if they are not
Следующее
От: "Anjul Tyagi"
Дата:
Сообщение: Logical Replication - Rep Manager