RE: [PoC] pg_upgrade: allow to upgrade publisher node

Поиск
Список
Период
Сортировка
От Hayato Kuroda (Fujitsu)
Тема RE: [PoC] pg_upgrade: allow to upgrade publisher node
Дата
Msg-id TYAPR01MB586620409507691E8116EA62F5E2A@TYAPR01MB5866.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
Dear Peter,

Thank you for reviewing! The patch can be available in [1].

> Here are my review comments for patch v25-0003.
> 
> ======
> src/bin/pg_upgrade/check.c
> 
> 1. GENERAL
> 
> +static void check_for_confirmed_flush_lsn(void);
> +static void check_for_lost_slots(void);
> 
> For more clarity, I wonder if it is better to rename some functions:
> 
> check_for_confirmed_flush_lsn() -> check_old_cluster_for_confirmed_flush_lsn()
> check_for_lost_slots() -> check_old_cluster_for_lost_slots()

Replaced.

> 2.
> + /*
> + * Logical replication slots can be migrated since PG17. See comments atop
> + * get_logical_slot_infos().
> + */
> + if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
> + {
> + check_for_lost_slots();
> +
> + /*
> + * Do additional checks if a live check is not required. This requires
> + * that confirmed_flush_lsn of all the slots is the same as the latest
> + * checkpoint location, but it would be satisfied only when the server
> + * has been shut down.
> + */
> + if (!live_check)
> + check_for_confirmed_flush_lsn();
> + }
> +
> 
> 2a.
> If my suggestions from v25-0002 [1] are adopted then this comment
> needs to change to say like "See atop file pg_upgrade.c..."
>
> 2b.
> Hmm. If my suggestions from v25-0002 [1] are adopted then the version
> checking and the slot counting would *already* be in this calling
> function. In that case, why can't this whole fragment be put in the
> same place? E.g. IIUC there is no reason to call these at checks all
> when the old_cluster slot count is already known to be 0. Similarly,
> there is no reason that both these functions need to be independently
> checking count_logical_slots again since we have already done that
> (again, assuming my suggestions from v25-0002 [1] are adopted).

Currently I did not accept the comment, so they were ignored.

> 3. check_for_lost_slots
> 
> +/*
> + * Verify that all logical replication slots are usable.
> + */
> +void
> +check_for_lost_slots(void)
> 
> This was forward-declared to be static, but the static function
> modifier is absent here.

Fixed.

> 4. check_for_lost_slots
> 
> + /* Quick exit if the cluster does not have logical slots. */
> + if (count_logical_slots() == 0)
> + return;
> +
> 
> AFAICT this quick exit can be removed. See my comment #2b.

2b was skipped, so IIUC this is still needed.

> 5. check_for_confirmed_flush_lsn
> 
> +check_for_confirmed_flush_lsn(void)
> +{
> + int i,
> + ntups,
> + i_slotname;
> + PGresult   *res;
> + DbInfo    *active_db = &old_cluster.dbarr.dbs[0];
> + PGconn    *conn;
> +
> + /* Quick exit if the cluster does not have logical slots. */
> + if (count_logical_slots() == 0)
> + return;
> 
> AFAICT this quick exit can be removed. See my comment #2b.

I kept the style.

> .../t/003_logical_replication_slots.pl
> 
> 6.
> +# 2. Temporarily disable the subscription
> +$subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub DISABLE");
>  $old_publisher->stop;
> 
> In my previous 0003 review ([2] #10b) I was not questioning the need
> for the $old_publisher->stop; before the pg_upgrade. I was only asking
> why it was done at this location (after the DISABLE) instead of
> earlier.

I see. The reason was to avoid unnecessary error by apply worker.

As the premise, the position of shutting down (before or after the DISABLE) does
not affect the result. But if it puts earlier than DISABLE, the apply worker will
exit with below error because the walsender exits ealier than worker:

```
ERROR:  could not send end-of-streaming message to primary: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
        no COPY in progress
```

It is not problematic but future readers may be confused if it find.
So I avoided it.

> 7.
> +# Check whether changes on the new publisher get replicated to the subscriber
> +$new_publisher->safe_psql('postgres',
> + "INSERT INTO tbl VALUES (generate_series(11, 20))");
> +$new_publisher->wait_for_catchup('sub');
> +$result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM tbl");
> +is($result, qq(20), 'check changes are shipped to the subscriber');
> 
> /shipped/replicated/

You meant to say s/replicated/shipped/, right? Fixed.

[1]:
https://www.postgresql.org/message-id/TYAPR01MB5866C6DE11EBC96752CEB7DEF5E2A%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Hayato Kuroda (Fujitsu)"
Дата:
Сообщение: RE: [PoC] pg_upgrade: allow to upgrade publisher node
Следующее
От: Erik Rijkers
Дата:
Сообщение: Re: PostgreSQL 16 release announcement draft