Обсуждение: pg_upgrade questions

Поиск
Список
Период
Сортировка

pg_upgrade questions

От
Sad Clouds
Дата:
Hello, I would like to ask a few questions about pg_upgrade when using
--link option to avoid copying data files.

1. After running pg_upgrade but before starting new cluster, is it 100%
safe to abort and go back to using the old cluster? For example:

/usr/PostgreSQL/15/bin/pg_upgrade --link \
  --old-datadir "/data/pgdb" \
  --new-datadir "/data/pgdb.new" \
  --old-bindir "/usr/PostgreSQL/9.6/bin" \
  --new-bindir "/usr/PostgreSQL/15/bin"

Upgrade succeeds but before I start the new cluster, I change my mind
and want to revert to using the old cluster. Can I simply revert to
using PostgreSQL 9.6 with these commands:

rm -rf /data/pgdb.new
mv /data/pgdb/global/pg_control.old /data/pgdb/global/pg_control

Am I correct in thinking that apart from pg_control file, "pg_upgrade
--link" does not modify the old cluster files in any incompatible ways?

2. Older pg_upgrade binary did not support --jobs argument. When
upgrading from such older clusters, running the latest "pg_upgrade
--jobs=N" fails at the very start. Is this expected?

3. I'm a bit confused about the correct procedure when it comes to
upgrading primary and standby clusters. Looking at
https://www.postgresql.org/docs/current/pgupgrade.html it mentions the
following:

"If you are upgrading standby servers using methods outlined in section
Step 11, verify that the old standby servers are caught up by running
pg_controldata against the old primary and standby clusters. Verify
that the “Latest checkpoint location” values match in all clusters.
(There will be a mismatch if old standby servers were shut down before
the old primary or if the old standby servers are still running.) Also,
make sure wal_level is not set to minimal in the postgresql.conf file
on the new primary cluster."

Does this mean pg_upgrade will not work on standby if it is out of sync
with the primary? The page also talks about using rsync to copy data
from primary to standby server. I would like to avoid doing large data
copy. Are these upgrade steps likely to work:

a) Disable replication temporary.
b) Stop standby server and run pg_upgrade.
c) Stop primary server and run pg_upgrade.
d) Enable replication and start primary and standby and allow them to
synchronize their data.

Thanks.



Re: pg_upgrade questions

От
"David G. Johnston"
Дата:
On Wed, Nov 9, 2022 at 7:25 AM Sad Clouds <cryintothebluesky@gmail.com> wrote:
Hello, I would like to ask a few questions about pg_upgrade when using
--link option to avoid copying data files.

1. After running pg_upgrade but before starting new cluster, is it 100%
safe to abort and go back to using the old cluster? For example:

/usr/PostgreSQL/15/bin/pg_upgrade --link \
  --old-datadir "/data/pgdb" \
  --new-datadir "/data/pgdb.new" \
  --old-bindir "/usr/PostgreSQL/9.6/bin" \
  --new-bindir "/usr/PostgreSQL/15/bin"

Upgrade succeeds but before I start the new cluster, I change my mind
and want to revert to using the old cluster. Can I simply revert to
using PostgreSQL 9.6 with these commands:

rm -rf /data/pgdb.new
mv /data/pgdb/global/pg_control.old /data/pgdb/global/pg_control

Am I correct in thinking that apart from pg_control file, "pg_upgrade
--link" does not modify the old cluster files in any incompatible ways?

 

2. Older pg_upgrade binary did not support --jobs argument. When
upgrading from such older clusters, running the latest "pg_upgrade
--jobs=N" fails at the very start. Is this expected?

No, it is not expected.  Whether parallelism is used or not should never affect the correctness of an upgrade or the state of the resultant cluster.
 

Does this mean pg_upgrade will not work on standby if it is out of sync
with the primary?

It will not, because once the new cluster launches there is no WAL that can fix any inconsistencies.
 
The page also talks about using rsync to copy data
from primary to standby server. I would like to avoid doing large data
copy. Are these upgrade steps likely to work:

a) Disable replication temporary.
b) Stop standby server and run pg_upgrade.
c) Stop primary server and run pg_upgrade.
d) Enable replication and start primary and standby and allow them to
synchronize their data.


I cannot tell if you are talking about physical or logical replication...but only the former matters for this purpose.  I don't know why you'd want to disable replication, what you need to do is ensure that there is zero activity on the primary and that the standby has caught up applying all WAL that has already been generated (and no WAL is pending).  Hence the requirement to check pg_controldata.

Step 11 covers your options.  I don't suggest you invent your own.

rsync is the means by which you avoid "large data transfers".

David J.

Re: pg_upgrade questions

От
Sad Clouds
Дата:
On Wed, 9 Nov 2022 09:21:57 -0700
"David G. Johnston" <david.g.johnston@gmail.com> wrote:

> I cannot tell if you are talking about physical or logical
> replication...but only the former matters for this purpose.  I don't know
> why you'd want to disable replication, what you need to do is ensure that
> there is zero activity on the primary and that the standby has caught up
> applying all WAL that has already been generated (and no WAL is pending).
> Hence the requirement to check pg_controldata.
> 
> Step 11 covers your options.  I don't suggest you invent your own.
> 
> rsync is the means by which you avoid "large data transfers".
> 
> David J.

Hi David, thanks for your reply. Yes I was referring to physical
replication. Step 11 in the pg_upgrade guide implies that the primary
server has to be upgraded first, then rsync copies upgraded files and
links to the standby server. For various reasons, it would be
desirable for me to upgrade the standby server first, while the primary
database is fully up and running. I don't have much control over the
applications using the primary data base and cannot make them wait
unspecified amount of time until no WAL is pending on the standby. The
outage during database upgrade needs to be absolutely minimal, hence
the use of --link option.

I'm not an expert on PostgreSQL replication, so was hoping there was a
way to upgrade standby, while primary is modifying data, then later
upgrade primary and tell it to synchronize any new data to standby.
Maybe there is a way to pause replication and leave both servers in a
consistent state? Can any outstanding WAL on primary be automatically
replayed to standby?

But, if I understand you correctly, then this is not going to work at
all. I may need to discard all data on the standby and restart
replication from empty.