Обсуждение: pg_upgrade allows itself to be run twice

Поиск
Список
Период
Сортировка

pg_upgrade allows itself to be run twice

От
Justin Pryzby
Дата:
Now that I've gotten your attention..

I expect pg_upgrade to fail if I run it twice in a row.

It would be reasonable if it were to fail during the "--check" phase,
maybe by failing like this:
| New cluster database "..." is not empty: found relation "..."

But that fails to happen if the cluster has neither tables nor matviews, in
which case, it passes the check phase and then fails like this:

Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows in the new cluster                        ok
Deleting files from new pg_xact                             ok
Copying old pg_clog to new server                           ok
Setting oldest XID for new cluster                          ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok
Setting frozenxid and minmxid counters in new cluster       connection to server on socket
"/home/pryzbyj/src/postgres/.s.PGSQL.50432"failed: FATAL:  could not open relation with OID 2610
 
Failure, exiting

I'll concede that a cluster which has no tables sounds a lot like a toy, but I
find it disturbing that nothing prevents running the process twice, up to the
point that it's evidently corrupted the catalog.

While looking at this, I noticed that starting postgres --single immediately
after initdb allows creating objects with OIDs below FirstNormalObjectId
(thereby defeating pg_upgrade's check).  I'm not familiar with the behavioral
differences of single user mode, and couldn't find anything in the
documentation.



Re: pg_upgrade allows itself to be run twice

От
Michael Paquier
Дата:
On Sat, Jun 25, 2022 at 11:04:37AM -0500, Justin Pryzby wrote:
> I expect pg_upgrade to fail if I run it twice in a row.

Yep.

> It would be reasonable if it were to fail during the "--check" phase,
> maybe by failing like this:
> | New cluster database "..." is not empty: found relation "..."

So, we get a complaint that the new cluster is not empty after one
pg_upgrade run with a new command of pg_upgrade, with or without
--check.  This happens in check_new_cluster(), where we'd fatal if a
relation uses a namespace different than pg_catalog.

> But that fails to happen if the cluster has neither tables nor matviews, in
> which case, it passes the check phase and then fails like this:

Indeed, as of get_rel_infos().

> I'll concede that a cluster which has no tables sounds a lot like a toy, but I
> find it disturbing that nothing prevents running the process twice, up to the
> point that it's evidently corrupted the catalog.

I have heard of cases where instances were only used with a set of
foreign tables, for example.  Not sure that this is spread enough to
worry about, but this would fail as much as your case.

> While looking at this, I noticed that starting postgres --single immediately
> after initdb allows creating objects with OIDs below FirstNormalObjectId
> (thereby defeating pg_upgrade's check).  I'm not familiar with the behavioral
> differences of single user mode, and couldn't find anything in the
> documentation.

This one comes from NextOID in the control data file after a fresh
initdb, and GetNewObjectId() would enforce that in a postmaster
environment to be FirstNormalObjectId when assigning the first user
OID.  Would you imply an extra step at the end of initdb to update the
control data file of the new cluster to reflect FirstNormalObjectId?
--
Michael

Вложения

Re: pg_upgrade allows itself to be run twice

От
Justin Pryzby
Дата:
On Wed, Jun 29, 2022 at 01:17:33PM +0900, Michael Paquier wrote:
> On Sat, Jun 25, 2022 at 11:04:37AM -0500, Justin Pryzby wrote:
> 
> > I'll concede that a cluster which has no tables sounds a lot like a toy, but I
> > find it disturbing that nothing prevents running the process twice, up to the
> > point that it's evidently corrupted the catalog.
> 
> I have heard of cases where instances were only used with a set of
> foreign tables, for example.  Not sure that this is spread enough to
> worry about, but this would fail as much as your case.

foreign tables have OIDs too.

ts=# SELECT * FROM pg_class WHERE relkind ='f';
oid                 | 1554544611

> > While looking at this, I noticed that starting postgres --single immediately
> > after initdb allows creating objects with OIDs below FirstNormalObjectId
> > (thereby defeating pg_upgrade's check).  I'm not familiar with the behavioral
> > differences of single user mode, and couldn't find anything in the
> > documentation.
> 
> This one comes from NextOID in the control data file after a fresh
> initdb, and GetNewObjectId() would enforce that in a postmaster
> environment to be FirstNormalObjectId when assigning the first user
> OID.  Would you imply an extra step at the end of initdb to update the
> control data file of the new cluster to reflect FirstNormalObjectId?

I added a call to reset xlog, similar to what's in pg_upgrade.
Unfortunately, I don't see an easy way to silence it.

-- 
Justin

Вложения

Re: pg_upgrade allows itself to be run twice

От
Justin Pryzby
Дата:
rebased and updated

Robert thought that it might be reasonable for someone to initdb, and
then connect and make some modifications, and then pg_upgrade.
https://www.postgresql.org/message-id/CA%2BTgmoYwaXh_wRRa2CqL4XpM4r6YEbq1%2Bec%3D%2B8b7Dr7-T_tT%2BQ%40mail.gmail.com

But the DBs are dropped by pg_upgrade, so that seems to serve no
purpose, except for shared relations (and global objects?).  In the case
of shared relations, it seems unsafe (even though my test didn't cause
an immediate explosion).

So rather than continuing to allow arbitrary changes between initdb and
pg_upgrade, I propose to prohibit all changes.  I'd consider allowing an
"inclusive" list of allowable changes, if someone were to propose such a
thing - but since DBs are dropped, I'm not sure what it could include.

Вложения

Re: pg_upgrade allows itself to be run twice

От
Peter Eisentraut
Дата:
On 07.07.22 08:22, Justin Pryzby wrote:
>> This one comes from NextOID in the control data file after a fresh
>> initdb, and GetNewObjectId() would enforce that in a postmaster
>> environment to be FirstNormalObjectId when assigning the first user
>> OID.  Would you imply an extra step at the end of initdb to update the
>> control data file of the new cluster to reflect FirstNormalObjectId?
> I added a call to reset xlog, similar to what's in pg_upgrade.
> Unfortunately, I don't see an easy way to silence it.

I think it would be better to update the control file directly instead 
of going through pg_resetwal.  (See 
src/include/common/controldata_utils.h for the required functions.)

However, I don't know whether we need to add special provisions that 
guard against people using postgres --single in complicated ways.  Many 
consider the single-user mode deprecated outside of initdb use.




Re: pg_upgrade allows itself to be run twice

От
Justin Pryzby
Дата:
On Tue, Nov 01, 2022 at 01:54:35PM +0100, Peter Eisentraut wrote:
> On 07.07.22 08:22, Justin Pryzby wrote:
> > > This one comes from NextOID in the control data file after a fresh
> > > initdb, and GetNewObjectId() would enforce that in a postmaster
> > > environment to be FirstNormalObjectId when assigning the first user
> > > OID.  Would you imply an extra step at the end of initdb to update the
> > > control data file of the new cluster to reflect FirstNormalObjectId?
> > I added a call to reset xlog, similar to what's in pg_upgrade.
> > Unfortunately, I don't see an easy way to silence it.
> 
> I think it would be better to update the control file directly instead of
> going through pg_resetwal.  (See src/include/common/controldata_utils.h for
> the required functions.)
> 
> However, I don't know whether we need to add special provisions that guard
> against people using postgres --single in complicated ways.  Many consider
> the single-user mode deprecated outside of initdb use.

Thanks for looking.

One other thing I noticed (by accident!) is that pg_upgrade doesn't
prevent itself from trying to upgrade a cluster on top of itself:

| $ /usr/pgsql-15/bin/initdb -D pg15.dat -N
| $ /usr/pgsql-15/bin/pg_upgrade -D pg15.dat -d pg15.dat -b /usr/pgsql-15/bin
| Performing Upgrade
| ------------------
| Analyzing all rows in the new cluster                       ok
| Freezing all rows in the new cluster                        ok
| Deleting files from new pg_xact                             ok
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| Copying old pg_xact to new server
| *failure*
| 
| Consult the last few lines of "pg15.dat/pg_upgrade_output.d/20221101T055916.486/log/pg_upgrade_utility.log" for
>>
| command: cp -Rf "pg15.dat/pg_xact" "pg15.dat/pg_xact" >>
"pg15.dat/pg_upgrade_output.d/20221101T055916.486/log/pg_upgrade_utility.log"2>&1
 
| cp: cannot stat 'pg15.dat/pg_xact': No such file or directory

This may be of little concern since it's upgrading a version to itself, which
only applies to developers.

-- 
Justin



Re: pg_upgrade allows itself to be run twice

От
Peter Eisentraut
Дата:
On 01.11.22 14:07, Justin Pryzby wrote:
> On Tue, Nov 01, 2022 at 01:54:35PM +0100, Peter Eisentraut wrote:
>> On 07.07.22 08:22, Justin Pryzby wrote:
>>>> This one comes from NextOID in the control data file after a fresh
>>>> initdb, and GetNewObjectId() would enforce that in a postmaster
>>>> environment to be FirstNormalObjectId when assigning the first user
>>>> OID.  Would you imply an extra step at the end of initdb to update the
>>>> control data file of the new cluster to reflect FirstNormalObjectId?
>>> I added a call to reset xlog, similar to what's in pg_upgrade.
>>> Unfortunately, I don't see an easy way to silence it.
>>
>> I think it would be better to update the control file directly instead of
>> going through pg_resetwal.  (See src/include/common/controldata_utils.h for
>> the required functions.)
>>
>> However, I don't know whether we need to add special provisions that guard
>> against people using postgres --single in complicated ways.  Many consider
>> the single-user mode deprecated outside of initdb use.
> 
> Thanks for looking.

I think the above is a "returned with feedback" at this point.

> One other thing I noticed (by accident!) is that pg_upgrade doesn't
> prevent itself from trying to upgrade a cluster on top of itself:
> 
> | $ /usr/pgsql-15/bin/initdb -D pg15.dat -N
> | $ /usr/pgsql-15/bin/pg_upgrade -D pg15.dat -d pg15.dat -b /usr/pgsql-15/bin
> | Performing Upgrade
> | ------------------
> | Analyzing all rows in the new cluster                       ok
> | Freezing all rows in the new cluster                        ok
> | Deleting files from new pg_xact                             ok
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> | Copying old pg_xact to new server
> | *failure*
> |
> | Consult the last few lines of "pg15.dat/pg_upgrade_output.d/20221101T055916.486/log/pg_upgrade_utility.log" for
>>>
> | command: cp -Rf "pg15.dat/pg_xact" "pg15.dat/pg_xact" >>
"pg15.dat/pg_upgrade_output.d/20221101T055916.486/log/pg_upgrade_utility.log"2>&1
 
> | cp: cannot stat 'pg15.dat/pg_xact': No such file or directory
> 
> This may be of little concern since it's upgrading a version to itself, which
> only applies to developers.

I think this would be worth addressing nonetheless, for robustness.  For 
comparison, "cp" and "mv" will error if you give source and destination 
that are the same file.




Re: pg_upgrade allows itself to be run twice

От
Justin Pryzby
Дата:
On Thu, Dec 01, 2022 at 10:30:16AM +0100, Peter Eisentraut wrote:
> On 01.11.22 14:07, Justin Pryzby wrote:
> > On Tue, Nov 01, 2022 at 01:54:35PM +0100, Peter Eisentraut wrote:
> > > On 07.07.22 08:22, Justin Pryzby wrote:
> > > > > This one comes from NextOID in the control data file after a fresh
> > > > > initdb, and GetNewObjectId() would enforce that in a postmaster
> > > > > environment to be FirstNormalObjectId when assigning the first user
> > > > > OID.  Would you imply an extra step at the end of initdb to update the
> > > > > control data file of the new cluster to reflect FirstNormalObjectId?
> > > > I added a call to reset xlog, similar to what's in pg_upgrade.
> > > > Unfortunately, I don't see an easy way to silence it.
> > > 
> > > I think it would be better to update the control file directly instead of
> > > going through pg_resetwal.  (See src/include/common/controldata_utils.h for
> > > the required functions.)
> > > 
> > > However, I don't know whether we need to add special provisions that guard
> > > against people using postgres --single in complicated ways.  Many consider
> > > the single-user mode deprecated outside of initdb use.
> > 
> > Thanks for looking.

To be clear, I don't think it's worth doing anything special just to
avoid creating special OIDs if someone runs postgres --single
immediately after initdb.

However, setting FirstNormalOid in initdb itself (rather than in the
next invocation of postgres, if it isn't in single-user-mode) was the
mechanism by which to avoid the original problem that pg_upgrade can be
run twice, if there are no non-system relations.  That still seems
desirable to fix somehow.

> I think the above is a "returned with feedback" at this point.
> 
> > One other thing I noticed (by accident!) is that pg_upgrade doesn't
> > prevent itself from trying to upgrade a cluster on top of itself:

> > | command: cp -Rf "pg15.dat/pg_xact" "pg15.dat/pg_xact" >>
"pg15.dat/pg_upgrade_output.d/20221101T055916.486/log/pg_upgrade_utility.log"2>&1
 
> > | cp: cannot stat 'pg15.dat/pg_xact': No such file or directory
> > 
> > This may be of little concern since it's upgrading a version to itself, which
> > only applies to developers.
> 
> I think this would be worth addressing nonetheless, for robustness.  For
> comparison, "cp" and "mv" will error if you give source and destination that
> are the same file.

I addressed this in 002.

-- 
Justin

Вложения

Re: pg_upgrade allows itself to be run twice

От
Michael Paquier
Дата:
On Fri, Dec 16, 2022 at 07:38:09AM -0600, Justin Pryzby wrote:
> However, setting FirstNormalOid in initdb itself (rather than in the
> next invocation of postgres, if it isn't in single-user-mode) was the
> mechanism by which to avoid the original problem that pg_upgrade can be
> run twice, if there are no non-system relations.  That still seems
> desirable to fix somehow.

+       if (new_cluster.controldata.chkpnt_nxtoid != FirstNormalObjectId)
+               pg_fatal("New cluster is not pristine: OIDs have been assigned since initdb (%u != %u)\n",
+                       new_cluster.controldata.chkpnt_nxtoid, FirstNormalObjectId);

On wraparound FirstNormalObjectId would be the first value we use for
nextOid.  Okay, that's very unlikely going to happen, still, strictly
speaking, that could be incorrect.

>> I think this would be worth addressing nonetheless, for robustness.  For
>> comparison, "cp" and "mv" will error if you give source and destination that
>> are the same file.
>
> I addressed this in 002.

+       if (strcmp(make_absolute_path(old_cluster.pgdata),
+                  make_absolute_path(new_cluster.pgdata)) == 0)
+               pg_fatal("cannot upgrade a cluster on top of itself");

Shouldn't this be done after adjust_data_dir(), which is the point
where we'll know the actual data folders we are working on by querying
postgres -C data_directory?
--
Michael

Вложения