Обсуждение: SSI patch renumbered existing 2PC resource managers??
So I finally started actually reading the SSI changes, and I am a tad distressed by this: diff --git a/src/include/access/twophase_rmgr.h b/src/include/access/twophase_rmgr.h index a541d0f..1c7d8bb 100644 --- a/src/include/access/twophase_rmgr.h +++ b/src/include/access/twophase_rmgr.h @@ -23,8 +23,9 @@ typedef uint8 TwoPhaseRmgrId; */#define TWOPHASE_RM_END_ID 0#define TWOPHASE_RM_LOCK_ID 1 -#define TWOPHASE_RM_PGSTAT_ID 2 -#define TWOPHASE_RM_MULTIXACT_ID 3 +#define TWOPHASE_RM_PREDICATELOCK_ID 2 +#define TWOPHASE_RM_PGSTAT_ID 3 +#define TWOPHASE_RM_MULTIXACT_ID 4#define TWOPHASE_RM_MAX_ID TWOPHASE_RM_MULTIXACT_IDextern const TwoPhaseCallbacktwophase_recover_callbacks[]; What was the rationale for changing the assignments of existing 2PC IDs? So far as I can tell, that breaks pg_upgrade (if there are any open prepared transactions) for no redeeming social benefit. Is there a reason why TWOPHASE_RM_PREDICATELOCK_ID has to be 2 and not at the end? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > What was the rationale for changing the assignments of existing > 2PC IDs? So far as I can tell, that breaks pg_upgrade (if there > are any open prepared transactions) for no redeeming social > benefit. Is there a reason why TWOPHASE_RM_PREDICATELOCK_ID has > to be 2 and not at the end? I'm sure that Dan will jump in if this guess is wrong, but since the predicate lock code is largely derived from the heavyweight locking code, it probably seemed to have a minor cosmetic benefit to put it adjacent to that. It didn't occur to me when the SSI 2PC code went in, but I can see the problem now that you point it out. The new entry should be moved to the end for compatibility. Would you like me to submit a patch to fix this, or should I stay out of your way? -Kevin
On 13.06.2011 21:31, Tom Lane wrote: > So I finally started actually reading the SSI changes, and I am a tad > distressed by this: > > diff --git a/src/include/access/twophase_rmgr.h b/src/include/access/twophase_rmgr.h > index a541d0f..1c7d8bb 100644 > --- a/src/include/access/twophase_rmgr.h > +++ b/src/include/access/twophase_rmgr.h > @@ -23,8 +23,9 @@ typedef uint8 TwoPhaseRmgrId; > */ > #define TWOPHASE_RM_END_ID 0 > #define TWOPHASE_RM_LOCK_ID 1 > -#define TWOPHASE_RM_PGSTAT_ID 2 > -#define TWOPHASE_RM_MULTIXACT_ID 3 > +#define TWOPHASE_RM_PREDICATELOCK_ID 2 > +#define TWOPHASE_RM_PGSTAT_ID 3 > +#define TWOPHASE_RM_MULTIXACT_ID 4 > #define TWOPHASE_RM_MAX_ID TWOPHASE_RM_MULTIXACT_ID > > extern const TwoPhaseCallback twophase_recover_callbacks[]; > > What was the rationale for changing the assignments of existing 2PC IDs? As far as I can tell it was for purely cosmetic reasons, to have lock and predicate lock lines together. > So far as I can tell, that breaks pg_upgrade (if there are any open > prepared transactions) for no redeeming social benefit. Surely pg_upgrade can't work anyway if there's any open prepared transactions in the database. We're not going to guarantee to keep all the data structures we write in two-phase state files unchanged over major releases. If pg_upgrade is not checking for prepared transcations at the moment, such a check should probably should be added. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Mon, Jun 13, 2011 at 10:22:19PM +0300, Heikki Linnakangas wrote: > As far as I can tell it was for purely cosmetic reasons, to have lock > and predicate lock lines together. Yes, that is the only reason. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > On 13.06.2011 21:31, Tom Lane wrote: >> So far as I can tell, that breaks pg_upgrade (if there are any open >> prepared transactions) for no redeeming social benefit. > Surely pg_upgrade can't work anyway if there's any open prepared > transactions in the database. We're not going to guarantee to keep all > the data structures we write in two-phase state files unchanged over > major releases. If pg_upgrade is not checking for prepared transcations > at the moment, such a check should probably should be added. No, pg_upgrade should not be unilaterally refusing that. The correct way to deal with this consideration is to change the TWOPHASE_MAGIC number when we make a change in on-disk 2PC state. Which wasn't done in the SSI patch. We can either change that now, or undo the unnecessary change in existing RM IDs. I vote for the latter. regards, tom lane
On Mon, Jun 13, 2011 at 03:33:24PM -0400, Tom Lane wrote: > We can either change that now, or undo the > unnecessary change in existing RM IDs. I vote for the latter. Sounds good to me. I'd offer a patch, but it'd probably take you longer to apply than to make the change yourself. Dan -- Dan R. K. Ports MIT CSAIL http://drkp.net/
Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > > On 13.06.2011 21:31, Tom Lane wrote: > >> So far as I can tell, that breaks pg_upgrade (if there are any open > >> prepared transactions) for no redeeming social benefit. > > > Surely pg_upgrade can't work anyway if there's any open prepared > > transactions in the database. We're not going to guarantee to keep all > > the data structures we write in two-phase state files unchanged over > > major releases. If pg_upgrade is not checking for prepared transcations > > at the moment, such a check should probably should be added. > > No, pg_upgrade should not be unilaterally refusing that. The correct > way to deal with this consideration is to change the TWOPHASE_MAGIC > number when we make a change in on-disk 2PC state. Which wasn't done > in the SSI patch. We can either change that now, or undo the > unnecessary change in existing RM IDs. I vote for the latter. Uh, isn't there some physical files in pg_twophase/ that stick around to keep prepared transactions --- if so, pg_upgrade does not copy them from the old cluster to the new one. I am also hesistant to do so because there might be data in there that isn't portable. I like the idea of adding a check, I assume by reading pg_prepared_xact(). -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > Tom Lane wrote: >> No, pg_upgrade should not be unilaterally refusing that. > Uh, isn't there some physical files in pg_twophase/ that stick around to > keep prepared transactions --- if so, pg_upgrade does not copy them from > the old cluster to the new one. I am also hesistant to do so because > there might be data in there that isn't portable. This argument seems a tad peculiar, since the *entire* *point* of pg_upgrade is to push physical files from one installation into another even though compatibility isn't guaranteed. It is the program's duty to understand enough to know whether it can transport the cluster's state safely. Not to arbitrarily discard state because it might possibly not be transportable. regards, tom lane
On 13.06.2011 22:33, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> On 13.06.2011 21:31, Tom Lane wrote: >>> So far as I can tell, that breaks pg_upgrade (if there are any open >>> prepared transactions) for no redeeming social benefit. > >> Surely pg_upgrade can't work anyway if there's any open prepared >> transactions in the database. We're not going to guarantee to keep all >> the data structures we write in two-phase state files unchanged over >> major releases. If pg_upgrade is not checking for prepared transcations >> at the moment, such a check should probably should be added. > > No, pg_upgrade should not be unilaterally refusing that. The correct > way to deal with this consideration is to change the TWOPHASE_MAGIC > number when we make a change in on-disk 2PC state. Which wasn't done > in the SSI patch. We can either change that now, or undo the > unnecessary change in existing RM IDs. I vote for the latter. Ok, I've renumbered the existing RMs back the way they were. I nevertheless don't think it's worthwhile to try to migrate 2pc state files in pg_upgrade. More than likely, it's a mistake on part of the admin anyway if there is a prepared transaction open at that point. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Ok, I've renumbered the existing RMs back the way they were. Don't we also need something like the attached? -Kevin
Вложения
On 14.06.2011 15:14, Kevin Grittner wrote: > Heikki Linnakangas wrote: > >> Ok, I've renumbered the existing RMs back the way they were. > > Don't we also need something like the attached? Yes. I just committed a fix for that after noticing that the buildfarm didn't like it. Sorry.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Tom Lane wrote: > >> No, pg_upgrade should not be unilaterally refusing that. > > > Uh, isn't there some physical files in pg_twophase/ that stick around to > > keep prepared transactions --- if so, pg_upgrade does not copy them from > > the old cluster to the new one. I am also hesistant to do so because > > there might be data in there that isn't portable. > > This argument seems a tad peculiar, since the *entire* *point* of > pg_upgrade is to push physical files from one installation into another > even though compatibility isn't guaranteed. It is the program's duty to > understand enough to know whether it can transport the cluster's state > safely. Not to arbitrarily discard state because it might possibly not > be transportable. Well, pg_upgrade succeeds because it does as little as necessary to do the migration, relying on pg_dump to do much of the migration work at the catalog level. pg_upgrade tries to be involved as little as possible with the Postgres code so it doesn't have to be changed regularly between major versions. The prepared transaction case seems ugly enough that we don't want pg_upgrade to have to check every major release if anything changed about the data stored in prepared transactions. This is the same reason pg_upgrade doesn't transfer WAL files from the old cluster, just pg_clog files (which rarely changes its format). -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian wrote: > > This argument seems a tad peculiar, since the *entire* *point* of > > pg_upgrade is to push physical files from one installation into another > > even though compatibility isn't guaranteed. It is the program's duty to > > understand enough to know whether it can transport the cluster's state > > safely. Not to arbitrarily discard state because it might possibly not > > be transportable. > > Well, pg_upgrade succeeds because it does as little as necessary to do > the migration, relying on pg_dump to do much of the migration work at > the catalog level. pg_upgrade tries to be involved as little as > possible with the Postgres code so it doesn't have to be changed > regularly between major versions. > > The prepared transaction case seems ugly enough that we don't want > pg_upgrade to have to check every major release if anything changed > about the data stored in prepared transactions. This is the same reason > pg_upgrade doesn't transfer WAL files from the old cluster, just pg_clog > files (which rarely changes its format). I have applied the attached pg_upgrade patch to head and 9.1 to fail if prepared transactions are in the old or new cluster. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c new file mode 100644 index fdec6e3..376d25a *** a/contrib/pg_upgrade/check.c --- b/contrib/pg_upgrade/check.c *************** static void check_old_cluster_has_new_cl *** 16,21 **** --- 16,22 ---- static void check_locale_and_encoding(ControlData *oldctrl, ControlData *newctrl); static void check_is_super_user(ClusterInfo *cluster); + static void check_for_prepared_transactions(ClusterInfo *cluster); static void check_for_isn_and_int8_passing_mismatch(ClusterInfo *cluster); static void check_for_reg_data_type_usage(ClusterInfo *cluster); *************** check_old_cluster(bool live_check, *** 65,70 **** --- 66,72 ---- * Check for various failure cases */ check_is_super_user(&old_cluster); + check_for_prepared_transactions(&old_cluster); check_for_reg_data_type_usage(&old_cluster); check_for_isn_and_int8_passing_mismatch(&old_cluster); *************** check_new_cluster(void) *** 117,122 **** --- 119,125 ---- get_db_and_rel_infos(&new_cluster); check_new_cluster_is_empty(); + check_for_prepared_transactions(&new_cluster); check_old_cluster_has_new_cluster_dbs(); check_loadable_libraries(); *************** check_is_super_user(ClusterInfo *cluster *** 507,512 **** --- 510,545 ---- /* + * check_for_prepared_transactions() + * + * Make sure there are no prepared transactions because the storage format + * might have changed. + */ + static void + check_for_prepared_transactions(ClusterInfo *cluster) + { + PGresult *res; + PGconn *conn = connectToServer(cluster, "template1"); + + prep_status("Checking for prepared transactions"); + + res = executeQueryOrDie(conn, + "SELECT * " + "FROM pg_catalog.pg_prepared_xact()"); + + if (PQntuples(res) != 0) + pg_log(PG_FATAL, "The %s cluster contains prepared transactions\n", + CLUSTER_NAME(cluster)); + + PQclear(res); + + PQfinish(conn); + + check_ok(); + } + + + /* * check_for_isn_and_int8_passing_mismatch() * * contrib/isn relies on data type int8, and in 8.4 int8 can now be passed