Обсуждение: Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum
I wrote: > Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off" > is gone: we now dodge that problem by the expedient of not issuing ALTER > DATABASE SET commands until after reconnecting to the target database. > Therefore, such settings won't apply during the restore session. The buildfarm just pointed out an issue in that: the delayed SET commands might affect the interpretation of data during the restore session. Specifically, I see failures like this on machines where the prevailing locale isn't C or US: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 4871; 2618 34337 RULE rtest_emp rtest_emp_del pgbf pg_restore: [archiver (db)] could not execute query: ERROR: invalid input syntax for type money: "$0.00" LINE 3: ... ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"... ^ Command was: CREATE RULE "rtest_emp_del" AS ON DELETE TO "rtest_emp" DO INSERT INTO "rtest_emplog" ("ename", "who", "action", "newsal", "oldsal") VALUES ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"money", "old"."salary"); The dump dumped the money literal with '$' because we set up the regression database with ALTER DATABASE regression SET lc_monetary TO 'C'; but at the time we process the CREATE RULE during the restore, we're still running with whatever the environmental default is. Not very sure about the best solution here. Clearly, we'd better absorb these settings to some extent, but how? I thought for a bit about having pg_dump emit immediate SET operations along with the ALTER versions, for any GUCs we deem important for restore purposes. This would probably be about a 99% solution, but we could only do it for ALTER DATABASE SET not for ALTER ROLE IN DATABASE SET, because we couldn't be very sure which of the latter should apply. So it would have some edge-case failure conditions. Or we could reconnect after applying the ALTERs, ensuring that we have the expected environment. But then we're back into the problem that commit 4bd371f6f hoped to solve, namely that "ALTER DATABASE SET default_transaction_read_only = on" breaks pg_dumpall (and hence pg_upgrade). I then thought about splitting the ALTERs into two separate TOC entries, one for "desirable" GUCs (which we'd apply by reconnecting after emitting its contents), and one for "undesirables", which we would not reconnect after. That seemed a bit messy because of the need to maintain a blacklist going forward, and the possibility that we couldn't agree on what to blacklist. Then it occurred to me that none of this works anyway for parallel pg_restore --- including parallel pg_upgrade --- because the workers are going to see all the ALTERs in place. And that means that commit 4bd371f6f's hack has been broken since parallel upgrade went in, in 9.3. So at this point I'm feeling that letting pg_dumpall work around default_transaction_read_only = on is just too much of a hack, and we should forget about that consideration entirely. If we do, fixing the current buildfarm failure is about a one-line patch: just reconnect after processing DATABASE PROPERTIES. If someone were to hold a gun to my head and say "make this work", what I'd probably do is set up the desirable vs undesirable split and then arrange for the "undesirable" GUCs to be postponed to the end of the restore, after the parallel portion of the run is complete. But man, that's a lot of ugliness. I think the only reason that 4bd371f6f got in at all was that it was just a two-line kluge, and we were willing to accept that amount of ugliness to handle a corner case more nicely. The cost of solving it after the pg_dump/dumpall refactoring will be a lot higher, and I for one don't think it's worth it. Comments? regards, tom lane
I wrote: > Specifically, I see failures like this on machines where the prevailing > locale isn't C or US: > pg_restore: [archiver (db)] Error while PROCESSING TOC: > pg_restore: [archiver (db)] Error from TOC entry 4871; 2618 34337 RULE rtest_emp rtest_emp_del pgbf > pg_restore: [archiver (db)] could not execute query: ERROR: invalid input syntax for type money: "$0.00" > LINE 3: ... ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"... > ^ > Command was: CREATE RULE "rtest_emp_del" AS > ON DELETE TO "rtest_emp" DO INSERT INTO "rtest_emplog" ("ename", "who", "action", "newsal", "oldsal") > VALUES ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"money", "old"."salary"); Actually ... maybe what this is pointing out is a pre-existing deficiency in pg_dump, which is that it's failing to lock down lc_monetary during restore. Arguably, we should teach _doSetFixedOutputState to set lc_monetary to whatever prevailed in the dump session, just as we do for client_encoding. I seem now to recall some user complaints about unsafety of dump/restore for "money" values, which essentially is the problem we're seeing here. I think the reason we haven't done this already is fear of putting platform-dependent lc_monetary values into dump scripts. That's certainly an issue, but it seems a minor one: at worst, you'd have to not use --single-transaction when restoring on a different platform, so you could ignore an error from the SET command. While this would fix the specific problem we're seeing in the buildfarm, I'm thinking we'd still need to do what I said in the previous message, and change pg_dump so that the restore session will absorb ALTER DATABASE settings before proceeding. Otherwise, at minimum, we have a hazard of differing behaviors in serial and parallel restores. It might be that lc_monetary is the only GUC that makes a real difference for this purpose, but I haven't got much faith in that proposition at the moment. regards, tom lane
On Tue, Jan 23, 2018 at 8:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Specifically, I see failures like this on machines where the prevailing
> locale isn't C or US:
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> pg_restore: [archiver (db)] Error from TOC entry 4871; 2618 34337 RULE rtest_emp rtest_emp_del pgbf
> pg_restore: [archiver (db)] could not execute query: ERROR: invalid input syntax for type money: "$0.00"
> LINE 3: ... ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"...
> ^
> Command was: CREATE RULE "rtest_emp_del" AS
> ON DELETE TO "rtest_emp" DO INSERT INTO "rtest_emplog" ("ename", "who", "action", "newsal", "oldsal")
> VALUES ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"money", "old"."salary");
Actually ... maybe what this is pointing out is a pre-existing deficiency
in pg_dump, which is that it's failing to lock down lc_monetary during
restore. Arguably, we should teach _doSetFixedOutputState to set
lc_monetary to whatever prevailed in the dump session, just as we do
for client_encoding. I seem now to recall some user complaints about
unsafety of dump/restore for "money" values, which essentially is the
problem we're seeing here.
I think the reason we haven't done this already is fear of putting
platform-dependent lc_monetary values into dump scripts. That's
certainly an issue, but it seems a minor one: at worst, you'd have
to not use --single-transaction when restoring on a different platform,
so you could ignore an error from the SET command.
While this would fix the specific problem we're seeing in the buildfarm,
I'm thinking we'd still need to do what I said in the previous message,
and change pg_dump so that the restore session will absorb ALTER DATABASE
settings before proceeding. Otherwise, at minimum, we have a hazard of
differing behaviors in serial and parallel restores. It might be that
lc_monetary is the only GUC that makes a real difference for this purpose,
but I haven't got much faith in that proposition at the moment.
Yes, restore session should absorb all the ALTER DATABASE and ALTER ROLE
IN DATABASE settings also to make sure that the target database is in same
state when the dump has started.
currently "default_transaction_read_only" is the only GUC that affects the
absorbing the restore of a database.
As you said in upthread, I feel splitting them into two _TOC entries and dump
as last commands of each database? Does it have any problem with parallel
restore?
Regards,
Hari Babu
Fujitsu Australia
Haribabu Kommi <kommi.haribabu@gmail.com> writes: > On Tue, Jan 23, 2018 at 8:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm thinking we'd still need to do what I said in the previous message, >> and change pg_dump so that the restore session will absorb ALTER DATABASE >> settings before proceeding. Otherwise, at minimum, we have a hazard of >> differing behaviors in serial and parallel restores. It might be that >> lc_monetary is the only GUC that makes a real difference for this purpose, >> but I haven't got much faith in that proposition at the moment. > Yes, restore session should absorb all the ALTER DATABASE and ALTER ROLE > IN DATABASE settings also to make sure that the target database is in same > state when the dump has started. I've pushed a patch to do that. > currently "default_transaction_read_only" is the only GUC that affects the > absorbing the restore of a database. Well, that's exactly the $64 question. Are we sure we have a complete, reliable list of which GUCs do or do not affect data restoration? I wouldn't count on it. If nothing else, extensions might have custom GUCs that we could not predict the behavior of. > As you said in upthread, I feel splitting them into two _TOC entries and > dump as last commands of each database? Does it have any problem with > parallel restore? As I said yesterday, I'm not really eager to do that. It's a lot of complexity and a continuing maintenance headache to solve a use-case that I find pretty debatable. Anyone who's actually put in "default_transaction_read_only = on" in a way that restricts their maintenance roles must have created procedures for undoing it easily; otherwise day-to-day maintenance would be a problem. Further, I don't find the original hack's distinction between pg_dump and pg_dumpall to be really convincing. Maybe that says we should resolve this by just sticking "SET default_transaction_read_only = off" into regular pg_dump output after all --- perhaps with a switch added to enable it. But I'd kind of like to see the argument why we should worry about this at all made afresh. regards, tom lane