Обсуждение: Sigh, we need an initdb
I just noticed that we had not one, but two commits in 9.4 that added fields to pg_control. And neither one changed PG_CONTROL_VERSION. This is inexcusable sloppiness on the part of the committers involved, but the question is what do we do now? Quick experimentation says that you don't really get to the point of noticing this if you try to start a 9.4 postmaster against 9.3 database or vice versa, because the postmaster will look at the PG_VERSION file first. However, pg_resetxlog and pg_controldata don't do that, so you could get misleading results from the wrong pg_controldata version or potentially screw yourself entirely with the wrong pg_resetxlog, if you failed to interpret their warnings about wrong CRC values correctly. I think we could possibly ship 9.4 without fixing this, but it would be imprudent. Anyone think differently? Of course, if we do fix this then the door opens for pushing other initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition that I was looking at when I noticed this, or the pg_lsn catalog additions that were being discussed a couple weeks ago. regards, tom lane
On 06/04/2014 11:52 AM, Tom Lane wrote: > I think we could possibly ship 9.4 without fixing this, but it would be > imprudent. Anyone think differently? > > Of course, if we do fix this then the door opens for pushing other > initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition > that I was looking at when I noticed this, or the pg_lsn catalog > additions that were being discussed a couple weeks ago. It certainly seems that if we are going to initdb anyway, let's do it with approved features that got kicked (assuming) only because they would cause an initdb. JD > > regards, tom lane > > -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc Political Correctness is for cowards.
On 06/04/2014 08:56 PM, Joshua D. Drake wrote: > > On 06/04/2014 11:52 AM, Tom Lane wrote: > >> I think we could possibly ship 9.4 without fixing this, but it would be >> imprudent. Anyone think differently? >> >> Of course, if we do fix this then the door opens for pushing other >> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition >> that I was looking at when I noticed this, or the pg_lsn catalog >> additions that were being discussed a couple weeks ago. > > It certainly seems that if we are going to initdb anyway, let's do it > with approved features that got kicked (assuming) only because they > would cause an initdb. agreed there - I dont think the "no initdb rule during BETA" really buys us that much these days. If people test our betas at all they do on scratch boxes in development/staging, i really doubt that (especially given the .0 history we had in the last years) people really move -BETA installs to production or expect to do so. Stefan
<p dir="ltr"><br /> On Jun 4, 2014 8:52 PM, "Tom Lane" <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br/> ><br /> > I just noticed that we had not one, but two commits in 9.4 that added<br /> > fields to pg_control. And neither one changed PG_CONTROL_VERSION.<br /> > This is inexcusable sloppiness on the part of the committersinvolved,<br /> > but the question is what do we do now?<br /> ><br /> > Quick experimentation says thatyou don't really get to the point of<br /> > noticing this if you try to start a 9.4 postmaster against 9.3 databaseor<br /> > vice versa, because the postmaster will look at the PG_VERSION file first.<br /> > However, pg_resetxlogand pg_controldata don't do that, so you could get<br /> > misleading results from the wrong pg_controldataversion or potentially<br /> > screw yourself entirely with the wrong pg_resetxlog, if you failed to<br/> > interpret their warnings about wrong CRC values correctly.<br /> ><br /> > I think we could possibly ship9.4 without fixing this, but it would be<br /> > imprudent. Anyone think differently?<p dir="ltr">Shipping it withoutfixing that seems like a horrible idea. We should either force an initdb or revert those changes. <p dir="ltr">I'dvote for forcing initdb. If we push it off we'll be paying for it for the coming 5 years. <br /><p dir="ltr">>Of course, if we do fix this then the door opens for pushing other<br /> > initdb-forcing fixes into 9.4beta2,such as the LOBLKSIZE addition<br /> > that I was looking at when I noticed this, or the pg_lsn catalog<br />> additions that were being discussed a couple weeks ago.<br /><p dir="ltr">+1. We should of course evaluate those patchesindividually but the initdb argument against them isn't valid, so if they are otherwise good, we should accept them.<p dir="ltr">/Magnus <br />
On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I just noticed that we had not one, but two commits in 9.4 that added > fields to pg_control. And neither one changed PG_CONTROL_VERSION. > This is inexcusable sloppiness on the part of the committers involved, > but the question is what do we do now? I think it would be an awfully good idea to think about what we could put into the buildfarm, the git repository, or the source tree to get some automatic notification when somebody screws up this way (or the xlog header magic, or catversion). The first of those two screw-ups (by me) was 11 months ago today; it's pretty scary that we're only just now noticing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote > On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane < > tgl@.pa > > wrote: >> I just noticed that we had not one, but two commits in 9.4 that added >> fields to pg_control. And neither one changed PG_CONTROL_VERSION. >> This is inexcusable sloppiness on the part of the committers involved, >> but the question is what do we do now? > > I think it would be an awfully good idea to think about what we could > put into the buildfarm, the git repository, or the source tree to get > some automatic notification when somebody screws up this way (or the > xlog header magic, or catversion). The first of those two screw-ups > (by me) was 11 months ago today; it's pretty scary that we're only > just now noticing. Not withstanding Tom's comments on the topic a regression test could work here. There was just a recent "leakproof" function discovery that resulted from a regression test that compared all known leakproof functions to those in the current catalog. When the test fails there should be additional instruction - like "Please alter this output file AND bump PG_CONTROL_VERSION!" David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Sigh-we-need-an-initdb-tp5806058p5806071.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On 06/04/2014 03:50 PM, Robert Haas wrote: > On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I just noticed that we had not one, but two commits in 9.4 that added >> fields to pg_control. And neither one changed PG_CONTROL_VERSION. >> This is inexcusable sloppiness on the part of the committers involved, >> but the question is what do we do now? > I think it would be an awfully good idea to think about what we could > put into the buildfarm, the git repository, or the source tree to get > some automatic notification when somebody screws up this way (or the > xlog header magic, or catversion). The first of those two screw-ups > (by me) was 11 months ago today; it's pretty scary that we're only > just now noticing. > I agree it's scary but in a few minutes thinking about it I haven't been able to come up with a good way of checking it. Maybe we could build sizeof(ControlData) into the version number, so instead of 937 we'd have 937nnnnn. Then we could check the nnnnn against what we know we is the size. I realize this isn't perfect, but might be better than nothing. cheers andrew
Hi, On 2014-06-04 14:52:35 -0400, Tom Lane wrote: > I think we could possibly ship 9.4 without fixing this, but it would be > imprudent. Anyone think differently? Agreed. Additionally I also agree with Stefan that the price of a initdb during beta isn't that high these days. > Of course, if we do fix this then the door opens for pushing other > initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition > that I was looking at when I noticed this, or the pg_lsn catalog > additions that were being discussed a couple weeks ago. Other things I'd like to change in that case: * rename pg_replication_slots.xmin to something else. After the replication slot patch went in, in another patch's reviewyou/Tom objected that xmin isn't the best name. The only problem being that I still don't have a better idea than myoriginal 'data_xmin' which Robert disliked. * Standardize on either slot_name for the replication slot's name. Currently some functions have a parameter named 'slotname'but all columnnames (from SRFs) are slot_name. That's not really bad since the parameter names don't really meanmuch, but if we can we should fix it imo. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-04 14:52:35 -0400, Tom Lane wrote: >> I think we could possibly ship 9.4 without fixing this, but it would be >> imprudent. Anyone think differently? > Agreed. Additionally I also agree with Stefan that the price of a initdb > during beta isn't that high these days. Yeah, if nothing else it gives testers another opportunity to exercise pg_upgrade ;-). The policy about post-beta1 initdb is "avoid if practical", not "avoid at all costs". Actually, that statement makes me realize that if we fix PG_CONTROL_VERSION then it's a good idea to *also* do some regular catalog changes, or at least bump catversion. Otherwise pg_upgrade won't be able to cope with upgrading non-default tablespaces in beta1 installations. For the moment I'll just go bump PG_CONTROL_VERSION, assuming that we have enough other things on the table that at least one of them will result in a catversion bump before beta2. > Other things I'd like to change in that case: I have no objection to these as long as we can get some consensus on the new names (and personally I don't much care what those are, but I agree "xmin" for a user column is a bad idea). regards, tom lane
On Wed, Jun 4, 2014 at 9:52 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Other things I'd like to change in that case: Fwiw I tried to use the pg_lsn data type the other day and pretty much immediately ran into the lack of the btree operator class. Pretty much the first thing I did when I loaded the data was "select ... order by lsn". -- greg
On 2014-06-04 17:03:52 -0400, Tom Lane wrote: > Actually, that statement makes me realize that if we fix > PG_CONTROL_VERSION then it's a good idea to *also* do some regular catalog > changes, or at least bump catversion. Otherwise pg_upgrade won't be able to > cope with upgrading non-default tablespaces in beta1 installations. Heh. That's not a particularly nice property, although I can see where it's coming from. Probably not really problematic because catversion updates are so much more frequent. > For the moment I'll just go bump PG_CONTROL_VERSION, assuming that we have > enough other things on the table that at least one of them will result in > a catversion bump before beta2. The slot_name vs slotname thing seems uncontroversial enough since slot_name is the thing that already appears everywhere in the docs and it's what we'd agreed upon onlist. It's just that not everything got the message. > I have no objection to these as long as we can get some consensus on the > new names (and personally I don't much care what those are, but I agree > "xmin" for a user column is a bad idea). I won't do anything about this one though until we've agreed upon something. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Greg Stark <stark@mit.edu> writes: > Fwiw I tried to use the pg_lsn data type the other day and pretty much > immediately ran into the lack of the btree operator class. Pretty much > the first thing I did when I loaded the data was "select ... order by > lsn". Agreed, now that we're going to force an initdb anyway, that fix is high priority. I'll take a look at the patch shortly. regards, tom lane
On Wed, Jun 04, 2014 at 09:16:36PM +0200, Stefan Kaltenbrunner wrote: > On 06/04/2014 08:56 PM, Joshua D. Drake wrote: > > On 06/04/2014 11:52 AM, Tom Lane wrote: > >> I think we could possibly ship 9.4 without fixing this, but it would be > >> imprudent. Anyone think differently? > >> > >> Of course, if we do fix this then the door opens for pushing other > >> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition > >> that I was looking at when I noticed this, or the pg_lsn catalog > >> additions that were being discussed a couple weeks ago. > > > > It certainly seems that if we are going to initdb anyway, let's do it > > with approved features that got kicked (assuming) only because they > > would cause an initdb. > > agreed there - I dont think the "no initdb rule during BETA" really buys > us that much these days. If people test our betas at all they do on > scratch boxes in development/staging, i really doubt that (especially > given the .0 history we had in the last years) people really move -BETA > installs to production or expect to do so. +1. You need a microscope to see the gain from imposing that rule. Even if people do move beta installs to production, that's just a pg_upgrade away. -- Noah Misch EnterpriseDB http://www.enterprisedb.com
On Wed, Jun 4, 2014 at 4:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > On 06/04/2014 03:50 PM, Robert Haas wrote: >> On Wed, Jun 4, 2014 at 2:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I just noticed that we had not one, but two commits in 9.4 that added >>> fields to pg_control. And neither one changed PG_CONTROL_VERSION. >>> This is inexcusable sloppiness on the part of the committers involved, >>> but the question is what do we do now? >> >> I think it would be an awfully good idea to think about what we could >> put into the buildfarm, the git repository, or the source tree to get >> some automatic notification when somebody screws up this way (or the >> xlog header magic, or catversion). The first of those two screw-ups >> (by me) was 11 months ago today; it's pretty scary that we're only >> just now noticing. >> > > I agree it's scary but in a few minutes thinking about it I haven't been > able to come up with a good way of checking it. Maybe we could build > sizeof(ControlData) into the version number, so instead of 937 we'd have > 937nnnnn. Then we could check the nnnnn against what we know we is the size. > I realize this isn't perfect, but might be better than nothing. I think that's worth considering. Another idea is: suppose we set up a PostgreSQL database somewhere that contained information about what controldata layout corresponded to what control version: CREATE TABLE control_formats (version_number integer, data_types text[]); Every time it runs, it checks out the latest source code. It checks whether the control version is already present in the table; if so, it verifies that the data types match. If they don't, it makes something turn red. If the control version isn't present yet, it inserts whatever types it sees as the definitive record of what the new version number means. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jun 4, 2014 at 4:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote: >> I agree it's scary but in a few minutes thinking about it I haven't been >> able to come up with a good way of checking it. Maybe we could build >> sizeof(ControlData) into the version number, so instead of 937 we'd have >> 937nnnnn. Then we could check the nnnnn against what we know we is the size. >> I realize this isn't perfect, but might be better than nothing. > I think that's worth considering. Another idea is: suppose we set up > a PostgreSQL database somewhere that contained information about what > controldata layout corresponded to what control version: > CREATE TABLE control_formats (version_number integer, data_types text[]); > Every time it runs, it checks out the latest source code. It checks > whether the control version is already present in the table; if so, it > verifies that the data types match. If they don't, it makes something > turn red. If the control version isn't present yet, it inserts > whatever types it sees as the definitive record of what the new > version number means. That seems possibly workable. Merely checking sizeof(ControlData) isn't real helpful since (1) it might not catch field additions because of alignment padding, and (2) it's not clear that that number is, or should be, entirely architecture-independent. But I think a list of the C data type names of the fields (and maybe the fields' own names, for good measure) would be reasonably robust. Not sure how we'd scale this idea to catversion or WAL version, but I think both of those are less significant hazards. regards, tom lane
I wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I think that's worth considering. Another idea is: suppose we set up >> a PostgreSQL database somewhere that contained information about what >> controldata layout corresponded to what control version: > That seems possibly workable. BTW, a possibly-easier-to-implement idea is to git diff pg_control.h against the previous release branch. Any non-comment diffs (or, really, anything at all except a copyright date change) should excite a warning unless a change to the PG_CONTROL_VERSION line is included. We'd probably need a way to shut it up after a simple comment change, but otherwise not a lot of infrastructure needed except a shell script. regards, tom lane
Stefan Kaltenbrunner wrote > On 06/04/2014 08:56 PM, Joshua D. Drake wrote: >> >> On 06/04/2014 11:52 AM, Tom Lane wrote: >> >>> I think we could possibly ship 9.4 without fixing this, but it would be >>> imprudent. Anyone think differently? >>> >>> Of course, if we do fix this then the door opens for pushing other >>> initdb-forcing fixes into 9.4beta2, such as the LOBLKSIZE addition >>> that I was looking at when I noticed this, or the pg_lsn catalog >>> additions that were being discussed a couple weeks ago. >> >> It certainly seems that if we are going to initdb anyway, let's do it >> with approved features that got kicked (assuming) only because they >> would cause an initdb. > > agreed there - I dont think the "no initdb rule during BETA" really buys > us that much these days. If people test our betas at all they do on > scratch boxes in development/staging, i really doubt that (especially > given the .0 history we had in the last years) people really move -BETA > installs to production or expect to do so. If we are planning on keeping this rule, which it seems like at least a few people feel is too stringent, maybe we can consider releasing an Alpha version and communicate the expectation that an initdb will be required to go from the alpha to beta1. Then hopefully, but not certainly, no initdb needed once in the beta phase. Basically convert beta1 into an alpha with that single policy/expectation change. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Sigh-we-need-an-initdb-tp5806058p5806137.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
David G Johnston <david.g.johnston@gmail.com> writes: > If we are planning on keeping this rule, which it seems like at least a few > people feel is too stringent, maybe we can consider releasing an Alpha > version and communicate the expectation that an initdb will be required to > go from the alpha to beta1. Then hopefully, but not certainly, no initdb > needed once in the beta phase. Basically convert beta1 into an alpha with > that single policy/expectation change. I think that would just amount to adding a month of dead time in what is already a very long beta cycle. Our past experience with releasing things called "alphas" has been that people don't test them. regards, tom lane