Обсуждение: sloppy back-patching of column-privilege leak
Branch: master [804b6b6db] 2015-01-28 12:31:30 -0500 Branch: REL9_4_STABLE Release: REL9_4_1 [3cc74a3d6] 2015-01-28 12:32:06 -0500 Branch: REL9_3_STABLE Release: REL9_3_6 [4b9874216] 2015-01-28 12:32:39 -0500 Branch: REL9_2_STABLE Release: REL9_2_10 [d49f84b08] 2015-01-28 12:32:56 -0500 Branch: REL9_1_STABLE Release: REL9_1_15 [9406884af] 2015-01-28 12:33:15 -0500 Branch: REL9_0_STABLE Release: REL9_0_19 [3a2063369] 2015-01-28 12:33:29 -0500 In 9.2 and newer branches, this commit makes substantial changes to execMain.c. In 9.1 and 9.0, though, the change to that file is just this: --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -95,6 +95,15 @@ static void intorel_receive(TupleTableSlot *slot, DestReceiver *self);static void intorel_shutdown(DestReceiver *self);static void intorel_destroy(DestReceiver *self); +/* + * Note that this macro also exists in commands/trigger.c. There does not + * appear to be any good header to put it into, given the structures that + * it uses, so we let them be duplicated. Be sure to update both if one needs + * to be changed, however. + */ +#define GetModifiedColumns(relinfo, estate) \ + (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols) +/* end of local decls */ We shouldn't be adding a macro to that file that isn't used anywhere. Either that file needs more substantial changes, or it doesn't need this, either. What gives? Besides the sloppiness of back-patching in that one macro and nothing else, this is a huge fraction of the patch that's just *gone* in the 9.0 and 9.1 branches, and there's not so much as a hint about it in the commit message, which is pretty astonishing to me. What's even more troubling is the new regression tests. In master, this commit added this: +INSERT INTO t1 (c1, c2) VALUES (1, 1); -- fail, but row not shown +ERROR: duplicate key value violates unique constraint "t1_pkey" +UPDATE t1 SET c2 = 1; -- fail, but row not shown +ERROR: duplicate key value violates unique constraint "t1_pkey" +INSERT INTO t1 (c1, c2) VALUES (null, null); -- fail, but see columns being inserted +ERROR: null value in column "c1" violates not-null constraint +DETAIL: Failing row contains (c1, c2) = (null, null). +INSERT INTO t1 (c3) VALUES (null); -- fail, but see columns being inserted or have SELECT +ERROR: null value in column "c1" violates not-null constraint +DETAIL: Failing row contains (c1, c3) = (null, null). +INSERT INTO t1 (c1) VALUES (5); -- fail, but see columns being inserted or have SELECT +ERROR: null value in column "c2" violates not-null constraint +DETAIL: Failing row contains (c1) = (5). +UPDATE t1 SET c3 = 10; -- fail, but see columns with SELECT rights, or being modified +ERROR: new row for relation "t1" violates check constraint "t1_c3_check" +DETAIL: Failing row contains (c1, c3) = (1, 10). Notice how the comments match the actual behavior. But in 9.0, you've got the SAME COMMENTS with DIFFERENT BEHAVIOR: +INSERT INTO t1 (c1, c2) VALUES (1, 1); -- fail, but row not shown +ERROR: duplicate key value violates unique constraint "t1_pkey" +UPDATE t1 SET c2 = 1; -- fail, but row not shown +ERROR: duplicate key value violates unique constraint "t1_pkey" +INSERT INTO t1 (c1, c2) VALUES (null, null); -- fail, but see columns being inserted +ERROR: null value in column "c1" violates not-null constraint +INSERT INTO t1 (c3) VALUES (null); -- fail, but see columns being inserted or have SELECT +ERROR: null value in column "c1" violates not-null constraint +INSERT INTO t1 (c1) VALUES (5); -- fail, but see columns being inserted or have SELECT +ERROR: null value in column "c2" violates not-null constraint +UPDATE t1 SET c3 = 10; -- fail, but see columns with SELECT rights, or being modified +ERROR: new row for relation "t1" violates check constraint "t1_c3_check" The comments say that you'll see the relevant columns, but you don't. Since when is it OK to check things into our regression tests where the comments say one thing and the behavior is something else? I don't even know what to say about this. I cannot understand how you can back-patch something like this and fail to notice that the regression tests give different output on different branches, and that that output is inconsistent with the comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert, * Robert Haas (robertmhaas@gmail.com) wrote: > In 9.2 and newer branches, this commit makes substantial changes to > execMain.c. In 9.1 and 9.0, though, the change to that file is just > this: > > --- a/src/backend/executor/execMain.c > +++ b/src/backend/executor/execMain.c > @@ -95,6 +95,15 @@ static void intorel_receive(TupleTableSlot *slot, > DestReceiver *self); > static void intorel_shutdown(DestReceiver *self); > static void intorel_destroy(DestReceiver *self); > > +/* > + * Note that this macro also exists in commands/trigger.c. There does not > + * appear to be any good header to put it into, given the structures that > + * it uses, so we let them be duplicated. Be sure to update both if one needs > + * to be changed, however. > + */ > +#define GetModifiedColumns(relinfo, estate) \ > + (rt_fetch((relinfo)->ri_RangeTableIndex, > (estate)->es_range_table)->modifiedCols) > + > /* end of local decls */ > > We shouldn't be adding a macro to that file that isn't used anywhere. Ah, yes, agreed, that didn't need to be added in the older branches. > Either that file needs more substantial changes, or it doesn't need > this, either. What gives? It doesn't need it. The older branches don't produce as detailed errors and because of that there weren't any further changes to be made. Including the extraneous #define in those older branches was certainly a mistake and I'll correct it. > Besides the sloppiness of back-patching in > that one macro and nothing else, this is a huge fraction of the patch > that's just *gone* in the 9.0 and 9.1 branches, and there's not so > much as a hint about it in the commit message, which is pretty > astonishing to me. Right, 9.0 and 9.1 don't have as detailed messages and so there wasn't as much to do in those older branches. I was well aware of that and I could have sworn that I included something in the commit messages.. Looks like I did, but not in a way which was as clear as it should have been. Specifically, in those older branches, the commit message only talks about the functions which exist in those branches- BuildIndexValueDescription and ri_ReportViolation. The commit messages for 9.2 and beyond also reference ExecBuildSlotValueDescription, which is what you're getting at. In hindsight, I agree that doing just that wasn't sufficient and thinking on it now I realize that, while having different commit messages for each branch may make sense if you're only looking at that branch, it ends up being confusing for folks following the overall project as they likely look at just the subject of each commit and expect the contents for every branch to be the same. To that point, I'll try to be clearer when there's a difference in commit message for each branch, or simply include everything for every branch in an identical commit message across all of them. > Notice how the comments match the actual behavior. But in 9.0, you've > got the SAME COMMENTS with DIFFERENT BEHAVIOR: Good point. The comments clearly were for the master-based behavior but the older branches don't have the same detail and so they should have been updated. I knew that they were different, and understood and expected those differences, made the specific changes which were appropriate for each branch, but missed updating the comments. > The comments say that you'll see the relevant columns, but you don't. > Since when is it OK to check things into our regression tests where > the comments say one thing and the behavior is something else? I'm a bit confused by this- I certainly didn't intend this mistake to be implying that our policy on comments and the code they're associated with should be changed to allow that they don't match up, nor do I advocate such a position now. > I don't even know what to say about this. I cannot understand how you > can back-patch something like this and fail to notice that the > regression tests give different output on different branches, and that > that output is inconsistent with the comments. I was quite aware that the regression tests were different on different branches, as I knew (which the commit messages show) that only a subset of the patch was relevant for the older branches since they didn't include the detail where the security issue originated from. I certainly reviewed the regression tests but obviously missed that the comments ended up out-dated due to the back-branch capabilities. Thanks for the review and comments. I'll remove the extraneous #define, the comment change which was made to the other #define (as it's not relevant since the #define is only located in one place) and fix the regression test comments to match the behavior in the older branches. Thanks again! Stephen
Stephen Frost wrote: > > Besides the sloppiness of back-patching in > > that one macro and nothing else, this is a huge fraction of the patch > > that's just *gone* in the 9.0 and 9.1 branches, and there's not so > > much as a hint about it in the commit message, which is pretty > > astonishing to me. > > Right, 9.0 and 9.1 don't have as detailed messages and so there wasn't > as much to do in those older branches. I was well aware of that and I > could have sworn that I included something in the commit messages.. > Looks like I did, but not in a way which was as clear as it should have > been. Specifically, in those older branches, the commit message only > talks about the functions which exist in those branches- > BuildIndexValueDescription and ri_ReportViolation. The commit messages > for 9.2 and beyond also reference ExecBuildSlotValueDescription, which > is what you're getting at. > > In hindsight, I agree that doing just that wasn't sufficient and > thinking on it now I realize that, while having different commit > messages for each branch may make sense if you're only looking at that > branch, it ends up being confusing for folks following the overall > project as they likely look at just the subject of each commit and > expect the contents for every branch to be the same. To that point, > I'll try to be clearer when there's a difference in commit message for > each branch, or simply include everything for every branch in an > identical commit message across all of them. FWIW using different commit messages for different branches is a mistake. The implicit policy we have is that there is one message, identical for all branches, that describe how the patch differs in each branch whenever necessary. Note that the git_changelog output that Robert pasted is not verbatim from the tool; what it actually prints is three separate entries, one for each different message you used, which is not what is supposed to occur. I say this policy is implicit because I don't recall it being spelled out anywhere, but since it's embodied in git_changelog's working principle it's important we stick to it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro, * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > FWIW using different commit messages for different branches is a > mistake. The implicit policy we have is that there is one message, > identical for all branches, that describe how the patch differs in each > branch whenever necessary. Note that the git_changelog output that > Robert pasted is not verbatim from the tool; what it actually prints is > three separate entries, one for each different message you used, which > is not what is supposed to occur. Ok, thanks. That's certainly easy enough to do and I'll do so in the future. I could have sworn I'd seen cases where further clarification was done for branch-specific commits but perhaps something else was different there. > I say this policy is implicit because I don't recall it being spelled > out anywhere, but since it's embodied in git_changelog's working > principle it's important we stick to it. I have to admit that I've never really used git_changelog. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: >> FWIW using different commit messages for different branches is a >> mistake. The implicit policy we have is that there is one message, >> identical for all branches, that describe how the patch differs in each >> branch whenever necessary. Note that the git_changelog output that >> Robert pasted is not verbatim from the tool; what it actually prints is >> three separate entries, one for each different message you used, which >> is not what is supposed to occur. > Ok, thanks. That's certainly easy enough to do and I'll do so in the > future. I could have sworn I'd seen cases where further clarification > was done for branch-specific commits but perhaps something else was > different there. Some people do it differently when the per-branch commits are very much different, but what Alvaro suggests is certainly handy when it comes time to make release notes ;-). >> I say this policy is implicit because I don't recall it being spelled >> out anywhere, but since it's embodied in git_changelog's working >> principle it's important we stick to it. > I have to admit that I've never really used git_changelog. It's pretty handy. The output for a couple of recent commits looks like Author: Noah Misch <noah@leadboat.com> Branch: master [a7a4adcf8] 2015-02-06 23:14:27 -0500 Assert(PqCommReadingMsg) in pq_peekbyte(). Interrupting pq_recvbuf() can break protocol sync, so its callers all deserve this assertion. The one pq_peekbyte() caller suffices already. Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Branch: master [ff16b40f8] 2015-02-06 11:26:50 +0200 Branch: REL9_4_STABLE [3bc4c6942] 2015-02-06 11:27:12 +0200 Branch: REL9_3_STABLE [5f0ba4abb] 2015-02-06 11:32:16 +0200 Branch: REL9_2_STABLE [2af568c6b] 2015-02-06 11:32:37 +0200 Branch: REL9_1_STABLE [0d36d9f2b] 2015-02-06 11:32:42 +0200 Report WAL flush, not insert, position in replication IDENTIFY_SYSTEM When beginning streaming replication, the clientusually issues the IDENTIFY_SYSTEM command, which used to return the current WAL insert position. That's not suitablefor the intended purpose of that field, however. pg_receivexlog uses it to start replication from the reported point, but if it hasn't been flushed to disk yet, it will fail. Change IDENTIFY_SYSTEM to report the flush positioninstead. Backpatch to 9.1 and above. 9.0 doesn't report any WAL position. Heikki's five commits got merged into one entry because they had identical log-message texts and were committed on the same day. Further back in the output you find things like Author: Peter Eisentraut <peter_e@gmx.net> Branch: master [1332bbb30] 2015-02-01 22:36:44 -0500 Branch: REL9_4_STABLE Release: REL9_4_1 [0ca8cc581] 2015-02-01 22:40:13 -0500 Branch: REL9_3_STABLE Release: REL9_3_6 [6b9b705c9] 2015-02-01 22:40:25 -0500 Branch: REL9_2_STABLE Release: REL9_2_10 [040f5ef50] 2015-02-01 22:40:36 -0500 Branch: REL9_1_STABLE Release: REL9_1_15 [2b0d33599] 2015-02-01 22:40:45 -0500 Branch: REL9_0_STABLE Release: REL9_0_19 [b09ca8834] 2015-02-01 22:40:53 -0500 doc: Improve claim about location of pg_service.conf The previous wording claimed that the file was always in /etc,but of course this varies with the installation layout. Write instead that it can be found via `pg_config --sysconfdir`. Even though this is still somewhat incorrect because it doesn't account of moved installations, it atleast conveys that the location depends on the installation. which show the first actual release incorporating each patch. So even if you're not writing release notes, it's mighty handy for identifying when/whether a given patch has shipped. I tend to run this once a week or so and keep the output around in a text file for quick searching. regards, tom lane
* Stephen Frost (sfrost@snowman.net) wrote: > Thanks for the review and comments. I'll remove the extraneous #define, > the comment change which was made to the other #define (as it's not > relevant since the #define is only located in one place) and fix the > regression test comments to match the behavior in the older branches. This has been done, though as noted in the commit, I simply removed the regression tests as they weren't relevant in those branches. Thanks! Stephen