Обсуждение: MAINTAIN privilege -- what do we need to un-revert it?
The MAINTAIN privilege was reverted during the 16 cycle because of the potential for someone to play tricks with search_path. For instance, if user foo does: CREATE FUNCTION mod7(INT) RETURNS INT IMMUTABLE LANGUAGE plpgsql AS $$ BEGIN RETURN mod($1, 7); END; $$; CREATE TABLE x(i INT); CREATE UNIQUE INDEX x_mod7_idx ON x (mod7(i)); GRANT MAINTAIN ON x TO bar; Then user bar can create their own function named "bar.mod(int, int)", and "SET search_path = bar, pg_catalog", and then issue a "REINDEX x" and cause problems. There are several factors required for that to be a problem: 1. foo hasn't used a "SET search_path" clause on their function 2. bar must have the privileges to create a function somewhere 3. bar must have privileges on table x There's an argument that we should blame factor #1. Robert stated[1] that users should use SET search_path clauses on their functions, even SECURITY INVOKER functions. And I've added a search_path cache which improves the performance enough to make that more reasonable to do generally. There's also an argument that #2 is to blame. Given the realities of our system, best practice is that users shouldn't have the privileges to create objects, even in their own schema, unless required. (Joe made this suggestion in an offline discussion.) There's also an arugment that #3 is not specific to the MAINTAIN privilege. Clearly similar risks exist for other privileges, like TRIGGER. And even the INSERT privilege, in the above example, would allow bar to violate the unique constraint and corrupt the index[2]. If those arguments are still unconvincing, then the next idea is to fix the search_path for all maintenance commands[3]. I tried this during the 16 cycle, but due to timing issues it was also reverted. I can proceed with this approach again, but I'd like a clear endorsement, in case there were other reasons to doubt the approach. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/CA+TgmoYEP40iBW-A9nPfDp8AhGoekPp3aPDFzTgBUrqmfCwZzQ@mail.gmail.com [2] https://www.postgresql.org/message-id/fff566293c9165c69bb4c555da1ac02c63660664.camel@j-davis.com [3] https://www.postgresql.org/message-id/e44327179e5c9015c8dda67351c04da552066017.camel@j-davis.com
On Wed, Feb 14, 2024 at 10:20:28AM -0800, Jeff Davis wrote: > If those arguments are still unconvincing, then the next idea is to fix > the search_path for all maintenance commands[3]. I tried this during > the 16 cycle, but due to timing issues it was also reverted. I can > proceed with this approach again, but I'd like a clear endorsement, in > case there were other reasons to doubt the approach. This seemed like the approach folks were most in favor of at the developer meeting a couple weeks ago [0]. At least, that was my interpretation of the discussion. BTW I have been testing reverting commit 151c22d (i.e., un-reverting MAINTAIN) every month or two, and last I checked, it still applies pretty cleanly. The only changes I've needed to make are to the catversion and to a hard-coded version in a test (16 -> 17). [0] https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2024_Developer_Meeting#The_Path_to_un-reverting_the_MAINTAIN_privilege -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Feb 14, 2024 at 01:02:26PM -0600, Nathan Bossart wrote: > BTW I have been testing reverting commit 151c22d (i.e., un-reverting > MAINTAIN) every month or two, and last I checked, it still applies pretty > cleanly. The only changes I've needed to make are to the catversion and to > a hard-coded version in a test (16 -> 17). Posting to get some cfbot coverage. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
On Wed, 2024-02-14 at 13:02 -0600, Nathan Bossart wrote: > This seemed like the approach folks were most in favor of at the > developer > meeting a couple weeks ago [0]. At least, that was my interpretation > of > the discussion. Attached rebased version. Note the changes in amcheck. It's creating functions and calling those functions from the comparators, and so the comparators need to set the search_path. I don't think that's terribly common, but does represent a behavior change and could break something. Regards, Jef Davis
Вложения
(Apologies in advance for anything I'm bringing up that we've already covered somewhere else.) On Fri, Feb 16, 2024 at 04:03:55PM -0800, Jeff Davis wrote: > Note the changes in amcheck. It's creating functions and calling those > functions from the comparators, and so the comparators need to set the > search_path. I don't think that's terribly common, but does represent a > behavior change and could break something. Why is this change needed? Is the idea to make amcheck follow the same rules as maintenance commands to encourage folks to set up index functions correctly? Or is amcheck similarly affected by search_path tricks? > void > InitializeSearchPath(void) > { > + /* Make the context we'll keep search path cache hashtable in */ > + SearchPathCacheContext = AllocSetContextCreate(TopMemoryContext, > + "search_path processing cache", > + ALLOCSET_DEFAULT_SIZES); > + > if (IsBootstrapProcessingMode()) > { > /* > @@ -4739,11 +4744,6 @@ InitializeSearchPath(void) > } > else > { > - /* Make the context we'll keep search path cache hashtable in */ > - SearchPathCacheContext = AllocSetContextCreate(TopMemoryContext, > - "search_path processing cache", > - ALLOCSET_DEFAULT_SIZES); > - What is the purpose of this change? > + SetConfigOption("search_path", GUC_SAFE_SEARCH_PATH, PGC_USERSET, > + PGC_S_SESSION); I wonder if it's worth using PGC_S_INTERACTIVE or introducing a new value for these. > +/* > + * Safe search path when executing code as the table owner, such as during > + * maintenance operations. > + */ > +#define GUC_SAFE_SEARCH_PATH "pg_catalog, pg_temp" Is including pg_temp actually safe? I worry that a user could use their temporary schema to inject objects that would take the place of non-schema-qualified stuff in functions. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
New version attached. Do we need a documentation update here? If so, where would be a good place? On Fri, 2024-02-23 at 15:30 -0600, Nathan Bossart wrote: > Why is this change needed? Is the idea to make amcheck follow the > same > rules as maintenance commands to encourage folks to set up index > functions > correctly? amcheck is calling functions it defined, so in order to find those functions it needs to set the right search path. > > What is the purpose of this [bootstrap-related] change? DefineIndex() is called during bootstrap, and it's also a maintenance command. I tried to handle the bootstrapping case, but I think it's best to just guard it with a conditional. Done. I also added Assert(!IsBootstrapProcessingMode()) in assign_search_path(). > > + SetConfigOption("search_path", GUC_SAFE_SEARCH_PATH, > > PGC_USERSET, > > + PGC_S_SESSION); > > I wonder if it's worth using PGC_S_INTERACTIVE or introducing a new > value > for these. > > Did you have a particular concern about PGC_S_SESSION? If it's less than PGC_S_SESSION, it won't work, because the caller's SET command will override it, and the same manipulation is possible. And I don't think we want it higher than PGC_S_SESSION, otherwise the function can't set its own search_path, if needed. > > +#define GUC_SAFE_SEARCH_PATH "pg_catalog, pg_temp" > > Is including pg_temp actually safe? I worry that a user could use > their > temporary schema to inject objects that would take the place of > non-schema-qualified stuff in functions. pg_temp cannot (currently) be excluded. If it is omitted from the string, it will be placed *first* in the search_path, which is more dangerous. pg_temp does not take part in function or operator resolution, which makes it safer than it first appears. There are potentially some risks around tables, but it's not typical to access a table in a function called as part of an index expression. If we determine that pg_temp is actually unsafe to include, we need to do something like what I proposed here: https://www.postgresql.org/message-id/a6865db287596c9c6ea12bdd9de87216cb5e7902.camel@j-davis.com before this change. Regards, Jeff Davis
Вложения
On Tue, Feb 27, 2024 at 04:22:34PM -0800, Jeff Davis wrote: > Do we need a documentation update here? If so, where would be a good > place? I'm afraid I don't have a better idea than adding a short note in each affected commands's page. > On Fri, 2024-02-23 at 15:30 -0600, Nathan Bossart wrote: >> I wonder if it's worth using PGC_S_INTERACTIVE or introducing a new >> value >> for these. > > Did you have a particular concern about PGC_S_SESSION? My only concern is that it could obscure the source of the search_path change, which in turn might cause confusion when things fail. > If it's less than PGC_S_SESSION, it won't work, because the caller's > SET command will override it, and the same manipulation is possible. > > And I don't think we want it higher than PGC_S_SESSION, otherwise the > function can't set its own search_path, if needed. Yeah, we would have to make it equivalent in priority to PGC_S_SESSION, which would likely require a bunch of special logic. I don't know if this is worth it, and this seems like something that could pretty easily be added in the future if it became necessary. >> > +#define GUC_SAFE_SEARCH_PATH "pg_catalog, pg_temp" >> >> Is including pg_temp actually safe? I worry that a user could use >> their >> temporary schema to inject objects that would take the place of >> non-schema-qualified stuff in functions. > > pg_temp cannot (currently) be excluded. If it is omitted from the > string, it will be placed *first* in the search_path, which is more > dangerous. > > pg_temp does not take part in function or operator resolution, which > makes it safer than it first appears. There are potentially some risks > around tables, but it's not typical to access a table in a function > called as part of an index expression. > > If we determine that pg_temp is actually unsafe to include, we need to > do something like what I proposed here: > > https://www.postgresql.org/message-id/a6865db287596c9c6ea12bdd9de87216cb5e7902.camel@j-davis.com I don't doubt anything you've said, but I can't help but think that we might as well handle the pg_temp risk, too. Furthermore, I see that we use "" as a safe search_path for autovacuum and fe_utils/connect.h. Is there any way to unite these? IIUC it might be possible to combine the autovacuum and maintenance command cases (i.e., "!pg_temp"), but we might need to keep pg_temp for the frontend case. I think it's worth trying to add comments about why this setting is safe for some cases but not others, too. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, 2024-02-28 at 10:55 -0600, Nathan Bossart wrote: > I'm afraid I don't have a better idea than adding a short note in > each > affected commands's page. OK, that works for now. Later we should also document that the functions are run as the table owner. > > On Fri, 2024-02-23 at 15:30 -0600, Nathan Bossart wrote: > > > I wonder if it's worth using PGC_S_INTERACTIVE or introducing a > > > new > > > value > > > for these. > > > > Did you have a particular concern about PGC_S_SESSION? > > My only concern is that it could obscure the source of the > search_path > change, which in turn might cause confusion when things fail. That's a good point. AutoVacWorkerMain uses PGC_S_OVERRIDE, but it doesn't have to worry about SET, because there's no real session. The function SET clause uses PGC_S_SESSION. It's arguable whether that's really the same source as a SET command, but it's definitely closer. > > Yeah, we would have to make it equivalent in priority to > PGC_S_SESSION, > which would likely require a bunch of special logic. I'm not clear on what problem that would solve. > I don't doubt anything you've said, but I can't help but think that > we > might as well handle the pg_temp risk, too. That sounds good to me, but I didn't get many replies in that last thread. And although it solves the problem, it is a bit awkward. Can we get some closure on whether that !pg_temp patch is the right approach? That was just my first idea, and it would be good to hear what others think. > Furthermore, I see that we use "" as a safe search_path for > autovacuum and > fe_utils/connect.h. Is there any way to unite these? We could have a single function like RestrictSearchPath(), which I believe I had in some previous iteration. That would use the safest search path (either excluding pg_temp or putting it at the end) and PGC_S_SESSION, and then use it everywhere. Regards, Jeff Davis
On Wed, 2024-02-28 at 09:29 -0800, Jeff Davis wrote: > On Wed, 2024-02-28 at 10:55 -0600, Nathan Bossart wrote: > > I'm afraid I don't have a better idea than adding a short note in > > each > > affected commands's page. > > OK, that works for now. Committed. The only changes are documentation and test updates. This is a behavior change, so it still carries some risk, though we've had a lot of discussion and generally it seems to be worth it. If it turns out worse than expected during beta, of course we can re-revert it. I will restate the risks here, which come basically from two places: (1) Functions called from index expressions which rely on search_path (and don't have a SET clause). Such a function would have already been fairly broken before my commit, because anyone accessing the table without the right search_path would have seen an error or wrong results. And there is no means to set the "right" search_path for autoanalyze or logical replication, so those would not have worked with such a broken function before my commit, no matter what. That being said, surely some users did have such broken functions, and with this commit, they will have to remedy them with a SET clause. Fortunately, the performance impact of doing so has been greatly reduced. (2) Matierialized views which call functions that rely on search_path (and don't have a SET clause). This is arguably a worse kind of breakage because materialized views are often refreshed only by the table owner, and it's easier to control search_path when running REFRESH. Additionally, functions called from materialized views are more likely to be "interesting" than functions called from an index expression. However, the remedy is straightforward: use a SET clause. Regards, Jeff Davis