Обсуждение: Missing docs on AT TIME ZONE precedence?

Поиск
Список
Период
Сортировка

Missing docs on AT TIME ZONE precedence?

От
Shay Rojansky
Дата:
Greeting hackers,

In the operator precedence table[1] table, AT TIME ZONE isn't explicitly listed out; that means it's to be interpreted in the "any other operator category".

However, it seems that the precedence of AT TIME ZONE is actually higher than that of the addition operator:

-- Fails with "function pg_catalog.timezone(unknown, interval) does not exist
SELECT now() + INTERVAL '14 days' AT TIME ZONE 'UTC';

-- Works:
SELECT (now() + INTERVAL '14 days') AT TIME ZONE 'UTC';

Note that missing parentheses for this were discussed in the context of pg_catalog.pg_get_viewdef[2].

Is there a missing line in the operator precedence table in the docs?

Thanks,

Shay

Re: Missing docs on AT TIME ZONE precedence?

От
Bruce Momjian
Дата:
On Sun, Nov 26, 2023 at 11:13:39AM +0100, Shay Rojansky wrote:
> Greeting hackers,
> 
> In the operator precedence table[1] table, AT TIME ZONE isn't explicitly listed
> out; that means it's to be interpreted in the "any other operator category".
> 
> However, it seems that the precedence of AT TIME ZONE is actually higher than
> that of the addition operator:
> 
> -- Fails with "function pg_catalog.timezone(unknown, interval) does not exist
> SELECT now() + INTERVAL '14 days' AT TIME ZONE 'UTC';
> 
> -- Works:
> SELECT (now() + INTERVAL '14 days') AT TIME ZONE 'UTC';
> 
> Note that missing parentheses for this were discussed in the context
> of pg_catalog.pg_get_viewdef[2].
> 
> Is there a missing line in the operator precedence table in the docs?

I think the big question is whether AT TIME ZONE is significant enough
to list there because there are many other clauses we could potentially
add there.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



Re: Missing docs on AT TIME ZONE precedence?

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Sun, Nov 26, 2023 at 11:13:39AM +0100, Shay Rojansky wrote:
>> Is there a missing line in the operator precedence table in the docs?

> I think the big question is whether AT TIME ZONE is significant enough
> to list there because there are many other clauses we could potentially
> add there.

Comparing the precedence list in the grammar with the doc table,
the only omissions I feel bad about are AT and COLLATE.  There's
a group of keywords that have "almost the same precedence as IDENT"
which probably don't need documentation; but these are not in that
group.

I am, however, feeling a little bit on the warpath about the
grammar comments for the SQL/JSON keyword precedences:

/* SQL/JSON related keywords */
%nonassoc    UNIQUE JSON
%nonassoc    KEYS OBJECT_P SCALAR VALUE_P
%nonassoc    WITH WITHOUT

Every other case where we're doing this has a para of explanation
in the block comment just below here.  These not only have no
meaningful explanation, they are in the wrong place --- it looks
like they are unrelated to the block comment, whereas actually
(I think) they are another instance of it.  I consider this
well below project standard.

            regards, tom lane



Re: Missing docs on AT TIME ZONE precedence?

От
Shay Rojansky
Дата:
>> Is there a missing line in the operator precedence table in the docs?
>
> I think the big question is whether AT TIME ZONE is significant enough
> to list there because there are many other clauses we could potentially
> add there.

Just to give more context, I'm a maintainer on Entity Framework Core (the .NET ORM), and this caused the provider to generate incorrect SQL etc.

If you decide to not have a comprehensive operator precedence table (though I do hope you do), I'd at least amend the "any other operator" and "all other native and user-defined operators" to clearly indicate that some operators aren't listed and have undocumented precedences, so implementers can at least be aware and test the unlisted ones etc.

Re: Missing docs on AT TIME ZONE precedence?

От
Tom Lane
Дата:
I wrote:
> Comparing the precedence list in the grammar with the doc table,
> the only omissions I feel bad about are AT and COLLATE.

Concretely, as attached.

            regards, tom lane

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 37817d0638..4dfbbd0862 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1065,6 +1065,18 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
        <entry>unary plus, unary minus</entry>
       </row>

+      <row>
+       <entry><token>COLLATE</token></entry>
+       <entry>left</entry>
+       <entry>collation selection</entry>
+      </row>
+
+      <row>
+       <entry><token>AT</token></entry>
+       <entry>left</entry>
+       <entry><literal>AT TIME ZONE</literal>, <literal>AT LOCAL</literal></entry>
+      </row>
+
       <row>
        <entry><token>^</token></entry>
        <entry>left</entry>
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c224df4ecc..8c00b119ec 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -858,7 +858,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %left        '*' '/' '%'
 %left        '^'
 /* Unary Operators */
-%left        AT                /* sets precedence for AT TIME ZONE */
+%left        AT                /* sets precedence for AT TIME ZONE, AT LOCAL */
 %left        COLLATE
 %right        UMINUS
 %left        '[' ']'

Re: Missing docs on AT TIME ZONE precedence?

От
Alvaro Herrera
Дата:
On 2023-Nov-26, Tom Lane wrote:

> I am, however, feeling a little bit on the warpath about the
> grammar comments for the SQL/JSON keyword precedences:
> 
> /* SQL/JSON related keywords */
> %nonassoc    UNIQUE JSON
> %nonassoc    KEYS OBJECT_P SCALAR VALUE_P
> %nonassoc    WITH WITHOUT
> 
> Every other case where we're doing this has a para of explanation
> in the block comment just below here.  These not only have no
> meaningful explanation, they are in the wrong place --- it looks
> like they are unrelated to the block comment, whereas actually
> (I think) they are another instance of it.  I consider this
> well below project standard.

I introduced those in commit 6ee30209a6f1.  That is the minimal set of
keywords for which the precedence had to be declared that was necessary
so that the grammar would compile for the new feature; I extracted that
from a much larger set that was in the original patch submission.  I
spent a long time trying to figure out whether the block comment
applied, and I wasn't sure, so I ended up leaving the comment at what
you see there.

Looking at it again:

UNIQUE and KEYS are there for "WITH UNIQUE KEYS" (& WITHOUT), where KEYS
is optional and the whole clause is optional in some rules.  So as I
understand it, we need to establish the relative precedence of UNIQUE
(first group), KEYS (second group) and WITH/WITHOUT (third group).
We also have a "%prec KEYS" declaration in the
json_key_uniqueness_constraint_opt rule for this.

We also need a relative precedence between JSON and the set below:
VALUE, OBJECT, SCALAR, for the "IS JSON {VALUE/OBJECT/SCALAR}"
construct.

I put KEYS in the same set as the three above just because it was not a
problem to do so; likewise UNIQUE together with JSON.  (I think it would
also work to put WITH and WITHOUT in the second group, but I only ran
bison to verify this, didn't run any tests.)

I am also not sure if the current location of those three groups (or
two, if we merge those) relative to the rest of the groups below the
large block comment is a good one.  As far as compilability of the
grammar goes, it looks like they could even be at the very bottom of the
precedence list, below the join operators.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"¿Qué importan los años?  Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo"  (Mafalda)



Re: Missing docs on AT TIME ZONE precedence?

От
Alvaro Herrera
Дата:
We could do something like this.  Is this good?

I tried to merge WITH and WITHOUT with the precedence class immediately
above, but that failed: the main grammar compiles fine and no tests
fail, but ECPG does fail to compile the sqljson.pgc test, so there's
some problem there.  Now, the ecpg grammar stuff *is* absolute black
magic to me, so I have no idea what to do about that.

(TBH I don't think the added comments really explain the problems fully.
That's most likely because I don't actually understand what the problems
are.)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)

Вложения

Re: Missing docs on AT TIME ZONE precedence?

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> (TBH I don't think the added comments really explain the problems fully.
> That's most likely because I don't actually understand what the problems
> are.)

The actual problem is that nobody has applied a cluestick to the SQL
committee about writing an unambiguous grammar :-(.  But I digress.

I don't like the existing coding for more reasons than just
underdocumentation.  Global assignment of precedence is a really,
really dangerous tool for solving ambiguous-grammar problems, because
it can mask problems unrelated to the one you think you are solving:
basically, it eliminates bison's complaints about grammar ambiguities
related to the token you mark.  (Commits 12b716457 and 28a61fc6c are
relevant here.)  Attaching precedence to individual productions is
far safer, because it won't have any effect that extends beyond that
production.  You still need a precedence attached to the lookahead
token; but I think we should try very hard to not assign a precedence
different from IDENT's to any unreserved keywords.

After a bit of fooling around I found a patch that seems to meet
that criterion; attached.

            regards, tom lane

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 8c00b119ec..340823588a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -821,11 +821,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %nonassoc    BETWEEN IN_P LIKE ILIKE SIMILAR NOT_LA
 %nonassoc    ESCAPE            /* ESCAPE must be just above LIKE/ILIKE/SIMILAR */

-/* SQL/JSON related keywords */
-%nonassoc    UNIQUE JSON
-%nonassoc    KEYS OBJECT_P SCALAR VALUE_P
-%nonassoc    WITH WITHOUT
-
 /*
  * To support target_el without AS, it used to be necessary to assign IDENT an
  * explicit precedence just less than Op.  While that's not really necessary
@@ -850,9 +845,15 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  * an explicit priority lower than '(', so that a rule with CUBE '(' will shift
  * rather than reducing a conflicting rule that takes CUBE as a function name.
  * Using the same precedence as IDENT seems right for the reasons given above.
+ *
+ * KEYS, OBJECT_P, SCALAR, VALUE_P, WITH, and WITHOUT are similarly assigned
+ * the same precedence as IDENT.  This allows resolving conflicts in the
+ * json_predicate_type_constraint and json_key_uniqueness_constraint_opt
+ * productions (see comments there).
  */
 %nonassoc    UNBOUNDED        /* ideally would have same precedence as IDENT */
 %nonassoc    IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
+            KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT
 %left        Op OPERATOR        /* multi-character ops and user-defined operators */
 %left        '+' '-'
 %left        '*' '/' '%'
@@ -16519,21 +16520,35 @@ json_returning_clause_opt:
             | /* EMPTY */                            { $$ = NULL; }
         ;

+/*
+ * We must assign the only-JSON production a precedence less than IDENT in
+ * order to favor shifting over reduction when JSON is followed by VALUE_P,
+ * OBJECT_P, or SCALAR.  (ARRAY doesn't need that treatment, because it's a
+ * fully reserved word.)  Because json_predicate_type_constraint is always
+ * followed by json_key_uniqueness_constraint_opt, we also need the only-JSON
+ * production to have precedence less than WITH and WITHOUT.  UNBOUNDED isn't
+ * really related to this syntax, but it's a convenient choice because it
+ * already has a precedence less than IDENT for other reasons.
+ */
 json_predicate_type_constraint:
-            JSON                                    { $$ = JS_TYPE_ANY; }
+            JSON                    %prec UNBOUNDED    { $$ = JS_TYPE_ANY; }
             | JSON VALUE_P                            { $$ = JS_TYPE_ANY; }
             | JSON ARRAY                            { $$ = JS_TYPE_ARRAY; }
             | JSON OBJECT_P                            { $$ = JS_TYPE_OBJECT; }
             | JSON SCALAR                            { $$ = JS_TYPE_SCALAR; }
         ;

-/* KEYS is a noise word here */
+/*
+ * KEYS is a noise word here.  To avoid shift/reduce conflicts, assign the
+ * KEYS-less productions a precedence less than IDENT (i.e., less than KEYS).
+ * This prevents reducing them when the next token is KEYS.
+ */
 json_key_uniqueness_constraint_opt:
             WITH UNIQUE KEYS                            { $$ = true; }
-            | WITH UNIQUE                                { $$ = true; }
+            | WITH UNIQUE                %prec UNBOUNDED    { $$ = true; }
             | WITHOUT UNIQUE KEYS                        { $$ = false; }
-            | WITHOUT UNIQUE                            { $$ = false; }
-            | /* EMPTY */                 %prec KEYS        { $$ = false; }
+            | WITHOUT UNIQUE            %prec UNBOUNDED    { $$ = false; }
+            | /* EMPTY */                 %prec UNBOUNDED    { $$ = false; }
         ;

 json_name_and_value_list:

Re: Missing docs on AT TIME ZONE precedence?

От
Andrew Dunstan
Дата:
On 2023-11-27 Mo 15:34, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> (TBH I don't think the added comments really explain the problems fully.
>> That's most likely because I don't actually understand what the problems
>> are.)
> The actual problem is that nobody has applied a cluestick to the SQL
> committee about writing an unambiguous grammar :-(.  But I digress.
>
> I don't like the existing coding for more reasons than just
> underdocumentation.  Global assignment of precedence is a really,
> really dangerous tool for solving ambiguous-grammar problems, because
> it can mask problems unrelated to the one you think you are solving:
> basically, it eliminates bison's complaints about grammar ambiguities
> related to the token you mark.  (Commits 12b716457 and 28a61fc6c are
> relevant here.)  Attaching precedence to individual productions is
> far safer, because it won't have any effect that extends beyond that
> production.  You still need a precedence attached to the lookahead
> token; but I think we should try very hard to not assign a precedence
> different from IDENT's to any unreserved keywords.
>
> After a bit of fooling around I found a patch that seems to meet
> that criterion; attached.
>
>             



Looks good. Perhaps the comments above the UNBOUNDED precedence setting 
(esp. the first paragraph) need strengthening, with a stern injunction 
to avoid different precedence for non-reserved keywords if at all possible.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Missing docs on AT TIME ZONE precedence?

От
Alvaro Herrera
Дата:
On 2023-Nov-27, Tom Lane wrote:

> I don't like the existing coding for more reasons than just
> underdocumentation.  Global assignment of precedence is a really,
> really dangerous tool for solving ambiguous-grammar problems, because
> it can mask problems unrelated to the one you think you are solving:
> basically, it eliminates bison's complaints about grammar ambiguities
> related to the token you mark.  (Commits 12b716457 and 28a61fc6c are
> relevant here.)  Attaching precedence to individual productions is
> far safer, because it won't have any effect that extends beyond that
> production.  You still need a precedence attached to the lookahead
> token; but I think we should try very hard to not assign a precedence
> different from IDENT's to any unreserved keywords.

Ooh, this is very useful, thank you.

> After a bit of fooling around I found a patch that seems to meet
> that criterion; attached.

It looks good and passes tests, including the ecpg ones.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Entristecido, Wutra                     (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"



Re: Missing docs on AT TIME ZONE precedence?

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> Looks good. Perhaps the comments above the UNBOUNDED precedence setting 
> (esp. the first paragraph) need strengthening, with a stern injunction 
> to avoid different precedence for non-reserved keywords if at all possible.

OK.  How about rewriting that first para like this?

 * Sometimes it is necessary to assign precedence to keywords that are not
 * really part of the operator hierarchy, in order to resolve grammar
 * ambiguities.  It's best to avoid doing so whenever possible, because such
 * assignments have global effect and may hide ambiguities besides the one
 * you intended to solve.  (Attaching a precedence to a single rule with
 * %prec is far safer and should be preferred.)  If you must give precedence
 * to a new keyword, try very hard to give it the same precedence as IDENT.
 * If the keyword has IDENT's precedence then it clearly acts the same as
 * non-keywords and other similar keywords, thus reducing the risk of
 * unexpected precedence effects.
 * 
 * We used to need to assign IDENT an explicit precedence just less than Op,
 * to support target_el without AS.  While that's not really necessary since
 * we removed postfix operators, we continue to do so because it provides a
 * reference point for a precedence level that we can assign to other
 * keywords that lack a natural precedence level.

            regards, tom lane



Re: Missing docs on AT TIME ZONE precedence?

От
Andrew Dunstan
Дата:
On 2023-11-28 Tu 10:27, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Looks good. Perhaps the comments above the UNBOUNDED precedence setting
>> (esp. the first paragraph) need strengthening, with a stern injunction
>> to avoid different precedence for non-reserved keywords if at all possible.
> OK.  How about rewriting that first para like this?
>
>   * Sometimes it is necessary to assign precedence to keywords that are not
>   * really part of the operator hierarchy, in order to resolve grammar
>   * ambiguities.  It's best to avoid doing so whenever possible, because such
>   * assignments have global effect and may hide ambiguities besides the one
>   * you intended to solve.  (Attaching a precedence to a single rule with
>   * %prec is far safer and should be preferred.)  If you must give precedence
>   * to a new keyword, try very hard to give it the same precedence as IDENT.
>   * If the keyword has IDENT's precedence then it clearly acts the same as
>   * non-keywords and other similar keywords, thus reducing the risk of
>   * unexpected precedence effects.
>   *
>   * We used to need to assign IDENT an explicit precedence just less than Op,
>   * to support target_el without AS.  While that's not really necessary since
>   * we removed postfix operators, we continue to do so because it provides a
>   * reference point for a precedence level that we can assign to other
>   * keywords that lack a natural precedence level.
>
>             


LGTM. Thanks.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Missing docs on AT TIME ZONE precedence?

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-11-28 Tu 10:27, Tom Lane wrote:
>> OK.  How about rewriting that first para like this?

> LGTM. Thanks.

Thanks for reviewing.  While checking things over one more time,
I noticed that there was an additional violation of this precept,
dating back to long before we understood the hazards: SET is
given its own priority, when it could perfectly well share that
of IDENT.  I adjusted that and pushed.

            regards, tom lane