Обсуждение: declarations of range-vs-element <@ and @>

Поиск
Список
Период
Сортировка

declarations of range-vs-element <@ and @>

От
Tom Lane
Дата:
Why do these use anynonarray rather than anyelement?  Given that we
support ranges of arrays (there's even a regression test), this seems
a bogus limitation.
        regards, tom lane


Re: declarations of range-vs-element <@ and @>

От
Tom Lane
Дата:
I wrote:
> Why do these use anynonarray rather than anyelement?  Given that we
> support ranges of arrays (there's even a regression test), this seems
> a bogus limitation.

After experimenting with changing that, I see why you did it: some of
the regression tests fail, eg,
 SELECT * FROM array_index_op_test WHERE i <@ '{38,34,32,89}' ORDER BY seqno; ERROR:  operator is not unique: integer[]
<@unknown
 

That is, if we have both anyarray <@ anyarray and anyelement <@ anyrange
operators, the parser is unable to decide which one is a better match to
integer[] <@ unknown.  However, restricting <@ to not work for ranges
over arrays is a pretty horrid fix for that, because there is simply not
any access to the lost functionality.  It'd be better IMO to fail here
and require the unknown literal to be cast explicitly than to do this.

But what surprises me about this example is that I'd have expected the
heuristic "assume the unknown is of the same type as the other input"
to resolve it.  Looking more closely, I see that we apply that heuristic
in such a way that it works only for exact operator matches, not for
matches requiring coercion (including polymorphic-type matches).  This
seems a bit weird.  I propose adding a step to func_select_candidate
that tries to resolve things that way, ie, if all the known-type inputs
have the same type, then try assuming that the unknown-type ones are of
that type, and see if that leads to a unique match.  There actually is a
comment in there that claims we do that, but the code it's attached to
is really doing something else that involves preferred types within
type categories...

Thoughts?
        regards, tom lane


Re: declarations of range-vs-element <@ and @>

От
Jeff Davis
Дата:
On Wed, 2011-11-16 at 16:41 -0500, Tom Lane wrote:
> But what surprises me about this example is that I'd have expected the
> heuristic "assume the unknown is of the same type as the other input"
> to resolve it.  Looking more closely, I see that we apply that heuristic
> in such a way that it works only for exact operator matches, not for
> matches requiring coercion (including polymorphic-type matches).  This
> seems a bit weird.  I propose adding a step to func_select_candidate
> that tries to resolve things that way, ie, if all the known-type inputs
> have the same type, then try assuming that the unknown-type ones are of
> that type, and see if that leads to a unique match.  There actually is a
> comment in there that claims we do that, but the code it's attached to
> is really doing something else that involves preferred types within
> type categories...
> 
> Thoughts?

That sounds reasonable to me.

Regards,Jeff Davis



Re: declarations of range-vs-element <@ and @>

От
Tom Lane
Дата:
Jeff Davis <pgsql@j-davis.com> writes:
> On Wed, 2011-11-16 at 16:41 -0500, Tom Lane wrote:
>> I propose adding a step to func_select_candidate
>> that tries to resolve things that way, ie, if all the known-type inputs
>> have the same type, then try assuming that the unknown-type ones are of
>> that type, and see if that leads to a unique match.  There actually is a
>> comment in there that claims we do that, but the code it's attached to
>> is really doing something else that involves preferred types within
>> type categories...
>>
>> Thoughts?

> That sounds reasonable to me.

Here's a draft patch (sans doc changes as yet) that extends the
ambiguous-function resolution rules that way.  It adds the heuristic at
the very end, at the point where we would otherwise fail, and therefore
it cannot change the system's behavior for any case that didn't
previously draw an "ambiguous function/operator" error.  I experimented
with placing the heuristic earlier in func_select_candidate, but found
that that caused some changes in regression test cases, which made me a
bit nervous.  Those changes were not clearly worse results, but this
isn't an area that I think we should toy with lightly.

I haven't yet tried again on changing the <@ and @> declarations, but
will do that next.

            regards, tom lane

diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 75f1e20475d1c2df628f0a866fc081c601340e98..01ed85b563d23e9288430a76b28aa5a7b2550b74 100644
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
*************** func_select_candidate(int nargs,
*** 618,631 ****
                        Oid *input_typeids,
                        FuncCandidateList candidates)
  {
!     FuncCandidateList current_candidate;
!     FuncCandidateList last_candidate;
      Oid           *current_typeids;
      Oid            current_type;
      int            i;
      int            ncandidates;
      int            nbestMatch,
!                 nmatch;
      Oid            input_base_typeids[FUNC_MAX_ARGS];
      TYPCATEGORY slot_category[FUNC_MAX_ARGS],
                  current_category;
--- 618,633 ----
                        Oid *input_typeids,
                        FuncCandidateList candidates)
  {
!     FuncCandidateList current_candidate,
!                 first_candidate,
!                 last_candidate;
      Oid           *current_typeids;
      Oid            current_type;
      int            i;
      int            ncandidates;
      int            nbestMatch,
!                 nmatch,
!                 nunknowns;
      Oid            input_base_typeids[FUNC_MAX_ARGS];
      TYPCATEGORY slot_category[FUNC_MAX_ARGS],
                  current_category;
*************** func_select_candidate(int nargs,
*** 651,659 ****
       * take a domain as an input datatype.    Such a function will be selected
       * over the base-type function only if it is an exact match at all
       * argument positions, and so was already chosen by our caller.
       */
      for (i = 0; i < nargs; i++)
!         input_base_typeids[i] = getBaseType(input_typeids[i]);

      /*
       * Run through all candidates and keep those with the most matches on
--- 653,674 ----
       * take a domain as an input datatype.    Such a function will be selected
       * over the base-type function only if it is an exact match at all
       * argument positions, and so was already chosen by our caller.
+      *
+      * While we're at it, count the number of unknown-type arguments for use
+      * later.
       */
+     nunknowns = 0;
      for (i = 0; i < nargs; i++)
!     {
!         if (input_typeids[i] != UNKNOWNOID)
!             input_base_typeids[i] = getBaseType(input_typeids[i]);
!         else
!         {
!             /* no need to call getBaseType on UNKNOWNOID */
!             input_base_typeids[i] = UNKNOWNOID;
!             nunknowns++;
!         }
!     }

      /*
       * Run through all candidates and keep those with the most matches on
*************** func_select_candidate(int nargs,
*** 749,762 ****
          return candidates;

      /*
!      * Still too many candidates? Try assigning types for the unknown columns.
!      *
!      * NOTE: for a binary operator with one unknown and one non-unknown input,
!      * we already tried the heuristic of looking for a candidate with the
!      * known input type on both sides (see binary_oper_exact()). That's
!      * essentially a special case of the general algorithm we try next.
       *
!      * We do this by examining each unknown argument position to see if we can
       * determine a "type category" for it.    If any candidate has an input
       * datatype of STRING category, use STRING category (this bias towards
       * STRING is appropriate since unknown-type literals look like strings).
--- 764,779 ----
          return candidates;

      /*
!      * Still too many candidates?  Try assigning types for the unknown inputs.
       *
!      * If there are no unknown inputs, we have no more heuristics that apply,
!      * and must fail.
!      */
!     if (nunknowns == 0)
!         return NULL;            /* failed to select a best candidate */
!
!     /*
!      * The next step examines each unknown argument position to see if we can
       * determine a "type category" for it.    If any candidate has an input
       * datatype of STRING category, use STRING category (this bias towards
       * STRING is appropriate since unknown-type literals look like strings).
*************** func_select_candidate(int nargs,
*** 770,778 ****
       * Having completed this examination, remove candidates that accept the
       * wrong category at any unknown position.    Also, if at least one
       * candidate accepted a preferred type at a position, remove candidates
!      * that accept non-preferred types.
!      *
!      * If we are down to one candidate at the end, we win.
       */
      resolved_unknowns = false;
      for (i = 0; i < nargs; i++)
--- 787,795 ----
       * Having completed this examination, remove candidates that accept the
       * wrong category at any unknown position.    Also, if at least one
       * candidate accepted a preferred type at a position, remove candidates
!      * that accept non-preferred types.  If just one candidate remains,
!      * return that one.  However, if this rule turns out to reject all
!      * candidates, keep them all instead.
       */
      resolved_unknowns = false;
      for (i = 0; i < nargs; i++)
*************** func_select_candidate(int nargs,
*** 835,840 ****
--- 852,858 ----
      {
          /* Strip non-matching candidates */
          ncandidates = 0;
+         first_candidate = candidates;
          last_candidate = NULL;
          for (current_candidate = candidates;
               current_candidate != NULL;
*************** func_select_candidate(int nargs,
*** 874,888 ****
                  if (last_candidate)
                      last_candidate->next = current_candidate->next;
                  else
!                     candidates = current_candidate->next;
              }
          }
!         if (last_candidate)        /* terminate rebuilt list */
              last_candidate->next = NULL;
      }

!     if (ncandidates == 1)
!         return candidates;

      return NULL;                /* failed to select a best candidate */
  }    /* func_select_candidate() */
--- 892,969 ----
                  if (last_candidate)
                      last_candidate->next = current_candidate->next;
                  else
!                     first_candidate = current_candidate->next;
              }
          }
!
!         /* if we found any matches, restrict our attention to those */
!         if (last_candidate)
!         {
!             candidates = first_candidate;
!             /* terminate rebuilt list */
              last_candidate->next = NULL;
+         }
+
+         if (ncandidates == 1)
+             return candidates;
      }

!     /*
!      * Last gasp: if there are both known- and unknown-type inputs, and all
!      * the known types are the same, assume the unknown inputs are also that
!      * type, and see if that gives us a unique match.  If so, use that match.
!      *
!      * NOTE: for a binary operator with one unknown and one non-unknown input,
!      * we already tried this heuristic in binary_oper_exact().  However, that
!      * code only finds exact matches, whereas here we will handle matches that
!      * involve coercion, polymorphic type resolution, etc.
!      */
!     if (nunknowns < nargs)
!     {
!         Oid            known_type = UNKNOWNOID;
!
!         for (i = 0; i < nargs; i++)
!         {
!             if (input_base_typeids[i] == UNKNOWNOID)
!                 continue;
!             if (known_type == UNKNOWNOID)        /* first known arg? */
!                 known_type = input_base_typeids[i];
!             else if (known_type != input_base_typeids[i])
!             {
!                 /* oops, not all match */
!                 known_type = UNKNOWNOID;
!                 break;
!             }
!         }
!
!         if (known_type != UNKNOWNOID)
!         {
!             /* okay, just one known type, apply the heuristic */
!             for (i = 0; i < nargs; i++)
!                 input_base_typeids[i] = known_type;
!             ncandidates = 0;
!             last_candidate = NULL;
!             for (current_candidate = candidates;
!                  current_candidate != NULL;
!                  current_candidate = current_candidate->next)
!             {
!                 current_typeids = current_candidate->args;
!                 if (can_coerce_type(nargs, input_base_typeids, current_typeids,
!                                     COERCION_IMPLICIT))
!                 {
!                     if (++ncandidates > 1)
!                         break;    /* not unique, give up */
!                     last_candidate = current_candidate;
!                 }
!             }
!             if (ncandidates == 1)
!             {
!                 /* successfully identified a unique match */
!                 last_candidate->next = NULL;
!                 return last_candidate;
!             }
!         }
!     }

      return NULL;                /* failed to select a best candidate */
  }    /* func_select_candidate() */