Обсуждение: Failure assertion in GROUPS mode of window function in current HEAD

Поиск
Список
Период
Сортировка

Failure assertion in GROUPS mode of window function in current HEAD

От
Masahiko Sawada
Дата:
Hi,

I got an assertion failure when I use GROUPS mode and specifies offset
without ORDER BY clause. The reproduction steps and the backtrace I
got are following.

=# create table test as select 1 as c;
=# select sum(c) over (partition by c groups between 1 preceding and
current row) from test;
TRAP: FailedAssertion("!(ptr >= 0 && ptr < state->readptrcount)",
File: "tuplestore.c", Line: 478)

#0  0x0000003b3ce32495 in raise () from /lib64/libc.so.6
#1  0x0000003b3ce33c75 in abort () from /lib64/libc.so.6
#2  0x0000000000a2ef5a in ExceptionalCondition (conditionName=0xc99f38
"!(ptr >= 0 && ptr < state->readptrcount)", errorType=0xc99f22
"FailedAssertion",
    fileName=0xc99ec0 "tuplestore.c", lineNumber=478) at assert.c:54
#3  0x0000000000a7fa7d in tuplestore_select_read_pointer
(state=0x139e4a0, ptr=-1) at tuplestore.c:478
#4  0x00000000007216cd in update_frameheadpos (winstate=0x137fc30) at
nodeWindowAgg.c:1655
#5  0x000000000071fb7f in eval_windowaggregates (winstate=0x137fc30)
at nodeWindowAgg.c:735
#6  0x0000000000722ac8 in ExecWindowAgg (pstate=0x137fc30) at
nodeWindowAgg.c:2196
#7  0x00000000006e5776 in ExecProcNodeFirst (node=0x137fc30) at
execProcnode.c:445
#8  0x00000000006da945 in ExecProcNode (node=0x137fc30) at
../../../src/include/executor/executor.h:237
#9  0x00000000006dd2fc in ExecutePlan (estate=0x137fa18,
planstate=0x137fc30, use_parallel_mode=false, operation=CMD_SELECT,
sendTuples=true,
    numberTuples=0, direction=ForwardScanDirection, dest=0x138b098,
execute_once=true) at execMain.c:1726
#10 0x00000000006daf2c in standard_ExecutorRun (queryDesc=0x13785b8,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:363
#11 0x00000000006dad48 in ExecutorRun (queryDesc=0x13785b8,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:306
#12 0x00000000008c7626 in PortalRunSelect (portal=0x12f3bd8,
forward=true, count=0, dest=0x138b098) at pquery.c:932
#13 0x00000000008c72b4 in PortalRun (portal=0x12f3bd8,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x138b098, altdest=0x138b098,
    completionTag=0x7fffaece38b0 "") at pquery.c:773
#14 0x00000000008c1288 in exec_simple_query (
    query_string=0x128e938 "select sum(c) over (partition by c groups
between 1 preceding and current row) from test;") at postgres.c:1122
#15 0x00000000008c555e in PostgresMain (argc=1, argv=0x12b8258,
dbname=0x12b80d8 "postgres", username=0x12b80b0 "masahiko") at
postgres.c:4153
#16 0x0000000000822c3c in BackendRun (port=0x12b0020) at postmaster.c:4361
#17 0x00000000008223aa in BackendStartup (port=0x12b0020) at postmaster.c:4033
#18 0x000000000081e84b in ServerLoop () at postmaster.c:1706
#19 0x000000000081e17d in PostmasterMain (argc=5, argv=0x1289330) at
postmaster.c:1379
#20 0x00000000007452d0 in main (argc=5, argv=0x1289330) at main.c:228

The cause of this assertion failure is that we attempted to select a
read pointer (framehead_ptr) that is not allocated. We allocate read
pointers for both frame head and tail when begin a new partition in
begin_partition(). The current code doesn't allocate them as follows
if ORDER BY clause is omitted, but this behavior doesn't match to both
update_framheadpos() and update_framtailpos() which always use each
read pointer in GROUPS offset mode.

    if ((winstate->frameOptions & (FRAMEOPTION_RANGE | FRAMEOPTION_GROUPS)) &&
       node->ordNumCols != 0)
    {
        if (!(winstate->frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
            winstate->framehead_ptr =
                tuplestore_alloc_read_pointer(winstate->buffer, 0);
        if (!(winstate->frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING))
            winstate->frametail_ptr =
                tuplestore_alloc_read_pointer(winstate->buffer, 0);
    }

Since this issue relates to only GROUPS mode it happen in PostgreSQL
11 or later. Attached patch fixes this issue and add regression tests
for testing GROUPS mode without ORDER BY clause.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Вложения

Re: Failure assertion in GROUPS mode of window function in current HEAD

От
Masahiko Sawada
Дата:
On Wed, Jul 4, 2018 at 11:24 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Hi,
>
> I got an assertion failure when I use GROUPS mode and specifies offset
> without ORDER BY clause. The reproduction steps and the backtrace I
> got are following.
>
> =# create table test as select 1 as c;
> =# select sum(c) over (partition by c groups between 1 preceding and
> current row) from test;
> TRAP: FailedAssertion("!(ptr >= 0 && ptr < state->readptrcount)",
> File: "tuplestore.c", Line: 478)
>
> #0  0x0000003b3ce32495 in raise () from /lib64/libc.so.6
> #1  0x0000003b3ce33c75 in abort () from /lib64/libc.so.6
> #2  0x0000000000a2ef5a in ExceptionalCondition (conditionName=0xc99f38
> "!(ptr >= 0 && ptr < state->readptrcount)", errorType=0xc99f22
> "FailedAssertion",
>     fileName=0xc99ec0 "tuplestore.c", lineNumber=478) at assert.c:54
> #3  0x0000000000a7fa7d in tuplestore_select_read_pointer
> (state=0x139e4a0, ptr=-1) at tuplestore.c:478
> #4  0x00000000007216cd in update_frameheadpos (winstate=0x137fc30) at
> nodeWindowAgg.c:1655
> #5  0x000000000071fb7f in eval_windowaggregates (winstate=0x137fc30)
> at nodeWindowAgg.c:735
> #6  0x0000000000722ac8 in ExecWindowAgg (pstate=0x137fc30) at
> nodeWindowAgg.c:2196
> #7  0x00000000006e5776 in ExecProcNodeFirst (node=0x137fc30) at
> execProcnode.c:445
> #8  0x00000000006da945 in ExecProcNode (node=0x137fc30) at
> ../../../src/include/executor/executor.h:237
> #9  0x00000000006dd2fc in ExecutePlan (estate=0x137fa18,
> planstate=0x137fc30, use_parallel_mode=false, operation=CMD_SELECT,
> sendTuples=true,
>     numberTuples=0, direction=ForwardScanDirection, dest=0x138b098,
> execute_once=true) at execMain.c:1726
> #10 0x00000000006daf2c in standard_ExecutorRun (queryDesc=0x13785b8,
> direction=ForwardScanDirection, count=0, execute_once=true) at
> execMain.c:363
> #11 0x00000000006dad48 in ExecutorRun (queryDesc=0x13785b8,
> direction=ForwardScanDirection, count=0, execute_once=true) at
> execMain.c:306
> #12 0x00000000008c7626 in PortalRunSelect (portal=0x12f3bd8,
> forward=true, count=0, dest=0x138b098) at pquery.c:932
> #13 0x00000000008c72b4 in PortalRun (portal=0x12f3bd8,
> count=9223372036854775807, isTopLevel=true, run_once=true,
> dest=0x138b098, altdest=0x138b098,
>     completionTag=0x7fffaece38b0 "") at pquery.c:773
> #14 0x00000000008c1288 in exec_simple_query (
>     query_string=0x128e938 "select sum(c) over (partition by c groups
> between 1 preceding and current row) from test;") at postgres.c:1122
> #15 0x00000000008c555e in PostgresMain (argc=1, argv=0x12b8258,
> dbname=0x12b80d8 "postgres", username=0x12b80b0 "masahiko") at
> postgres.c:4153
> #16 0x0000000000822c3c in BackendRun (port=0x12b0020) at postmaster.c:4361
> #17 0x00000000008223aa in BackendStartup (port=0x12b0020) at postmaster.c:4033
> #18 0x000000000081e84b in ServerLoop () at postmaster.c:1706
> #19 0x000000000081e17d in PostmasterMain (argc=5, argv=0x1289330) at
> postmaster.c:1379
> #20 0x00000000007452d0 in main (argc=5, argv=0x1289330) at main.c:228
>
> The cause of this assertion failure is that we attempted to select a
> read pointer (framehead_ptr) that is not allocated. We allocate read
> pointers for both frame head and tail when begin a new partition in
> begin_partition(). The current code doesn't allocate them as follows
> if ORDER BY clause is omitted, but this behavior doesn't match to both
> update_framheadpos() and update_framtailpos() which always use each
> read pointer in GROUPS offset mode.
>
>     if ((winstate->frameOptions & (FRAMEOPTION_RANGE | FRAMEOPTION_GROUPS)) &&
>        node->ordNumCols != 0)
>     {
>         if (!(winstate->frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
>             winstate->framehead_ptr =
>                 tuplestore_alloc_read_pointer(winstate->buffer, 0);
>         if (!(winstate->frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING))
>             winstate->frametail_ptr =
>                 tuplestore_alloc_read_pointer(winstate->buffer, 0);
>     }
>
> Since this issue relates to only GROUPS mode it happen in PostgreSQL
> 11 or later. Attached patch fixes this issue and add regression tests
> for testing GROUPS mode without ORDER BY clause.

Since this problem still happens with current HEAD I've added this
item to Open Item.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Failure assertion in GROUPS mode of window function in current HEAD

От
Tom Lane
Дата:
Masahiko Sawada <sawada.mshk@gmail.com> writes:
> I got an assertion failure when I use GROUPS mode and specifies offset
> without ORDER BY clause. The reproduction steps and the backtrace I
> got are following.

> =# create table test as select 1 as c;
> =# select sum(c) over (partition by c groups between 1 preceding and
> current row) from test;
> TRAP: FailedAssertion("!(ptr >= 0 && ptr < state->readptrcount)",
> File: "tuplestore.c", Line: 478)

Hm.  Shouldn't we reject this case?  I don't see how GROUPS mode makes
any sense with no ORDER BY.  Maybe you could define it as working with
"groups" of one row each, but the standard seems to think not:

   c) If GROUPS is specified, then:

      i) Either WDEF shall contain a <window order clause>, or WDEF shall
      specify an <existing window name> that identifies a window structure
      descriptor that includes a window ordering clause.

The fact that you got an assertion failure is not very nice, maybe we
need some more code defenses there; but probably the whole thing
should be rejected at parse time.

            regards, tom lane


Re: Failure assertion in GROUPS mode of window function in current HEAD

От
Masahiko Sawada
Дата:
On Tue, Jul 10, 2018 at 7:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
>> I got an assertion failure when I use GROUPS mode and specifies offset
>> without ORDER BY clause. The reproduction steps and the backtrace I
>> got are following.
>
>> =# create table test as select 1 as c;
>> =# select sum(c) over (partition by c groups between 1 preceding and
>> current row) from test;
>> TRAP: FailedAssertion("!(ptr >= 0 && ptr < state->readptrcount)",
>> File: "tuplestore.c", Line: 478)
>
> Hm.  Shouldn't we reject this case?  I don't see how GROUPS mode makes
> any sense with no ORDER BY.  Maybe you could define it as working with
> "groups" of one row each, but the standard seems to think not:
>
>    c) If GROUPS is specified, then:
>
>       i) Either WDEF shall contain a <window order clause>, or WDEF shall
>       specify an <existing window name> that identifies a window structure
>       descriptor that includes a window ordering clause.
>
> The fact that you got an assertion failure is not very nice, maybe we
> need some more code defenses there; but probably the whole thing
> should be rejected at parse time.
>

Agreed. So should we reject only GROUPS offset mode without ORDER BY?
Although since I don't have the standard I'm not sure the ideal
behavior exactly there is the code that considers GROUPS non-offset
mode without ORDER BY.

BTW, I found an another but related issue. We can got an assertion
failure also by the following query.

=# select sum(c) over (partition by c order by c groups between 1
preceding and current row) from test;

The cause of this assertion failure is similar to the first problem I
reported; in the above case,  since node->ordNumCols will be set to 0
we don't allocate both the frame head and tail pointer. The proposed
patch seems to fix the both problems but If we fix the first problem
by rejecting it as not-supported-feature error we need to fix the
second problem by other way.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Failure assertion in GROUPS mode of window function in current HEAD

От
Tom Lane
Дата:
Masahiko Sawada <sawada.mshk@gmail.com> writes:
> BTW, I found an another but related issue. We can got an assertion
> failure also by the following query.

> =# select sum(c) over (partition by c order by c groups between 1
> preceding and current row) from test;

Oh, good point, that's certainly legal per spec, so we'd need to do
something about it.

The proximate cause of the problem, I think, is that the planner throws
away the "order by c" as being redundant because it already sorted by c
for the partitioning requirement.  So we end up with ordNumCols = 0
even though the query had ORDER BY.

This breaks not only GROUPS cases, but also RANGE OFFSET cases, because
the executor expects to have an ordering column.  I thought for a bit
about fixing that by forcing the planner to emit the ordering column for
RANGE OFFSET even if it's redundant.  But it's hard to make the existing
logic in get_column_info_for_window do that --- it can't tell which
partitioning column the ordering column was redundant with, and even if it
could, that column might've been of a different data type.  So eventually
I just threw away that redundant-key-elimination logic altogether.
I believe that we never thought it was really useful as an optimization,
but way back when window functions were put in, we didn't have (or at
least didn't think about) a way to identify the partitioning/ordering
columns without reference to the input pathkeys.

With this patch, WindowAggPath.winpathkeys is not used for anything
anymore.  I'm inclined to get rid of it, though I didn't do so here.
(If we keep it, we at least need to adjust the comment in relation.h
that claims createplan.c needs it.)

The other issue here is that nodeWindowAgg's setup logic is not quite
consistent with update_frameheadpos and update_frametailpos about when
to create tuplestore read pointers and slots.  We might've prevented
all the inconsistent cases with the parser and planner fixes, but it
seems best to make them really consistent anyway, so I changed that.

Draft patch attached.

            regards, tom lane

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 968d5d3..0db193b 100644
*** a/src/backend/executor/nodeWindowAgg.c
--- b/src/backend/executor/nodeWindowAgg.c
*************** begin_partition(WindowAggState *winstate
*** 1079,1084 ****
--- 1079,1085 ----
  {
      WindowAgg  *node = (WindowAgg *) winstate->ss.ps.plan;
      PlanState  *outerPlan = outerPlanState(winstate);
+     int            frameOptions = winstate->frameOptions;
      int            numfuncs = winstate->numfuncs;
      int            i;

*************** begin_partition(WindowAggState *winstate
*** 1143,1150 ****
           * If the frame head is potentially movable, or we have an EXCLUSION
           * clause, we might need to restart aggregation ...
           */
!         if (!(winstate->frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING) ||
!             (winstate->frameOptions & FRAMEOPTION_EXCLUSION))
          {
              /* ... so create a mark pointer to track the frame head */
              agg_winobj->markptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
--- 1144,1151 ----
           * If the frame head is potentially movable, or we have an EXCLUSION
           * clause, we might need to restart aggregation ...
           */
!         if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING) ||
!             (frameOptions & FRAMEOPTION_EXCLUSION))
          {
              /* ... so create a mark pointer to track the frame head */
              agg_winobj->markptr = tuplestore_alloc_read_pointer(winstate->buffer, 0);
*************** begin_partition(WindowAggState *winstate
*** 1182,1202 ****

      /*
       * If we are in RANGE or GROUPS mode, then determining frame boundaries
!      * requires physical access to the frame endpoint rows, except in
       * degenerate cases.  We create read pointers to point to those rows, to
       * simplify access and ensure that the tuplestore doesn't discard the
!      * endpoint rows prematurely.  (Must match logic in update_frameheadpos
!      * and update_frametailpos.)
       */
      winstate->framehead_ptr = winstate->frametail_ptr = -1; /* if not used */

!     if ((winstate->frameOptions & (FRAMEOPTION_RANGE | FRAMEOPTION_GROUPS)) &&
!         node->ordNumCols != 0)
      {
!         if (!(winstate->frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
              winstate->framehead_ptr =
                  tuplestore_alloc_read_pointer(winstate->buffer, 0);
!         if (!(winstate->frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING))
              winstate->frametail_ptr =
                  tuplestore_alloc_read_pointer(winstate->buffer, 0);
      }
--- 1183,1206 ----

      /*
       * If we are in RANGE or GROUPS mode, then determining frame boundaries
!      * requires physical access to the frame endpoint rows, except in certain
       * degenerate cases.  We create read pointers to point to those rows, to
       * simplify access and ensure that the tuplestore doesn't discard the
!      * endpoint rows prematurely.  (Must create pointers in exactly the same
!      * cases that update_frameheadpos and update_frametailpos need them.)
       */
      winstate->framehead_ptr = winstate->frametail_ptr = -1; /* if not used */

!     if (frameOptions & (FRAMEOPTION_RANGE | FRAMEOPTION_GROUPS))
      {
!         if (((frameOptions & FRAMEOPTION_START_CURRENT_ROW) &&
!              node->ordNumCols != 0) ||
!             (frameOptions & FRAMEOPTION_START_OFFSET))
              winstate->framehead_ptr =
                  tuplestore_alloc_read_pointer(winstate->buffer, 0);
!         if (((frameOptions & FRAMEOPTION_END_CURRENT_ROW) &&
!              node->ordNumCols != 0) ||
!             (frameOptions & FRAMEOPTION_END_OFFSET))
              winstate->frametail_ptr =
                  tuplestore_alloc_read_pointer(winstate->buffer, 0);
      }
*************** begin_partition(WindowAggState *winstate
*** 1210,1217 ****
       */
      winstate->grouptail_ptr = -1;

!     if ((winstate->frameOptions & (FRAMEOPTION_EXCLUDE_GROUP |
!                                    FRAMEOPTION_EXCLUDE_TIES)) &&
          node->ordNumCols != 0)
      {
          winstate->grouptail_ptr =
--- 1214,1221 ----
       */
      winstate->grouptail_ptr = -1;

!     if ((frameOptions & (FRAMEOPTION_EXCLUDE_GROUP |
!                          FRAMEOPTION_EXCLUDE_TIES)) &&
          node->ordNumCols != 0)
      {
          winstate->grouptail_ptr =
*************** update_frameheadpos(WindowAggState *wins
*** 1563,1568 ****
--- 1567,1575 ----
              bool        sub,
                          less;

+             /* We must have an ordering column */
+             Assert(node->ordNumCols == 1);
+
              /* Precompute flags for in_range checks */
              if (frameOptions & FRAMEOPTION_START_OFFSET_PRECEDING)
                  sub = true;        /* subtract startOffset from current row */
*************** update_frameheadpos(WindowAggState *wins
*** 1643,1648 ****
--- 1650,1660 ----
               * framehead_slot, and advance as necessary.  Note that if we
               * reach end of partition, we will leave frameheadpos = end+1 and
               * framehead_slot empty.
+              *
+              * Note: it's possible to have ordNumCols == 0, in which case
+              * there is only one peer group and we could in principle do this
+              * without the laborious per-row tests.  Doesn't seem worth
+              * expending extra code on such a silly case, though.
               */
              int64        offset = DatumGetInt64(winstate->startOffsetValue);
              int64        minheadgroup;
*************** update_frametailpos(WindowAggState *wins
*** 1814,1819 ****
--- 1826,1834 ----
              bool        sub,
                          less;

+             /* We must have an ordering column */
+             Assert(node->ordNumCols == 1);
+
              /* Precompute flags for in_range checks */
              if (frameOptions & FRAMEOPTION_END_OFFSET_PRECEDING)
                  sub = true;        /* subtract endOffset from current row */
*************** update_frametailpos(WindowAggState *wins
*** 1894,1899 ****
--- 1909,1919 ----
               * of the last-known frame tail row in frametail_slot, and advance
               * as necessary.  Note that if we reach end of partition, we will
               * leave frametailpos = end+1 and frametail_slot empty.
+              *
+              * Note: it's possible to have ordNumCols == 0, in which case
+              * there is only one peer group and we could in principle do this
+              * without the laborious per-row tests.  Doesn't seem worth
+              * expending extra code on such a silly case, though.
               */
              int64        offset = DatumGetInt64(winstate->endOffsetValue);
              int64        maxtailgroup;
*************** ExecInitWindowAgg(WindowAgg *node, EStat
*** 2318,2333 ****
      winstate->temp_slot_2 = ExecInitExtraTupleSlot(estate, scanDesc);

      /*
!      * create frame head and tail slots only if needed (must match logic in
!      * update_frameheadpos and update_frametailpos)
       */
      winstate->framehead_slot = winstate->frametail_slot = NULL;

      if (frameOptions & (FRAMEOPTION_RANGE | FRAMEOPTION_GROUPS))
      {
!         if (!(frameOptions & FRAMEOPTION_START_UNBOUNDED_PRECEDING))
              winstate->framehead_slot = ExecInitExtraTupleSlot(estate, scanDesc);
!         if (!(frameOptions & FRAMEOPTION_END_UNBOUNDED_FOLLOWING))
              winstate->frametail_slot = ExecInitExtraTupleSlot(estate, scanDesc);
      }

--- 2338,2358 ----
      winstate->temp_slot_2 = ExecInitExtraTupleSlot(estate, scanDesc);

      /*
!      * create frame head and tail slots only if needed (must create slots in
!      * exactly the same cases that update_frameheadpos and update_frametailpos
!      * need them)
       */
      winstate->framehead_slot = winstate->frametail_slot = NULL;

      if (frameOptions & (FRAMEOPTION_RANGE | FRAMEOPTION_GROUPS))
      {
!         if (((frameOptions & FRAMEOPTION_START_CURRENT_ROW) &&
!              node->ordNumCols != 0) ||
!             (frameOptions & FRAMEOPTION_START_OFFSET))
              winstate->framehead_slot = ExecInitExtraTupleSlot(estate, scanDesc);
!         if (((frameOptions & FRAMEOPTION_END_CURRENT_ROW) &&
!              node->ordNumCols != 0) ||
!             (frameOptions & FRAMEOPTION_END_OFFSET))
              winstate->frametail_slot = ExecInitExtraTupleSlot(estate, scanDesc);
      }

diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index cf82b70..4186e20 100644
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** static WindowAgg *create_windowagg_plan(
*** 107,121 ****
  static SetOp *create_setop_plan(PlannerInfo *root, SetOpPath *best_path,
                    int flags);
  static RecursiveUnion *create_recursiveunion_plan(PlannerInfo *root, RecursiveUnionPath *best_path);
- static void get_column_info_for_window(PlannerInfo *root, WindowClause *wc,
-                            List *tlist,
-                            int numSortCols, AttrNumber *sortColIdx,
-                            int *partNumCols,
-                            AttrNumber **partColIdx,
-                            Oid **partOperators,
-                            int *ordNumCols,
-                            AttrNumber **ordColIdx,
-                            Oid **ordOperators);
  static LockRows *create_lockrows_plan(PlannerInfo *root, LockRowsPath *best_path,
                       int flags);
  static ModifyTable *create_modifytable_plan(PlannerInfo *root, ModifyTablePath *best_path);
--- 107,112 ----
*************** create_windowagg_plan(PlannerInfo *root,
*** 2160,2178 ****
  {
      WindowAgg  *plan;
      WindowClause *wc = best_path->winclause;
      Plan       *subplan;
      List       *tlist;
-     int            numsortkeys;
-     AttrNumber *sortColIdx;
-     Oid           *sortOperators;
-     Oid           *collations;
-     bool       *nullsFirst;
      int            partNumCols;
      AttrNumber *partColIdx;
      Oid           *partOperators;
      int            ordNumCols;
      AttrNumber *ordColIdx;
      Oid           *ordOperators;

      /*
       * WindowAgg can project, so no need to be terribly picky about child
--- 2151,2167 ----
  {
      WindowAgg  *plan;
      WindowClause *wc = best_path->winclause;
+     int            numPart = list_length(wc->partitionClause);
+     int            numOrder = list_length(wc->orderClause);
      Plan       *subplan;
      List       *tlist;
      int            partNumCols;
      AttrNumber *partColIdx;
      Oid           *partOperators;
      int            ordNumCols;
      AttrNumber *ordColIdx;
      Oid           *ordOperators;
+     ListCell   *lc;

      /*
       * WindowAgg can project, so no need to be terribly picky about child
*************** create_windowagg_plan(PlannerInfo *root,
*** 2183,2214 ****
      tlist = build_path_tlist(root, &best_path->path);

      /*
!      * We shouldn't need to actually sort, but it's convenient to use
!      * prepare_sort_from_pathkeys to identify the input's sort columns.
       */
!     subplan = prepare_sort_from_pathkeys(subplan,
!                                          best_path->winpathkeys,
!                                          NULL,
!                                          NULL,
!                                          false,
!                                          &numsortkeys,
!                                          &sortColIdx,
!                                          &sortOperators,
!                                          &collations,
!                                          &nullsFirst);

!     /* Now deconstruct that into partition and ordering portions */
!     get_column_info_for_window(root,
!                                wc,
!                                subplan->targetlist,
!                                numsortkeys,
!                                sortColIdx,
!                                &partNumCols,
!                                &partColIdx,
!                                &partOperators,
!                                &ordNumCols,
!                                &ordColIdx,
!                                &ordOperators);

      /* And finally we can make the WindowAgg node */
      plan = make_windowagg(tlist,
--- 2172,2214 ----
      tlist = build_path_tlist(root, &best_path->path);

      /*
!      * Convert SortGroupClause lists into arrays of attr indexes and equality
!      * operators, as wanted by executor.  (Note: in principle, it's possible
!      * to drop some of the sort columns, if they were proved redundant by
!      * pathkey logic.  However, it doesn't seem worth going out of our way to
!      * optimize such cases.  In any case, we must *not* remove the ordering
!      * column for RANGE OFFSET cases, as the executor needs that for in_range
!      * tests even if it's known to be equal to some partitioning column.)
       */
!     partColIdx = (AttrNumber *) palloc(sizeof(AttrNumber) * numPart);
!     partOperators = (Oid *) palloc(sizeof(Oid) * numPart);

!     partNumCols = 0;
!     foreach(lc, wc->partitionClause)
!     {
!         SortGroupClause *sgc = (SortGroupClause *) lfirst(lc);
!         TargetEntry *tle = get_sortgroupclause_tle(sgc, subplan->targetlist);
!
!         Assert(OidIsValid(sgc->eqop));
!         partColIdx[partNumCols] = tle->resno;
!         partOperators[partNumCols] = sgc->eqop;
!         partNumCols++;
!     }
!
!     ordColIdx = (AttrNumber *) palloc(sizeof(AttrNumber) * numOrder);
!     ordOperators = (Oid *) palloc(sizeof(Oid) * numOrder);
!
!     ordNumCols = 0;
!     foreach(lc, wc->orderClause)
!     {
!         SortGroupClause *sgc = (SortGroupClause *) lfirst(lc);
!         TargetEntry *tle = get_sortgroupclause_tle(sgc, subplan->targetlist);
!
!         Assert(OidIsValid(sgc->eqop));
!         ordColIdx[ordNumCols] = tle->resno;
!         ordOperators[ordNumCols] = sgc->eqop;
!         ordNumCols++;
!     }

      /* And finally we can make the WindowAgg node */
      plan = make_windowagg(tlist,
*************** create_windowagg_plan(PlannerInfo *root,
*** 2235,2346 ****
  }

  /*
-  * get_column_info_for_window
-  *        Get the partitioning/ordering column numbers and equality operators
-  *        for a WindowAgg node.
-  *
-  * This depends on the behavior of planner.c's make_pathkeys_for_window!
-  *
-  * We are given the target WindowClause and an array of the input column
-  * numbers associated with the resulting pathkeys.  In the easy case, there
-  * are the same number of pathkey columns as partitioning + ordering columns
-  * and we just have to copy some data around.  However, it's possible that
-  * some of the original partitioning + ordering columns were eliminated as
-  * redundant during the transformation to pathkeys.  (This can happen even
-  * though the parser gets rid of obvious duplicates.  A typical scenario is a
-  * window specification "PARTITION BY x ORDER BY y" coupled with a clause
-  * "WHERE x = y" that causes the two sort columns to be recognized as
-  * redundant.)    In that unusual case, we have to work a lot harder to
-  * determine which keys are significant.
-  *
-  * The method used here is a bit brute-force: add the sort columns to a list
-  * one at a time and note when the resulting pathkey list gets longer.  But
-  * it's a sufficiently uncommon case that a faster way doesn't seem worth
-  * the amount of code refactoring that'd be needed.
-  */
- static void
- get_column_info_for_window(PlannerInfo *root, WindowClause *wc, List *tlist,
-                            int numSortCols, AttrNumber *sortColIdx,
-                            int *partNumCols,
-                            AttrNumber **partColIdx,
-                            Oid **partOperators,
-                            int *ordNumCols,
-                            AttrNumber **ordColIdx,
-                            Oid **ordOperators)
- {
-     int            numPart = list_length(wc->partitionClause);
-     int            numOrder = list_length(wc->orderClause);
-
-     if (numSortCols == numPart + numOrder)
-     {
-         /* easy case */
-         *partNumCols = numPart;
-         *partColIdx = sortColIdx;
-         *partOperators = extract_grouping_ops(wc->partitionClause);
-         *ordNumCols = numOrder;
-         *ordColIdx = sortColIdx + numPart;
-         *ordOperators = extract_grouping_ops(wc->orderClause);
-     }
-     else
-     {
-         List       *sortclauses;
-         List       *pathkeys;
-         int            scidx;
-         ListCell   *lc;
-
-         /* first, allocate what's certainly enough space for the arrays */
-         *partNumCols = 0;
-         *partColIdx = (AttrNumber *) palloc(numPart * sizeof(AttrNumber));
-         *partOperators = (Oid *) palloc(numPart * sizeof(Oid));
-         *ordNumCols = 0;
-         *ordColIdx = (AttrNumber *) palloc(numOrder * sizeof(AttrNumber));
-         *ordOperators = (Oid *) palloc(numOrder * sizeof(Oid));
-         sortclauses = NIL;
-         pathkeys = NIL;
-         scidx = 0;
-         foreach(lc, wc->partitionClause)
-         {
-             SortGroupClause *sgc = (SortGroupClause *) lfirst(lc);
-             List       *new_pathkeys;
-
-             sortclauses = lappend(sortclauses, sgc);
-             new_pathkeys = make_pathkeys_for_sortclauses(root,
-                                                          sortclauses,
-                                                          tlist);
-             if (list_length(new_pathkeys) > list_length(pathkeys))
-             {
-                 /* this sort clause is actually significant */
-                 (*partColIdx)[*partNumCols] = sortColIdx[scidx++];
-                 (*partOperators)[*partNumCols] = sgc->eqop;
-                 (*partNumCols)++;
-                 pathkeys = new_pathkeys;
-             }
-         }
-         foreach(lc, wc->orderClause)
-         {
-             SortGroupClause *sgc = (SortGroupClause *) lfirst(lc);
-             List       *new_pathkeys;
-
-             sortclauses = lappend(sortclauses, sgc);
-             new_pathkeys = make_pathkeys_for_sortclauses(root,
-                                                          sortclauses,
-                                                          tlist);
-             if (list_length(new_pathkeys) > list_length(pathkeys))
-             {
-                 /* this sort clause is actually significant */
-                 (*ordColIdx)[*ordNumCols] = sortColIdx[scidx++];
-                 (*ordOperators)[*ordNumCols] = sgc->eqop;
-                 (*ordNumCols)++;
-                 pathkeys = new_pathkeys;
-             }
-         }
-         /* complain if we didn't eat exactly the right number of sort cols */
-         if (scidx != numSortCols)
-             elog(ERROR, "failed to deconstruct sort operators into partitioning/ordering operators");
-     }
- }
-
- /*
   * create_setop_plan
   *
   *      Create a SetOp plan for 'best_path' and (recursively) plans
--- 2235,2240 ----
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index fd45c97..2c31b57 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*************** make_window_input_target(PlannerInfo *ro
*** 5466,5473 ****
   * The required ordering is first the PARTITION keys, then the ORDER keys.
   * In the future we might try to implement windowing using hashing, in which
   * case the ordering could be relaxed, but for now we always sort.
-  *
-  * Caution: if you change this, see createplan.c's get_column_info_for_window!
   */
  static List *
  make_pathkeys_for_window(PlannerInfo *root, WindowClause *wc,
--- 5466,5471 ----
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index e147880..cfd4b91 100644
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
*************** transformWindowDefinitions(ParseState *p
*** 2795,2800 ****
--- 2795,2810 ----
              wc->inRangeNullsFirst = sortcl->nulls_first;
          }

+         /* Per spec, GROUPS mode requires an ORDER BY clause */
+         if (wc->frameOptions & FRAMEOPTION_GROUPS)
+         {
+             if (wc->orderClause == NIL)
+                 ereport(ERROR,
+                         (errcode(ERRCODE_WINDOWING_ERROR),
+                          errmsg("GROUPS mode requires an ORDER BY clause"),
+                          parser_errposition(pstate, windef->location)));
+         }
+
          /* Process frame offset expressions */
          wc->startOffset = transformFrameOffset(pstate, wc->frameOptions,
                                                 rangeopfamily, rangeopcintype,
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 85d81e7..562006a 100644
*** a/src/test/regress/expected/window.out
--- b/src/test/regress/expected/window.out
*************** SELECT count(*) OVER (PARTITION BY four)
*** 2834,2839 ****
--- 2834,2934 ----
  -------
  (0 rows)

+ -- check some degenerate cases
+ create temp table t1 (f1 int, f2 int8);
+ insert into t1 values (1,1),(1,2),(2,2);
+ select f1, sum(f1) over (partition by f1
+                          range between 1 preceding and 1 following)
+ from t1 where f1 = f2;  -- error, must have order by
+ ERROR:  RANGE with offset PRECEDING/FOLLOWING requires exactly one ORDER BY column
+ LINE 1: select f1, sum(f1) over (partition by f1
+                                 ^
+ explain (costs off)
+ select f1, sum(f1) over (partition by f1 order by f2
+                          range between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+            QUERY PLAN
+ ---------------------------------
+  WindowAgg
+    ->  Sort
+          Sort Key: f1
+          ->  Seq Scan on t1
+                Filter: (f1 = f2)
+ (5 rows)
+
+ select f1, sum(f1) over (partition by f1 order by f2
+                          range between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+  f1 | sum
+ ----+-----
+   1 |   1
+   2 |   2
+ (2 rows)
+
+ select f1, sum(f1) over (partition by f1, f1 order by f2
+                          range between 2 preceding and 1 preceding)
+ from t1 where f1 = f2;
+  f1 | sum
+ ----+-----
+   1 |
+   2 |
+ (2 rows)
+
+ select f1, sum(f1) over (partition by f1, f2 order by f2
+                          range between 1 following and 2 following)
+ from t1 where f1 = f2;
+  f1 | sum
+ ----+-----
+   1 |
+   2 |
+ (2 rows)
+
+ select f1, sum(f1) over (partition by f1
+                          groups between 1 preceding and 1 following)
+ from t1 where f1 = f2;  -- error, must have order by
+ ERROR:  GROUPS mode requires an ORDER BY clause
+ LINE 1: select f1, sum(f1) over (partition by f1
+                                 ^
+ explain (costs off)
+ select f1, sum(f1) over (partition by f1 order by f2
+                          groups between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+            QUERY PLAN
+ ---------------------------------
+  WindowAgg
+    ->  Sort
+          Sort Key: f1
+          ->  Seq Scan on t1
+                Filter: (f1 = f2)
+ (5 rows)
+
+ select f1, sum(f1) over (partition by f1 order by f2
+                          groups between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+  f1 | sum
+ ----+-----
+   1 |   1
+   2 |   2
+ (2 rows)
+
+ select f1, sum(f1) over (partition by f1, f1 order by f2
+                          groups between 2 preceding and 1 preceding)
+ from t1 where f1 = f2;
+  f1 | sum
+ ----+-----
+   1 |
+   2 |
+ (2 rows)
+
+ select f1, sum(f1) over (partition by f1, f2 order by f2
+                          groups between 1 following and 2 following)
+ from t1 where f1 = f2;
+  f1 | sum
+ ----+-----
+   1 |
+   2 |
+ (2 rows)
+
  -- ordering by a non-integer constant is allowed
  SELECT rank() OVER (ORDER BY length('abc'));
   rank
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 051b50b..e2943a3 100644
*** a/src/test/regress/sql/window.sql
--- b/src/test/regress/sql/window.sql
*************** WINDOW w AS (ORDER BY x groups between 1
*** 795,800 ****
--- 795,838 ----
  -- with UNION
  SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0;

+ -- check some degenerate cases
+ create temp table t1 (f1 int, f2 int8);
+ insert into t1 values (1,1),(1,2),(2,2);
+
+ select f1, sum(f1) over (partition by f1
+                          range between 1 preceding and 1 following)
+ from t1 where f1 = f2;  -- error, must have order by
+ explain (costs off)
+ select f1, sum(f1) over (partition by f1 order by f2
+                          range between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+ select f1, sum(f1) over (partition by f1 order by f2
+                          range between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+ select f1, sum(f1) over (partition by f1, f1 order by f2
+                          range between 2 preceding and 1 preceding)
+ from t1 where f1 = f2;
+ select f1, sum(f1) over (partition by f1, f2 order by f2
+                          range between 1 following and 2 following)
+ from t1 where f1 = f2;
+
+ select f1, sum(f1) over (partition by f1
+                          groups between 1 preceding and 1 following)
+ from t1 where f1 = f2;  -- error, must have order by
+ explain (costs off)
+ select f1, sum(f1) over (partition by f1 order by f2
+                          groups between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+ select f1, sum(f1) over (partition by f1 order by f2
+                          groups between 1 preceding and 1 following)
+ from t1 where f1 = f2;
+ select f1, sum(f1) over (partition by f1, f1 order by f2
+                          groups between 2 preceding and 1 preceding)
+ from t1 where f1 = f2;
+ select f1, sum(f1) over (partition by f1, f2 order by f2
+                          groups between 1 following and 2 following)
+ from t1 where f1 = f2;
+
  -- ordering by a non-integer constant is allowed
  SELECT rank() OVER (ORDER BY length('abc'));


Re: Failure assertion in GROUPS mode of window function in current HEAD

От
Masahiko Sawada
Дата:
On Wed, Jul 11, 2018 at 4:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
>> BTW, I found an another but related issue. We can got an assertion
>> failure also by the following query.
>
>> =# select sum(c) over (partition by c order by c groups between 1
>> preceding and current row) from test;
>
> Oh, good point, that's certainly legal per spec, so we'd need to do
> something about it.
>
> The proximate cause of the problem, I think, is that the planner throws
> away the "order by c" as being redundant because it already sorted by c
> for the partitioning requirement.  So we end up with ordNumCols = 0
> even though the query had ORDER BY.

Yeah, I think so too.

>
> This breaks not only GROUPS cases, but also RANGE OFFSET cases, because
> the executor expects to have an ordering column.  I thought for a bit
> about fixing that by forcing the planner to emit the ordering column for
> RANGE OFFSET even if it's redundant.  But it's hard to make the existing
> logic in get_column_info_for_window do that --- it can't tell which
> partitioning column the ordering column was redundant with, and even if it
> could, that column might've been of a different data type.  So eventually
> I just threw away that redundant-key-elimination logic altogether.
> I believe that we never thought it was really useful as an optimization,
> but way back when window functions were put in, we didn't have (or at
> least didn't think about) a way to identify the partitioning/ordering
> columns without reference to the input pathkeys.
>

Agreed.

> With this patch, WindowAggPath.winpathkeys is not used for anything
> anymore.  I'm inclined to get rid of it, though I didn't do so here.
> (If we keep it, we at least need to adjust the comment in relation.h
> that claims createplan.c needs it.)

+1 for removing.

>
> The other issue here is that nodeWindowAgg's setup logic is not quite
> consistent with update_frameheadpos and update_frametailpos about when
> to create tuplestore read pointers and slots.  We might've prevented
> all the inconsistent cases with the parser and planner fixes, but it
> seems best to make them really consistent anyway, so I changed that.
>
> Draft patch attached.
>

Thank you for committing!
I think we also can update the doc about that GROUPS offset mode
requires ORDER BY clause. Thoughts? Attached patch updates it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Вложения

Re: Failure assertion in GROUPS mode of window function in current HEAD

От
Tom Lane
Дата:
Masahiko Sawada <sawada.mshk@gmail.com> writes:
> I think we also can update the doc about that GROUPS offset mode
> requires ORDER BY clause. Thoughts? Attached patch updates it.

Ooops, I forgot to check the docs.  This isn't quite the right fix
though --- the problem is that there's a sentence at the end of that
para that now says exactly the wrong thing.  I fixed that.

            regards, tom lane


Re: Failure assertion in GROUPS mode of window function in current HEAD

От
Masahiko Sawada
Дата:
On Fri, Jul 13, 2018 at 12:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
>> I think we also can update the doc about that GROUPS offset mode
>> requires ORDER BY clause. Thoughts? Attached patch updates it.
>
> Ooops, I forgot to check the docs.  This isn't quite the right fix
> though --- the problem is that there's a sentence at the end of that
> para that now says exactly the wrong thing.  I fixed that.
>

Yeah, that's not the right fix. Thank you for fixing!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center