Обсуждение: EXPLAIN of Parallel Append

Поиск

Список

Период

Сортировка

EXPLAIN of Parallel Append

От

Jesper Pedersen

Дата:

14 марта 2018 г., 21:28:29

Hi,

Given

-- test.sql --
CREATE TABLE t1 ( 

     a integer NOT NULL,
     b integer NOT NULL
) PARTITION BY HASH (b);
CREATE TABLE t1_p00 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 0);
CREATE TABLE t1_p01 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 1);
CREATE TABLE t1_p02 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 2);
CREATE TABLE t1_p03 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, 
REMAINDER 3);
INSERT INTO t1 (SELECT i, i FROM generate_series(1, 1000000) AS i);
ANALYZE;
-- test.sql --

Running

EXPLAIN (ANALYZE) SELECT * FROM t1 WHERE a = 5432;

gives

  Gather  (cost=1000.00..12780.36 rows=4 width=8) (actual 
time=61.270..61.309 rows=1 loops=1)
    Workers Planned: 2
    Workers Launched: 2
    ->  Parallel Append  (cost=0.00..11779.96 rows=4 width=8) (actual 
time=38.915..57.209 rows=0 loops=3)
          ->  Parallel Seq Scan on t1_p01  (cost=0.00..2949.00 rows=1 
width=8) (actual time=38.904..38.904 rows=0 loops=1)
                Filter: (a = 5432)
                Rows Removed by Filter: 250376
          ->  Parallel Seq Scan on t1_p03  (cost=0.00..2948.07 rows=1 
width=8) (actual time=0.369..47.909 rows=1 loops=1)
                Filter: (a = 5432)
                Rows Removed by Filter: 250248
          ->  Parallel Seq Scan on t1_p02  (cost=0.00..2942.66 rows=1 
width=8) (actual time=11.354..11.354 rows=0 loops=3)
                Filter: (a = 5432)
                Rows Removed by Filter: 83262
          ->  Parallel Seq Scan on t1_p00  (cost=0.00..2940.21 rows=1 
width=8) (actual time=50.745..50.745 rows=0 loops=1)
                Filter: (a = 5432)
                Rows Removed by Filter: 249589
  Planning time: 0.381 ms
  Execution time: 62.810 ms
(18 rows)

Parallel Append's ntuples is 1, but given nloops is 3 you end up with 
the slightly confusing "(actual ... *rows=0* loops=3)".

Using master (3b7ab438).

Thoughts ?

Best regards,
  Jesper

Re: EXPLAIN of Parallel Append

От

Amit Kapila

Дата:

07 июля 2018 г., 11:08:03

On Wed, Mar 14, 2018 at 8:58 PM, Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:
> Hi,
>
> Given
>
> -- test.sql --
> CREATE TABLE t1 (
>     a integer NOT NULL,
>     b integer NOT NULL
> ) PARTITION BY HASH (b);
> CREATE TABLE t1_p00 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, REMAINDER
> 0);
> CREATE TABLE t1_p01 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, REMAINDER
> 1);
> CREATE TABLE t1_p02 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, REMAINDER
> 2);
> CREATE TABLE t1_p03 PARTITION OF t1 FOR VALUES WITH (MODULUS 4, REMAINDER
> 3);
> INSERT INTO t1 (SELECT i, i FROM generate_series(1, 1000000) AS i);
> ANALYZE;
> -- test.sql --
>
> Running
>
> EXPLAIN (ANALYZE) SELECT * FROM t1 WHERE a = 5432;
>
> gives
>
>  Gather  (cost=1000.00..12780.36 rows=4 width=8) (actual time=61.270..61.309
> rows=1 loops=1)
>    Workers Planned: 2
>    Workers Launched: 2
>    ->  Parallel Append  (cost=0.00..11779.96 rows=4 width=8) (actual
> time=38.915..57.209 rows=0 loops=3)
..
..
> (18 rows)
>
> Parallel Append's ntuples is 1, but given nloops is 3 you end up with the
> slightly confusing "(actual ... *rows=0* loops=3)".
>

The number of rows displayed is total_rows / loops due to which you
are seeing these numbers.  This behavior is the same for all parallel
nodes, nothing specific to Parallel Append.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: EXPLAIN of Parallel Append

От

Jesper Pedersen

Дата:

09 июля 2018 г., 19:30:43

Hi Amit,

On 07/07/2018 01:08 AM, Amit Kapila wrote:
> On Wed, Mar 14, 2018 at 8:58 PM, Jesper Pedersen
>> Parallel Append's ntuples is 1, but given nloops is 3 you end up with the
>> slightly confusing "(actual ... *rows=0* loops=3)".
>>
> 
> The number of rows displayed is total_rows / loops due to which you
> are seeing these numbers.  This behavior is the same for all parallel
> nodes, nothing specific to Parallel Append.
> 

Thanks !

Maybe something like the attached patch for the documentation is needed.

Best regards,
  Jesper

Вложения

parallel-docs.patch

Re: EXPLAIN of Parallel Append

От

Amit Kapila

Дата:

10 июля 2018 г., 18:49:52

On Mon, Jul 9, 2018 at 7:00 PM, Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:
> On 07/07/2018 01:08 AM, Amit Kapila wrote:
>>
>> On Wed, Mar 14, 2018 at 8:58 PM, Jesper Pedersen
>>>
>>> Parallel Append's ntuples is 1, but given nloops is 3 you end up with the
>>> slightly confusing "(actual ... *rows=0* loops=3)".
>>>
>>
>> The number of rows displayed is total_rows / loops due to which you
>> are seeing these numbers.  This behavior is the same for all parallel
>> nodes, nothing specific to Parallel Append.
>>
>
> Thanks !
>
> Maybe something like the attached patch for the documentation is needed.
>

-    performance characteristics of the plan.
+    performance characteristics of the plan. Note, that the parallel nodes
+    may report zero rows returned due internal calculations when one or more
+    rows are actually being returned.

typo.
/due/due to

I think it is quite unclear what you mean by internal calculations.
If you can come up with something similar to how we have already
explained similar thing for Nest Loop Joins [1], then it would be
great, you can add something like what you have written at the end of
the paragraph after explaining the actual calculation.  This is quite
a common confusion since the parallel query is developed; if you can
write some nice example and text, it would be really helpful.

[1] -
https://www.postgresql.org/docs/devel/static/using-explain.html#USING-EXPLAIN-ANALYZE

Refer below text on that link:
"In some query plans, it is possible for a subplan node to be executed
more than once. For example, the inner index scan will be executed
once per outer row in the above nested-loop plan. In such cases, the
loops value reports the total number of executions of the node, and
the actual time and rows values shown are averages per-execution. This
is done to make the numbers comparable with the way that the cost
estimates are shown. Multiply by the loops value to get the total time
actually spent in the node."

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: EXPLAIN of Parallel Append

EXPLAIN of Parallel Append

Re: EXPLAIN of Parallel Append

Re: EXPLAIN of Parallel Append

Вложения

Re: EXPLAIN of Parallel Append