Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .
От | Amit Kapila |
---|---|
Тема | Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE . |
Дата | |
Msg-id | CAA4eK1L4fTEebm5dvVw3G04Ksv-ya+QinnuuC7+97Q6xa7ASFQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE . (Masahiko Sawada <sawada.mshk@gmail.com>) |
Ответы |
Re: Actuall row count of Parallel Seq Scan in EXPLAIN
ANALYZE .
(Masahiko Sawada <sawada.mshk@gmail.com>)
Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE . (Satoshi Nagayasu <snaga@uptime.jp>) Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE . (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Список | pgsql-hackers |
On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi all,
>
> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
> fine for parallel seq scan.
>
> postgres(1)=# explain analyze verbose select count(*) from pgbench_accounts ;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> Finalize Aggregate (cost=217018.55..217018.56 rows=1 width=8)
> (actual time=2640.015..2640.015 rows=1 loops=1)
> Output: count(*)
> -> Gather (cost=217018.33..217018.54 rows=2 width=8) (actual
> time=2639.064..2640.002 rows=3 loops=1)
> Output: (PARTIAL count(*))
> Workers Planned: 2
> Workers Launched: 2
> -> Partial Aggregate (cost=216018.33..216018.34 rows=1
> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
> Output: PARTIAL count(*)
> Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
> Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
> -> Parallel Seq Scan on public.pgbench_accounts
> (cost=0.00..205601.67 rows=4166667 width=0) (actual
> time=0.042..1685.542 rows=3333333 loops=3)
> Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1
> Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1
> Planning time: 1.026 ms
> Execution time: 2640.225 ms
> (15 rows)
>
> For example, the above result shows,
> Parallel Seq Scan : actual rows = 3333333
> worker 0 : actual rows = 3457968
> worker 1 : actual rows = 3741069
> Summation of these is 10532370, but actual total rows is 10000000.
> I think that Parallel Seq Scan should show actual rows =
> 10000000(total rows) or actual rows = 2800963(rows collected by
> itself). (10000000 maybe better)
>
You have to read the rows at Parallel Seq Scan nodes as total count of rows, but you have to consider the loops parameter as well.
>
> After spent time to investigate this behaviour, ISTM that the problem
> is nloops of Parallel Seq Scan.
> Parallel Seq Scan is done only once, but nloops is incremented to 3.
> So its "actual rows" is calculated 3333333(10000000 / 3) at explain.c:L1223.
>
> Hi all,
>
> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
> fine for parallel seq scan.
>
> postgres(1)=# explain analyze verbose select count(*) from pgbench_accounts ;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> Finalize Aggregate (cost=217018.55..217018.56 rows=1 width=8)
> (actual time=2640.015..2640.015 rows=1 loops=1)
> Output: count(*)
> -> Gather (cost=217018.33..217018.54 rows=2 width=8) (actual
> time=2639.064..2640.002 rows=3 loops=1)
> Output: (PARTIAL count(*))
> Workers Planned: 2
> Workers Launched: 2
> -> Partial Aggregate (cost=216018.33..216018.34 rows=1
> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
> Output: PARTIAL count(*)
> Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
> Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
> -> Parallel Seq Scan on public.pgbench_accounts
> (cost=0.00..205601.67 rows=4166667 width=0) (actual
> time=0.042..1685.542 rows=3333333 loops=3)
> Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1
> Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1
> Planning time: 1.026 ms
> Execution time: 2640.225 ms
> (15 rows)
>
> For example, the above result shows,
> Parallel Seq Scan : actual rows = 3333333
> worker 0 : actual rows = 3457968
> worker 1 : actual rows = 3741069
> Summation of these is 10532370, but actual total rows is 10000000.
> I think that Parallel Seq Scan should show actual rows =
> 10000000(total rows) or actual rows = 2800963(rows collected by
> itself). (10000000 maybe better)
>
You have to read the rows at Parallel Seq Scan nodes as total count of rows, but you have to consider the loops parameter as well.
>
> After spent time to investigate this behaviour, ISTM that the problem
> is nloops of Parallel Seq Scan.
> Parallel Seq Scan is done only once, but nloops is incremented to 3.
>
nloops here indicates, that it is done for 2 workers and a master backend.
> So its "actual rows" is calculated 3333333(10000000 / 3) at explain.c:L1223.
>
Thats how it should be considered. You might want to compare the behaviour with other cases where value of nloops is used.
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Masahiko SawadaДата:
Сообщение: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .
Следующее
От: Masahiko SawadaДата:
Сообщение: Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .