Обсуждение: Why is a hash join preferred when it does not fit in work_mem

Поиск
Список
Период
Сортировка

Why is a hash join preferred when it does not fit in work_mem

От
Dimitrios Apostolou
Дата:
Hello list,

I have a very simple NATURAL JOIN that does not fit in the work_mem.  Why
does the query planner prefer a hash join that needs 361s, while with a
sort operation and a merge join it takes only 13s?

The server is an old Mac Mini with hard disk drive and only 4GB RAM.
Postgres version info:

  PostgreSQL 15.0 on x86_64-apple-darwin20.6.0, compiled by Apple clang
version 12.0.0 (clang-1200.0.32.29), 64-bit

The low work_mem and the disabled memoization are set on purpose, in order
to simplify a complex query, while reproducing the same problem that I
experienced there. This result is the simplest query I could get, where
the optimizer does not go for a faster merge join.

From my point of view a merge join is clearly faster, because the hash
table does not fit in memory and I expect a hash join to do a lot of
random I/O. But the query planner does not see that, and increasing
random_page_cost does not help either. In fact the opposite happens: the
merge join gets a higher cost difference to the hash join, as I increase
the random page cost!



# EXPLAIN (ANALYZE,VERBOSE,BUFFERS,SETTINGS) SELECT * FROM
tasks_mm_workitems NATURAL JOIN workitem_ids;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=121222.68..257633.01 rows=3702994 width=241) (actual
time=184498.464..360606.257 rows=3702994 loops=1)
    Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n,
workitem_ids.workitem_id
    Inner Unique: true
    Hash Cond: (tasks_mm_workitems.workitem_n = workitem_ids.workitem_n)
    Buffers: shared hit=15068 read=47434, temp read=56309 written=56309
    ->  Seq Scan on public.tasks_mm_workitems  (cost=0.00..53488.94
rows=3702994 width=8) (actual time=0.040..1376.084 rows=3702994 loops=1)
          Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
          Buffers: shared read=16459
    ->  Hash  (cost=59780.19..59780.19 rows=1373719 width=237) (actual
time=184361.874..184361.875 rows=1373737 loops=1)
          Output: workitem_ids.workitem_id, workitem_ids.workitem_n
          Buckets: 4096  Batches: 512  Memory Usage: 759kB
          Buffers: shared hit=15068 read=30975, temp written=43092
          ->  Seq Scan on public.workitem_ids  (cost=0.00..59780.19
rows=1373719 width=237) (actual time=0.026..1912.312 rows=1373737 loops=1)
                Output: workitem_ids.workitem_id, workitem_ids.workitem_n
                Buffers: shared hit=15068 read=30975
  Settings: effective_cache_size = '500MB', enable_memoize = 'off',
hash_mem_multiplier = '1', max_parallel_workers_per_gather = '1', work_mem
= '1MB'
  Planning:
    Buffers: shared hit=2 read=6
  Planning Time: 0.568 ms
  Execution Time: 361106.876 ms
(20 rows)


# EXPLAIN (ANALYZE,VERBOSE,BUFFERS,SETTINGS) SELECT * FROM
tasks_mm_workitems NATURAL JOIN workitem_ids;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Merge Join  (cost=609453.49..759407.78 rows=3702994 width=241) (actual time=5062.513..10866.313 rows=3702994 loops=1)
    Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n, workitem_ids.workitem_id
    Merge Cond: (workitem_ids.workitem_n = tasks_mm_workitems.workitem_n)
    Buffers: shared hit=5343 read=66053, temp read=32621 written=32894
    ->  Index Scan using workitem_ids_pkey on public.workitem_ids
(cost=0.43..81815.86 rows=1373719 width=237) (actual time=0.111..1218.363 rows=1373737 loops=1)
          Output: workitem_ids.workitem_n, workitem_ids.workitem_id
          Buffers: shared hit=5310 read=49627
    ->  Materialize  (cost=609372.91..627887.88 rows=3702994 width=8) (actual time=5062.389..7392.640 rows=3702994
loops=1)
          Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
          Buffers: shared hit=33 read=16426, temp read=32621 written=32894
          ->  Sort  (cost=609372.91..618630.40 rows=3702994 width=8) (actual time=5062.378..6068.703 rows=3702994
loops=1)
                Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
                Sort Key: tasks_mm_workitems.workitem_n
                Sort Method: external merge  Disk: 65256kB
                Buffers: shared hit=33 read=16426, temp read=32621 written=32894
                ->  Seq Scan on public.tasks_mm_workitems
(cost=0.00..53488.94 rows=3702994 width=8) (actual time=0.045..1177.202 rows=3702994 loops=1)
                      Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
                      Buffers: shared hit=33 read=16426
  Settings: effective_cache_size = '500MB', enable_hashjoin = 'off', enable_memoize = 'off', hash_mem_multiplier = '1',
max_parallel_workers_per_gather= '1', work_mem = '1MB' 
  Planning:
    Buffers: shared hit=8
  Planning Time: 0.677 ms
  Execution Time: 13364.545 ms
(23 rows)


Thank you in advance,
Dimitris




Re: Why is a hash join preferred when it does not fit in work_mem

От
David Rowley
Дата:
On Fri, 13 Jan 2023 at 07:33, Dimitrios Apostolou <jimis@gmx.net> wrote:
>
> I have a very simple NATURAL JOIN that does not fit in the work_mem.  Why
> does the query planner prefer a hash join that needs 361s, while with a
> sort operation and a merge join it takes only 13s?

It's a simple matter of that the Hash Join plan appears cheaper based
on the costs that the planner has calculated.

A better question to ask would be, where are the costs inaccurate? and why.

One thing I noticed in your EXPLAIN ANALYZE output is that the Index
Scan to workitems_ids costed more expensively than the Seq scan, yet
was faster.

> ->  Seq Scan on public.workitem_ids  (cost=0.00..59780.19 rows=1373719 width=237) (actual time=0.026..1912.312
rows=1373737loops=1)
 

> ->  Index Scan using workitem_ids_pkey on public.workitem_ids (cost=0.43..81815.86 rows=1373719 width=237) (actual
time=0.111..1218.363rows=1373737 loops=1)
 

Perhaps the Seq scan is doing more actual I/O than the index scan is.

> The low work_mem and the disabled memoization are set on purpose, in order
> to simplify a complex query, while reproducing the same problem that I
> experienced there. This result is the simplest query I could get, where
> the optimizer does not go for a faster merge join.
>
> From my point of view a merge join is clearly faster, because the hash
> table does not fit in memory and I expect a hash join to do a lot of
> random I/O. But the query planner does not see that, and increasing
> random_page_cost does not help either. In fact the opposite happens: the
> merge join gets a higher cost difference to the hash join, as I increase
> the random page cost!

I'd expect reducing random_page_cost to make the Mege Join cheaper as
that's where the Index Scan is. I'm not quite sure where you think the
random I/O is coming from in a batched hash join.

It would be interesting to see the same plans with SET track_io_timing
= on; set.  It's possible that there's less *actual* I/O going on with
the Merge Join plan vs the Hash Join plan.  Since we do buffered I/O,
without track_io_timing, we don't know if the read buffers resulted in
an actual disk read or a read from the kernel buffers.

David



Re: Why is a hash join preferred when it does not fit in work_mem

От
Dimitrios Apostolou
Дата:
On Fri, 13 Jan 2023, David Rowley wrote:
>
> I'd expect reducing random_page_cost to make the Mege Join cheaper as
> that's where the Index Scan is. I'm not quite sure where you think the
> random I/O is coming from in a batched hash join.

Thanks for the feedback, indeed you are right! Decreasing random_page_cost
to values way below the default makes the planner prefer the merge join!
This seems strange to me.

Please correct me if I'm wrong, as I'm a newcomer to PostgreSQL, but here
is how I understand things according to posts I've read, and classical
algorithms:

+ The Hash Join is fastest when one side fits in work_mem. Then on one
   hand you have a hash table lookup (amortized O(1)) and on the other
   hand, if the table has M rows that that do not fit in memory, you have
   sequential reads from the disk (given low fragmentation of the table or
   index files):  For every line you read from the disk, you lookup the key
   in the hash table.

   If the hash table does not fit in RAM then the cost becomes prohibitive.
   Every lookup is a random access possibly hitting the disk. The total
   cost should be random_page_cost * M.

+ The Merge Join involves mostly sequential accesses if the disk files are
   not fragmented. It reads sequentially and in parallel from both tables,
   merging the results where the key matches.

   It requires on-disk sorting (because tables don't fit in work_mem), but
   even this operation requires little disk seeking. A merge-sort algorithm
   might have a random access cost of logN * random_page_cost.

So I would expect an increased random_page_cost to benefit the Merge Join
algorithm. And since my setup involves spinning disks, it does makes sense
to increase it.


> It would be interesting to see the same plans with SET track_io_timing
> = on; set.  It's possible that there's less *actual* I/O going on with
> the Merge Join plan vs the Hash Join plan.  Since we do buffered I/O,
> without track_io_timing, we don't know if the read buffers resulted in
> an actual disk read or a read from the kernel buffers.


The database has been VACUUM ANALYZEd first and is otherwise idle.
Every query has been run twice, and I paste here only the 2nd run.


Slow Hash Join:

# EXPLAIN (ANALYZE,VERBOSE,BUFFERS,SETTINGS) SELECT * FROM tasks_mm_workitems NATURAL JOIN workitem_ids;

  Hash Join  (cost=121222.68..257633.01 rows=3702994 width=241) (actual time=145641.295..349682.387 rows=3702994
loops=1)
    Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n, workitem_ids.workitem_id
    Inner Unique: true
    Hash Cond: (tasks_mm_workitems.workitem_n = workitem_ids.workitem_n)
    Buffers: shared hit=12121 read=50381, temp read=56309 written=56309
    I/O Timings: shared/local read=745.925, temp read=162199.307 write=172758.699
    ->  Seq Scan on public.tasks_mm_workitems  (cost=0.00..53488.94 rows=3702994 width=8) (actual time=0.114..1401.896
rows=3702994loops=1) 
          Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
          Buffers: shared hit=65 read=16394
          I/O Timings: shared/local read=183.959
    ->  Hash  (cost=59780.19..59780.19 rows=1373719 width=237) (actual time=145344.555..145344.557 rows=1373737
loops=1)
          Output: workitem_ids.workitem_id, workitem_ids.workitem_n
          Buckets: 4096  Batches: 512  Memory Usage: 759kB
          Buffers: shared hit=12056 read=33987, temp written=43092
          I/O Timings: shared/local read=561.966, temp write=142221.740
          ->  Seq Scan on public.workitem_ids  (cost=0.00..59780.19 rows=1373719 width=237) (actual
time=0.033..1493.652rows=1373737 loops=1) 
                Output: workitem_ids.workitem_id, workitem_ids.workitem_n
                Buffers: shared hit=12056 read=33987
                I/O Timings: shared/local read=561.966
  Settings: effective_cache_size = '500MB', enable_memoize = 'off', hash_mem_multiplier = '1',
max_parallel_workers_per_gather= '1', work_mem = '1MB' 
  Planning:
    Buffers: shared hit=8
  Planning Time: 0.693 ms
  Execution Time: 350290.496 ms
(24 rows)


Fast Merge Join:

# SET enable_hashjoin TO off;
SET

# EXPLAIN (ANALYZE,VERBOSE,BUFFERS,SETTINGS) SELECT * FROM tasks_mm_workitems NATURAL JOIN workitem_ids;

  Merge Join  (cost=609453.49..759407.78 rows=3702994 width=241) (actual time=4602.623..9700.435 rows=3702994 loops=1)
    Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n, workitem_ids.workitem_id
    Merge Cond: (workitem_ids.workitem_n = tasks_mm_workitems.workitem_n)
    Buffers: shared hit=5310 read=66086, temp read=32621 written=32894
    I/O Timings: shared/local read=566.121, temp read=228.063 write=526.739
    ->  Index Scan using workitem_ids_pkey on public.workitem_ids  (cost=0.43..81815.86 rows=1373719 width=237) (actual
time=0.034..1080.800rows=1373737 loops=1) 
          Output: workitem_ids.workitem_n, workitem_ids.workitem_id
          Buffers: shared hit=5310 read=49627
          I/O Timings: shared/local read=448.952
    ->  Materialize  (cost=609372.91..627887.88 rows=3702994 width=8) (actual time=4602.576..6621.072 rows=3702994
loops=1)
          Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
          Buffers: shared read=16459, temp read=32621 written=32894
          I/O Timings: shared/local read=117.168, temp read=228.063 write=526.739
          ->  Sort  (cost=609372.91..618630.40 rows=3702994 width=8) (actual time=4602.569..5414.072 rows=3702994
loops=1)
                Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
                Sort Key: tasks_mm_workitems.workitem_n
                Sort Method: external merge  Disk: 65256kB
                Buffers: shared read=16459, temp read=32621 written=32894
                I/O Timings: shared/local read=117.168, temp read=228.063 write=526.739
                ->  Seq Scan on public.tasks_mm_workitems  (cost=0.00..53488.94 rows=3702994 width=8) (actual
time=0.034..1113.868rows=3702994 loops=1) 
                      Output: tasks_mm_workitems.workitem_n, tasks_mm_workitems.task_n
                      Buffers: shared read=16459
                      I/O Timings: shared/local read=117.168
  Settings: effective_cache_size = '500MB', enable_hashjoin = 'off', enable_memoize = 'off', hash_mem_multiplier = '1',
max_parallel_workers_per_gather= '1', work_mem = '1MB' 
  Planning:
    Buffers: shared hit=2 read=6
    I/O Timings: shared/local read=0.152
  Planning Time: 0.570 ms
  Execution Time: 12165.894 ms
(29 rows)


Regards,
Dimitris




Re: Why is a hash join preferred when it does not fit in work_mem

От
Tom Lane
Дата:
Dimitrios Apostolou <jimis@gmx.net> writes:
> Please correct me if I'm wrong, as I'm a newcomer to PostgreSQL, but here
> is how I understand things according to posts I've read, and classical
> algorithms:

> + The Hash Join is fastest when one side fits in work_mem. Then on one
>    hand you have a hash table lookup (amortized O(1)) and on the other
>    hand, if the table has M rows that that do not fit in memory, you have
>    sequential reads from the disk (given low fragmentation of the table or
>    index files):  For every line you read from the disk, you lookup the key
>    in the hash table.

>    If the hash table does not fit in RAM then the cost becomes prohibitive.
>    Every lookup is a random access possibly hitting the disk. The total
>    cost should be random_page_cost * M.

That would be true of a simple hash join, but Postgres uses batched
hash joins: we split up the hash key space into subsets, where hopefully
each subset includes few enough inner-side rows to fit into work_mem.
While this can go wrong given pathological distribution of the inner-side
keys, it does mean that the join can perform well even when the inner
side is much larger than work_mem.  So it's not the case that the planner
will simply disregard hash joins beyond work_mem.  It will apply a cost
penalty for the predicted batching overhead; but that can still come out
cheaper than merge join, because the sorting needed for merge is generally
also far from cheap.

> So I would expect an increased random_page_cost to benefit the Merge Join
> algorithm. And since my setup involves spinning disks, it does makes sense
> to increase it.

What is probably really happening is that random_page_cost affects the
estimated cost of performing the sort using an index scan instead of
a bespoke sort step.  AFAIR, cost_sort doesn't consider random_page_cost
at all, and neither does cost_hashjoin.

            regards, tom lane



Re: Why is a hash join preferred when it does not fit in work_mem

От
Dimitrios Apostolou
Дата:
On Sat, 14 Jan 2023, Tom Lane wrote:

> Dimitrios Apostolou <jimis@gmx.net> writes:
>> Please correct me if I'm wrong, as I'm a newcomer to PostgreSQL, but here
>> is how I understand things according to posts I've read, and classical
>> algorithms:
>
>> + The Hash Join is fastest when one side fits in work_mem. Then on one
>>    hand you have a hash table lookup (amortized O(1)) and on the other
>>    hand, if the table has M rows that that do not fit in memory, you have
>>    sequential reads from the disk (given low fragmentation of the table or
>>    index files):  For every line you read from the disk, you lookup the key
>>    in the hash table.
>
>>    If the hash table does not fit in RAM then the cost becomes prohibitive.
>>    Every lookup is a random access possibly hitting the disk. The total
>>    cost should be random_page_cost * M.
>
> That would be true of a simple hash join, but Postgres uses batched
> hash joins: we split up the hash key space into subsets, where hopefully
> each subset includes few enough inner-side rows to fit into work_mem.
> While this can go wrong given pathological distribution of the inner-side
> keys, it does mean that the join can perform well even when the inner
> side is much larger than work_mem.  So it's not the case that the planner
> will simply disregard hash joins beyond work_mem.  It will apply a cost
> penalty for the predicted batching overhead;

Thanks for this, I found a page [1] that describes the hash join and
now I understand a bit more.

[1] https://www.interdb.jp/pg/pgsql03.html

I'm not sure whether the key distribution is pathological in my case.
The join condition is:

   Hash Cond: (tasks_mm_workitems.workitem_n = workitem_ids.workitem_n)

and workitem_ids.workitem_n is an integer GENERATED AS IDENTITY and PUBLIC
KEY. The TABLE workitem_ids har 1.7M rows, and the other table has 3.7M
rows. None of them fit in workmem.

In my (simplified and pathological) case of work_mem == 1MB, the hash join
does 512 batches (Buckets: 4,096 Batches: 512 Memory Usage: 759kB). I'm
not sure which hash-merge strategy is followed, but based on that
document, it should be the "hybrid hash join with skew". I don't quite
follow the I/O requirements of this algorithm, yet. :-)

> but that can still come out
> cheaper than merge join, because the sorting needed for merge is generally
> also far from cheap.

I was under the impression that on-disk merge-sort is a relatively cheap
(logN) operation, regarding random I/O.

>
>> So I would expect an increased random_page_cost to benefit the Merge Join
>> algorithm. And since my setup involves spinning disks, it does makes sense
>> to increase it.
>
> What is probably really happening is that random_page_cost affects the
> estimated cost of performing the sort using an index scan instead of
> a bespoke sort step.  AFAIR, cost_sort doesn't consider random_page_cost
> at all, and neither does cost_hashjoin.

On the last EXPLAIN I posted for the forced merge-join, I see that it uses
an index-scan on the "small" table. It makes sense since the join happens
on the primary key of the table. On the large table it does not use an
index scan, because an index doesn't exist for that column. It sorts the
3.7M rows of the table (and FWIW that table only has two integer columns).
If I understood correctly what you meant with "performing the sort using
an index scan".


The problem I see is that the estimated cost of the sort operation is
609,372.91..618,630.40. It's already way above the whole hash-join cost
(121,222.68..257,633.01). However the real timings are very different.
Actual time for Sort is 4,602.569..5,414.072 ms while for the whole hash
join it is 145,641.295..349,682.387 ms.

Am I missing some configuration knobs to put some sense to the planner?


Thanks,
Dimitris





Re: Why is a hash join preferred when it does not fit in work_mem

От
Dimitrios Apostolou
Дата:
Hello again, I am back with new experiments.

First of all, I have a concrete set of steps that replicate the
slowness of the hash join that I described. If you have a system with
spinning disks lying around, I would appreciate if you can verify the
scenario. Can you also replicate it in different kind of systems?


CREATE TABLE descriptions (description_id serial PRIMARY KEY, description text);
INSERT INTO  descriptions (description_id, description)
    SELECT s, repeat(encode(sha512(s::text::bytea), 'hex'), 4)
        FROM generate_series(0,1200300) AS s;
CREATE TABLE descriptions_in_books (description_id integer REFERENCES descriptions(description_id), book_id integer);
INSERT INTO  descriptions_in_books (description_id, book_id)
    SELECT s % 1200300, s
        FROM generate_series(0,5200300) AS s;

SET work_mem TO '1MB';
SET hash_mem_multiplier = 1.0;
SET track_io_timing TO on;

EXPLAIN (ANALYZE,VERBOSE,BUFFERS,SETTINGS)  SELECT * FROM descriptions NATURAL JOIN descriptions_in_books;
SET enable_hashjoin TO off;
EXPLAIN (ANALYZE,VERBOSE,BUFFERS,SETTINGS)  SELECT * FROM descriptions NATURAL JOIN descriptions_in_books;



The first JOIN query uses a hash join with Batches: 1024 and takes 622s!
For the longest part, I can see the disk writing most of the time around
1-2 MB/s, so I have to assume it's not writing sequentially.

The second identical JOIN uses a merge join that completes in 14s. The I/O
happens in a much higher rate (10x maybe), so I'm assuming it's mostly
sequential.

Another observation is that the hash join deteriorates as the length of
the TEXT column grows. In fact, if I fill it with only 32 char long
strings, then the hash join is split in only 128 batches, and it completes
almost as fast as the merge join. Could it be that the cost estimation
is underestimating the I/O pattern related to splitting in batches?


Here are the measurements:


  Hash Join  (cost=192450.84..401456.02 rows=5200486 width=524) (actual time=344516.004..621725.562 rows=5200301
loops=1)
    Output: descriptions.description_id, descriptions.description, descriptions_in_books.book_id
    Inner Unique: true
    Hash Cond: (descriptions_in_books.description_id = descriptions.description_id)
    Buffers: shared hit=15586 read=93161, temp read=97829 written=97829
    I/O Timings: shared/local read=1402.597, temp read=229252.170 write=371508.313
    ->  Seq Scan on public.descriptions_in_books  (cost=0.00..75015.86 rows=5200486 width=8) (actual
time=0.068..1819.629rows=5200301 loops=1) 
          Output: descriptions_in_books.book_id, descriptions_in_books.description_id
          Buffers: shared hit=32 read=22979
          I/O Timings: shared/local read=249.910
    ->  Hash  (cost=97739.04..97739.04 rows=1200304 width=520) (actual time=343268.470..343268.471 rows=1200301
loops=1)
          Output: descriptions.description_id, descriptions.description
          Buckets: 2048  Batches: 1024  Memory Usage: 686kB
          Buffers: shared hit=15554 read=70182, temp written=78538
          I/O Timings: shared/local read=1152.687, temp write=338883.205
          ->  Seq Scan on public.descriptions  (cost=0.00..97739.04 rows=1200304 width=520) (actual
time=0.028..2278.791rows=1200301 loops=1) 
                Output: descriptions.description_id, descriptions.description
                Buffers: shared hit=15554 read=70182
                I/O Timings: shared/local read=1152.687
  Settings: hash_mem_multiplier = '1', work_mem = '1MB'
  Planning Time: 0.303 ms
  Execution Time: 622495.279 ms
(22 rows)


SET enable_hashjoin TO off;


  Merge Join  (cost=868411.87..1079330.96 rows=5200301 width=524) (actual time=6091.932..13304.924 rows=5200301
loops=1)
    Output: descriptions.description_id, descriptions.description, descriptions_in_books.book_id
    Merge Cond: (descriptions.description_id = descriptions_in_books.description_id)
    Buffers: shared hit=67 read=111962 written=1, temp read=45806 written=46189
    I/O Timings: shared/local read=1007.043 write=28.575, temp read=344.937 write=794.483
    ->  Index Scan using descriptions_pkey on public.descriptions
(cost=0.43..116919.99 rows=1200304 width=520) (actual time=0.028..1596.387 rows=1200301 loops=1)
          Output: descriptions.description_id, descriptions.description
          Buffers: shared hit=3 read=89015 written=1
          I/O Timings: shared/local read=834.732 write=28.575
    ->  Materialize  (cost=868408.84..894410.35 rows=5200301 width=8) (actual time=6091.892..9171.796 rows=5200301
loops=1)
          Output: descriptions_in_books.book_id, descriptions_in_books.description_id
          Buffers: shared hit=64 read=22947, temp read=45806 written=46189
          I/O Timings: shared/local read=172.311, temp read=344.937 write=794.483
          ->  Sort  (cost=868408.84..881409.60 rows=5200301 width=8) (actual time=6091.885..7392.828 rows=5200301
loops=1)
                Output: descriptions_in_books.book_id, descriptions_in_books.description_id
                Sort Key: descriptions_in_books.description_id
                Sort Method: external merge  Disk: 91632kB
                Buffers: shared hit=64 read=22947, temp read=45806 written=46189
                I/O Timings: shared/local read=172.311, temp read=344.937 write=794.483
                ->  Seq Scan on public.descriptions_in_books  (cost=0.00..75014.01 rows=5200301 width=8) (actual
time=0.031..1617.520rows=5200301 loops=1) 
                      Output: descriptions_in_books.book_id, descriptions_in_books.description_id
                      Buffers: shared hit=64 read=22947
                      I/O Timings: shared/local read=172.311
  Settings: enable_hashjoin = 'off', hash_mem_multiplier = '1', work_mem = '1MB'
  Planning:
    Buffers: shared hit=10 read=9
    I/O Timings: shared/local read=90.112
  Planning Time: 90.774 ms
  Execution Time: 13955.462 ms
(29 rows)




Regards,
Dimitris