Обсуждение: Re: Query never completes with low work_mem (at least notwithin one hour)
Re: Query never completes with low work_mem (at least notwithin one hour)
От
Daniel Westermann
Дата:
>> I have set work_mem to a very low value intentionally for demonstration
>> purposes:
>>
>> postgres=# show work_mem;
>> work_mem
>> ----------
>> 16MB
>> (1 row)
>>
>> postgres=# show shared_buffers ;
>> shared_buffers
>> ----------------
>> 128MB
>> (1 row)
>>
>>
>> When I run the following query ( I know that "not in" is not a good choice
>> here ):
>>
>> postgres=# select count(user_id) from users where user_id not in ( select id
>> from ids);
>"NOT IN" where the predate is a table column can lead to very poor
>query plans especially where the haystack is not provably known (at
>plan time) to contain only not null values. By reducing work_mem, the
>server has decided has to repeatedly search the table to search for
>the presence of null values. Try converting the query to NOT EXISTS.
>> purposes:
>>
>> postgres=# show work_mem;
>> work_mem
>> ----------
>> 16MB
>> (1 row)
>>
>> postgres=# show shared_buffers ;
>> shared_buffers
>> ----------------
>> 128MB
>> (1 row)
>>
>>
>> When I run the following query ( I know that "not in" is not a good choice
>> here ):
>>
>> postgres=# select count(user_id) from users where user_id not in ( select id
>> from ids);
>"NOT IN" where the predate is a table column can lead to very poor
>query plans especially where the haystack is not provably known (at
>plan time) to contain only not null values. By reducing work_mem, the
>server has decided has to repeatedly search the table to search for
>the presence of null values. Try converting the query to NOT EXISTS.
Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is going here. Why does the server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem?
Regards
Daniel
2017-04-05 8:57 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:
>> I have set work_mem to a very low value intentionally for demonstration
>> purposes:
>>
>> postgres=# show work_mem;
>> work_mem
>> ----------
>> 16MB
>> (1 row)
>>
>> postgres=# show shared_buffers ;
>> shared_buffers
>> ----------------
>> 128MB
>> (1 row)
>>
>>
>> When I run the following query ( I know that "not in" is not a good choice
>> here ):
>>
>> postgres=# select count(user_id) from users where user_id not in ( select id
>> from ids);
>"NOT IN" where the predate is a table column can lead to very poor
>query plans especially where the haystack is not provably known (at
>plan time) to contain only not null values. By reducing work_mem, the
>server has decided has to repeatedly search the table to search for
>the presence of null values. Try converting the query to NOT EXISTS.Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is going here. Why does the server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem?
what is result of EXPLAIN statement for slow and fast cases?
regards
Pavel
RegardsDaniel
Re: Query never completes with low work_mem (at least notwithin one hour)
От
Daniel Westermann
Дата:
>>what is result of EXPLAIN statement for slow and fast cases?
>>
>>regards
>>>>Pavel
For work_mem=32MB
explain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 ms
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 ms
For work_mem='16MB' it does not complete with analyze in on hour. For explain only:
explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)
2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:
>>what is result of EXPLAIN statement for slow and fast cases?>>>>regards>>>>PavelFor work_mem=32MBexplain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
------------------------------------------------------------ ------------------------------ ------------------------------ --------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 msFor work_mem='16MB' it does not complete with analyze in on hour. For explain only:explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
------------------------------------------------------------ ----------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)
There is a materialize op more
do you have a index on ids.id?
Pavel
Re: Query never completes with low work_mem (at least notwithin one hour)
От
Daniel Westermann
Дата:
2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:
>>what is result of EXPLAIN statement for slow and fast cases?>>>>regards>>>>PavelFor work_mem=32MBexplain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 msFor work_mem='16MB' it does not complete with analyze in on hour. For explain only:explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)
>There is a materialize op more
>
>do you have a index on ids.id?
Yes:
\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)
2017-04-05 10:13 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:
2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>: >>what is result of EXPLAIN statement for slow and fast cases?>>>>regards>>>>PavelFor work_mem=32MBexplain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
------------------------------------------------------------ ------------------------------ ------------------------------ --------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 msFor work_mem='16MB' it does not complete with analyze in on hour. For explain only:explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
------------------------------------------------------------ ----------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)Yes:\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)
hmm .. NOT IN is just bad :(
The second is slow becase table ids is stored in temp file. and it is repeatedly read from file. In first case, ids table is stored in memory.
SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids where id = user_id)
Regards
Pavel
Re: Query never completes with low work_mem (at least notwithin one hour)
От
Daniel Westermann
Дата:
2017-04-05 10:13 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:
2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:>>what is result of EXPLAIN statement for slow and fast cases?>>>>regards>>>>PavelFor work_mem=32MBexplain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 msFor work_mem='16MB' it does not complete with analyze in on hour. For explain only:explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)Yes:\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)
>>hmm .. NOT IN is just bad :(
>>
>>The second is slow becase table ids is stored in temp file. and it is repeatedly read from file. In first case, ids table is stored in memory.
>>
>>SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids where id = user_id)
>>
Yes, really bad :) ... and I still do not get it. Even when reading from the tempfile all the time it should at least complete within one hour, shouldn't it? The tables are not so big:
select * from pg_size_pretty ( pg_relation_size ('ids' ));
pg_size_pretty
----------------
35 MB
(1 row)
pg_size_pretty
----------------
35 MB
(1 row)
select * from pg_size_pretty ( pg_relation_size ('users' ));
pg_size_pretty
----------------
195 MB
(1 row)
pg_size_pretty
----------------
195 MB
(1 row)
2017-04-05 10:33 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:
1500000 * few ms ~ big time2017-04-05 10:13 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>: 2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>: >>what is result of EXPLAIN statement for slow and fast cases?>>>>regards>>>>PavelFor work_mem=32MBexplain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
------------------------------------------------------------ ------------------------------ ------------------------------ --------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 msFor work_mem='16MB' it does not complete with analyze in on hour. For explain only:explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
------------------------------------------------------------ ----------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)Yes:\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)>>hmm .. NOT IN is just bad :(>>>>The second is slow becase table ids is stored in temp file. and it is repeatedly read from file. In first case, ids table is stored in memory.>>>>SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids where id = user_id)>>Yes, really bad :) ... and I still do not get it. Even when reading from the tempfile all the time it should at least complete within one hour, shouldn't it? The tables are not so big:select * from pg_size_pretty ( pg_relation_size ('ids' ));
pg_size_pretty
----------------
35 MB
(1 row)select * from pg_size_pretty ( pg_relation_size ('users' ));
pg_size_pretty
----------------
195 MB
(1 row)
Re: Query never completes with low work_mem (at least notwithin one hour)
От
Daniel Westermann
Дата:
2017-04-05 10:33 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:
>> 1500000 * few ms ~ big time2017-04-05 10:13 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-services.com>:>>what is result of EXPLAIN statement for slow and fast cases?>>>>regards>>>>PavelFor work_mem=32MBexplain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)
Time: 8244.493 msFor work_mem='16MB' it does not complete with analyze in on hour. For explain only:explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)Yes:\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)>>hmm .. NOT IN is just bad :(>>>>The second is slow becase table ids is stored in temp file. and it is repeatedly read from file. In first case, ids table is stored in memory.>>>>SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids where id = user_id)>>Yes, really bad :) ... and I still do not get it. Even when reading from the tempfile all the time it should at least complete within one hour, shouldn't it? The tables are not so big:select * from pg_size_pretty ( pg_relation_size ('ids' ));
pg_size_pretty
----------------
35 MB
(1 row)select * from pg_size_pretty ( pg_relation_size ('users' ));
pg_size_pretty
----------------
195 MB
(1 row)
Ok got it
Thanks
Pavel
Daniel Westermann <daniel.westermann@dbi-services.com> writes: > Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is goinghere. Why does the server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem? The core point is that one plan is using a hashed subplan and the other is not, because the planner estimated that the hashtable wouldn't fit into work_mem. With a hashtable you'll have one probe into the hashtable per outer row, and each probe is O(1) unless you are unlucky about data distributions, so the runtime is more or less linear. Without a hashtable, the inner table is rescanned for each outer row, so the runtime is O(N^2) which gets pretty bad pretty fast. "Materializing" the inner table doesn't really help: it gets rid of per-inner-row visibility checks and some buffer locking overhead, so it cuts the constant factor some, but the big-O situation is still disastrous. regards, tom lane
Re: Query never completes with low work_mem (at least notwithin one hour)
От
Daniel Westermann
Дата:
Daniel Westermann <daniel.westermann@dbi-services.com> writes:
>> Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is going here. Why does the ><server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem?
>The core point is that one plan is using a hashed subplan and the other is
>not, because the planner estimated that the hashtable wouldn't fit into
>work_mem. With a hashtable you'll have one probe into the hashtable per
>outer row, and each probe is O(1) unless you are unlucky about data
>distributions, so the runtime is more or less linear. Without a
>hashtable, the inner table is rescanned for each outer row, so the
>runtime is O(N^2) which gets pretty bad pretty fast. "Materializing"
>the inner table doesn't really help: it gets rid of per-inner-row
>visibility checks and some buffer locking overhead, so it cuts the
>constant factor some, but the big-O situation is still disastrous.
Thanks, Tom
>> Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is going here. Why does the ><server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem?
>The core point is that one plan is using a hashed subplan and the other is
>not, because the planner estimated that the hashtable wouldn't fit into
>work_mem. With a hashtable you'll have one probe into the hashtable per
>outer row, and each probe is O(1) unless you are unlucky about data
>distributions, so the runtime is more or less linear. Without a
>hashtable, the inner table is rescanned for each outer row, so the
>runtime is O(N^2) which gets pretty bad pretty fast. "Materializing"
>the inner table doesn't really help: it gets rid of per-inner-row
>visibility checks and some buffer locking overhead, so it cuts the
>constant factor some, but the big-O situation is still disastrous.
Thanks, Tom