Обсуждение: BUG #18172: High memory usage in tSRF function context

Поиск
Список
Период
Сортировка

BUG #18172: High memory usage in tSRF function context

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18172
Logged by:          Sergei Kornilov
Email address:      sk@zsrv.org
PostgreSQL version: Unsupported/Unknown
Operating system:   various Linux
Description:

Hello

While researching several cases of OOM, I discovered an atypically high
memory consumption in this case:

/*
 * test data looks like {"pay": [{"id": 1, "test": 1}], "name": "foo1"}
 * about 1,3GiB total
 */
create table test (id serial, data jsonb);
insert into test (data) select jsonb_build_object('name', 'foo'||i, 'pay',
jsonb_build_array(jsonb_build_object('id', i, 'test', i % 10))) from
generate_series(1,1e7,1) as i;

Using gdb I called MemoryContextStats(TopPortalContext) on break point
standard_ExecutorEnd

statement: select distinct (jsonb_array_elements(data->'pay2')->>'test')
from test;
TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
  PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408 used:
<unnamed>
    ExecutorState: 16384 total in 2 blocks; 1760 free (5 chunks); 14624
used
      HashAgg meta context: 24576 total in 2 blocks; 13880 free (2 chunks);
10696 used
        ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240
used
      tSRF function arguments: 1442840576 total in 182 blocks; 2833096 free
(356 chunks); 1440007480 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
Grand total: 1442956288 bytes in 196 blocks; 2920648 free (363 chunks);
1440035640 used

statement: select distinct (jsonb_array_elements(data->'pay2')->>'test')
from test where data ? 'pay2';
TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
  PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408 used:
<unnamed>
    ExecutorState: 32768 total in 3 blocks; 16680 free (6 chunks); 16088
used
      HashAgg meta context: 57408 total in 2 blocks; 5744 free (0 chunks);
51664 used
        ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240
used
      tSRF function arguments: 8192 total in 1 blocks; 7952 free (0 chunks);
240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
Grand total: 173120 bytes in 16 blocks; 102288 free (6 chunks); 70832 used

statement: select distinct (jsonb_array_elements(data->'pay')->>'test') from
test;
TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
  PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408 used:
<unnamed>
    ExecutorState: 32768 total in 3 blocks; 17912 free (10 chunks); 14856
used
      HashAgg meta context: 1581120 total in 2 blocks; 5744 free (0 chunks);
1575376 used
        ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240
used
      tSRF function arguments: 8192 total in 1 blocks; 7952 free (0 chunks);
240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7472 free (0 chunks); 720 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
      ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
Grand total: 1696832 bytes in 16 blocks; 103040 free (10 chunks); 1593792
used

The first query consumes much more memory; note that it refers to a
non-existent json key. Querying for a key that exists in JSON or checking
for the existence of a key in where clause corrects memory consumption. But,
unfortunately, data analysts write the query exactly like the first
statement, not like the second statement.

Queries, of course, are corrected later, but maybe it is possible to improve
memory consumption for this case?

MemoryContextStats above are from my local 17devel
(83510534d5f3f116efa035639b9b62b8c6c4df34 commit), occurs for all supported
versions.

regards, Sergei


Re: BUG #18172: High memory usage in tSRF function context

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> While researching several cases of OOM, I discovered an atypically high
> memory consumption in this case:
> ...
> The first query consumes much more memory; note that it refers to a
> non-existent json key. Querying for a key that exists in JSON or checking
> for the existence of a key in where clause corrects memory consumption.

Thanks for the report!  It seems that ExecProjectSet neglected to
think hard about what happens when the SRF returns zero rows, so
it leaks whatever memory the SRF expression might have leaked.
The attached fixes it for me.

            regards, tom lane

diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index b4bbdc89b1..307ac591be 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -72,20 +72,22 @@ ExecProjectSet(PlanState *pstate)
             return resultSlot;
     }

-    /*
-     * Reset argument context to free any expression evaluation storage
-     * allocated in the previous tuple cycle.  Note this can't happen until
-     * we're done projecting out tuples from a scan tuple, as ValuePerCall
-     * functions are allowed to reference the arguments for each returned
-     * tuple.
-     */
-    MemoryContextReset(node->argcontext);
-
     /*
      * Get another input tuple and project SRFs from it.
      */
     for (;;)
     {
+        /*
+         * Reset argument context to free any expression evaluation storage
+         * allocated in the previous tuple cycle.  Note this can't happen
+         * until we're done projecting out tuples from a scan tuple, as
+         * ValuePerCall functions are allowed to reference the arguments for
+         * each returned tuple.  However, if we loop around after finding that
+         * no rows are produced from a scan tuple, we should reset, as the SRF
+         * might well leak memory in that case.
+         */
+        MemoryContextReset(node->argcontext);
+
         /*
          * Retrieve tuples from the outer plan until there are no more.
          */

Re:BUG #18172: High memory usage in tSRF function context

От
Sergei Kornilov
Дата:
Hi

Thank you! The patch works for me too.

regards, Sergei



Re: BUG #18172: High memory usage in tSRF function context

От
Tom Lane
Дата:
Sergei Kornilov <sk@zsrv.org> writes:
> Thank you! The patch works for me too.

Thanks for testing!  I looked at ExecProjectSet a second time
and realized that I probably still hadn't thought hard enough,
because it's also responsible for doing ResetExprContext(econtext)
and that wasn't happening either when looping around after the
SRF returns no rows.  So if the SRF itself leaks some memory
in its CurrentMemoryContext and then returns nothing, we can
see bloat in that context.  I failed to produce such a behavior
with jsonb_array_elements, but after trying some other things
I found a test case that still leaked even with the initial
patch:

create table zed as
  select repeat('xyzzy', 10000) as f1, repeat('g', 10000) as flags
  from generate_series(1,100000);

select regexp_matches(f1, 'q', flags) from zed;

The leak here stems from regexp_matches detoasting its flags
argument in the CurrentMemoryContext.  A big flags argument
is surely not real-world usage, but perhaps there are other
cases that are more likely to happen.

The fix of course is just to make sure we also do ResetExprContext
when looping around.  Applied to all branches.

            regards, tom lane



Re:BUG #18172: High memory usage in tSRF function context

От
Sergei Kornilov
Дата:
Great! Thanks!

regards, Sergei