Re: remaining sql/json patches

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: remaining sql/json patches
Дата
Msg-id CA+HiwqGYkYjXe2g0=c==BGSpVQfMJZBhPE-vesX3xCz4enBYPg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: remaining sql/json patches  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: remaining sql/json patches  (jian he <jian.universality@gmail.com>)
Re: remaining sql/json patches  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
Hi Tomas,

On Wed, Mar 6, 2024 at 6:30 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Hi,
>
> I know very little about sql/json and all the json internals, but I
> decided to do some black box testing. I built a large JSONB table
> (single column, ~7GB of data after loading). And then I did a query
> transforming the data into tabular form using JSON_TABLE.
>
> The JSON_TABLE query looks like this:
>
> SELECT jt.* FROM
>   title_jsonb t,
>   json_table(t.info, '$'
>     COLUMNS (
>       "id" text path '$."id"',
>       "type" text path '$."type"',
>       "title" text path '$."title"',
>       "original_title" text path '$."original_title"',
>       "is_adult" text path '$."is_adult"',
>       "start_year" text path '$."start_year"',
>       "end_year" text path '$."end_year"',
>       "minutes" text path '$."minutes"',
>       "genres" text path '$."genres"',
>       "aliases" text path '$."aliases"',
>       "directors" text path '$."directors"',
>       "writers" text path '$."writers"',
>       "ratings" text path '$."ratings"',
>       NESTED PATH '$."aliases"[*]'
>         COLUMNS (
>           "alias_title" text path '$."title"',
>           "alias_region" text path '$."region"'
>         ),
>       NESTED PATH '$."directors"[*]'
>         COLUMNS (
>           "director_name" text path '$."name"',
>           "director_birth_year" text path '$."birth_year"',
>           "director_death_year" text path '$."death_year"'
>         ),
>       NESTED PATH '$."writers"[*]'
>         COLUMNS (
>           "writer_name" text path '$."name"',
>           "writer_birth_year" text path '$."birth_year"',
>           "writer_death_year" text path '$."death_year"'
>         ),
>       NESTED PATH '$."ratings"[*]'
>         COLUMNS (
>           "rating_average" text path '$."average"',
>           "rating_votes" text path '$."votes"'
>         )
>     )
>   ) as jt;
>
> again, not particularly complex. But if I run this, it consumes multiple
> gigabytes of memory, before it gets killed by OOM killer. This happens
> even when ran using
>
>   COPY (...) TO '/dev/null'
>
> so there's nothing sent to the client. I did catch memory context info,
> where it looks like this (complete stats attached):
>
> ------
> TopMemoryContext: 97696 total in 5 blocks; 13056 free (11 chunks);
>                   84640 used
>   ...
>   TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); ...
>     PortalContext: 1024 total in 1 blocks; 560 free (0 chunks); ...
>       ExecutorState: 2541764672 total in 314 blocks; 6528176 free
>                      (1208 chunks); 2535236496 used
>         printtup: 8192 total in 1 blocks; 7952 free (0 chunks); ...
>         ...
> ...
> Grand total: 2544132336 bytes in 528 blocks; 7484504 free
>              (1340 chunks); 2536647832 used
> ------
>
> I'd say 2.5GB in ExecutorState seems a bit excessive ... Seems there's
> some memory management issue? My guess is we're not releasing memory
> allocated while parsing the JSON or building JSON output.
>
> I'm not attaching the data, but I can provide that if needed - it's
> about 600MB compressed. The structure is not particularly complex, it's
> movie info from [1] combined into a JSON document (one per movie).

Thanks for the report.

Yeah, I'd like to see the data to try to drill down into what's piling
up in ExecutorState.  I want to be sure of if the 1st, query functions
patch, is not implicated in this, because I'd like to get that one out
of the way sooner than later.

--
Thanks, Amit Langote



В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: [PoC] Improve dead tuple storage for lazy vacuum
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Switching XLog source from archive to streaming when primary available