Обсуждение: BUG #16363: Memory increases for each table accessed until connection is closed

Поиск
Список
Период
Сортировка

BUG #16363: Memory increases for each table accessed until connection is closed

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16363
Logged by:          Eduardo Barreto Alenxadre
Email address:      sezdocs@gmail.com
PostgreSQL version: 12.2
Operating system:   Ubuntu Linux 19.04
Description:

I have a specific use case when I need a lot of similar tables in my
database, in one case, for example, I have more than 700 tables.

Whenever I insert data in one of these tables within the same connection,
the memory used by that connection process will increase and never be freed
(unless I close the connection), because of that I'm getting Out Of Memory
errors in my backend since the connection eventually consumes the entire
memory available in my system.

Looking around I found this issue in StackOverflow which I think is exactly
the issue I'm having:
https://stackoverflow.com/questions/5587830/postgresql-backend-process-high-memory-usage-issue

I'm not sure if this is a memory leak or simply a cache mechanism of
Postgres, if the later, can I control it somehow to force it to free
memory?

just for completeness, this is a table of the memory usage of a single
Postgres connection process running a script that will simply insert new
data randomly in these 700 tables one at a time.

# Elapsed time   Real (MB)   Virtual (MB)
       0.000        26.453      328.066
      20.017       421.367      766.164
      40.037       889.867     1302.520
      60.057      1305.645     1772.336
      80.077      1681.555     2243.105
     100.093      2001.133     2679.195
     120.109      2333.535     3131.918
     140.129      2622.051     3518.801
     160.141      2887.207     3886.355
     180.157      3209.109     4323.109
     200.177      3414.656     4606.566
     220.197      3414.656     4606.566
     240.218      3414.656     4606.566

After the last line, I got OOM error.


Re: BUG #16363: Memory increases for each table accessed until connection is closed

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> I have a specific use case when I need a lot of similar tables in my
> database, in one case, for example, I have more than 700 tables.
> Whenever I insert data in one of these tables within the same connection,
> the memory used by that connection process will increase and never be freed
> (unless I close the connection), because of that I'm getting Out Of Memory
> errors in my backend since the connection eventually consumes the entire
> memory available in my system.

Yes, Postgres caches some information about every table your session
has accessed, and no you don't have a lot of control over that, and
no it isn't a bug.

Having said that, the amount of memory consumed this way is in the
vicinity of a few tens of KB per table, according to some simple
experiments I just did.  I couldn't get a process touching 700 simple
tables to eat more than ~70MB of space.  So either you are talking about
tables with enormously complex schemas, or there's something else going
on.  (Please note that the SO post you are referencing describes an
attempt to use circa 100000 tables/views in one session, not 700.)

The memory consumption curve you showed looks suspiciously like you have
circa 4GB shared buffers allocated and the reported number is mostly about
how many of those buffers the process has touched so far.  This is a
common misunderstanding; "top" and similar tools tend to be really bad
about accounting for shared memory usage in any useful way.  In any case,
the process you've shown us stopped growing its space consumption
some time ago, so I wonder where the OOM complaint actually came from.

The short answer is that you're probably barking up the wrong tree,
but there's not enough information here to provide any useful guesses
about which is the right tree.  Please see

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

for some hints about submitting a trouble report that's complete
enough to garner useful responses.

            regards, tom lane



Re: BUG #16363: Memory increases for each table accessed untilconnection is closed

От
Eduardo Barreto Alexandre
Дата:
Sending the reply again since it seems that it didn't go through the mailing list

Hello Tom, first of all, thanks for the response!

Sorry to not have added a lot of the needed information, I will try to rectify that in this message.

So, first is what I'm trying to do, in my backend, I store financial data (candlesticks and trades) in Postgres + TimescaleDB, the way I do it is separating each exchange in one database and each market and timeframe in one table, for example, giving that I have `ExchangeA` with 1 market EUR/USD and 4 timeframes, I would get these 5 tables:
candles_eur_usd_one_minute
candles_eur_usd_five_minutes
candles_eur_usd_third_minutes
candles_eur_usd_one_hour
trades_eur_usd

Normally in my backend, I have, per database (exchange) ~750 markets, and ~16 timeframes, so 750 * 16 = 12000 tables.

I noticed that Postgres would continuously increase my memory usage per connection until I got an OOM error, which, in my backend terms, means inserting candles and trades, ~2000 rows per minute. For a system with 16GB, it would take like 1 to 3 hours and then I would get the OOM. For my other system which has 64GB, it would take like 2 days but it would come a time when Postgres would use everything and crash too.

For versions, I was using Postgres 11 with latest TimescaleDB, since I was not sure if this was a TimescaleDB issue, I first created a minimal example that triggered the issue faster (the one I talked in the first e-mail that uses 700 tables) and then tested without TimescaleDB in Postgres 11 and 12. I got the same result in both, so removing TimescaleDB out of the equation.

For the operational system, I'm using ubuntu, but I do test it in my Gentoo machine with the same results, and I also tested it using Postgres in docker.

For postgresql.conf, I was using first one tunned by the TimescaleDB tuner software, but then changed to a default one to see if it would make a change.

My backend is written in Elixir, so I'm using Ecto library to connect with Postgres, Ecto will create a pool of connections that will be alive as long as my backend is alive too. To make it easier for me to test the issue, I limited the number of connections to 1, so I can keep an eye on that connection process memory usage, etc.

> Yes, Postgres caches some information about every table your session
> has accessed, and no you don't have a lot of control over that, and
> no it isn't a bug.

> Having said that, the amount of memory consumed this way is in the
> vicinity of a few tens of KB per table, according to some simple
> experiments I just did.  I couldn't get a process touching 700 simple
> tables to eat more than ~70MB of space.  So either you are talking about
> tables with enormously complex schemas, or there's something else going
> on.  (Please note that the SO post you are referencing describes an
> attempt to use circa 100000 tables/views in one session, not 700.)

Here are the details of one of the tables I'm using to do my test (all of then are equal, just the name is different):
test_dev=# \d+ trades_eur_usd
                                         Table "public.trades_eur_usd"
  Column   |            Type             | Collation | Nullable | Default | Storage | Stats target | Description
-----------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
 id        | integer                     |           | not null |         | plain   |              |
 timestamp | timestamp without time zone |           | not null |         | plain   |              |
 amount    | numeric                     |           | not null |         | main    |              |
 price     | numeric                     |           | not null |         | main    |              |
Indexes:
    "trades_eur_usd_id_desc_index" btree (id DESC)
    "trades_eur_usd_timestamp_desc_index" btree ("timestamp" DESC)
Access method: heap

One thing that I forgot to explain better is what I'm doing to touch those tables. Basically I created a little Elixir backend with Ecto which will keep 1 connection alive for the pool and start a bulk insertion of rows to one of the 700 tables, so, basically, the steps are:
1) Choose one of the tables randomly;
2) Insert in one query 13000 rows to that table;
3) Go back to step 1 and repeat.

This will trigger the issue instantly and will make the memory usage of that connection grow very fast as shown in the last e-mail.

One very important thing to note is that if I change the above algorithm to do the same thing, but always choose the same table in step 1 (so I will do the bulk insertion always in 1 table instead of one of the 700), I will not trigger the issue, memory usage will not go up at all. it only triggers it if I'm doing the insertion in different tables each time.

> The memory consumption curve you showed looks suspiciously like you have
> circa 4GB shared buffers allocated and the reported number is mostly about
> how many of those buffers the process has touched so far.  This is a
> common misunderstanding; "top" and similar tools tend to be really bad
> about accounting for shared memory usage in any useful way.  In any case,
> the process you've shown us stopped growing its space consumption
> some time ago, so I wonder where the OOM complaint actually came from.

> The short answer is that you're probably barking up the wrong tree,
> but there's not enough information here to provide any useful guesses
> about which is the right tree.

I guess you are right, maybe the issue is with some shared buffers cache or something, I tried lowering to the minimum value each possible option in postgresql.conf that I found but still got the same OOM result.

Thanks a lot for the help, and feel free to ask for more information!

Re: BUG #16363: Memory increases for each table accessed untilconnection is closed

От
luis.roberto@siscobra.com.br
Дата:
Sending the reply again since it seems that it didn't go through the mailing list

Hello Tom, first of all, thanks for the response!
Sorry to not have added a lot of the needed information, I will try to rectify that in this message.
So, first is what I'm trying to do, in my backend, I store financial data (candlesticks and trades) in Postgres + TimescaleDB, the way I do it is separating each exchange in one database and each market and timeframe in one table, for example, giving that I have `ExchangeA` with 1 market EUR/USD and 4 timeframes, I would get these 5 tables:
candles_eur_usd_one_minute
candles_eur_usd_five_minutes
candles_eur_usd_third_minutes
candles_eur_usd_one_hour
trades_eur_usd
Normally in my backend, I have, per database (exchange) ~750 markets, and ~16 timeframes, so 750 * 16 = 12000 tables.
I noticed that Postgres would continuously increase my memory usage per connection until I got an OOM error, which, in my backend terms, means inserting candles and trades, ~2000 rows per minute. For a system with 16GB, it would take like 1 to 3 hours and then I would get the OOM. For my other system which has 64GB, it would take like 2 days but it would come a time when Postgres would use everything and crash too.
For versions, I was using Postgres 11 with latest TimescaleDB, since I was not sure if this was a TimescaleDB issue, I first created a minimal example that triggered the issue faster (the one I talked in the first e-mail that uses 700 tables) and then tested without TimescaleDB in Postgres 11 and 12. I got the same result in both, so removing TimescaleDB out of the equation.
For the operational system, I'm using ubuntu, but I do test it in my Gentoo machine with the same results, and I also tested it using Postgres in docker.
For postgresql.conf, I was using first one tunned by the TimescaleDB tuner software, but then changed to a default one to see if it would make a change.
My backend is written in Elixir, so I'm using Ecto library to connect with Postgres, Ecto will create a pool of connections that will be alive as long as my backend is alive too. To make it easier for me to test the issue, I limited the number of connections to 1, so I can keep an eye on that connection process memory usage, etc.

> Yes, Postgres caches some information about every table your session
> has accessed, and no you don't have a lot of control over that, and
> no it isn't a bug.

> Having said that, the amount of memory consumed this way is in the
> vicinity of a few tens of KB per table, according to some simple
> experiments I just did.  I couldn't get a process touching 700 simple
> tables to eat more than ~70MB of space.  So either you are talking about
> tables with enormously complex schemas, or there's something else going
> on.  (Please note that the SO post you are referencing describes an
> attempt to use circa 100000 tables/views in one session, not 700.)

Here are the details of one of the tables I'm using to do my test (all of then are equal, just the name is different):
test_dev=# \d+ trades_eur_usd
                                         Table "public.trades_eur_usd"
  Column   |            Type             | Collation | Nullable | Default | Storage | Stats target | Description
-----------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
 id        | integer                     |           | not null |         | plain   |              |
 timestamp | timestamp without time zone |           | not null |         | plain   |              |
 amount    | numeric                     |           | not null |         | main    |              |
 price     | numeric                     |           | not null |         | main    |              |
Indexes:
    "trades_eur_usd_id_desc_index" btree (id DESC)
    "trades_eur_usd_timestamp_desc_index" btree ("timestamp" DESC)
Access method: heap
One thing that I forgot to explain better is what I'm doing to touch those tables. Basically I created a little Elixir backend with Ecto which will keep 1 connection alive for the pool and start a bulk insertion of rows to one of the 700 tables, so, basically, the steps are:
1) Choose one of the tables randomly;
2) Insert in one query 13000 rows to that table;
3) Go back to step 1 and repeat.
This will trigger the issue instantly and will make the memory usage of that connection grow very fast as shown in the last e-mail.
One very important thing to note is that if I change the above algorithm to do the same thing, but always choose the same table in step 1 (so I will do the bulk insertion always in 1 table instead of one of the 700), I will not trigger the issue, memory usage will not go up at all. it only triggers it if I'm doing the insertion in different tables each time.

> The memory consumption curve you showed looks suspiciously like you have
> circa 4GB shared buffers allocated and the reported number is mostly about
> how many of those buffers the process has touched so far.  This is a
> common misunderstanding; "top" and similar tools tend to be really bad
> about accounting for shared memory usage in any useful way.  In any case,
> the process you've shown us stopped growing its space consumption
> some time ago, so I wonder where the OOM complaint actually came from.

> The short answer is that you're probably barking up the wrong tree,
> but there's not enough information here to provide any useful guesses
> about which is the right tree.

I guess you are right, maybe the issue is with some shared buffers cache or something, I tried lowering to the minimum value each possible option in postgresql.conf that I found but still got the same OOM result.
Thanks a lot for the help, and feel free to ask for more information!
Can you try to run it without a connection pool? just a simple connection to the backend.

Re: BUG #16363: Memory increases for each table accessed untilconnection is closed

От
Eduardo Barreto Alexandre
Дата:
Unfortunately, same result :(

On Thu, Apr 16, 2020 at 11:43 AM <luis.roberto@siscobra.com.br> wrote:
Sending the reply again since it seems that it didn't go through the mailing list

Hello Tom, first of all, thanks for the response!
Sorry to not have added a lot of the needed information, I will try to rectify that in this message.
So, first is what I'm trying to do, in my backend, I store financial data (candlesticks and trades) in Postgres + TimescaleDB, the way I do it is separating each exchange in one database and each market and timeframe in one table, for example, giving that I have `ExchangeA` with 1 market EUR/USD and 4 timeframes, I would get these 5 tables:
candles_eur_usd_one_minute
candles_eur_usd_five_minutes
candles_eur_usd_third_minutes
candles_eur_usd_one_hour
trades_eur_usd
Normally in my backend, I have, per database (exchange) ~750 markets, and ~16 timeframes, so 750 * 16 = 12000 tables.
I noticed that Postgres would continuously increase my memory usage per connection until I got an OOM error, which, in my backend terms, means inserting candles and trades, ~2000 rows per minute. For a system with 16GB, it would take like 1 to 3 hours and then I would get the OOM. For my other system which has 64GB, it would take like 2 days but it would come a time when Postgres would use everything and crash too.
For versions, I was using Postgres 11 with latest TimescaleDB, since I was not sure if this was a TimescaleDB issue, I first created a minimal example that triggered the issue faster (the one I talked in the first e-mail that uses 700 tables) and then tested without TimescaleDB in Postgres 11 and 12. I got the same result in both, so removing TimescaleDB out of the equation.
For the operational system, I'm using ubuntu, but I do test it in my Gentoo machine with the same results, and I also tested it using Postgres in docker.
For postgresql.conf, I was using first one tunned by the TimescaleDB tuner software, but then changed to a default one to see if it would make a change.
My backend is written in Elixir, so I'm using Ecto library to connect with Postgres, Ecto will create a pool of connections that will be alive as long as my backend is alive too. To make it easier for me to test the issue, I limited the number of connections to 1, so I can keep an eye on that connection process memory usage, etc.

> Yes, Postgres caches some information about every table your session
> has accessed, and no you don't have a lot of control over that, and
> no it isn't a bug.

> Having said that, the amount of memory consumed this way is in the
> vicinity of a few tens of KB per table, according to some simple
> experiments I just did.  I couldn't get a process touching 700 simple
> tables to eat more than ~70MB of space.  So either you are talking about
> tables with enormously complex schemas, or there's something else going
> on.  (Please note that the SO post you are referencing describes an
> attempt to use circa 100000 tables/views in one session, not 700.)

Here are the details of one of the tables I'm using to do my test (all of then are equal, just the name is different):
test_dev=# \d+ trades_eur_usd
                                         Table "public.trades_eur_usd"
  Column   |            Type             | Collation | Nullable | Default | Storage | Stats target | Description
-----------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
 id        | integer                     |           | not null |         | plain   |              |
 timestamp | timestamp without time zone |           | not null |         | plain   |              |
 amount    | numeric                     |           | not null |         | main    |              |
 price     | numeric                     |           | not null |         | main    |              |
Indexes:
    "trades_eur_usd_id_desc_index" btree (id DESC)
    "trades_eur_usd_timestamp_desc_index" btree ("timestamp" DESC)
Access method: heap
One thing that I forgot to explain better is what I'm doing to touch those tables. Basically I created a little Elixir backend with Ecto which will keep 1 connection alive for the pool and start a bulk insertion of rows to one of the 700 tables, so, basically, the steps are:
1) Choose one of the tables randomly;
2) Insert in one query 13000 rows to that table;
3) Go back to step 1 and repeat.
This will trigger the issue instantly and will make the memory usage of that connection grow very fast as shown in the last e-mail.
One very important thing to note is that if I change the above algorithm to do the same thing, but always choose the same table in step 1 (so I will do the bulk insertion always in 1 table instead of one of the 700), I will not trigger the issue, memory usage will not go up at all. it only triggers it if I'm doing the insertion in different tables each time.

> The memory consumption curve you showed looks suspiciously like you have
> circa 4GB shared buffers allocated and the reported number is mostly about
> how many of those buffers the process has touched so far.  This is a
> common misunderstanding; "top" and similar tools tend to be really bad
> about accounting for shared memory usage in any useful way.  In any case,
> the process you've shown us stopped growing its space consumption
> some time ago, so I wonder where the OOM complaint actually came from.

> The short answer is that you're probably barking up the wrong tree,
> but there's not enough information here to provide any useful guesses
> about which is the right tree.

I guess you are right, maybe the issue is with some shared buffers cache or something, I tried lowering to the minimum value each possible option in postgresql.conf that I found but still got the same OOM result.
Thanks a lot for the help, and feel free to ask for more information!
Can you try to run it without a connection pool? just a simple connection to the backend.


--

Re: BUG #16363: Memory increases for each table accessed untilconnection is closed

От
Tomas Vondra
Дата:
I think you need to fix your mail system, because clearly it's broken in
some strange way. If you look at [1] you'll see the messages indeed got
to the mailing list so either you're not receiving messages from the
list or it's getting stashed in a spam folder or something.

[1] https://www.postgresql.org/message-id/16363-a66916ab04716e58%40postgresql.org

On Thu, Apr 16, 2020 at 06:23:33PM -0300, Eduardo Barreto Alexandre wrote:
>Unfortunately, same result :(
>
>On Thu, Apr 16, 2020 at 11:43 AM <luis.roberto@siscobra.com.br> wrote:
>
>> Sending the reply again since it seems that it didn't go through the
>> mailing list
>>
>>
>> Hello Tom, first of all, thanks for the response!
>> Sorry to not have added a lot of the needed information, I will try to
>> rectify that in this message.
>> So, first is what I'm trying to do, in my backend, I store financial data
>> (candlesticks and trades) in Postgres + TimescaleDB, the way I do it is
>> separating each exchange in one database and each market and timeframe in
>> one table, for example, giving that I have `ExchangeA` with 1 market
>> EUR/USD and 4 timeframes, I would get these 5 tables:
>> candles_eur_usd_one_minute
>> candles_eur_usd_five_minutes
>> candles_eur_usd_third_minutes
>> candles_eur_usd_one_hour
>> trades_eur_usd
>> Normally in my backend, I have, per database (exchange) ~750 markets, and
>> ~16 timeframes, so 750 * 16 = 12000 tables.
>> I noticed that Postgres would continuously increase my memory usage per
>> connection until I got an OOM error, which, in my backend terms, means
>> inserting candles and trades, ~2000 rows per minute. For a system with
>> 16GB, it would take like 1 to 3 hours and then I would get the OOM. For my
>> other system which has 64GB, it would take like 2 days but it would come a
>> time when Postgres would use everything and crash too.
>> For versions, I was using Postgres 11 with latest TimescaleDB, since I was
>> not sure if this was a TimescaleDB issue, I first created a minimal example
>> that triggered the issue faster (the one I talked in the first e-mail that
>> uses 700 tables) and then tested without TimescaleDB in Postgres 11 and 12.
>> I got the same result in both, so removing TimescaleDB out of the equation.
>> For the operational system, I'm using ubuntu, but I do test it in my
>> Gentoo machine with the same results, and I also tested it using Postgres
>> in docker.
>> For postgresql.conf, I was using first one tunned by the TimescaleDB tuner
>> software, but then changed to a default one to see if it would make a
>> change.
>> My backend is written in Elixir, so I'm using Ecto library to connect with
>> Postgres, Ecto will create a pool of connections that will be alive as long
>> as my backend is alive too. To make it easier for me to test the issue, I
>> limited the number of connections to 1, so I can keep an eye on that
>> connection process memory usage, etc.
>>
>> > Yes, Postgres caches some information about every table your session
>> > has accessed, and no you don't have a lot of control over that, and
>> > no it isn't a bug.
>>
>> > Having said that, the amount of memory consumed this way is in the
>> > vicinity of a few tens of KB per table, according to some simple
>> > experiments I just did.  I couldn't get a process touching 700 simple
>> > tables to eat more than ~70MB of space.  So either you are talking about
>> > tables with enormously complex schemas, or there's something else going
>> > on.  (Please note that the SO post you are referencing describes an
>> > attempt to use circa 100000 tables/views in one session, not 700.)
>>
>> Here are the details of one of the tables I'm using to do my test (all of
>> then are equal, just the name is different):
>> test_dev=# \d+ trades_eur_usd
>>                                          Table "public.trades_eur_usd"
>>   Column   |            Type             | Collation | Nullable | Default
>> | Storage | Stats target | Description
>>
>> -----------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
>>  id        | integer                     |           | not null |
>> | plain   |              |
>>  timestamp | timestamp without time zone |           | not null |
>> | plain   |              |
>>  amount    | numeric                     |           | not null |
>> | main    |              |
>>  price     | numeric                     |           | not null |
>> | main    |              |
>> Indexes:
>>     "trades_eur_usd_id_desc_index" btree (id DESC)
>>     "trades_eur_usd_timestamp_desc_index" btree ("timestamp" DESC)
>> Access method: heap
>> One thing that I forgot to explain better is what I'm doing to touch those
>> tables. Basically I created a little Elixir backend with Ecto which will
>> keep 1 connection alive for the pool and start a bulk insertion of rows to
>> one of the 700 tables, so, basically, the steps are:
>> 1) Choose one of the tables randomly;
>> 2) Insert in one query 13000 rows to that table;
>> 3) Go back to step 1 and repeat.
>> This will trigger the issue instantly and will make the memory usage of
>> that connection grow very fast as shown in the last e-mail.
>> One very important thing to note is that if I change the above algorithm
>> to do the same thing, but always choose the same table in step 1 (so I will
>> do the bulk insertion always in 1 table instead of one of the 700), I will
>> not trigger the issue, memory usage will not go up at all. it only triggers
>> it if I'm doing the insertion in different tables each time.
>>
>> > The memory consumption curve you showed looks suspiciously like you have
>> > circa 4GB shared buffers allocated and the reported number is mostly
>> about
>> > how many of those buffers the process has touched so far.  This is a
>> > common misunderstanding; "top" and similar tools tend to be really bad
>> > about accounting for shared memory usage in any useful way.  In any case,
>> > the process you've shown us stopped growing its space consumption
>> > some time ago, so I wonder where the OOM complaint actually came from.
>>
>> > The short answer is that you're probably barking up the wrong tree,
>> > but there's not enough information here to provide any useful guesses
>> > about which is the right tree.
>>
>> I guess you are right, maybe the issue is with some shared buffers cache
>> or something, I tried lowering to the minimum value each possible option in
>> postgresql.conf that I found but still got the same OOM result.
>> Thanks a lot for the help, and feel free to ask for more information!
>>
>> Can you try to run it without a connection pool? just a simple connection
>> to the backend.
>>
>
>
>-- 
>Eduardo Barreto Alexandre
>
>http://lattes.cnpq.br/0983072410155605

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: BUG #16363: Memory increases for each table accessed untilconnection is closed

От
Eduardo Barreto Alexandre
Дата:
Actually, I take it back, I did the 1 connection wrong.. Doing the connection directly didn't trigger the problem.

Thanks Luis, your suggestion made me close the scope in the issue! It was actually a cache done by default by Ecto (Elixir's database library), I just created an issue in Ecto's github https://github.com/elixir-ecto/ecto/issues/3284 and got the response confirming it and also the fix to disable it.

Thanks you all very much for the help!