Обсуждение: Join query including two generate_series calls causes big memory growth and crash

Поиск
Список
Период
Сортировка

Join query including two generate_series calls causes big memory growth and crash

От
Jorge Arévalo
Дата:
Hello,

I'm executing this query:

SELECT x, y, another_field FROM generate_series(1, 10) x,
generate_series(1, 10) y, my_table

The field 'another_field' belongs to 'my_table'. And that table has
360000 entries. In a 64 bits machine, with 4GB RAM, Ubuntu 10.10 and
postgres 8.4.7, the query works fine. But in a 32 bits machine, with
1GB RAM, Ubuntu 9.10 and postgres 8.4.7, the query process is killed
after taking about 80% of available memory. In the 64 bits machine the
query takes about 60-70% of the available memory too, but it ends.
And this happens even if I simply get x and y:

SELECT x, y FROM generate_series(1, 10) x, generate_series(1, 10) y, my_table

Is it normal? I mean, postgres has to deal with millions of rows, ok,
but shouldn't it start swapping memory instead of crashing? Is a
question of postgres configuration?

Thanks in advance,

--
Jorge Arévalo
Internet & Mobilty Division, DEIMOS
jorge.arevalo@deimos-space.com
http://es.linkedin.com/in/jorgearevalo80
http://mobility.grupodeimos.com/
http://gis4free.wordpress.com
http://geohash.org/ezjqgrgzz0g

Re: Join query including two generate_series calls causes big memory growth and crash

От
Tom Lane
Дата:
=?ISO-8859-1?Q?Jorge_Ar=E9valo?= <jorge.arevalo@deimos-space.com> writes:
> I'm executing this query:

> SELECT x, y, another_field FROM generate_series(1, 10) x,
> generate_series(1, 10) y, my_table

> The field 'another_field' belongs to 'my_table'. And that table has
> 360000 entries. In a 64 bits machine, with 4GB RAM, Ubuntu 10.10 and
> postgres 8.4.7, the query works fine. But in a 32 bits machine, with
> 1GB RAM, Ubuntu 9.10 and postgres 8.4.7, the query process is killed
> after taking about 80% of available memory. In the 64 bits machine the
> query takes about 60-70% of the available memory too, but it ends.

You mean the backend, or psql?  I don't see any particular backend bloat
when I do that, but psql eats memory because it's trying to absorb and
display the whole query result.

> Is it normal? I mean, postgres has to deal with millions of rows, ok,
> but shouldn't it start swapping memory instead of crashing? Is a
> question of postgres configuration?

Try "\set FETCH_COUNT 1000" or so.

            regards, tom lane

Re: Join query including two generate_series calls causes big memory growth and crash

От
tv@fuzzy.cz
Дата:
> Hello,
>
> I'm executing this query:
>
> SELECT x, y, another_field FROM generate_series(1, 10) x,
> generate_series(1, 10) y, my_table

Well, do you realize this is a cartesian product that gives

10 x 10 x 360000 = 36.000.000

rows in the end. Not sure how wide is the third table (how many columns
etc.) but this may occupy a lot of memory.

> The field 'another_field' belongs to 'my_table'. And that table has
> 360000 entries. In a 64 bits machine, with 4GB RAM, Ubuntu 10.10 and
> postgres 8.4.7, the query works fine. But in a 32 bits machine, with
> 1GB RAM, Ubuntu 9.10 and postgres 8.4.7, the query process is killed
> after taking about 80% of available memory. In the 64 bits machine the
> query takes about 60-70% of the available memory too, but it ends.
> And this happens even if I simply get x and y:
>
> SELECT x, y FROM generate_series(1, 10) x, generate_series(1, 10) y,
> my_table

The result is still 36 million rows, so there's not a big difference I guess.

> Is it normal? I mean, postgres has to deal with millions of rows, ok,
> but shouldn't it start swapping memory instead of crashing? Is a
> question of postgres configuration?

I guess that's the OOM killer, killing one of the processes. See this

http://en.wikipedia.org/wiki/Out_of_memory

so it's a matter of the system, not PostgreSQL - the kernel decides
there's not enough memory, chooses one of the processes and kills it.
PostgreSQL is a victim in this case.

Tomas


Re: Join query including two generate_series calls causes big memory growth and crash

От
Jorge Arévalo
Дата:
On Wed, Apr 20, 2011 at 5:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> =?ISO-8859-1?Q?Jorge_Ar=E9valo?= <jorge.arevalo@deimos-space.com> writes:
>> I'm executing this query:
>
>> SELECT x, y, another_field FROM generate_series(1, 10) x,
>> generate_series(1, 10) y, my_table
>
>> The field 'another_field' belongs to 'my_table'. And that table has
>> 360000 entries. In a 64 bits machine, with 4GB RAM, Ubuntu 10.10 and
>> postgres 8.4.7, the query works fine. But in a 32 bits machine, with
>> 1GB RAM, Ubuntu 9.10 and postgres 8.4.7, the query process is killed
>> after taking about 80% of available memory. In the 64 bits machine the
>> query takes about 60-70% of the available memory too, but it ends.
>
> You mean the backend, or psql?  I don't see any particular backend bloat
> when I do that, but psql eats memory because it's trying to absorb and
> display the whole query result.
>

Yes, the memory eater is psql, not backend.

>> Is it normal? I mean, postgres has to deal with millions of rows, ok,
>> but shouldn't it start swapping memory instead of crashing? Is a
>> question of postgres configuration?
>
> Try "\set FETCH_COUNT 1000" or so.
>
>                        regards, tom lane
>

Thanks for the tip.


Best regards,

--
Jorge Arévalo
Internet & Mobilty Division, DEIMOS
jorge.arevalo@deimos-space.com
http://es.linkedin.com/in/jorgearevalo80
http://mobility.grupodeimos.com/
http://gis4free.wordpress.com
http://geohash.org/ezjqgrgzz0g

Re: Join query including two generate_series calls causes big memory growth and crash

От
Jorge Arévalo
Дата:
2011/4/20  <tv@fuzzy.cz>:
>> Hello,
>>
>> I'm executing this query:
>>
>> SELECT x, y, another_field FROM generate_series(1, 10) x,
>> generate_series(1, 10) y, my_table
>
> Well, do you realize this is a cartesian product that gives
>
> 10 x 10 x 360000 = 36.000.000
>
> rows in the end. Not sure how wide is the third table (how many columns
> etc.) but this may occupy a lot of memory.
>

Yes, I know it. But I expect memory swapping in this situation, not crashing.


>> The field 'another_field' belongs to 'my_table'. And that table has
>> 360000 entries. In a 64 bits machine, with 4GB RAM, Ubuntu 10.10 and
>> postgres 8.4.7, the query works fine. But in a 32 bits machine, with
>> 1GB RAM, Ubuntu 9.10 and postgres 8.4.7, the query process is killed
>> after taking about 80% of available memory. In the 64 bits machine the
>> query takes about 60-70% of the available memory too, but it ends.
>> And this happens even if I simply get x and y:
>>
>> SELECT x, y FROM generate_series(1, 10) x, generate_series(1, 10) y,
>> my_table
>
> The result is still 36 million rows, so there's not a big difference I guess.
>

Yes, silly example. I only wanted to delete my table's field from equation.


>> Is it normal? I mean, postgres has to deal with millions of rows, ok,
>> but shouldn't it start swapping memory instead of crashing? Is a
>> question of postgres configuration?
>
> I guess that's the OOM killer, killing one of the processes. See this
>
> http://en.wikipedia.org/wiki/Out_of_memory
>
> so it's a matter of the system, not PostgreSQL - the kernel decides
> there's not enough memory, chooses one of the processes and kills it.
> PostgreSQL is a victim in this case.
>
> Tomas
>
>

Ok, I got it. Anyway, my question again: could I expect some help from
postgres backend to avoid this situation? Something like "I don't want
to be killed by the OOM killer because one of my threads. I'll try
this..."

Maybe is my responsibility, changing some configuration parameters,
like the "\set FETCH_COUNT 1000" Tomas Lane has suggested...

Thanks again,

--
Jorge Arévalo
Internet & Mobilty Division, DEIMOS
jorge.arevalo@deimos-space.com
http://es.linkedin.com/in/jorgearevalo80
http://mobility.grupodeimos.com/
http://gis4free.wordpress.com
http://geohash.org/ezjqgrgzz0g