Обсуждение: Update statement results in Out of memory
Hello,
I am trying to update a column using a PostGIS ST_Buffer function into a table of 4.257.769 rows, but after 6 hours, an Out of memory error appears and the kernel starts killing processes until a Kernel Panic shows up.
I have simplified the buffer target geometry and also added a gist index to that column.
The statement is the following:
psql -h host -U user -W -d database -c "UPDATE table SET buffer = ST_Buffer(simplified_geometry, 0.005);"
After reading and tunning the configuration, I still have the same result.
Here's the initial memory stats:
total used free shared buff/cache available
Mem: 15G 1.5G 12G 503M 1.4G 13G
Swap: 7.8G 0B 7.8G
I'm running out of ideas, as I think the postgresql.conf memory parameters are quite low for the machine specs. I understand I can split the process and paginate the rows, but I can't see why I can't deal with this full statement right now.
Do you think this issue is related with the postgres memory parameters configuration? Why is not respecting the shared_buffers or effective_cache_size parameters and keeps growing?
Here's some info:
Machine specs
- Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
- 16 GB of memory
- Fedora release 23 (Twenty Three)
- Kernel - 4.5.7-202.fc23.x86_64
postgresql.conf
- effective_cache_size = 5GB
- shared_buffers = 3GB
- work_mem = 10MB
- maintenance_work_mem = 800MB
- wal_buffers = 16MB
Kernel parameters
- vm.overcommit_memory=2
- kernel.shmmax = 8340893696
- kernel.shmall = 2036351
Versions:
- PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), 64-bit
- POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04 March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3" LIBJSON="0.12" RASTER
Many thanks,
Ivan
On 07/06/2016 02:13 AM, Ivan Bianchi wrote: > Hello, > > I am trying to update a column using a PostGIS ST_Buffer > <http://postgis.net/docs/ST_Buffer.html> function into a table > of 4.257.769 rows, but after 6 hours, an /Out of memory/ error appears > and the kernel starts killing processes until a /Kernel Panic/ shows up. > > I have simplified the buffer target geometry and also added a gist index > to that column. > > The statement is the following: > > psql -h host -U user -W -d database -c "UPDATE table SET buffer = > ST_Buffer(simplified_geometry, 0.005);" I would say the issue is the above, you are running in a single transaction. Given that an UPDATE in Postgres is a DELETE/INSERT and that both the new and old rows have to be kept around until the transaction completes I see only problems with doing it this way. > > > After reading and tunning the configuration, I still have the same result. > > Here's the initial memory stats: > > total used freeshared buff/cache > available > Mem: 15G 1.5G 12G 503M 1.4G > 13G > Swap: 7.8G 0B 7.8G > > > > I'm running out of ideas, as I think the postgresql.conf memory > parameters are quite low for the machine specs. I understand I can split > the process and paginate the rows, but I can't see why I can't deal with > this full statement right now. See above. > > Do you think this issue is related with the postgres memory parameters > configuration? Why is not respecting the shared_buffers or > effective_cache_size parameters and keeps growing? > > > Here's some info: > * > * > _Machine specs_ > > * Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores) > * 16 GB of memory > * Fedora release 23 (Twenty Three) > * Kernel - 4.5.7-202.fc23.x86_64 > > _postgresql.conf_ > > * effective_cache_size = 5GB > * shared_buffers = 3GB > * work_mem = 10MB > > * maintenance_work_mem = 800MB > * wal_buffers = 16MB > > _Kernel parameters_ > > * vm.overcommit_memory=2 > > * kernel.shmmax = 8340893696 > * kernel.shmall = 2036351 > > _Versions:_ > > * PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1 > 20160406 (Red Hat 5.3.1-6), 64-bit > * POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. > 4.9.1, 04 March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" > LIBXML="2.9.3" LIBJSON="0.12" RASTER > > > Many thanks, > > -- > Ivan -- Adrian Klaver adrian.klaver@aklaver.com
Running a multi-million row update will take a long time. It's possible you've exposed a memory leak in ST_Buffer (the older your version of GEOS, the more likely that is) but it's also possible you're just running a really long update. I find for batch processing purposes that creating fresh tables is far preferable: CREATE TABLE newtable AS SELECT ST_Buffer(geom) ... FROM oldtable; If you still see memory issues with the above then you probably do have a leak, *or* you're just running buffer on a sufficiently large input geometry or with a large enough radius to blow up the memory naturally. P On Wed, Jul 6, 2016 at 2:13 AM, Ivan Bianchi <ivan@wikiloc.com> wrote: > Hello, > > I am trying to update a column using a PostGIS ST_Buffer function into a > table of 4.257.769 rows, but after 6 hours, an Out of memory error appears > and the kernel starts killing processes until a Kernel Panic shows up. > > I have simplified the buffer target geometry and also added a gist index to > that column. > > The statement is the following: >> >> psql -h host -U user -W -d database -c "UPDATE table SET buffer = >> ST_Buffer(simplified_geometry, 0.005);" > > > After reading and tunning the configuration, I still have the same result. > > Here's the initial memory stats: > >> total used free shared buff/cache available >> Mem: 15G 1.5G 12G 503M 1.4G >> 13G >> Swap: 7.8G 0B 7.8G > > > > I'm running out of ideas, as I think the postgresql.conf memory parameters > are quite low for the machine specs. I understand I can split the process > and paginate the rows, but I can't see why I can't deal with this full > statement right now. > > Do you think this issue is related with the postgres memory parameters > configuration? Why is not respecting the shared_buffers or > effective_cache_size parameters and keeps growing? > > > Here's some info: > > Machine specs > > Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores) > 16 GB of memory > Fedora release 23 (Twenty Three) > Kernel - 4.5.7-202.fc23.x86_64 > > postgresql.conf > > effective_cache_size = 5GB > shared_buffers = 3GB > work_mem = 10MB > > maintenance_work_mem = 800MB > wal_buffers = 16MB > > Kernel parameters > > vm.overcommit_memory=2 > > kernel.shmmax = 8340893696 > kernel.shmall = 2036351 > > Versions: > > PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1 > 20160406 (Red Hat 5.3.1-6), 64-bit > POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04 > March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3" > LIBJSON="0.12" RASTER > > > Many thanks, > > -- > Ivan
You could check the max number of points in your geometries :
SELECT max(ST_NumPoints(geom))
FROM ...
Of course you could still have invalid / abberant geometry,
which you could also check (ST_IsValid, St_IsSimple).
You could solve both those hypotheses if you could perform your buffer by batch.
Cheers,
Rémi-C
2016-07-06 15:36 GMT+02:00 Paul Ramsey <pramsey@cleverelephant.ca>:
Running a multi-million row update will take a long time.
It's possible you've exposed a memory leak in ST_Buffer (the older
your version of GEOS, the more likely that is) but it's also possible
you're just running a really long update.
I find for batch processing purposes that creating fresh tables is far
preferable:
CREATE TABLE newtable AS SELECT ST_Buffer(geom) ... FROM oldtable;
If you still see memory issues with the above then you probably do
have a leak, *or* you're just running buffer on a sufficiently large
input geometry or with a large enough radius to blow up the memory
naturally.
P
On Wed, Jul 6, 2016 at 2:13 AM, Ivan Bianchi <ivan@wikiloc.com> wrote:
> Hello,
>
> I am trying to update a column using a PostGIS ST_Buffer function into a
> table of 4.257.769 rows, but after 6 hours, an Out of memory error appears
> and the kernel starts killing processes until a Kernel Panic shows up.
>
> I have simplified the buffer target geometry and also added a gist index to
> that column.
>
> The statement is the following:
>>
>> psql -h host -U user -W -d database -c "UPDATE table SET buffer =
>> ST_Buffer(simplified_geometry, 0.005);"
>
>
> After reading and tunning the configuration, I still have the same result.
>
> Here's the initial memory stats:
>
>> total used free shared buff/cache available
>> Mem: 15G 1.5G 12G 503M 1.4G
>> 13G
>> Swap: 7.8G 0B 7.8G
>
>
>
> I'm running out of ideas, as I think the postgresql.conf memory parameters
> are quite low for the machine specs. I understand I can split the process
> and paginate the rows, but I can't see why I can't deal with this full
> statement right now.
>
> Do you think this issue is related with the postgres memory parameters
> configuration? Why is not respecting the shared_buffers or
> effective_cache_size parameters and keeps growing?
>
>
> Here's some info:
>
> Machine specs
>
> Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
> 16 GB of memory
> Fedora release 23 (Twenty Three)
> Kernel - 4.5.7-202.fc23.x86_64
>
> postgresql.conf
>
> effective_cache_size = 5GB
> shared_buffers = 3GB
> work_mem = 10MB
>
> maintenance_work_mem = 800MB
> wal_buffers = 16MB
>
> Kernel parameters
>
> vm.overcommit_memory=2
>
> kernel.shmmax = 8340893696
> kernel.shmall = 2036351
>
> Versions:
>
> PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1
> 20160406 (Red Hat 5.3.1-6), 64-bit
> POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04
> March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3"
> LIBJSON="0.12" RASTER
>
>
> Many thanks,
>
> --
> Ivan--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Thanks for all for the good feedback,
I realize that there's no problem with the configuration, and I assume that the full update statement is a bad decision by itself, so I made a batch process with single updates statements which commits every 50 single operation.
What I can discover now is that a memory leak appeared with a specific geometry of 6624 points. The memory starts falling dramatically untill the system collapses. So now, I can assume that the hanging was caused by the ST_Buffer function in that geometry (which I have been able to trace thanks to the split). I've already checked that the geometry is valid with ST_isValid(), so I should ask to the PostGIS people for further information.
So my learning here is that a very large single transaction is a bad practice by itself, not only for the (more than probably) memory errors, but for the loss of tracking in case of error.
Best regards,
Ivan
2016-07-06 15:42 GMT+02:00 Rémi Cura <remi.cura@gmail.com>:
You could check the max number of points in your geometries :SELECT max(ST_NumPoints(geom))FROM ...Of course you could still have invalid / abberant geometry,which you could also check (ST_IsValid, St_IsSimple).You could solve both those hypotheses if you could perform your buffer by batch.Cheers,Rémi-C2016-07-06 15:36 GMT+02:00 Paul Ramsey <pramsey@cleverelephant.ca>:Running a multi-million row update will take a long time.
It's possible you've exposed a memory leak in ST_Buffer (the older
your version of GEOS, the more likely that is) but it's also possible
you're just running a really long update.
I find for batch processing purposes that creating fresh tables is far
preferable:
CREATE TABLE newtable AS SELECT ST_Buffer(geom) ... FROM oldtable;
If you still see memory issues with the above then you probably do
have a leak, *or* you're just running buffer on a sufficiently large
input geometry or with a large enough radius to blow up the memory
naturally.
P
On Wed, Jul 6, 2016 at 2:13 AM, Ivan Bianchi <ivan@wikiloc.com> wrote:
> Hello,
>
> I am trying to update a column using a PostGIS ST_Buffer function into a
> table of 4.257.769 rows, but after 6 hours, an Out of memory error appears
> and the kernel starts killing processes until a Kernel Panic shows up.
>
> I have simplified the buffer target geometry and also added a gist index to
> that column.
>
> The statement is the following:
>>
>> psql -h host -U user -W -d database -c "UPDATE table SET buffer =
>> ST_Buffer(simplified_geometry, 0.005);"
>
>
> After reading and tunning the configuration, I still have the same result.
>
> Here's the initial memory stats:
>
>> total used free shared buff/cache available
>> Mem: 15G 1.5G 12G 503M 1.4G
>> 13G
>> Swap: 7.8G 0B 7.8G
>
>
>
> I'm running out of ideas, as I think the postgresql.conf memory parameters
> are quite low for the machine specs. I understand I can split the process
> and paginate the rows, but I can't see why I can't deal with this full
> statement right now.
>
> Do you think this issue is related with the postgres memory parameters
> configuration? Why is not respecting the shared_buffers or
> effective_cache_size parameters and keeps growing?
>
>
> Here's some info:
>
> Machine specs
>
> Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (8 cores)
> 16 GB of memory
> Fedora release 23 (Twenty Three)
> Kernel - 4.5.7-202.fc23.x86_64
>
> postgresql.conf
>
> effective_cache_size = 5GB
> shared_buffers = 3GB
> work_mem = 10MB
>
> maintenance_work_mem = 800MB
> wal_buffers = 16MB
>
> Kernel parameters
>
> vm.overcommit_memory=2
>
> kernel.shmmax = 8340893696
> kernel.shmall = 2036351
>
> Versions:
>
> PostgreSQL 9.5.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 5.3.1
> 20160406 (Red Hat 5.3.1-6), 64-bit
> POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.1, 04
> March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.9.3"
> LIBJSON="0.12" RASTER
>
>
> Many thanks,
>
> --
> Ivan--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Ivan