Обсуждение: Built-in support for a memory consumption ulimit?
After giving somebody advice, for the Nth time, to install a memory-consumption ulimit instead of leaving his database to the tender mercies of the Linux OOM killer, it occurred to me to wonder why we don't provide a built-in feature for that, comparable to the "ulimit -c max" option that already exists in pg_ctl. A reasonably low-overhead way to do that would be to define it as something a backend process sets once at startup, if told to by a GUC. The GUC could possibly be PGC_BACKEND level though I'm not sure if we want unprivileged users messing with it. Thoughts? regards, tom lane
On Sat, Jun 14, 2014 at 8:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> After giving somebody advice, for the Nth time, to install a
> memory-consumption ulimit instead of leaving his database to the tender
> mercies of the Linux OOM killer, it occurred to me to wonder why we don't
> provide a built-in feature for that, comparable to the "ulimit -c max"
> option that already exists in pg_ctl.
>
> After giving somebody advice, for the Nth time, to install a
> memory-consumption ulimit instead of leaving his database to the tender
> mercies of the Linux OOM killer, it occurred to me to wonder why we don't
> provide a built-in feature for that, comparable to the "ulimit -c max"
> option that already exists in pg_ctl.
Considering that we have quite some stuff which is backend local (prepared
statement cache, pl compiled body cache, etc..) due to which memory
usage can increase and keep on increasing depending on operations
performed by user or due to some bug, I think having such a feature will be
useful. Infact I have heard about such complaints from users.
> A reasonably low-overhead way
> to do that would be to define it as something a backend process sets
> once at startup, if told to by a GUC. The GUC could possibly be
> PGC_BACKEND level though I'm not sure if we want unprivileged users
> messing with it.
Providing such a feature via GUC is a good idea, but I think changing
> to do that would be to define it as something a backend process sets
> once at startup, if told to by a GUC. The GUC could possibly be
> PGC_BACKEND level though I'm not sure if we want unprivileged users
> messing with it.
Providing such a feature via GUC is a good idea, but I think changing
limit for usage of system resources should be allowed to privileged
users.
On 06/16/2014 11:56 AM, Amit Kapila wrote: > On Sat, Jun 14, 2014 at 8:07 PM, Tom Lane <tgl@sss.pgh.pa.us > <mailto:tgl@sss.pgh.pa.us>> wrote: >> >> After giving somebody advice, for the Nth time, to install a >> memory-consumption ulimit instead of leaving his database to the tender >> mercies of the Linux OOM killer, it occurred to me to wonder why we don't >> provide a built-in feature for that, comparable to the "ulimit -c max" >> option that already exists in pg_ctl. > > Considering that we have quite some stuff which is backend local (prepared > statement cache, pl compiled body cache, etc..) due to which memory > usage can increase and keep on increasing depending on operations > performed by user AFTER trigger queues, anybody? Though they're bad enough that they really need to spill to disk, adding a limit for them would be at best a temporary workaround. > Providing such a feature via GUC is a good idea, but I think changing > limit for usage of system resources should be allowed to privileged > users. I don't think we have the facility to do what I'd really like to: Let users lower it, but not raise it above the system provided max. Just like ulimit its self. So SUSET seems OK to me. I don't think it should be PGC_BACKEND, not least because I can see the utility of a superuser-owned SECURITY DEFINER procedure applying system specific policy to who can set what limit. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Jun 14, 2014 at 10:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > After giving somebody advice, for the Nth time, to install a > memory-consumption ulimit instead of leaving his database to the tender > mercies of the Linux OOM killer, it occurred to me to wonder why we don't > provide a built-in feature for that, comparable to the "ulimit -c max" > option that already exists in pg_ctl. A reasonably low-overhead way > to do that would be to define it as something a backend process sets > once at startup, if told to by a GUC. The GUC could possibly be > PGC_BACKEND level though I'm not sure if we want unprivileged users > messing with it. > > Thoughts? What happens if the limit is exceeded? ERROR? FATAL? PANIC? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jun 16, 2014 at 9:08 PM, Robert Haas <robertmhaas@gmail.com> wrote: > What happens if the limit is exceeded? ERROR? FATAL? PANIC? Well presumably it just makes malloc return NULL which causes an ERROR. One advantage to setting it via a GUC is that it might be possible to, for example, raise it automatically in critical sections or during error unwinding. -- greg
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Jun 14, 2014 at 10:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> After giving somebody advice, for the Nth time, to install a >> memory-consumption ulimit instead of leaving his database to the tender >> mercies of the Linux OOM killer, it occurred to me to wonder why we don't >> provide a built-in feature for that, comparable to the "ulimit -c max" >> option that already exists in pg_ctl. A reasonably low-overhead way >> to do that would be to define it as something a backend process sets >> once at startup, if told to by a GUC. The GUC could possibly be >> PGC_BACKEND level though I'm not sure if we want unprivileged users >> messing with it. > What happens if the limit is exceeded? ERROR? FATAL? PANIC? We'd presumably get NULL returns from malloc, which'd be reported as elog(ERROR, "out of memory"). It's not anything new for the code to handle. One issue if we allow this to be set on something other than PGC_BACKEND timing is that somebody might try to reduce the setting to less than his session is currently using. I'm not sure what setrlimit would do in such cases --- possibly fail, or maybe just set a limit that would cause all subsequent malloc's to fail. I doubt it'd actually take away existing memory though. regards, tom lane
I like this feature but am wondering how to use it. If uses one value across all backends, we may have to set it conservatively to avoid OOM killer. But this does not promote resource sharing. If we set it per backend, what's the suggested value? One may is recommending user sorts his queries to small-q group and big-q group, where big-q connection sets a higher ulimit w.r.t work_mem. But this also has a limit mileage.An ideal way is PGC_SIGHUP, which implies all server process added up shall respect this setting and it is adjustable. Not sure how to implement it, as setrlimit() seems not supporting process group (and what about windows?). Even if it does, a small issue is that this might increase the chance we hit OOM at some inconvenient places. For example, here:
/* Special case for startup: use good ol' malloc */
node = (MemoryContext) malloc(needed);
Assert(node != NULL);I wonder how far we want to go along the line. Consider this case: we have some concurrent big-q and med-q, the system may comfortably allowing 1 big-q running with 2 or 3 med-qs to zack the left-over memory. If with query throttling, we hopefully make all queries run successfully without middle-fail-surprises, and ulimit guards the bottomline if anything goes wrong. This may lead to a discussion of more complete workload management support.Regards,
QingqingFrom: Tom LaneDate: 2014-06-14 22:37To: pgsql-hackersSubject: [HACKERS] Built-in support for a memory consumption ulimit?After giving somebody advice, for the Nth time, to install amemory-consumption ulimit instead of leaving his database to the tendermercies of the Linux OOM killer, it occurred to me to wonder why we don'tprovide a built-in feature for that, comparable to the "ulimit -c max"option that already exists in pg_ctl. A reasonably low-overhead wayto do that would be to define it as something a backend process setsonce at startup, if told to by a GUC. The GUC could possibly bePGC_BACKEND level though I'm not sure if we want unprivileged usersmessing with it.Thoughts?regards, tom lane--Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jun 14, 2014 at 10:37:36AM -0400, Tom Lane wrote: > After giving somebody advice, for the Nth time, to install a > memory-consumption ulimit instead of leaving his database to the tender > mercies of the Linux OOM killer, it occurred to me to wonder why we don't > provide a built-in feature for that, comparable to the "ulimit -c max" > option that already exists in pg_ctl. In principle, I would like to have such a feature. The only limit I've found reliable on the occasions I wanted this was RLIMIT_AS; RLIMIT_DATA has been ineffective on Linux+GNU libc. RLIMIT_AS has its own problems, of course: address space usage is only tenuously connected to the definition of "memory consumption" folks actually want. Worse, one can often construct a query to crash an RLIMIT_AS-affected backend. Make the query use heap space until just shy of the limit, then burn stack space until RLIMIT_AS halts stack growth. I would welcome a feature for configuring each RLIMIT_* or some selection thereof. Then it's up to the administrator to know the (possibly platform-specific) benefits and risks of each limit. I don't think a high level "limit memory consumption" feature is within reach, though. > A reasonably low-overhead way > to do that would be to define it as something a backend process sets > once at startup, if told to by a GUC. The GUC could possibly be > PGC_BACKEND level though I'm not sure if we want unprivileged users > messing with it. Letting unprivileged users raise the limit is somewhat like allowing them to raise work_mem. On the other hand, while one can craft queries to consume arbitrary multiples of work_mem, the combination of RLIMIT_AS and CONNECTION LIMIT should be effective to cap a user's memory consumption. Overall, PGC_SUSET sounds good to me, for the reasons Craig gave. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com
On Mon, Jun 16, 2014 at 10:16 PM, Noah Misch <noah@leadboat.com> wrote: > On Sat, Jun 14, 2014 at 10:37:36AM -0400, Tom Lane wrote: >> After giving somebody advice, for the Nth time, to install a >> memory-consumption ulimit instead of leaving his database to the tender >> mercies of the Linux OOM killer, it occurred to me to wonder why we don't >> provide a built-in feature for that, comparable to the "ulimit -c max" >> option that already exists in pg_ctl. > > In principle, I would like to have such a feature. The only limit I've found > reliable on the occasions I wanted this was RLIMIT_AS; RLIMIT_DATA has been > ineffective on Linux+GNU libc. RLIMIT_AS has its own problems, of course: > address space usage is only tenuously connected to the definition of "memory > consumption" folks actually want. Worse, one can often construct a query to > crash an RLIMIT_AS-affected backend. Make the query use heap space until just > shy of the limit, then burn stack space until RLIMIT_AS halts stack growth. Ouch. Having a feature that causes excessive memory utilization to produce an ERROR that halts query execution and returns us to the top level, as Tom and Greg were proposing, sounds nice, though even there I wonder what happens if the memory exhaustion is due to things like relcache bloat which will not be ameliorated by error recovery. But having a feature that crashes the backend in similar circumstances doesn't sound nearly as nice. We could do better by accounting for memory usage ourselves, inside the memory-context system, but that'd probably impose some overhead we don't have today. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > We could do better by accounting for memory usage ourselves, inside > the memory-context system, but that'd probably impose some overhead we > don't have today. Hm. We could minimize the overhead if we just accounted for entire malloc chunks and not individual palloc allocations. That would make the overhead not zero, but yet probably small enough to ignore. On the other hand, this approach would entirely fail to account for non-palloc'd allocations, which could be a significant issue in some contexts. I wonder how practical it would be to forestall Noah's scenario by preallocating all the stack space we want before enabling the rlimit. Another idea would be to do the enforcement ourselves by means of measuring the change in "sbrk(0)"'s reported value since startup, which we could check whenever, say, we're about to request a large malloc chunk in aset.c. I'm not sure about the cost of that function though, nor about whether this measurement method is meaningful in modern systems. regards, tom lane
On Tue, Jun 17, 2014 at 4:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> We could do better by accounting for memory usage ourselves, inside >> the memory-context system, but that'd probably impose some overhead we >> don't have today. > > Hm. We could minimize the overhead if we just accounted for entire > malloc chunks and not individual palloc allocations. That would make > the overhead not zero, but yet probably small enough to ignore. Yeah, although it might expose more details of aset.c's allocation behavior than we want users to have to know about. > Another idea would be to do the enforcement ourselves by means of > measuring the change in "sbrk(0)"'s reported value since startup, which we > could check whenever, say, we're about to request a large malloc chunk in > aset.c. I'm not sure about the cost of that function though, nor about > whether this measurement method is meaningful in modern systems. I wouldn't like to count on that method. I believe some malloc() implementations implement large allocations via mmap(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 18, 2014 at 2:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > We could do better by accounting for memory usage ourselves, inside
> > the memory-context system, but that'd probably impose some overhead we
> > don't have today.
>
> Hm. We could minimize the overhead if we just accounted for entire
> malloc chunks and not individual palloc allocations. That would make
> the overhead not zero, but yet probably small enough to ignore.
>
> On the other hand, this approach would entirely fail to account for
> non-palloc'd allocations, which could be a significant issue in some
> contexts.
Won't it be possible if we convert malloc calls in backend code to
>
> Robert Haas <robertmhaas@gmail.com> writes:
> > We could do better by accounting for memory usage ourselves, inside
> > the memory-context system, but that'd probably impose some overhead we
> > don't have today.
>
> Hm. We could minimize the overhead if we just accounted for entire
> malloc chunks and not individual palloc allocations. That would make
> the overhead not zero, but yet probably small enough to ignore.
>
> On the other hand, this approach would entirely fail to account for
> non-palloc'd allocations, which could be a significant issue in some
> contexts.
Won't it be possible if we convert malloc calls in backend code to
go through wrapper, we already have some precedents of same like
Amit Kapila <amit.kapila16@gmail.com> writes: > On Wed, Jun 18, 2014 at 2:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> On the other hand, this approach would entirely fail to account for >> non-palloc'd allocations, which could be a significant issue in some >> contexts. > Won't it be possible if we convert malloc calls in backend code to > go through wrapper, we already have some precedents of same like > guc_malloc, pg_malloc? We do not have control over mallocs done by third-party code (think pl/perl for example). mallocs done by our own code are fairly insignificant, I would hope. regards, tom lane
On Wed, Jun 18, 2014 at 10:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > Won't it be possible if we convert malloc calls in backend code to
> > go through wrapper, we already have some precedents of same like
> > guc_malloc, pg_malloc?
>
> We do not have control over mallocs done by third-party code
> (think pl/perl for example).
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > Won't it be possible if we convert malloc calls in backend code to
> > go through wrapper, we already have some precedents of same like
> > guc_malloc, pg_malloc?
>
> We do not have control over mallocs done by third-party code
> (think pl/perl for example).
Yeah, mallocs done by third-party code would be difficult to track,
one possibility could be that we expose a built-in memory allocator
function. I think it will lead to change in third-party code who wants
to use this new feature. However if thats not viable then we need to
think about some OS specific calls like the one you have suggested
above (sbrk(0)), but I think that solution might also need to have
portable API for Windows.
On Tue, Jun 17, 2014 at 04:39:51PM -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > We could do better by accounting for memory usage ourselves, inside > > the memory-context system, but that'd probably impose some overhead we > > don't have today. > I wonder how practical it would be to forestall Noah's scenario by > preallocating all the stack space we want before enabling the rlimit. I think that's worth a closer look. Compared to doing our own memory usage tracking, it has the major advantage of isolating the added CPU overhead at backend start. -- Noah Misch EnterpriseDB http://www.enterprisedb.com