Re: Using ProcSignal to get memory context stats from a running backend

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Using ProcSignal to get memory context stats from a running backend
Дата
Msg-id CAMsr+YGh+sso5N6Q+FmYHLWC=BPCzA+5GbhYZSGruj2d0c7Vvg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using ProcSignal to get memory context stats from a runningbackend  (Andres Freund <andres@anarazel.de>)
Ответы Re: Using ProcSignal to get memory context stats from a running backend  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers
On 21 December 2017 at 15:24, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2017-12-21 15:13:13 +0800, Craig Ringer wrote:
> There tons of callers to  enlargeStringInfo, so a 'noerror' parameter would
> be viable.

Not sure what you mean with that sentence?

Mangled in editing and sent prematurely, disregard.

There are NOT tons of callers to  enlargeStringInfo, so adding a parameter that allowed it to return a failure result rather than ERROR on OOM seems to be a reasonable option. But it relies on repalloc(), which will ERROR on OOM. AFAICS we don't have "no error" variants of the memory allocation routines and I'm not eager to add them. Especially since we can't trust that we're not on misconfigured Linux where the OOM killer will go wild instead of giving us a nice NULL result.

So I guess that means we should probably just do individual elog(...)s and live with the ugliness of scraping the resulting mess out of the logs. After all, a log destination that could possibly copy and modify the string being logged a couple of times it's not a good idea to try to drop the whole thing into the logs in one blob. And we can't trust things like syslog with large messages.

I'll follow up with a revised patch that uses individual elog()s soon.

BTW, I also pgindented it in my tree, so it'll have formatting fixed up. pgindent's handling of 

static void printwrapper_stringinfo(void *extra, const char *fmt,...) pg_attribute_printf(2, 3);

upsets me a little, since if I break the line it mangles it into

static void
printwrapper_stringinfo(void *extra, const char *fmt,...)
pg_attribute_printf(2, 3);

so it'll break line-length rules a little, but otherwise conform.

> But I'm not convinced it's worth it personally. If we OOM in response to a
> ProcSignal request for memory context output, we're having pretty bad luck.
> The output is 8k in my test. But even if it were a couple of hundred kb,
> happening to hit OOM just then isn't great luck on modern systems with many
> gigabytes of RAM.

I've seen plenty memory dumps in the dozens to hundreds of
megabytes. And imo such cases are more likely to invite use of this
facility.

OK, that's more concerning then. Also impressively huge.

As written it's using MemoryContextStatsDetail(TopMemoryContext, 100) so it shouldn't really be *getting* multimegabyte dumps, but I'm not thrilled by hardcoding that. It doesn't merit a GUC though, so I'm not sure what to do about it and figured I'd make it a "later" problem.
 
> If that *does* happen, repalloc(...) will call
> MemoryContextStats(TopMemoryContext) before returning NULL. So we'll get
> our memory context dump anyway, albeit to stderr.

That would still abort the query that might otherwise continue to work,
so that seems no excuse.

Eh. It'd probably die soon enough anyway. But yeah, I agree it's not safe to hope we can sling around up to a couple of hundred MB under presumed memory pressure.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Using ProcSignal to get memory context stats from a runningbackend
Следующее
От: Mikael Kjellström
Дата:
Сообщение: Re: != should give error?