Re: memory leak in trigger handling (since PG12)
От | Tomas Vondra |
---|---|
Тема | Re: memory leak in trigger handling (since PG12) |
Дата | |
Msg-id | ee89cce4-6b08-5c06-fd88-4fef5a204875@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: memory leak in trigger handling (since PG12) (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: memory leak in trigger handling (since PG12)
(Andres Freund <andres@anarazel.de>)
|
Список | pgsql-hackers |
On 5/24/23 20:14, Andres Freund wrote: > Hi, > > On 2023-05-23 23:26:42 +0200, Tomas Vondra wrote: >> On 5/23/23 19:14, Andres Freund wrote: >>> Hi, >>> >>> On 2023-05-23 18:23:00 +0200, Tomas Vondra wrote: >>>> This means that for an UPDATE with triggers, we may end up calling this >>>> for each row, possibly multiple bitmaps. And those bitmaps are allocated >>>> in ExecutorState, so won't be freed until the end of the query :-( >>> >>> Ugh. >>> >>> >>> I've wondered about some form of instrumentation to detect such issues >>> before. It's obviously a problem that we can have fairly large leaks, like the >>> one you just discovered, without detecting it for a couple years. It's kinda >>> made worse by the memory context infrastructure, because it hides such issues. >>> >>> Could it help to have a mode where the executor shutdown hook checks how much >>> memory is allocated in ExecutorState and warns if its too much? There's IIRC a >>> few places that allocate large things directly in it, but most of those >>> probably should be dedicated contexts anyway. Something similar could be >>> useful for some other long-lived contexts. >>> >> >> Not sure such simple instrumentation would help much, unfortunately :-( >> >> We only discovered this because the user reported OOM crashes, which >> means the executor didn't get to the shutdown hook at all. Yeah, maybe >> if we had such instrumentation it'd get triggered for milder cases, but >> I'd bet the amount of noise is going to be significant. >> >> For example, there's a nearby thread about hashjoin allocating buffiles >> etc. in ExecutorState - we ended up moving that to a separate context, >> but surely there are more such cases (not just for ExecutorState). > > Yes, that's why I said that we would have to more of those into dedicated > contexts - which is a good idea independent of this issue. > Yeah, I think that's a good idea in principle. > >> The really hard thing was determining what causes the memory leak - the >> simple instrumentation doesn't help with that at all. It tells you there >> might be a leak, but you don't know where did the allocations came from. >> >> What I ended up doing is a simple gdb script that sets breakpoints on >> all palloc/pfree variants, and prints info (including the backtrace) for >> each call on ExecutorState. And then a script that aggregate those to >> identify which backtraces allocated most chunks that were not freed. > > FWIW, for things like this I found "heaptrack" to be extremely helpful. > > E.g. for a reproducer of the problem here, it gave me the attach "flame graph" > of the peak memory usage - after attaching to a running backend and running an > UPDATE triggering the leak.. > > Because the view I screenshotted shows the stacks contributing to peak memory > usage, it works nicely to find "temporary leaks", even if memory is actually > freed after all etc. > That's a nice visualization, but isn't that useful only once you determine there's a memory leak? Which I think is the hard problem. > > >>> Hm. Somehow this doesn't seem quite right. Shouldn't we try to use a shorter >>> lived memory context instead? Otherwise we'll just end up with the same >>> problem in a few years. >>> >> >> I agree using a shorter lived memory context would be more elegant, and >> more in line with how we do things. But it's not clear to me why we'd >> end up with the same problem in a few years with what the patch does. > > Because it sets up the pattern of manual memory management and continues to > run the relevant code within a query-lifetime context. > Oh, you mean someone might add new allocations to this code (or into one of the functions executed from it), and that'd leak again? Yeah, true. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: