Обсуждение: Logging in LockBufferForCleanup()
Hi, I just noticed somewhat new code in LockBufferForCleanup(), added in commit 39b03690b529935a3c33024ee68f08e2d347cf4f Author: Fujii Masao <fujii@postgresql.org> Date: 2021-01-13 22:59:17 +0900 Log long wait time on recovery conflict when it's resolved. commit 64638ccba3a659d8b8a3a4bc5b47307685a64a8a Author: Fujii Masao <fujii@postgresql.org> Date: 2020-03-30 17:35:03 +0900 Report waiting via PS while recovery is waiting for buffer pin in hot standby. After those commit LockBufferForCleanup() contains code doing memory allocations, elogs. That doesn't strike me as a good idea: Previously the code looked somewhat safe to use in critical section like blocks (although whether it'd be good idea to use in one is a different question), but not after. Even if not used in a critical section, adding new failure conditions to low-level code that's holding LWLocks etc. doesn't seem like a good idea. Secondly, the way it's done seems like a significant laying violation. Before the HS related code in LockBufferForCleanup() was encapsulated in a few calls to routines dedicated to dealing with that. Now it's all over LockBufferForCleanup(). It also just increases the overhead of LockBuffer(). Adding palloc(), copying of process title, GetCurrentTimestamp() to a low level routine like this isn't free - even if it's mostly in the contended paths. Greetings, Andres Freund
On Wed, Feb 09, 2022 at 06:22:05PM -0800, Andres Freund wrote: > Previously the code looked somewhat safe to use in critical section like > blocks (although whether it'd be good idea to use in one is a different > question), but not after. Even if not used in a critical section, adding new > failure conditions to low-level code that's holding LWLocks etc. doesn't seem > like a good idea. This is an interesting point. Would the addition of one or more critical sections in this area impact its performance in any way? > It also just increases the overhead of LockBuffer(). Adding palloc(), copying > of process title, GetCurrentTimestamp() to a low level routine like this isn't > free - even if it's mostly in the contended paths. Good point. -- Michael
Вложения
On Thu, Feb 10, 2022 at 11:22 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > I just noticed somewhat new code in LockBufferForCleanup(), added in > > commit 39b03690b529935a3c33024ee68f08e2d347cf4f > Author: Fujii Masao <fujii@postgresql.org> > Date: 2021-01-13 22:59:17 +0900 > > Log long wait time on recovery conflict when it's resolved. > > commit 64638ccba3a659d8b8a3a4bc5b47307685a64a8a > Author: Fujii Masao <fujii@postgresql.org> > Date: 2020-03-30 17:35:03 +0900 > > Report waiting via PS while recovery is waiting for buffer pin in hot standby. > > > After those commit LockBufferForCleanup() contains code doing memory > allocations, elogs. That doesn't strike me as a good idea: > > Previously the code looked somewhat safe to use in critical section like > blocks (although whether it'd be good idea to use in one is a different > question), but not after. Even if not used in a critical section, adding new > failure conditions to low-level code that's holding LWLocks etc. doesn't seem > like a good idea. > > Secondly, the way it's done seems like a significant laying violation. Before > the HS related code in LockBufferForCleanup() was encapsulated in a few calls > to routines dedicated to dealing with that. Now it's all over > LockBufferForCleanup(). > > It also just increases the overhead of LockBuffer(). Adding palloc(), copying > of process title, GetCurrentTimestamp() to a low level routine like this isn't > free - even if it's mostly in the contended paths. While I understand that these are theoretically valid concerns, it’s unclear to me how much the current code leads to bad effects in practice. There are other low-level codes/paths that call palloc() while holding LWLocks, and I’m not sure that the overheads result in visible negative performance impact particularly because the calling to LockBufferForCleanup() is likely to accompany wait in the first place. BTW I think calling to LockBufferForCleanup() in a critical section is a bad idea for sure since it becomes uninterruptible. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
On 2022/02/10 20:28, Masahiko Sawada wrote: > While I understand that these are theoretically valid concerns, it’s > unclear to me how much the current code leads to bad effects in > practice. There are other low-level codes/paths that call palloc() > while holding LWLocks, and I’m not sure that the overheads result in > visible negative performance impact particularly because the calling > to LockBufferForCleanup() is likely to accompany wait in the first > place. I'm also not sure that. palloc() for PS display, copy of process title, GetCurrentTimestamp() and ereport() of recovery conflicthappen when we fail to acquire the lock. In this case we also call ResolveRecoveryConflictWithBufferPin() and waitto be signaled there. So compared to the wait time or overhead caused in ResolveRecoveryConflictWithBufferPin(), ISTMthat the overhead caused by them would be ralatively small. Also ereport() of recovery conflict can happen only whenlog_recovery_conflict_waits parameter is enabled and the deadlock timeout happens during the lock wait. > BTW I think calling to LockBufferForCleanup() in a critical > section is a bad idea for sure since it becomes uninterruptible. Yes. And as far as I tested, currently LockBufferForCleanup() is not called in critical section. Right? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION