Обсуждение: checkpointer code behaving strangely on postmaster -T
I noticed while doing some tests that the checkpointer process does not recover very nicely after a backend crashes under postmaster -T (after all processes have been kill -CONTd, of course, and postmaster told to shutdown via Ctrl-C on its console). For some reason it seems to get stuck on a loop doing sleep(0.5s) In other case I caught it trying to do a checkpoint, but it was progressing a single page each time and then sleeping. In that condition, the checkpoint took a very long time to finish. Pressing Ctrl-C in the postmaster console at that point does not have any effect. -- Álvaro Herrera <alvherre@alvh.no-ip.org>
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > I noticed while doing some tests that the checkpointer process does not > recover very nicely after a backend crashes under postmaster -T (after > all processes have been kill -CONTd, of course, and postmaster told to > shutdown via Ctrl-C on its console). For some reason it seems to get > stuck on a loop doing sleep(0.5s) In other case I caught it trying to > do a checkpoint, but it was progressing a single page each time and then > sleeping. In that condition, the checkpoint took a very long time to > finish. Is this still a problem as of HEAD? I think I've fixed some issues in the checkpointer's outer loop logic, but not sure if what you saw is still there. regards, tom lane
Excerpts from Tom Lane's message of jue may 10 02:27:32 -0400 2012: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > I noticed while doing some tests that the checkpointer process does not > > recover very nicely after a backend crashes under postmaster -T (after > > all processes have been kill -CONTd, of course, and postmaster told to > > shutdown via Ctrl-C on its console). For some reason it seems to get > > stuck on a loop doing sleep(0.5s) In other case I caught it trying to > > do a checkpoint, but it was progressing a single page each time and then > > sleeping. In that condition, the checkpoint took a very long time to > > finish. > > Is this still a problem as of HEAD? I think I've fixed some issues in > the checkpointer's outer loop logic, but not sure if what you saw is > still there. Yep, it's still there as far as I can tell. A backtrace from the checkpointer shows it's waiting on the latch. It seems to me that the bug is in the postmaster state machine rather than checkpointer itself. After a few false starts, this seems to fix it: --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2136,6 +2136,8 @@ pmdie(SIGNAL_ARGS) signal_child(WalWriterPID, SIGTERM); if (BgWriterPID!= 0) signal_child(BgWriterPID, SIGTERM); + if (FatalError && CheckpointerPID != 0) + signal_child(CheckpointerPID, SIGUSR2); /* * If we're in recovery, we can'tkill the startup process @@ -2178,6 +2180,8 @@ pmdie(SIGNAL_ARGS) signal_child(WalReceiverPID, SIGTERM); if (BgWriterPID !=0) signal_child(BgWriterPID, SIGTERM); + if (FatalError && CheckpointerPID != 0) + signal_child(CheckpointerPID, SIGUSR2); if (pmState == PM_RECOVERY) { /*only checkpointer is active in this state */ Note that since checkpointer can only be running after we enter FatalError when the -T (send SIGSTOP instead of SIGQUIT) switch is used, this bug doesn't seem to affect normal usage. (I'm not sure SIGUSR2 is the most appropriate signal to send at this time -- since we're in FatalError, probably SIGQUIT is better suited.) One good thing is that when I patched postmaster in a different way (which I later realized to be bogus), I caused it to die with an assertion while checkpointer was still running; the debug output let me know that checkpointer went away immediately. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Tom Lane's message of jue may 10 02:27:32 -0400 2012: >> Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > I noticed while doing some tests that the checkpointer process does not > recover very nicely after a backend crashes under postmaster -T > It seems to me that the bug is in the postmaster state machine rather > than checkpointer itself. After a few false starts, this seems to fix > it: > --- a/src/backend/postmaster/postmaster.c > +++ b/src/backend/postmaster/postmaster.c > @@ -2136,6 +2136,8 @@ pmdie(SIGNAL_ARGS) > signal_child(WalWriterPID, SIGTERM); > if (BgWriterPID != 0) > signal_child(BgWriterPID, SIGTERM); > + if (FatalError && CheckpointerPID != 0) > + signal_child(CheckpointerPID, SIGUSR2); Surely we do not want the checkpointer doing a shutdown checkpoint here. If we need it to die immediately, SIGQUIT is the way. If we want a shutdown checkpoint, that has to wait till after everything else is known dead. So while I agree this may be a state machine bug, that doesn't look like a good fix. regards, tom lane
On 10 May 2012 16:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Excerpts from Tom Lane's message of jue may 10 02:27:32 -0400 2012: >>> Alvaro Herrera <alvherre@alvh.no-ip.org> writes: >> I noticed while doing some tests that the checkpointer process does not >> recover very nicely after a backend crashes under postmaster -T > >> It seems to me that the bug is in the postmaster state machine rather >> than checkpointer itself. After a few false starts, this seems to fix >> it: > >> --- a/src/backend/postmaster/postmaster.c >> +++ b/src/backend/postmaster/postmaster.c >> @@ -2136,6 +2136,8 @@ pmdie(SIGNAL_ARGS) >> signal_child(WalWriterPID, SIGTERM); >> if (BgWriterPID != 0) >> signal_child(BgWriterPID, SIGTERM); >> + if (FatalError && CheckpointerPID != 0) >> + signal_child(CheckpointerPID, SIGUSR2); > > Surely we do not want the checkpointer doing a shutdown checkpoint here. > If we need it to die immediately, SIGQUIT is the way. If we want a > shutdown checkpoint, that has to wait till after everything else is > known dead. So while I agree this may be a state machine bug, that > doesn't look like a good fix. Is this now fixed? You've made a few changes so I'm confused. Thanks. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Tom Lane's message of jue may 10 02:27:32 -0400 2012: >> Alvaro Herrera <alvherre@alvh.no-ip.org> writes: >>> I noticed while doing some tests that the checkpointer process does not >>> recover very nicely after a backend crashes under postmaster -T (after >>> all processes have been kill -CONTd, of course, and postmaster told to >>> shutdown via Ctrl-C on its console). For some reason it seems to get >>> stuck on a loop doing sleep(0.5s) In other case I caught it trying to >>> do a checkpoint, but it was progressing a single page each time and then >>> sleeping. In that condition, the checkpoint took a very long time to >>> finish. >> Is this still a problem as of HEAD? I think I've fixed some issues in >> the checkpointer's outer loop logic, but not sure if what you saw is >> still there. > Yep, it's still there as far as I can tell. A backtrace from the > checkpointer shows it's waiting on the latch. I'm confused about what you did here and whether this isn't just pilot error. If you run with -T then the postmaster will just SIGSTOP the remaining child processes, but then it will sit and wait for them to die, since the state machine expects them to react as though they'd been sent SIGQUIT. If you SIGCONT any of them then that process will resume, totally ignorant that it's supposed to die. So "kill -CONTd, of course" makes no sense to me. I tried killing one child with -KILL, then sending SIGINT to the postmaster, then killing the remaining already-stopped children, and the postmaster did exit as expected after the last child died. So I don't see any bug here. And, after closer inspection, your previous proposed patch is quite bogus because checkpointer is not supposed to stop yet when the other processes are being terminated normally. Possibly it'd be useful to teach the postmaster more thoroughly about SIGSTOP and have a way for it to really kill the remaining children after you've finished investigating their state. But frankly this is the first time I've heard of anybody using that feature at all; I always thought it was a vestigial hangover from days when the kernel was too stupid to write separate core dump files for each backend. I'd rather remove SendStop than add more complexity there. regards, tom lane
Excerpts from Tom Lane's message of vie may 11 16:50:01 -0400 2012: > > Yep, it's still there as far as I can tell. A backtrace from the > > checkpointer shows it's waiting on the latch. > > I'm confused about what you did here and whether this isn't just pilot > error. If you run with -T then the postmaster will just SIGSTOP the > remaining child processes, but then it will sit and wait for them to > die, since the state machine expects them to react as though they'd been > sent SIGQUIT. The sequence of events is: postmaster -T crash a backend SIGINT postmaster SIGCONT all child processes My expectation is that postmaster should exit normally after this. What happens instead is that all processes exit, except checkpointer. And in fact, postmaster is now in PM_WAIT_BACKENDS state, so sending SIGINT a second time will not shutdown checkpointer either. Maybe we can consider this to be just pilot error, but then why do all other processes exit normally? To me it just seems an oversight in checkpointer shutdown handling in conjuction with -T. > If you SIGCONT any of them then that process will resume, > totally ignorant that it's supposed to die. So "kill -CONTd, of course" > makes no sense to me. I tried killing one child with -KILL, then > sending SIGINT to the postmaster, then killing the remaining > already-stopped children, and the postmaster did exit as expected after > the last child died. Uhm, after you SIGINTd postmaster didn't it shutdown all children? That would be odd. > So I don't see any bug here. And, after closer inspection, your > previous proposed patch is quite bogus because checkpointer is not > supposed to stop yet when the other processes are being terminated > normally. Well, it does send the signal only when FatalError is set. So it should only affect -T behavior. > Possibly it'd be useful to teach the postmaster more thoroughly about > SIGSTOP and have a way for it to really kill the remaining children > after you've finished investigating their state. But frankly this > is the first time I've heard of anybody using that feature at all; > I always thought it was a vestigial hangover from days when the kernel > was too stupid to write separate core dump files for each backend. > I'd rather remove SendStop than add more complexity there. Hah. I've used it a few times, but I can see that multiple core files are okay. Maybe you're right and we should just remove it. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Tom Lane's message of vie may 11 16:50:01 -0400 2012: >> I'm confused about what you did here and whether this isn't just pilot >> error. > The sequence of events is: > postmaster -T > crash a backend > SIGINT postmaster > SIGCONT all child processes > My expectation is that postmaster should exit normally after this. Well, my expectation is that the postmaster should wait for the children to finish dying, and then exit rather than respawn anything. It is not on the postmaster's head to make them die anymore, because it already (thinks it) sent them SIGQUIT. Using SIGCONT here is pilot error. > Maybe we can consider this to be just pilot error, but then why do all > other processes exit normally? The reason for that is that the postmaster's SIGINT interrupt handler (lines 2163ff) sent them SIGTERM, without bothering to notice that we'd already sent them SIGQUIT/SIGSTOP; so once you CONT them they get the SIGTERM and drop out normally. That handler knows it should not signal the checkpointer yet, so the checkpointer doesn't get the memo. But the lack of a FatalError check here is just a simplicity of implementation thing; it should not be necessary to send any more signals once we are in FatalError state. Besides, this behavior is all wrong for a crash recovery scenario: there is no guarantee that shared memory is in good enough condition for SIGTERM shutdown to work. And we *definitely* don't want the checkpointer trying to write a shutdown checkpoint. >> So I don't see any bug here. And, after closer inspection, your >> previous proposed patch is quite bogus because checkpointer is not >> supposed to stop yet when the other processes are being terminated >> normally. > Well, it does send the signal only when FatalError is set. So it should > only affect -T behavior. If FatalError is set, it should not be necessary to send any more signals, period, because we already tried to kill every child. If we need to defend against somebody using SIGSTOP/SIGCONT inappropriately, it would take a lot more thought (and code) than this, and it would still be extremely fragile because a SIGCONT'd backend is going to be executing against possibly-corrupt shared memory. regards, tom lane