Обсуждение: Reinitialize stack base after fork (for the benefit of rr)?

Поиск
Список
Период
Сортировка

Reinitialize stack base after fork (for the benefit of rr)?

От
Andres Freund
Дата:
Hi,

I've found rr [1] very useful to debug issues in postgres. The ability
to hit a bug, and then e.g. identify a pointer with problematic
contents, set a watchpoint on its contents, and reverse-continue is
extremely powerful.

Unfortunately, when running postgres, it currently occasionally triggers
spurious stack-too-deep errors. That turns out to be because it has to
use an alternative stack in some corner cases (IIUC when a signal
arrives while already in a signal handler). That corner case can
unfortunately be hit from within postmaster, and at least can lead to
sigusr1_handler() being called with an alternative stack set.

Unfortunately that means that processes that postmaster fork()s while
using that alternative stack will continue their live using that
alternative stack. Which then subsequently means that our stack depth
checks always trigger.

I've not seen this trigger for normal backends (which makes sense,
they're not started from a signal handler), but for bgworkers. In
particular parallel workers are prone to hit the issue.


I've locally fixed the issue by computing the stack base address anew
for postmaster children. Currently in InitPostmasterChild().

I'd like to get that change upstream. The rr hackers have fixed a number
of other issues that could be hit with postgres, but they couldn't see a
good way to address the potential for a different signal stack in this
edge case. And it doesn't seem crazy to me to compute the stack base
again in postmaster children: It's cheap enough and it's extremely
unlikely that postmaster uses up a crazy amount of stack.

I also don't find it too crazy to guard against forks in signal handlers
leading to a different stack base address. It's a pretty odd thing to
do.


Tom, while imo not a fix of the right magnitude here: Are you planning /
hoping to work again on your postmaster latch patch? I think it'd be
really good if we could restructure the postmaster code to do far far
less in signal handlers. And the postmaster latch patch seems like a big
step in that direction.  I think we mostly dropped it due to the release
schedule last time round?

Greetings,

Andres Freund

[1] https://github.com/mozilla/rr/



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> I've locally fixed the issue by computing the stack base address anew
> for postmaster children. Currently in InitPostmasterChild().

> I'd like to get that change upstream. The rr hackers have fixed a number
> of other issues that could be hit with postgres, but they couldn't see a
> good way to address the potential for a different signal stack in this
> edge case. And it doesn't seem crazy to me to compute the stack base
> again in postmaster children: It's cheap enough and it's extremely
> unlikely that postmaster uses up a crazy amount of stack.

Seems reasonable.  I think we'd probably also need this in the
EXEC_BACKEND case, in case ASLR puts the child process's stack
somewhere else.  Can you merge your concern with that one?

On the other hand, it might be better to not launch children from the
signal handler, because I don't think we should assume the alternate
stack can grow as large as the main one.  Does POSIX talk about this?

> Tom, while imo not a fix of the right magnitude here: Are you planning /
> hoping to work again on your postmaster latch patch?

Um ... -ESWAPPEDOUT.  What are you thinking of?

            regards, tom lane



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Andres Freund
Дата:
Hi,

On 2020-03-27 14:34:56 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I've locally fixed the issue by computing the stack base address anew
> > for postmaster children. Currently in InitPostmasterChild().
>
> > I'd like to get that change upstream. The rr hackers have fixed a number
> > of other issues that could be hit with postgres, but they couldn't see a
> > good way to address the potential for a different signal stack in this
> > edge case. And it doesn't seem crazy to me to compute the stack base
> > again in postmaster children: It's cheap enough and it's extremely
> > unlikely that postmaster uses up a crazy amount of stack.
>
> Seems reasonable.  I think we'd probably also need this in the
> EXEC_BACKEND case, in case ASLR puts the child process's stack
> somewhere else.  Can you merge your concern with that one?

We currently already do that there, in SubPostmasterMain(). If we add a
set_stack_base() to InitPostmasterChild() we can remove it from there,
though.


> On the other hand, it might be better to not launch children from the
> signal handler, because I don't think we should assume the alternate
> stack can grow as large as the main one.  Does POSIX talk about this?

I strongly agree that it'd be better - independent of what we conclude
re a localized fix for rr. I think I looked for what specs around this a
while ago and couldn't find much. fork() is listed as signal safe (but
there was discussion about removing it - going nowhere I think).


> > Tom, while imo not a fix of the right magnitude here: Are you planning /
> > hoping to work again on your postmaster latch patch?
>
> Um ... -ESWAPPEDOUT.  What are you thinking of?

https://postgr.es/m/18193.1492793404%40sss.pgh.pa.us

That doesn't convert all that much of postmaster to latches, but once
the basic infrastructure is in place, it doesn't seem too hard to
convert more. In particular sigusr1_handler, which is the relevant one
here, looks fairly easy. SIGHUP_handler(), reaper() shouldn't be hard
either. Whether it could make sense to convert pmdie for SIGQUIT is less
clear to me, but also seems less clearly necessary: We don't fork, and
shutting down anyway.

Greetings,

Andres Freund



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2020-03-27 14:34:56 -0400, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> Tom, while imo not a fix of the right magnitude here: Are you planning /
>>> hoping to work again on your postmaster latch patch?

>> Um ... -ESWAPPEDOUT.  What are you thinking of?

> https://postgr.es/m/18193.1492793404%40sss.pgh.pa.us

Oh, I thought we'd dropped that line of thinking in favor of trying
to not do work in the postmaster signal handlers (i.e. I thought *you*
were pushing this forward, not me).

            regards, tom lane



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Andres Freund
Дата:
Hi,

On 2020-03-27 14:59:56 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-03-27 14:34:56 -0400, Tom Lane wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >>> Tom, while imo not a fix of the right magnitude here: Are you planning /
> >>> hoping to work again on your postmaster latch patch?
> 
> >> Um ... -ESWAPPEDOUT.  What are you thinking of?
> 
> > https://postgr.es/m/18193.1492793404%40sss.pgh.pa.us
> 
> Oh, I thought we'd dropped that line of thinking in favor of trying
> to not do work in the postmaster signal handlers (i.e. I thought *you*
> were pushing this forward, not me).

Hm - the way I imagine that to work is that we'd do a SetLatch() in the
various signal handlers and that the main loop would then react to
got_sigchld type variables. But for that we'd need latch support in
postmaster - which I think is pretty exactly what your patch in the
above message does?

Of course there'd need to be several subsequent patches to move work out
of signal handlers into the main loop.

Were you thinking of somehow doing that without using a latch?

Greetings,

Andres Freund



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Peter Geoghegan
Дата:
On Fri, Mar 27, 2020 at 11:22 AM Andres Freund <andres@anarazel.de> wrote:
> I've found rr [1] very useful to debug issues in postgres. The ability
> to hit a bug, and then e.g. identify a pointer with problematic
> contents, set a watchpoint on its contents, and reverse-continue is
> extremely powerful.

I agree that rr is very useful. It would be great if we had a totally
smooth workflow for debugging using rr.

-- 
Peter Geoghegan



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Andres Freund
Дата:
On 2020-04-04 21:02:56 -0700, Peter Geoghegan wrote:
> On Fri, Mar 27, 2020 at 11:22 AM Andres Freund <andres@anarazel.de> wrote:
> > I've found rr [1] very useful to debug issues in postgres. The ability
> > to hit a bug, and then e.g. identify a pointer with problematic
> > contents, set a watchpoint on its contents, and reverse-continue is
> > extremely powerful.
> 
> I agree that rr is very useful. It would be great if we had a totally
> smooth workflow for debugging using rr.

I just pushed that.



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Peter Geoghegan
Дата:
On Sun, Apr 5, 2020 at 6:54 PM Andres Freund <andres@anarazel.de> wrote:
> I just pushed that.

Great!

I have found that it's useful to use rr to debug Postgres by following
certain recipes. I'll share some of the details now, in case anybody
else wants to start using rr and isn't sure where to start.

I have a script that records a postgres session using rr with these options:

rr record -M /code/postgresql/$BRANCH/install/bin/postgres \
  -D /code/postgresql/$BRANCH/data \
  --log_line_prefix="%m %p " \
  --autovacuum=off \
  --effective_cache_size=1GB \
  --random_page_cost=4.0 \
  --work_mem=4MB \
  --maintenance_work_mem=64MB \
  --fsync=off \
  --log_statement=all \
  --log_min_messages=DEBUG5 \
  --max_connections=50 \
  --shared_buffers=32MB

Most of these settings were taken from a similar script that I use to
run Valgrind, so the particulars may not matter much -- though it's
useful to make the server logs as verbose as possible (you'll see why
in a minute).

I find it quite practical to run "make installcheck" against the
server, recording the entire execution. I find that it's not that much
slower than just running the tests against a regular debug build of
Postgres. It's still much faster than Valgrind, for example.
(Replaying the recording seems to be where having a high end machine
helps a lot.)

Once the tests are done, I stop Postgres in the usual way (Ctrl + C).
The recording is saved to the $HOME/.local/share/rr/ directory on my
Linux distro -- rr creates a directory for each distinct recording in
this parent directory. rr also maintains a symlink (latest-trace) that
points to the latest recording directory, which I rely on most of the
time when replaying a recording (it's the default). I am careful to
not leave too many recordings around, since they're large enough that
that could become a concern.

The record/Postgres terminal has output that looks like this:

[rr 1786705 1241867]2020-04-04 21:55:05.018 PDT 1786705 DEBUG:
CommitTransaction(1) name: unnamed; blockState: STARTED; state:
INPROGRESS, xid/subid/cid: 63992/1/2
[rr 1786705 1241898]2020-04-04 21:55:05.019 PDT 1786705 DEBUG:
StartTransaction(1) name: unnamed; blockState: DEFAULT; state:
INPROGRESS, xid/subid/cid: 0/1/0
[rr 1786705 1241902]2020-04-04 21:55:05.019 PDT 1786705 LOG:
statement: CREATE TYPE test_type_empty AS ();
[rr 1786705 1241906]2020-04-04 21:55:05.020 PDT 1786705 DEBUG:
CommitTransaction(1) name: unnamed; blockState: STARTED; state:
INPROGRESS, xid/subid/cid: 63993/1/1
[rr 1786705 1241936]2020-04-04 21:55:05.020 PDT 1786705 DEBUG:
StartTransaction(1) name: unnamed; blockState: DEFAULT; state:
INPROGRESS, xid/subid/cid: 0/1/0
[rr 1786705 1241940]2020-04-04 21:55:05.020 PDT 1786705 LOG:
statement: DROP TYPE test_type_empty;
[rr 1786705 1241944]2020-04-04 21:55:05.021 PDT 1786705 DEBUG:  drop
auto-cascades to composite type test_type_empty
[rr 1786705 1241948]2020-04-04 21:55:05.021 PDT 1786705 DEBUG:  drop
auto-cascades to type test_type_empty[]
[rr 1786705 1241952]2020-04-04 21:55:05.021 PDT 1786705 DEBUG:
MultiXact: setting OldestMember[2] = 9
[rr 1786705 1241956]2020-04-04 21:55:05.021 PDT 1786705 DEBUG:
CommitTransaction(1) name: unnamed; blockState: STARTED; state:
INPROGRESS, xid/subid/cid: 63994/1/3

The part of each log line in square brackets comes from rr (since we
used -M when recording) -- the first number is a PID, the second an
event number. I usually don't care about the PIDs, though, since the
event number alone unambiguously identifies a particular "event" in a
particular backend (rr recordings are single threaded, even when there
are multiple threads or processes). Suppose I want to get to the
"CREATE TYPE test_type_empty AS ();" query -- I can get to the end of
the query by replaying the recording with this option:

$ rr replay -M -g 1241902

Replaying the recording like this takes me to the point where the
Postgres backend prints the log message at the end of executing the
query I mentioned -- I get a familiar gdb debug server (rr implements
a gdb backend). This isn't precisely the point of execution that
interests me, but it's close enough. I can easily set a breakpoint to
the precise function I'm interested in, and then "reverse-continue" to
get there by going backwards.

I can also find the point where a particular backend starts by using
the fork option instead. So for the PID 1786705, that would look like:

$ rr replay -M -f 1786705

(Don't try to use the similar -p option, since that starts a debug
server when the pid has been exec'd.)

rr really shines when debugging things like tap tests, where there is
complex scaffolding that may run multiple Postgres servers. You can
run an entire "rr record make check", without having to worry about
how that scaffolding works. Once you have useful event numbers to work
off of, it doesn't take too long to get an interactive debugging
session in the backend of interest by applying the same techniques.

Note that saving the output of a recording using standard tools like
"tee" seems to have some issues [1]. I've found it helpful to get log
output (complete with these event numbers) by doing an "autopilot"
replay, like this:

$  rr replay -M -a &> rr.log

This may actually be required when running "make installcheck" or
something, since there might be megabytes of log output. I usually
don't need to bother to generate logs in this way, though. It might
take a few minutes to do an autopilot replay, since rr will replay
everything that was recorded in sub realtime.

One last tip: rr pack can be used to save a recording in a fairly
stable format -- it copies the needed files into the trace:

$ rr pack

I haven't used this one yet. It seems like it would be useful if I
wanted to save a recording for more than a day or two. Because every
single detail of the recording (e.g. pointers, PIDs) is stable, it
seems possible to treat a recording as a totally self contained thing.

Other resources:

https://github.com/mozilla/rr/wiki/Usage
https://github.com/mozilla/rr/wiki/Debugging-protips

[1] https://github.com/mozilla/rr/issues/91
-- 
Peter Geoghegan



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Andres Freund
Дата:
Hi,

On 2020-04-05 20:35:50 -0700, Peter Geoghegan wrote:
> I have found that it's useful to use rr to debug Postgres by following
> certain recipes. I'll share some of the details now, in case anybody
> else wants to start using rr and isn't sure where to start.

Perhaps put it on a wiki page?

> I have a script that records a postgres session using rr with these options:
> 
> rr record -M /code/postgresql/$BRANCH/install/bin/postgres \
>   -D /code/postgresql/$BRANCH/data \
>   --log_line_prefix="%m %p " \
>   --autovacuum=off \

Were you doing this because of occasional failures in autovacuum
workers?  If so, that shouldn't be necessary after the stack base change
(previously workers IIRC also could start with the wrong stack base -
but didn't end up checking stack depth except for expression indexes).

Greetings,

Andres Freund



Re: Reinitialize stack base after fork (for the benefit of rr)?

От
Peter Geoghegan
Дата:
On Sun, Apr 5, 2020 at 8:56 PM Andres Freund <andres@anarazel.de> wrote:
> Perhaps put it on a wiki page?

I added a new major section to the "getting a stack trace" wiki page:


https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Recording_Postgres_using_rr_Record_and_Replay_Framework

Feel free to add to and edit this section yourself.

> Were you doing this because of occasional failures in autovacuum
> workers?  If so, that shouldn't be necessary after the stack base change
> (previously workers IIRC also could start with the wrong stack base -
> but didn't end up checking stack depth except for expression indexes).

No, just a personal preference for things like this.

-- 
Peter Geoghegan