Обсуждение: How to cripple a postgres server

Поиск
Список
Период
Сортировка

How to cripple a postgres server

От
Stephen Robert Norris
Дата:
I've observed the following, and I wonder if anyone has seen it or has a
workaround, before I report it as a bug.

Doing the following will result in the load on the server reaching ~600,
and the server becoming very unresponsive. This seems to be as a result
of the SIGUSR2 which is delivered for async notify to free space in the
event queue thingy (I'm sort of in the dark here).

1) Open many (I used 800) database connections and leave them idle.
2) run: while true; do vacuum {database} ; done

Wait. Observe db has stopped doing useful work.

This causes glitches of up to many minutes, during which the db fails to
respond in any way.

We noticed this on a production system, so it's fairly dire. I know one
workaround is to reduce the number of idle connections, but that will
make our system less responsive.

It looks like a bug to me - anyone got any ideas?

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> 1) Open many (I used 800) database connections and leave them idle.
> 2) run: while true; do vacuum {database} ; done

> Wait. Observe db has stopped doing useful work.

Is your machine capable of supporting 800 active backends in the absence
of any vacuum processing?  I'm not sure that "we're driving our server
into swapping hell" qualifies as a Postgres bug ...

If you're seeing load peaks in excess of what would be observed with
800 active queries, then I would agree there's something to investigate
here.

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 11:57, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > 1) Open many (I used 800) database connections and leave them idle.
> > 2) run: while true; do vacuum {database} ; done
>
> > Wait. Observe db has stopped doing useful work.
>
> Is your machine capable of supporting 800 active backends in the absence
> of any vacuum processing?  I'm not sure that "we're driving our server
> into swapping hell" qualifies as a Postgres bug ...
>
> If you're seeing load peaks in excess of what would be observed with
> 800 active queries, then I would agree there's something to investigate
> here.
>
>             regards, tom lane

Yep, indeed with 800 backends doing a query every second, nothing
happens.

The machine itself has 1GB of RAM, and uses no swap in the above
situation. Instead, system time goes to 99% of CPU. The machine is a
dual-CPU athlon 1900 (1.5GHz).

It _only_ happens with idle connections!

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
>> If you're seeing load peaks in excess of what would be observed with
>> 800 active queries, then I would agree there's something to investigate
>> here.

> Yep, indeed with 800 backends doing a query every second, nothing
> happens.

> The machine itself has 1GB of RAM, and uses no swap in the above
> situation. Instead, system time goes to 99% of CPU. The machine is a
> dual-CPU athlon 1900 (1.5GHz).

> It _only_ happens with idle connections!

Hmm, you mean if the 800 other connections are *not* idle, you can
do VACUUMs with impunity?  If so, I'd agree we got a bug ...

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 12:44, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> >> If you're seeing load peaks in excess of what would be observed with
> >> 800 active queries, then I would agree there's something to investigate
> >> here.
>
> > Yep, indeed with 800 backends doing a query every second, nothing
> > happens.
>
> > The machine itself has 1GB of RAM, and uses no swap in the above
> > situation. Instead, system time goes to 99% of CPU. The machine is a
> > dual-CPU athlon 1900 (1.5GHz).
>
> > It _only_ happens with idle connections!
>
> Hmm, you mean if the 800 other connections are *not* idle, you can
> do VACUUMs with impunity?  If so, I'd agree we got a bug ...
>
>             regards, tom lane

Yes, that's what I'm saying.

If you put a sleep 5 into my while loop above, the problem still happens
(albeit after a longer time).

If you put a delay in, and also make sure that each of the 800
connections does a query (I'm using "select 1;"), then the problem never
happens.

Without the delay, you get a bit of load, but only up to about 10 or so,
rather than 600.

What seems to happen is that all the idle backends get woken up every
now and then to do _something_ (it seemed to coincide with getting the
SIGUSR2 to trigger an async_notify), and that shoots up the load. Making
sure all the backends have done something every now and then seems to
avoid the problem.

I suspect the comment near the async_notify code explains the problem -
where it talks about idle backends having to be woken up to clear out
pending events...

We only notice it on our production system when it's got lots of idle db
connections.

Other people seem to have spotted it (I found references to similar
effects with Google) but nobody seems to have worked out what it is -
the other people typically had the problem with web servers which keep a
pool of db connections.

Just to reiterate; this machine never swaps, it can handle > 1000
queries per second and it doesn't seem to run out of resources...

    Stephen


Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> On Tue, 2002-05-28 at 12:44, Tom Lane wrote:
>> Hmm, you mean if the 800 other connections are *not* idle, you can
>> do VACUUMs with impunity?  If so, I'd agree we got a bug ...

> Yes, that's what I'm saying.

Fascinating.

> I suspect the comment near the async_notify code explains the problem -
> where it talks about idle backends having to be woken up to clear out
> pending events...

Well, we need to do that, but your report seems to suggest that that
code path isn't getting the job done completely --- my first guess is
that a new normal query has to arrive before a SIGUSR2'd backend is
completely happy again.  Interesting.

You didn't specify: what PG version are you running, and on what
platform?

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 13:21, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > On Tue, 2002-05-28 at 12:44, Tom Lane wrote:
> >> Hmm, you mean if the 800 other connections are *not* idle, you can
> >> do VACUUMs with impunity?  If so, I'd agree we got a bug ...
>
> > Yes, that's what I'm saying.
>
> Fascinating.
>
> > I suspect the comment near the async_notify code explains the problem -
> > where it talks about idle backends having to be woken up to clear out
> > pending events...
>
> Well, we need to do that, but your report seems to suggest that that
> code path isn't getting the job done completely --- my first guess is
> that a new normal query has to arrive before a SIGUSR2'd backend is
> completely happy again.  Interesting.
>
> You didn't specify: what PG version are you running, and on what
> platform?
>
>             regards, tom lane

Sorry, forgot that.

PG 7.1 or 7.2 (both have the problem) on linux 2.2.x (x>=17) and 2.4.x
(4 <= x <= 18).

I've tried a variety of intel machines (and AMD machines) and managed to
reproduce the same problem.

To clarify my comments, I suspect the problem is that all 800 of the
backends get the SIGUSR2 at the same time, and all wake up, causing the
kernel scheduler to go mad trying to decide which one to schedule... If
the connections aren't idle, the queue never fills up enough to require
the signals...

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> To clarify my comments, I suspect the problem is that all 800 of the
> backends get the SIGUSR2 at the same time, and all wake up,

That would indeed happen ...

> causing the
> kernel scheduler to go mad trying to decide which one to schedule... If
> the connections aren't idle, the queue never fills up enough to require
> the signals...

... but it doesn't follow that this is causing your problem.  All 800
are going to get woken up at the same time anyway (or as close to the
same time as the postmaster can issue the signals, that is).  Why does
it make a difference whether they have or soon will process a normal
query?  If you are able to prevent the problem by issuing "select 1"
once per second on each connection, then for sure it's not a case of
whether they are busy at receipt of the signal or not.  Most of 'em
will not be busy at any given instant, with a load as trivial as that.

Do you use LISTEN/NOTIFY at all?  When was your pg_listener table
last vacuumed?

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 13:42, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > To clarify my comments, I suspect the problem is that all 800 of the
> > backends get the SIGUSR2 at the same time, and all wake up,
>
> That would indeed happen ...
>
> > causing the
> > kernel scheduler to go mad trying to decide which one to schedule... If
> > the connections aren't idle, the queue never fills up enough to require
> > the signals...
>
> ... but it doesn't follow that this is causing your problem.  All 800
> are going to get woken up at the same time anyway (or as close to the
> same time as the postmaster can issue the signals, that is).  Why does
> it make a difference whether they have or soon will process a normal
> query?  If you are able to prevent the problem by issuing "select 1"
> once per second on each connection, then for sure it's not a case of
> whether they are busy at receipt of the signal or not.  Most of 'em
> will not be busy at any given instant, with a load as trivial as that.
>
> Do you use LISTEN/NOTIFY at all?  When was your pg_listener table
> last vacuumed?
>
>             regards, tom lane

My reading of the code was that the signals didn't get delivered unless
the queue got too full, and that entries on the queue are created by the
vacuum (and other stuff) and processed when a backend does something,
thus the queue never gets too full.

My test program just cycles through each connection doing the select 1
on each in series.

No LISTEN/NOTIFY at all. I've been creating entirely new db setups to
test it (with initdb).

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> My reading of the code was that the signals didn't get delivered unless
> the queue got too full, and that entries on the queue are created by the
> vacuum (and other stuff) and processed when a backend does something,
> thus the queue never gets too full.

Good point.  And certainly the async-notify code (which scans through
pg_listener) is a lot more expensive than is needed just to respond to
an SInval-queue-full condition; that looks to be a quick hack that was
inserted without thought to performance.  But I don't think we quite
understand this issue yet.  If your system can support 800 simultaneous
useful queries then it shouldn't have a problem with 800 simultaneous
scans of an empty pg_listener table.

I'll ask again: is your system sized to support 800 *simultaneous*
user queries?  (One query per second on 800 connections is hardly
the same thing.)

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 13:57, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > My reading of the code was that the signals didn't get delivered unless
> > the queue got too full, and that entries on the queue are created by the
> > vacuum (and other stuff) and processed when a backend does something,
> > thus the queue never gets too full.
>
> Good point.  And certainly the async-notify code (which scans through
> pg_listener) is a lot more expensive than is needed just to respond to
> an SInval-queue-full condition; that looks to be a quick hack that was
> inserted without thought to performance.  But I don't think we quite
> understand this issue yet.  If your system can support 800 simultaneous
> useful queries then it shouldn't have a problem with 800 simultaneous
> scans of an empty pg_listener table.
>
> I'll ask again: is your system sized to support 800 *simultaneous*
> user queries?  (One query per second on 800 connections is hardly
> the same thing.)
>
>             regards, tom lane

I'm not sure; it can certainly do >1k queries/second through 800
simultaneous connections (about 1/connection second), but it's hard to
find enough machines to load it up that much...

One big difference, though, is that with the vacuum problem, the CPU
used is almost all (99%) system time; loading up the db with lots of
queries increases user time mostly, with little system time...

In any event, it seems a bug that merely having connections open causes
this problem! They aren't even in transactions...

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> One big difference, though, is that with the vacuum problem, the CPU
> used is almost all (99%) system time; loading up the db with lots of
> queries increases user time mostly, with little system time...

Hmm, that's a curious point; leaves one wondering about possible kernel
bugs.

> In any event, it seems a bug that merely having connections open causes
> this problem! They aren't even in transactions...

If the problem is that you've launched far more backends than the system
can really support, I'd have no hesitation in writing it off as user
error.  "Idle" processes are not without cost.  But at this point
I can't tell whether that's the case, or whether you're looking at a
genuine performance bug in either Postgres or the kernel.

Can you run strace (or truss or kernel-call-tracer-of-your-choice) on
the postmaster, and also on one of the putatively idle backends, so
we can see some more data about what's happening?

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 14:24, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > One big difference, though, is that with the vacuum problem, the CPU
> > used is almost all (99%) system time; loading up the db with lots of
> > queries increases user time mostly, with little system time...
>
> Hmm, that's a curious point; leaves one wondering about possible kernel
> bugs.
>
> > In any event, it seems a bug that merely having connections open causes
> > this problem! They aren't even in transactions...
>
> If the problem is that you've launched far more backends than the system
> can really support, I'd have no hesitation in writing it off as user
> error.  "Idle" processes are not without cost.  But at this point
> I can't tell whether that's the case, or whether you're looking at a
> genuine performance bug in either Postgres or the kernel.
>
> Can you run strace (or truss or kernel-call-tracer-of-your-choice) on
> the postmaster, and also on one of the putatively idle backends, so
> we can see some more data about what's happening?
>
>             regards, tom lane

I've already strace'ed the idle backend, and I can see the SIGUSR2 being
delivered just before everything goes bad.

What resource would you think idle backends might be exhausting?

On the production system, the problem doesn't happen under load (of
about 60-80 non-trivial queries/second) but does happen when the system
is largely idle. The number of connections is exactly the same in both
cases...

    Stephen

Вложения

Re: How to cripple a postgres server

От
Curt Sampson
Дата:
On Tue, 28 May 2002, Tom Lane wrote:

> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > One big difference, though, is that with the vacuum problem, the CPU
> > used is almost all (99%) system time; loading up the db with lots of
> > queries increases user time mostly, with little system time...
>
> Hmm, that's a curious point; leaves one wondering about possible kernel
> bugs.

If it turns out to be so, this would not be the first problem I've
heard about in the Linux scheduler. (It was notoriously bad for years.)

I'd suggest a good test would be to try this on a BSD machine and
see if the problem exists there, too. That will at least tell you
if it's Postgres or Linux.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC


Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 15:43, Curt Sampson wrote:
> On Tue, 28 May 2002, Tom Lane wrote:
>
> > Stephen Robert Norris <srn@commsecure.com.au> writes:
> > > One big difference, though, is that with the vacuum problem, the CPU
> > > used is almost all (99%) system time; loading up the db with lots of
> > > queries increases user time mostly, with little system time...
> >
> > Hmm, that's a curious point; leaves one wondering about possible kernel
> > bugs.
>
> If it turns out to be so, this would not be the first problem I've
> heard about in the Linux scheduler. (It was notoriously bad for years.)
>
> I'd suggest a good test would be to try this on a BSD machine and
> see if the problem exists there, too. That will at least tell you
> if it's Postgres or Linux.
>
> cjs
> --
> Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
>     Don't you know, in this new Dark Age, we're all light.  --XTC

Can someone with access to a BSD machine try this? It's pretty easy to
set up, a simple loop to open a few hundred connections and a 1-line
shell script. It doesn't seem to matter what's in the database...

I'm going to try the O(1) scheduler patch for the linux kernel and see
if that helps...

    Stephen

Вложения

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 14:24, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > One big difference, though, is that with the vacuum problem, the CPU
> > used is almost all (99%) system time; loading up the db with lots of
> > queries increases user time mostly, with little system time...
>
> Hmm, that's a curious point; leaves one wondering about possible kernel
> bugs.
>
> > In any event, it seems a bug that merely having connections open causes
> > this problem! They aren't even in transactions...
>
> If the problem is that you've launched far more backends than the system
> can really support, I'd have no hesitation in writing it off as user
> error.  "Idle" processes are not without cost.  But at this point
> I can't tell whether that's the case, or whether you're looking at a
> genuine performance bug in either Postgres or the kernel.
>
> Can you run strace (or truss or kernel-call-tracer-of-your-choice) on
> the postmaster, and also on one of the putatively idle backends, so
> we can see some more data about what's happening?
>
>             regards, tom lane

I wonder if it's a problem with a second SIGUSR2 arriving before the
first is finished? It seems much easier to trigger the effect with more
rapid vacuums than with a delay (even accounting for the reduced number
of vacuums occurring).

    Stephen

Вложения

Re: How to cripple a postgres server

От
Mitch Vincent
Дата:
> Can someone with access to a BSD machine try this? It's pretty easy to
> set up, a simple loop to open a few hundred connections and a 1-line
> shell script. It doesn't seem to matter what's in the database...
>
> I'm going to try the O(1) scheduler patch for the linux kernel and see
> if that helps...
>
> Stephen
>

I'm willing, send over what you were using.. Also the schema and dump of any
database you were testing against (if it matters)..

I don't think I have a machine that I can get to tomorrow that can do 800
connections at the same time -- does a smaller number of connections produce
the same result on your machine(s)?

-Mitch


Re: How to cripple a postgres server

От
Justin Clift
Дата:
Hi Stephen,

Are you able to give some detailed technical specs of the hardware
you're running?

I consider yours to be a higher-end PostgreSQL server, and I'd like to
have a good practical understanding of what is required (hardware wise)
to put together a 1k/second transaction PostgreSQL server.

:-)

Regards and best wishes,

Justin Clift


<snip>

> I'm not sure; it can certainly do >1k queries/second through 800
> simultaneous connections (about 1/connection second), but it's hard to
> find enough machines to load it up that much...
>
> One big difference, though, is that with the vacuum problem, the CPU
> used is almost all (99%) system time; loading up the db with lots of
> queries increases user time mostly, with little system time...
>
> In any event, it seems a bug that merely having connections open causes
> this problem! They aren't even in transactions...
>
>         Stephen
>
>   ------------------------------------------------------------------------
>                        Name: signature.asc
>    signature.asc       Type: PGP Armored File (application/x-unknown-content-type-PGP Armored File)
>                 Description: This is a digitally signed message part

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
   - Indira Gandhi

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> I've already strace'ed the idle backend, and I can see the SIGUSR2 being
> delivered just before everything goes bad.

Yes, but what happens after that?

If you don't see anything obvious by examining a single process, maybe
strace'ing the postmaster + all descendant processes would offer a
better viewpoint.

> What resource would you think idle backends might be exhausting?

Difficult to say.  I suspect your normal load doesn't have *all* the
backends trying to run queries at once.  But in any case the SIGUSR2
event should only produce a momentary spike in load, AFAICS.

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 22:26, Justin Clift wrote:
> Hi Stephen,
>
> Are you able to give some detailed technical specs of the hardware
> you're running?
>
> I consider yours to be a higher-end PostgreSQL server, and I'd like to
> have a good practical understanding of what is required (hardware wise)
> to put together a 1k/second transaction PostgreSQL server.
>
> :-)
>
> Regards and best wishes,
>
> Justin Clift

It's nothing special - a dual Athlon (1600MHz or whatever), 1GB RAM, 4
UDMA IDE HDD.

The key trick is that we only have about 200MB or data to serve, and
thus never actually hit the disks...

On similar hardware, but with 3GB RAM and 160GB of data, we go back to a
more usual 30-50 random queries/second.

For background, the first system is part of a stock exchange information
system; the second is part of a bill presentation system.

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
I spent some time this morning trying to reproduce your problem, with
not much luck.  I used the attached test program, in case anyone else
wants to try --- it fires up the specified number of connections
(issuing a trivial query on each one, just so that the backend is not
completely virgin) and goes to sleep.  I ran that in one window and did
manual "vacuum full"s in psql in another window.  I was doing the
vacuums in the regression database which has about 150 tables, so there
was an SI overrun event (resulting in SIGUSR2) every third or so vacuum.

Using stock Red Hat Linux 7.2 (kernel 2.4.7-10) on a machine with 256M
of RAM, I was able to run up to about 400 backends without seeing much
of any performance problem.  (I had the postmaster running with
postmaster -i -F -N 1000 -B 2000 and defaults in postgresql.conf.)
Each SI overrun fired up all the idle backends, but they went back to
sleep after a couple of kernel calls and not much computation.

Above 500 backends the thing went into swap hell --- it took minutes of
realtime to finish out the SI overrun cycle, even though the CPU was
idle (waiting for swap-in) most of the time.

I could not see any mode where system CPU percentage was significant.
I did notice the user CPU spiking up at times (this seems to be due to
the deadlock detection algorithm, which gets invoked once a process has
waited "too long" for a lock).

I was doing this with current development sources, but I have no reason
to think that PG 7.2 would act differently.

It's fairly clear from looking at this example that we're handling SI
overrun notification in a very inefficient manner; probably it should
have its own signal so that receiving backends can just read the SI
queue and not bother with scanning pg_listener.  But there's something
else going on in your system that I don't understand.  You're not
swapping and you are seeing 99% system CPU --- what's causing that?
And why don't I see it?

When there are not enough backends to cause swapping, strace'ing a
randomly-chosen backend produces results like this for each SIGUSR2
cycle:

11:40:24.824914 recv(8, 0x83ff600, 8192, 0) = ? ERESTARTSYS (To be restarted)
11:40:32.549941 --- SIGUSR2 (User defined signal 2) ---
11:40:32.550243 gettimeofday({1022686832, 550363}, NULL) = 0
11:40:32.550592 semop(5898249, 0xbfffeba4, 1) = 0
11:40:32.639593 lseek(4, 0, SEEK_END)   = 0
11:40:32.639729 sigreturn()             = ? (mask now [])
11:40:32.639845 recv(8,

On the other hand, with just a few too many backends I get:

11:42:46.913127 recv(8, 0x83ff600, 8192, 0) = ? ERESTARTSYS (To be restarted)
11:43:56.115481 --- SIGUSR2 (User defined signal 2) ---
11:43:56.577389 semop(5767173, 0xbfffebe4, 1) = 0
11:44:00.441212 semop(7438392, 0xbfffebe4, 1) = 0
11:44:00.441306 semop(5767173, 0xbfffeb34, 1) = 0
11:44:03.929909 semop(6455322, 0xbfffeb34, 1) = 0
11:44:03.930026 gettimeofday({1022687043, 930059}, NULL) = 0
11:44:03.931215 setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={1, 0}}, {it_interval={0, 0}, it_value={0, 0}}) =
0
11:44:03.931321 semop(5767173, 0xbfffeac4, 1) = -1 EINTR (Interrupted system call)
11:44:04.923771 --- SIGALRM (Alarm clock) ---
11:44:04.923969 semop(5767173, 0xbfffe734, 1) = 0
11:44:05.296190 semop(7602237, 0xbfffe734, 1) = 0
11:44:05.296368 sigreturn()             = ? (mask now [USR2])
11:44:05.296500 semop(5767173, 0xbfffeac4, 1) = 0
11:44:09.216795 setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}}, {it_interval={0, 0}, it_value={0, 0}}) =
0
11:44:09.217014 lseek(4, 0, SEEK_END)   = 0
11:44:09.286696 semop(6127632, 0xbfffeb54, 1) = 0
11:44:09.555040 sigreturn()             = ? (mask now [])
11:44:09.555251 recv(8,

(The number of semops can vary drastically depending on timing --- each
one indicates blocking on an internal lock.  BTW, does anyone know how
to get Linux strace to show the contents of the second arg to semop?)

What does your strace look like?

            regards, tom lane

/*
 * startlots.c
 *
 * Build with libpq (eg, gcc startlots.c -o startlots -lpq).
 *
 * Usage: startlots N to start N backends.  Control-C to exit.
 */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include "libpq-fe.h"

static void
exit_nicely(PGconn *conn)
{
    PQfinish(conn);
    exit(1);
}

static void
start1conn()
{
    char       *pghost,
               *pgport,
               *pgoptions,
               *pgtty;
    char       *dbName;
    PGconn       *conn;
    PGresult   *res;

    /*
     * begin, by setting the parameters for a backend connection if the
     * parameters are null, then the system will try to use reasonable
     * defaults by looking up environment variables or, failing that,
     * using hardwired constants
     */
    pghost = NULL;                /* host name of the backend server */
    pgport = NULL;                /* port of the backend server */
    pgoptions = NULL;            /* special options to start up the backend
                                 * server */
    pgtty = NULL;                /* debugging tty for the backend server */
    dbName = "template1";

    /* make a connection to the database */
    conn = PQsetdb(pghost, pgport, pgoptions, pgtty, dbName);

    /* check to see that the backend connection was successfully made */
    if (PQstatus(conn) == CONNECTION_BAD)
    {
        fprintf(stderr, "Connection to database '%s' failed.\n", dbName);
        fprintf(stderr, "%s", PQerrorMessage(conn));
        exit_nicely(conn);
    }

    /* run a transaction */
    res = PQexec(conn, "select version()");
    if (PQresultStatus(res) != PGRES_TUPLES_OK)
    {
        fprintf(stderr, "select version() command didn't return tuples properly\n");
        PQclear(res);
        exit_nicely(conn);
    }

    PQclear(res);

    /* leave the connection open */
}

int
main(int argc, char**argv)
{
    int        nconn = atoi(argv[1]);

    while (nconn-- > 0)
        start1conn();

    sleep(100000);

    return 0;
}

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Thu, 2002-05-30 at 01:52, Tom Lane wrote:
> I spent some time this morning trying to reproduce your problem, with
> not much luck.  I used the attached test program, in case anyone else
> wants to try --- it fires up the specified number of connections
> (issuing a trivial query on each one, just so that the backend is not
> completely virgin) and goes to sleep.  I ran that in one window and did
> manual "vacuum full"s in psql in another window.  I was doing the
> vacuums in the regression database which has about 150 tables, so there
> was an SI overrun event (resulting in SIGUSR2) every third or so vacuum.
>
> Using stock Red Hat Linux 7.2 (kernel 2.4.7-10) on a machine with 256M
> of RAM, I was able to run up to about 400 backends without seeing much
> of any performance problem.  (I had the postmaster running with
> postmaster -i -F -N 1000 -B 2000 and defaults in postgresql.conf.)
> Each SI overrun fired up all the idle backends, but they went back to
> sleep after a couple of kernel calls and not much computation.

Similar setup here, but 1GB RAM. If this problem is some sort of O(n^2)
thing, it could well be the case that it only happens on (for example) >
600 backends, and is fine at 400...

I also wonder if SMP has any impact - if there's lots of semops going
on, and the memory is being thrashed between CPU caches, that won't be
nice...

> Above 500 backends the thing went into swap hell --- it took minutes of
> realtime to finish out the SI overrun cycle, even though the CPU was
> idle (waiting for swap-in) most of the time.

I never swap.

Some more data from this end - I have only managed to reproduce the
problem once in about 2 hours with those lines removed that you asked me
to remove yesterday. With the lines still in, the problem happens after
a minute or two pretty much every time.

I still see the high numbers of processes in the run queue, and the load
rises, but neither postgres nor the machine stalls.


> What does your strace look like?
>
>             regards, tom lane

In "normal" SI overruns, about the same:

--- SIGUSR2 (User defined signal 2) ---
gettimeofday({1022719053, 355014}, NULL) = 0
close(7)                                = 0
close(6)                                = 0
close(4)                                = 0
close(3)                                = 0
close(9)                                = 0
semop(6258745, 0xbfffeb04, 1)           = 0
semop(6193207, 0xbfffeb04, 1)           = 0
open("/var/lib/pgsql/data/base/504592641/1259", O_RDWR) = 3
open("/var/lib/pgsql/data/base/504592641/16429", O_RDWR) = 4
semop(6258745, 0xbfffe8e4, 1)           = 0
semop(6225976, 0xbfffe8e4, 1)           = 0
open("/var/lib/pgsql/data/base/504592641/1249", O_RDWR) = 6
open("/var/lib/pgsql/data/base/504592641/16427", O_RDWR) = 7
open("/var/lib/pgsql/data/base/504592641/16414", O_RDWR) = 9
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={1, 0}},
{it_interval={0, 0}, it_value={0, 0}}) = 0
semop(6258745, 0xbfffea24, 1)           = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}},
{it_interval={0, 0}, it_value={0, 870000}}) = 0
lseek(9, 0, SEEK_END)                   = 0
semop(4456450, 0xbfffeac4, 1)           = 0
sigreturn()                             = ? (mask now [])
recv(8, 0x839a0a0, 8192, 0)             = ? ERESTARTSYS (To be
restarted)

Although, I see anything up to 9 or even 15 semop() calls and file
close/open pairs.

When it went mad, this happened:

--- SIGUSR2 (User defined signal 2) ---
gettimeofday({1022720979, 494838}, NULL) = 0
semop(10551353, 0xbfffeb04, 1)          = 0
close(7)                                = 0
close(6)                                = 0
close(4)                                = 0
close(3)                                = 0
select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
close(9)                                = 0
open("/var/lib/pgsql/data/base/504592641/1259", O_RDWR) = 3
select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
open("/var/lib/pgsql/data/base/504592641/16429", O_RDWR) = 4
open("/var/lib/pgsql/data/base/504592641/1249", O_RDWR) = 6
open("/var/lib/pgsql/data/base/504592641/16427", O_RDWR) = 7
open("/var/lib/pgsql/data/base/504592641/16414", O_RDWR) = 9
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={1, 0}},
{it_interval={0, 0}, it_value={0, 0}}) = 0
semop(10551353, 0xbfffea24, 1)          = -1 EINTR (Interrupted system
call)
--- SIGALRM (Alarm clock) ---
semop(10551353, 0xbfffe694, 1)          = 0
semop(8716289, 0xbfffe694, 1)           = 0
sigreturn()                             = ? (mask now [USR2])

However, the strace stopped just before the ) on the first semop, which
I think means it hadn't completed. The whole thing (postgres, vmstat and
all) stopped for about 10 seconds, then it went on.

This was only a short version of the problem (it can lock up for 20-30
seconds), but I think it's the same thing.

    Stephen

Вложения

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Tue, 2002-05-28 at 23:29, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > I've already strace'ed the idle backend, and I can see the SIGUSR2 being
> > delivered just before everything goes bad.
>
> Yes, but what happens after that?

The strace stops until I manually kill the connecting process - the
machine stops in general until then (vmstat 1 stops producing output,
shells stop responding ...). So who knows what happens :(

>
> If you don't see anything obvious by examining a single process, maybe
> strace'ing the postmaster + all descendant processes would offer a
> better viewpoint.
>
> > What resource would you think idle backends might be exhausting?
>
> Difficult to say.  I suspect your normal load doesn't have *all* the
> backends trying to run queries at once.  But in any case the SIGUSR2
> event should only produce a momentary spike in load, AFAICS.
>
>             regards, tom lane

I agree (about not having 800 simultaneous queries). Sometimes, the
SIGUSR2 does just create a very brief load spike (vmstat shows >500
processes on the run queue, but the next second everything is back to
normal and no unusual amount of CPU is consumed).

This sort of rules out (to me) a kernel problem, unless it's something
triggered at a specific number of processes (like 700 is bad, 699 is
OK).

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> I've already strace'ed the idle backend, and I can see the SIGUSR2 being
> delivered just before everything goes bad.

>> Yes, but what happens after that?

> The strace stops until I manually kill the connecting process - the
> machine stops in general until then (vmstat 1 stops producing output,
> shells stop responding ...). So who knows what happens :(

Hmm, I hadn't quite understood that you were complaining of a
system-wide lockup and not just Postgres getting wedged.  I think the
chances are very good that this *is* a kernel bug.  In any case, no
self-respecting kernel hacker would be happy with the notion that
a completely unprivileged user program can lock up the whole machine.
So even if Postgres has got a problem, the kernel is clearly failing
to defend itself adequately.

Are you able to reproduce the problem with fewer than 800 backends?
How about if you try it on a smaller machine?

Another thing that would be entertaining to try is other ways of
releasing 800 queries at once.  For example, on connection 1 do
    BEGIN; LOCK TABLE foo;
then issue a "SELECT COUNT(*) FROM foo" on each other connection,
and finally COMMIT on connection 1.  If that creates similar misbehavior
then I think the SI-overrun mechanism is probably not to be blamed.

> ... Sometimes, the
> SIGUSR2 does just create a very brief load spike (vmstat shows >500
> processes on the run queue, but the next second everything is back to
> normal and no unusual amount of CPU is consumed).

That's the behavior I'd expect.  We need to figure out what's different
between that case and the cases where it locks up.

            regards, tom lane

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Wed, 2002-05-29 at 09:08, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > I've already strace'ed the idle backend, and I can see the SIGUSR2 being
> > delivered just before everything goes bad.
>
> >> Yes, but what happens after that?
>
> > The strace stops until I manually kill the connecting process - the
> > machine stops in general until then (vmstat 1 stops producing output,
> > shells stop responding ...). So who knows what happens :(
>
> Hmm, I hadn't quite understood that you were complaining of a
> system-wide lockup and not just Postgres getting wedged.  I think the
> chances are very good that this *is* a kernel bug.  In any case, no
> self-respecting kernel hacker would be happy with the notion that
> a completely unprivileged user program can lock up the whole machine.
> So even if Postgres has got a problem, the kernel is clearly failing
> to defend itself adequately.
>
> Are you able to reproduce the problem with fewer than 800 backends?
> How about if you try it on a smaller machine?

Yep, on a PIII-800 with 256MB I can do it with fewer backends (I forget
how many) and only a few vacuums. It's much easier, basically, but
there's much less CPU on that machine. It also locks the machine up for
several minutes...

> Another thing that would be entertaining to try is other ways of
> releasing 800 queries at once.  For example, on connection 1 do
>     BEGIN; LOCK TABLE foo;
> then issue a "SELECT COUNT(*) FROM foo" on each other connection,
> and finally COMMIT on connection 1.  If that creates similar misbehavior
> then I think the SI-overrun mechanism is probably not to be blamed.
>
> > ... Sometimes, the
> > SIGUSR2 does just create a very brief load spike (vmstat shows >500
> > processes on the run queue, but the next second everything is back to
> > normal and no unusual amount of CPU is consumed).
>
> That's the behavior I'd expect.  We need to figure out what's different
> between that case and the cases where it locks up.
>
>             regards, tom lane

Yeah. I'll try your suggestion above and report back.

    Stephen

Вложения

Re: How to cripple a postgres server

От
Stephen Robert Norris
Дата:
On Wed, 2002-05-29 at 13:55, Tom Lane wrote:
> Stephen Robert Norris <srn@commsecure.com.au> writes:
> > Not at all like the vacuum problem. Do you have any other tests I can
> > run? Other diagnostics that might help?
>
> This is a long shot, but ... does the behavior change if you remove the
> two lines
>
>     if (signo != SIGALRM)
>         act.sa_flags |= SA_RESTART;
>
> near the bottom of src/backend/libpq/pqsignal.c ?  I am wondering if
> Linux gets unhappy if we try to do much in a signal handler.  This
> looks to be the simplest alteration that might address such a problem.
>
>             regards, tom lane

Maybe it does help. I've only seen the problem once now (just after
startup) rather than every few minutes... I'm still seeing lots of
processes in the ready queue every now and then, but they are doing more
sensible things.

I'll investigate some more and be more certain.

    Stephen

Вложения

Re: How to cripple a postgres server

От
Tom Lane
Дата:
Stephen Robert Norris <srn@commsecure.com.au> writes:
> Not at all like the vacuum problem. Do you have any other tests I can
> run? Other diagnostics that might help?

This is a long shot, but ... does the behavior change if you remove the
two lines

    if (signo != SIGALRM)
        act.sa_flags |= SA_RESTART;

near the bottom of src/backend/libpq/pqsignal.c ?  I am wondering if
Linux gets unhappy if we try to do much in a signal handler.  This
looks to be the simplest alteration that might address such a problem.

            regards, tom lane

Re: How to cripple a postgres server

От
torsknod@datas-world.dyndns.org
Дата:
Comments: In-reply-to Stephen Robert Norris <srn@commsecure.com.au>
    message dated "29 May 2002 11:59:25 +1000"
Date: Tue, 28 May 2002 23:55:57 -0400
Message-ID: <20115.1022644557@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Precedence: bulk
Sender: pgsql-general-owner@postgresql.org

Stephen Robert Norris <srn@commsecure.com.au> writes:
> Not at all like the vacuum problem. Do you have any other tests I can
> run? Other diagnostics that might help?

This is a long shot, but ... does the behavior change if you remove the
two lines

    if (signo != SIGALRM)
        act.sa_flags |= SA_RESTART;

near the bottom of src/backend/libpq/pqsignal.c ?  I am wondering if
Linux gets unhappy if we try to do much in a signal handler.  This
looks to be the simplest alteration that might address such a problem.

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly