Обсуждение: Lock pileup stuck processes

Поиск
Список
Период
Сортировка

Lock pileup stuck processes

От
Josh berkus
Дата:
Folks,

This is a "hard to reproduce" bug, so is being submitted to this list in
order to accumulate evidence for eventual debugging when there are
enough reports to figure something out.  Since I've seen this on two
different user applications now, I think it relates to some kind of
persistent issue either in Postgres or in the OS.

Summary: in some cases, "lock pileups" fail to resolve completely, and
one or more orphan backends are left in permanent lock-waiting state.

Versions observed: 9.2.14, 9.2.15, 9.3.5

Platforms: RHEL6, Fedora

Observations:

1. A long-running transaction grabs one or more row locks.

2. Various queries, especially SELECT FOR UPDATE queries, pile up behind
this lock request.

3. At peak, 30 or more backends are waiting for locks in a dependency
chain.  System load is high.

4. Original transaction ends.

5. Over 10 minutes most of the waiting backends complete their work and
release.

6. 1 to 3 backends never come out of active/waiting state, remaining
that way indefinitely.

My attempts to reproduce this issue under synthetic circumstances have
not been successful.  strace of the stuck backends shows no activity.

--=20
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)

Re: Lock pileup stuck processes

От
Tom Lane
Дата:
Josh berkus <josh@agliodbs.com> writes:
> Summary: in some cases, "lock pileups" fail to resolve completely, and
> one or more orphan backends are left in permanent lock-waiting state.
> ...
> My attempts to reproduce this issue under synthetic circumstances have
> not been successful.  strace of the stuck backends shows no activity.

Please see if you can get a stack trace from a stuck backend.

            regards, tom lane