Обсуждение: deadlock in single-row select-for-update + update scenario? How could it happen?

Поиск
Список
Период
Сортировка

deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
I have developer with pg 9.3.5, which is reporing something really strange.

He runs importer, which does, in single transaction:

begin;
select * from table where pkey = <value> limit 1 for update;
update table set ... where pkey = <the same_value>;
commit;

and two backends running the same transaction deadlock.

I checked for duplicated rows with the same pkey value - none are there. And frankly - I'm out of ideas.

What could be wrong in such case?

Detailed logs, with just some obfuscation: https://depesz.privatepaste.com/0594a93459

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Adrian Klaver
Дата:
On 08/22/2014 09:29 AM, hubert depesz lubaczewski wrote:
> I have developer with pg 9.3.5, which is reporing something really strange.
>
> He runs importer, which does, in single transaction:
>
> begin;
> select * from table where pkey = <value> limit 1 for update;
> update table set ... where pkey = <the same_value>;
> commit;
>
> and two backends running the same transaction deadlock.
>
> I checked for duplicated rows with the same pkey value - none are there.
> And frankly - I'm out of ideas.
>
> What could be wrong in such case?

So process 66017 and 66014 are blocking each because they are running
the exact same queries. The interesting part is the process with the
lower pid is starting later then the none with the higher pid.

So what exactly is 'importer' and what does it do?

Also what is this (59303)?

>
> Detailed logs, with just some obfuscation:
> https://depesz.privatepaste.com/0594a93459
>
> depesz


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Jeff Janes
Дата:
On Fri, Aug 22, 2014 at 9:29 AM, hubert depesz lubaczewski <depesz@gmail.com> wrote:
I have developer with pg 9.3.5, which is reporing something really strange.

He runs importer, which does, in single transaction:

begin;
select * from table where pkey = <value> limit 1 for update;
update table set ... where pkey = <the same_value>;
commit;

and two backends running the same transaction deadlock.

I checked for duplicated rows with the same pkey value - none are there. And frankly - I'm out of ideas.

What transaction isolation level is being used?

Cheers,

Jeff

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 6:45 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
So process 66017 and 66014 are blocking each because they are running the exact same queries. The interesting part is the process with the lower pid is starting later then the none with the higher pid.

Locking is obvious. But why deadlock? There is just single row, and it shouldn't be able to deadlock on it?!
 
So what exactly is 'importer' and what does it do?

Some software written by some guy. Runs lots of queries, but the only problem we have is with these transactions.
 
Also what is this (59303)?

log_line_prefix is  '%m %r %p %u %d ' so it's port number.

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Adrian Klaver
Дата:
On 08/22/2014 10:15 AM, hubert depesz lubaczewski wrote:
> On Fri, Aug 22, 2014 at 6:45 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     So process 66017 and 66014 are blocking each because they are
>     running the exact same queries. The interesting part is the process
>     with the lower pid is starting later then the none with the higher pid.
>
>
> Locking is obvious. But why deadlock? There is just single row, and it
> shouldn't be able to deadlock on it?!

Well both queries are doing SELECT .. FOR UPDATE as well as UPDATE. From
what I see there are four queries contending for the same row.

>
>     So what exactly is 'importer' and what does it do?
>
>
> Some software written by some guy. Runs lots of queries, but the only
> problem we have is with these transactions.
>
>     Also what is this (59303)?
>
>
> log_line_prefix is  '%m %r %p %u %d ' so it's port number.

So why are different processes running the exact same queries coming in
on different ports?


>
> depesz


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
John R Pierce
Дата:
On 8/22/2014 9:29 AM, hubert depesz lubaczewski wrote:
> select * from table where pkey = <value> limit 1 for update;

why is there a limit 1 in there?    pkey=somevalue should only return a
single row.   if it DID return multiple rows, you don't have an ORDER
BY, so the limit 1 would be indeterminate.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 7:29 PM, John R Pierce <pierce@hogranch.com> wrote:
On 8/22/2014 9:29 AM, hubert depesz lubaczewski wrote:
select * from table where pkey = <value> limit 1 for update;
why is there a limit 1 in there?    pkey=somevalue should only return a single row.   if it DID return multiple rows, you don't have an ORDER BY, so the limit 1 would be indeterminate.

leftover from some other thing.

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 7:20 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
So why are different processes running the exact same queries coming in on different ports?

the importer is parallelized, and sometimes two processes handle batches of data that happen to update the same "top level row".

but the deadlocking problem is happening only on one machine, though very repeatably.

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Adrian Klaver
Дата:
On 08/22/2014 10:36 AM, hubert depesz lubaczewski wrote:
> On Fri, Aug 22, 2014 at 7:20 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     So why are different processes running the exact same queries coming
>     in on different ports?
>
>
> the importer is parallelized, and sometimes two processes handle batches
> of data that happen to update the same "top level row".
>
> but the deadlocking problem is happening only on one machine, though
> very repeatably.

Which begs the question, what is different about that machine?

>
> depesz


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@gmail.com> writes:
> On Fri, Aug 22, 2014 at 6:45 PM, Adrian Klaver <adrian.klaver@aklaver.com>
> wrote:
>> So process 66017 and 66014 are blocking each because they are running the
>> exact same queries. The interesting part is the process with the lower pid
>> is starting later then the none with the higher pid.

> Locking is obvious. But why deadlock? There is just single row, and it
> shouldn't be able to deadlock on it?!

You have not shown us the full sequence of events leading up to the
deadlock failure, but I hypothesize that there were yet other transactions
that updated that same row in the very recent past.  That might allow
there to be more than one tuple lock involved (ie, locks on different
versions of the row), which would create some scope for a deadlock
failure.

            regards, tom lane


Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 7:43 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
Which begs the question, what is different about that machine?

No idea. I can pass all the question you might have, but I'm ~ 6000 miles away from any machine running this code.

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 7:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
You have not shown us the full sequence of events leading up to the
deadlock failure, but I hypothesize that there were yet other transactions
that updated that same row in the very recent past.  That might allow
there to be more than one tuple lock involved (ie, locks on different
versions of the row), which would create some scope for a deadlock
failure.

Well, showing all events is difficult due to parallelization of importer, but shouldn't "select for update" solve the problem of other locks?

The transactions are exactly as shown - select for update and then update.

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Adrian Klaver
Дата:
On 08/22/2014 10:50 AM, hubert depesz lubaczewski wrote:
> On Fri, Aug 22, 2014 at 7:43 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     Which begs the question, what is different about that machine?
>
>
> No idea. I can pass all the question you might have, but I'm ~ 6000
> miles away from any machine running this code.

Which in itself might be a clue.

Is all the code/data running on/coming from that machine or is some
coming in remotely?

Where network latency might be an issue?

>
> depesz


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 7:54 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
Which in itself might be a clue.

Is all the code/data running on/coming from that machine or is some coming in remotely?

Where network latency might be an issue?

All locally, but hey - how could network latency be a problem? Transaction gets the lock on row, and then it updates. the same row. in the same transaction. with nothing else in the transaction. where is here place for deadlock for another, identical transaction?

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Adrian Klaver
Дата:
On 08/22/2014 11:14 AM, hubert depesz lubaczewski wrote:
> On Fri, Aug 22, 2014 at 7:54 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     Which in itself might be a clue.
>
>     Is all the code/data running on/coming from that machine or is some
>     coming in remotely?
>
>     Where network latency might be an issue?
>
>
> All locally, but hey - how could network latency be a problem?
> Transaction gets the lock on row, and then it updates. the same row. in
> the same transaction. with nothing else in the transaction. where is
> here place for deadlock for another, identical transaction?

Not sure, just the combination of parallel operations and remote
connections seemed to be an avenue to explore. Given that everything is
local, turns out it was dead end.

Looking at the pastebin log again, am I reading it right that the first
process actually COMMITs properly?

Also is there a trigger in the mix that might be fouling things up?

>
> depesz


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Alvaro Herrera
Дата:
hubert depesz lubaczewski wrote:
> I have developer with pg 9.3.5, which is reporing something really strange.
>
> He runs importer, which does, in single transaction:
>
> begin;
> select * from table where pkey = <value> limit 1 for update;
> update table set ... where pkey = <the same_value>;
> commit;
>
> and two backends running the same transaction deadlock.

FWIW this problem was reported also by Andrew Sackville-West at
http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230

I strongly suspect now that the problem is related to the locking of
updated versions as heap_lock_tuple_updated, and perhaps the internal
locking done by EvalPlanQual.  Haven't traced through it.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 8:33 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
Not sure, just the combination of parallel operations and remote connections seemed to be an avenue to explore. Given that everything is local, turns out it was dead end.
Looking at the pastebin log again, am I reading it right that the first process actually COMMITs properly?
Also is there a trigger in the mix that might be fouling things up?

Please note that the pastebin log is split by backend pid, and only in backend-pid groups sorted by timestamp.

66014 started transaction later, and committed, while 66017, which started transaction earlier, and actually obtained lock earlier - got killed by deadlock resolution.

There are no triggers aside from some (~10) fkeys.

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 9:21 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
FWIW this problem was reported also by Andrew Sackville-West at
http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230
I strongly suspect now that the problem is related to the locking of
updated versions as heap_lock_tuple_updated, and perhaps the internal
locking done by EvalPlanQual.  Haven't traced through it.

Is there anything I could tell the developer to do (he's on Mac) so he could provide more information?

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
hubert depesz lubaczewski
Дата:
On Fri, Aug 22, 2014 at 6:55 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
What transaction isolation level is being used?

Sorry for late reply - the user was away for parts of friday, I was away on weekend, and just now got answer - it's read committed.

depesz

Re: deadlock in single-row select-for-update + update scenario? How could it happen?

От
Adrian Klaver
Дата:
On 08/25/2014 04:18 AM, hubert depesz lubaczewski wrote:
> On Fri, Aug 22, 2014 at 9:21 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com <mailto:alvherre@2ndquadrant.com>> wrote:
>
>     FWIW this problem was reported also by Andrew Sackville-West at
>     http://www.postgresql.org/message-id/20140731233051.GN17765@andrew-ThinkPad-X230
>     I strongly suspect now that the problem is related to the locking of
>     updated versions as heap_lock_tuple_updated, and perhaps the internal
>     locking done by EvalPlanQual.  Haven't traced through it.
>
>
> Is there anything I could tell the developer to do (he's on Mac) so he
> could provide more information?

I would say to confirm the bug report Alvaro mentioned it would be good
to try the importer script against non-9.3 instances of Postgres to see
if the same thing happens. Also interesting to note that in the bug
report thread mention is made of a large number of FKs on a table.

>
> depesz


--
Adrian Klaver
adrian.klaver@aklaver.com