Обсуждение: ERROR: WaitOnLock: error on wakeup - Aborting this transaction

Поиск
Список
Период
Сортировка

ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Tatsuo Ishii
Дата:
I get above message from the backend while trying to update same raw
from different transactions (I guess). Is this normal?

FYI, if I change the transaction isolation level to serializable, no
erro occurs.
---
Tatsuo Ishii


Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Vadim Mikheev
Дата:
Tatsuo Ishii wrote:
> 
> I get above message from the backend while trying to update same raw
> from different transactions (I guess). Is this normal?

1=>begin;
2=>begin;
1=>update t set a = 1 where c = 1;
2=>update t set a = 1 where c = 2;
1=>update t set a = 2 where c = 2; -- blocked by 2
2=>update t set a = 2 where c = 1; --> deadlock


Or you didn't use BEGIN/END ?

Vadim


Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Tatsuo Ishii
Дата:
> > I get above message from the backend while trying to update same raw
> > from different transactions (I guess). Is this normal?
> 
> 1=>begin;
> 2=>begin;
> 1=>update t set a = 1 where c = 1;
> 2=>update t set a = 1 where c = 2;
> 1=>update t set a = 2 where c = 2; -- blocked by 2
> 2=>update t set a = 2 where c = 1; --> deadlock

My sessions look like:

begin;
update t set a = 1 where c = 1;
select * from t where c = 1;
end;

So I think there is no possibility of a deadlock. Note that the error
happens with relatively large number of concurrent transactions
running. I don't see the error at # of transactions = 1~32 while I get
errors at 63 (I didn't try 33~62). In each session which raw gets
updated is decided by a random generator, so increasing # of
transactions might also increases the chance of conflicts, or 63 might
hit some threshold of certain resources, I don't know. The interesting
thing is the error never happen if I set the transaction isolation
mode to "serializable."

If I have time, I would do more test cases.
---
Tatsuo Ishii


Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Vadim Mikheev
Дата:
Tatsuo Ishii wrote:
> 
> > > I get above message from the backend while trying to update same raw
> > > from different transactions (I guess). Is this normal?
> 
> My sessions look like:
> 
> begin;
> update t set a = 1 where c = 1;
> select * from t where c = 1;
> end;

Ops. Do you have indices over table t?
Btree-s are still using page-level locking and don't release
locks when leave index page to fetch row from relation.
Seems that this causes deadlocks more often than I thought -:(

Marc? I can fix this today and I'll be very careful...
Ok?

Vadim


Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Tatsuo Ishii
Дата:
>> My sessions look like:
>> 
>> begin;
>> update t set a = 1 where c = 1;
>> select * from t where c = 1;
>> end;
>
>Ops. Do you have indices over table t?

Yes. It has the primary key, so has an btree index.

>Btree-s are still using page-level locking and don't release
>locks when leave index page to fetch row from relation.
>Seems that this causes deadlocks more often than I thought -:(
>
>Marc? I can fix this today and I'll be very careful...
>Ok?

Please let me know if you fix it. I will run the test again.
---
Tatsuo Ishii




Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Bruce Momjian
Дата:
> Tatsuo Ishii wrote:
> > 
> > > > I get above message from the backend while trying to update same raw
> > > > from different transactions (I guess). Is this normal?
> > 
> > My sessions look like:
> > 
> > begin;
> > update t set a = 1 where c = 1;
> > select * from t where c = 1;
> > end;
> 
> Ops. Do you have indices over table t?
> Btree-s are still using page-level locking and don't release
> locks when leave index page to fetch row from relation.
> Seems that this causes deadlocks more often than I thought -:(
> 
> Marc? I can fix this today and I'll be very careful...
> Ok?


If you don't, seems like our MVCC isn't going to be much good.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Vadim Mikheev
Дата:
Tatsuo Ishii wrote:
> 
> >Btree-s are still using page-level locking and don't release
> >locks when leave index page to fetch row from relation.
> >Seems that this causes deadlocks more often than I thought -:(
> >
> >Marc? I can fix this today and I'll be very careful...
> >Ok?
> 
> Please let me know if you fix it. I will run the test again.

Fixed.

Vadim


Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Tatsuo Ishii
Дата:
>> >Btree-s are still using page-level locking and don't release
>> >locks when leave index page to fetch row from relation.
>> >Seems that this causes deadlocks more often than I thought -:(
>> >
>> >Marc? I can fix this today and I'll be very careful...
>> >Ok?
>> 
>> Please let me know if you fix it. I will run the test again.
>
>Fixed.

Thanks, but I now have another problem. I got backend abortings. Stack 
trace shows followings (Sorry I' writing this by hand rather than
cut&paste, so there may be an error):

s_lock_stuck
s_lock
SpinAcquire
LockAcquire
LockRelation
heap_beginscan
index_info
find_secondary_index
find_relation_indices
set_base_rel_pathlist
make_one_rel
subPlanner
query_planner
union_planner
planner
pg_parse_and_plan
pg_exec_query_dest
pg_exec_quer
:
:

Note that this happend in both read committed/serializable levels.
--
Tatsuo Ishii


Re: [HACKERS] ERROR: WaitOnLock: error on wakeup - Aborting this transaction

От
Vadim Mikheev
Дата:
Tatsuo Ishii wrote:
> 
> Thanks, but I now have another problem. I got backend abortings. Stack
> trace shows followings (Sorry I' writing this by hand rather than
> cut&paste, so there may be an error):
> 
> s_lock_stuck
> s_lock
> SpinAcquire
> LockAcquire
> LockRelation
> heap_beginscan

Try to re-compile with -DLOCK_MGR_DEBUG -DDEADLOCK_DEBUG
and run postmaster with -o -K 3 to see what's going on in
lmgr.

Vadim