Обсуждение: possible concurrency bug or mistake in understanding read-committed behavior
Hello All, I am recently using sysbench with PostgreSQL 9.0 and 8.4.5 and doing some tests on 8core systems with SSDs. I seem to be hitting some problems with the read-write tests and hoping to see if it is a possible concurrency bug or expected behavior. Using sysbench with 1M rows and 80+ threads with --oltp-dist-type=special --oltp-table-size=1000000 --oltp-read-only=off --oltp-test-mode=complex --max-requests=0 --max-time=300 --num-threads=80 --test=oltp Causes : sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 80 Doing OLTP test. Running mixed OLTP test Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases) Using "BEGIN" for starting transactions Using auto_inc on the id column Threads started! FATAL: query execution failed: 8386864 FATAL: database error, exiting... Done. With error reported in pg_log: ERROR: duplicate key value violates unique constraint "sbtest_pkey" The way sysbench works that in a transaction it deletes a row and inserts the same row back. So in a transaction logic it should not hit this error since it just deleted the row. What's happening is the timing with transaction on some other process working with the same id. The error is pretty easily reproduced. Here are some condensed logs with respect to the two processes involved 445373 tm:2010-11-16 16:06:55.526 UTC db:sbtest pid:14244 LOG: duration: 0.004 ms execute sbstmt13377649542683423: BEGIN441773 tm:2010-11-16 16:06:55.527 UTC db:sbtest pid:14307 LOG: duration: 0.003 ms execute sbstmt899357721350501620: BEGIN ...other statements..removed... 445440 tm:2010-11-16 16:06:55.829 UTC db:sbtest pid:14244 LOG: duration: 16.683 ms execute sbstmt610923974845906481: UPDATE sbtest set k=k+1 where id=$1445441 tm:2010-11-16 16:06:55.829 UTC db:sbtest pid:14244 DETAIL: parameters: $1 = '500865'445444 tm:2010-11-16 16:06:55.830 UTC db:sbtest pid:14244 LOG: duration: 0.058 ms execute sbstmt81255943842609644: DELETE from sbtest where id=$1445445 tm:2010-11-16 16:06:55.830 UTC db:sbtest pid:14244 DETAIL: parameters: $1 = '500865' 441840 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14307 LOG: duration: 17.812 ms execute sbstmt375950798157093453: UPDATE sbtest set k=k+1 where id=$1441841 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14307 DETAIL: parameters: $1 = '500865' 445448 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14244 LOG: duration: 0.084 ms execute sbstmt388529267384136315: INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy')445449 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14244 DETAIL: parameters: $1 = '500865'445451 tm:2010-11-16 16:06:55.880 UTC db:sbtest pid:14244 LOG: duration: 48.647 ms execute sbstmt271576009267161387: COMMIT 441844 tm:2010-11-16 16:06:55.880 UTC db:sbtest pid:14307 LOG: duration: 48.889 ms execute sbstmt718364989767970105: DELETE from sbtest where id=$1441845 tm:2010-11-16 16:06:55.880 UTC db:sbtest pid:14307 DETAIL: parameters: $1 = '500865' 445452 tm:2010-11-16 16:06:55.881 UTC db:sbtest pid:14244 LOG: duration: 0.021 ms bind sbstmt13377649542683423: BEGIN445453 tm:2010-11-16 16:06:55.881 UTC db:sbtest pid:14244 LOG: duration: 0.005 ms execute sbstmt13377649542683423: BEGIN 441846 tm:2010-11-16 16:06:55.881 UTC db:sbtest pid:14307 LOG: duration: 0.036 ms bind sbstmt16151529281767205: INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy')441847 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307DETAIL: parameters: $1 = '500865'441848 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307 ERROR: duplicate key value violates unique constraint "sbtest_pkey"441849 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307 DETAIL:Key (id)=(500865) already exists.441850 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307 STATEMENT: INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy') The logs are condensed so please retry it with sysbench to get all logs related to 80+ processes if required So reducing the above I get in Read-Committed Isolation BEGIN BEGIN DELETE row; INSERT row; COMMIT; DELETE row; INSERT row; COMMIT; If I read this right if that is what is happening here.. then (a) since the COMMIT happens before DELETE then DELETE should have deleted that row again and insert should have been successful.. SO I tried this manually with two psql sessions; BEGIN; BEGIN; SELECT xmin,xmax,id from sbtest where id=500815; xmin | xmax | id---------+------+-------- 2566416 | 0 |500815 (1 row) select xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+------+-------- 2576468 |0 | 500815 (1 row) DELETE FROM sbtest WHERE id=500815; DELETE 1 select xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+---------+-------- 2576468| 2576518 | 500815 (1 row) INSERT INTO sbtest values(500815,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); INSERT 0 1 select xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+---------+-------- 2576468 | 2576518| 500815 (1 row) END; COMMIT SELECT xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+------+-------- 2576518 | 0| 500815 (1 row) DELETE FROM sbtest WHERE id=500815; DELETE 1 SELECT xmin,xmax,id from sbtest where id=500815; xmin| xmax | id ------+------+---- (0 rows) sbtest=# INSERT INTO sbtest values(500815,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); INSERT 0 1 SELECT xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+------+-------- 2576519 | 0 | 500815 (1 row) END; COMMIT So this wont create a problem .. Since the delete and insert happens after the other committed.. Now lets do the same for DELETE happening before commit. so inbetween delete and commit; BEGIN; BEGIN; DELETE; INSERT; DELETE - hangs END; COMMIT DELETE 0 - fails but doesn't roll back the transaction INSERT INTO sbtest values(500815,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); ERROR: duplicate key value violates unique constraint "sbtest_pkey" END; ROLLBACK So probably the log report is missing this point that the DELETE started before the first one committed since the timestamps are same though in the logs the are in different order. The question is should the delete fail if it doesn't exist or succeed with DELETE 0 ? The other way I tried is I tried doing the DELETE before insert in the first one and same results.. So irrespective of how it does things if DELETE starts before the first transaction commits, then the answer is "ERROR: duplicate key" What happening is while DELETE tries to delete the old row and INSERT then tries to compete against the new row.. Not sure which way should be right in read-committed. The end result is sysbench keeps failing at 64+ threads and I am trying to figure out what do I fix (sysbench or postgresql) and why? Thanks. Regards, Jignesh
Re: possible concurrency bug or mistake in understanding read-committed behavior
От
Jignesh Shah
Дата:
Actually cutting down my mail to something more readable.. Lets consider two transactions BEGIN; BEGIN; DELETE FROM sbtest WHERE id=500815; INSERT INTO sbtest values(500815,0,'','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); DELETE FROM sbtest WHERE id=500815; < ------- hangs/waits END; COMMIT DELETE 0 – returns success but doesn’t delete any rows . It doesn't roll back the transaction INSERT INTO sbtest values(500815,0,'','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); ERROR: duplicate key value violates unique constraint"sbtest_pkey" END; ROLLBACK The question is should the delete fail if it doesn't exist and cause a rollback or succeed with DELETE 0 ? The other way I tried is I tried doing the DELETE before insert in the first one and same results.. So irrespective of how it does things if DELETE starts before the first transaction commits, then the answer is "ERROR: duplicate key" What happening is while DELETE tries to delete the old row and INSERT then tries to compete against the new row.. Not sure which way should be right in read-committed. What’s the official READ-COMMITTED semantics for this? Thanks. Regards, Jignesh
Re: possible concurrency bug or mistake in understanding read-committed behavior
От
"Kevin Grittner"
Дата:
Jignesh Shah <jkshah@gmail.com> wrote: > The question is should the delete fail if it doesn't exist and > cause a rollback or succeed with DELETE 0 ? I think existing behavior is consistent with both the standard and the other behaviors of PostgreSQL at the READ COMMITTED isolation level. I might have found it surprising at first glance except that there was a recent discussion about why an INSERT on one transaction doesn't stand in the way of a concurrent UPDATE in another transaction at that level. (I don't want to exhibit surprise again so soon -- it might scare Bruce. ;-) ) This example does cause a serialization failure at the REPEATABLE READ level as well as at the SERIALIZABLE level with the SSI patch, which all seems sane to me, too. -Kevin
Re: possible concurrency bug or mistake in understanding read-committed behavior
От
"Kevin Grittner"
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > Jignesh Shah <jkshah@gmail.com> wrote: > >> The question is should the delete fail if it doesn't exist and >> cause a rollback or succeed with DELETE 0 ? > > I think existing behavior is consistent with both the standard and > the other behaviors of PostgreSQL at the READ COMMITTED isolation > level. I actually woke up in the middle of the night after sending this with the realization that I was wrong in that statement. Well, half wrong -- it *is* entirely consistent with the other behaviors of PostgreSQL at the READ COMMITTED isolation level, but it is *not* consistent with the standard. Since the row existed both before and after the first transaction; for the other transaction to see zero rows it has to violate atomicity, and see *part* of the work of the transaction but not *all* of it. This issue has been discussed on the list before, regarding other curious behaviors, and what came of it was that to prevent such behavior PostgreSQL would need to wrap each statement within a READ COMMITTED transaction with an implicit subtransaction and upon detecting that a transaction on which it was blocked due to a write had committed, it would need to roll back the subtransaction and acquire a new snapshot. Nobody seemed excited about either doing the work to make that happen or what the performance implications might be. The upshot was more or less that if you care about such anomalies you should be running at a more strict isolation level -- what you describe is not a problem at REPEATABLE READ or SERIALIZABLE isolation levels. -Kevin