Обсуждение: FOR SHARE LOCK clause ?
Ok, in multi-version systems readers never lock selected rows and so never block writers. Nice but cause problems in some cases: if you want to implement referential integrity at the application level then you'll have to use LOCK TABLE IN SHARE MODE or SELECT FOR UPDATE to prevent updation of primary keys etc. Not so good... So, what about FOR SHARE LOCK clause in SELECTs? Like FOR UPDATE clause, FOR SHARE LOCK clause will lock selected rows, but in _share_ mode. This is some kind of read-locking on demand. Disadvantages: 1. FOR SHARE LOCK will use lock manager to lock rows, but lmgr can't handle "too many" locks. (note that UPDATE, DELETE& SELECT FOR UPDATE use only one entry in lmgr table for ALL updated/marked_for_update rows). 2. UPDATE/DELETE will go slower than now: server will have to exclusively lock (using lmgr) each row being updated, update/delete it and release lmgr' lock after that (note that currently lmgr locking is not used for rows). On the otherhand, SELECT FOR SHARE LOCK could mark locked rows in buffer (t_infomask |= HEAP_MARKED_FOR_SHARE) and so UPDATE/DELETE could acquire lmgr lock only if row is marked... Comments ? Vadim
> So, what about FOR SHARE LOCK clause in SELECTs? > Like FOR UPDATE clause, FOR SHARE LOCK clause > will lock selected rows, but in _share_ mode. > This is some kind of read-locking on demand. > 1. FOR SHARE LOCK will use lock manager to > lock rows, but lmgr can't handle "too many" locks. > (note that UPDATE, DELETE & SELECT FOR UPDATE use only > one entry in lmgr table for ALL updated/marked_for_update > rows). > 2. UPDATE/DELETE will go slower than now: server will have to > exclusively lock (using lmgr) each row being updated, update/delete > it and release lmgr' lock after that (note that currently > lmgr locking is not used for rows). On the other hand, > SELECT FOR SHARE LOCK could mark locked rows in buffer > (t_infomask |= HEAP_MARKED_FOR_SHARE) and so UPDATE/DELETE > could acquire lmgr lock only if row is marked... If one did not specify FOR SHARE LOCK then the performance would be similar to the current performance? Then if you want to use the feature, you pay the performance penalty. Or do you think that the checking should/will happen irrespective of any query settings? Also, my commercial Ingres system would escalate row-level locks to page- and table-level locks when the number of row-level locks exceeded a (settable) threshold. That seems like a nice compromise between features and performance, but it can lead to deadlock during the lock escalation... - Tom
"Thomas G. Lockhart" wrote: > > > So, what about FOR SHARE LOCK clause in SELECTs? > > Like FOR UPDATE clause, FOR SHARE LOCK clause > > will lock selected rows, but in _share_ mode. > > This is some kind of read-locking on demand. > > 1. FOR SHARE LOCK will use lock manager to > > lock rows, but lmgr can't handle "too many" locks. > > (note that UPDATE, DELETE & SELECT FOR UPDATE use only > > one entry in lmgr table for ALL updated/marked_for_update > > rows). > > 2. UPDATE/DELETE will go slower than now: server will have to > > exclusively lock (using lmgr) each row being updated, update/delete > > it and release lmgr' lock after that (note that currently > > lmgr locking is not used for rows). On the other hand, > > SELECT FOR SHARE LOCK could mark locked rows in buffer > > (t_infomask |= HEAP_MARKED_FOR_SHARE) and so UPDATE/DELETE > > could acquire lmgr lock only if row is marked... > > If one did not specify FOR SHARE LOCK then the performance would be > similar to the current performance? Then if you want to use the feature, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Yes, if FOR SHARE LOCK will mark locked row in buffer (and so - force buffer write)... > you pay the performance penalty. Or do you think that the checking > should/will happen irrespective of any query settings? > > Also, my commercial Ingres system would escalate row-level locks to > page- and table-level locks when the number of row-level locks exceeded > a (settable) threshold. That seems like a nice compromise between > features and performance, but it can lead to deadlock during the lock > escalation... As you can see in old lock.c postgres supposed to do row --> table lock escalation if #rows locked in table equal 10 - very simple and bad -:) I wouldn't like to care about # of SHARE LOCKed rows, at least in 6.5. Actually, this number shouldn't be too big. In the case of referential integrity, FOR SHARE LOCK should be used only on primary table and only if user inserts/updates foreign table, but primary key is unique... Vadim
> Ok, in multi-version systems readers never lock > selected rows and so never block writers. Nice but > cause problems in some cases: if you want > to implement referential integrity at the > application level then you'll have to use > LOCK TABLE IN SHARE MODE or SELECT FOR UPDATE to > prevent updation of primary keys etc. Not so good... I was wondering if there is another perspective to look upon this problem. Assertion: The primary difference between "primary key" and "unique key" is that primary keys are _never_ updated (during transaction processing) where unique keys may be. Question: It seems that the techinical solution here may not be better locking, but rather a mechinism to prevent updates on primary keys unless the entire table is locked, or some other dramatic gesture for non-transaction processing. This leaves the issue of delete looming, so mabye the suggestion won't help. Clark
Let me see if I understand what MVCC(Multi-level concurrency control is). I looked in "Transaction Processing: Concepts and Techniques", and saw some mention on pages 435-437, but not much more. In MVCC, SELECT's do not share lock the table, allowing UPDATE's during the SELECT. This is done by having the SELECT sequential scan look at rows that are committed with transaction ids less than their own, or superseded rows that have a superseded id greater than their own. The only lock a SELECT does it to prevent a vacuum during its table scan. My assumption is table writes still require an exclusive lock, but because SELECT does not need a lock, both can occur at the same time. (Sounds like my deadlock and lock queue code may need tweaking.) Your stated problem is that someone in a transaction doing a SELECT is not getting a shared lock on the rows he is selecting, so they could change while inside the transaction. This is a valid concern. Usually, doing the SELECT FOR UPDATE, even though you are not going to update the table is used. You are suggesting SELECT FOR SHARE LOCK, but because SELECT's don't need a lock anymore, isn't that the same as a FOR UPDATE in an MVCC system? Is the problem that SHARE LOCK does not modify the tuple, so it is harder to lock the rows? > As you can see in old lock.c postgres supposed to do > row --> table lock escalation if #rows locked in > table equal 10 - very simple and bad -:) > > I wouldn't like to care about # of SHARE LOCKed rows, > at least in 6.5. Actually, this number shouldn't be > too big. In the case of referential integrity, FOR SHARE LOCK > should be used only on primary table and only if user > inserts/updates foreign table, but primary key is unique... I think lock escalation is nice. Locking every row makes for lock resource problems. I would recommend locking a single row, and if a second row needs to be locked, just escalate to lock the whole table... if that can be done. This would seem to be the most reasonable and easiest to do. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Tue, 5 Jan 1999, Bruce Momjian wrote: > Let me see if I understand what MVCC(Multi-level concurrency control > is). I looked in "Transaction Processing: Concepts and Techniques", > and saw some mention on pages 435-437, but not much more. > > In MVCC, SELECT's do not share lock the table, allowing UPDATE's during > the SELECT. This is done by having the SELECT sequential scan look at > rows that are committed with transaction ids less than their own, or > superseded rows that have a superseded id greater than their own. The > only lock a SELECT does it to prevent a vacuum during its table scan. > > My assumption is table writes still require an exclusive lock, but > because SELECT does not need a lock, both can occur at the same time. > (Sounds like my deadlock and lock queue code may need tweaking.) > > Your stated problem is that someone in a transaction doing a SELECT is > not getting a shared lock on the rows he is selecting, so they could > change while inside the transaction. This is a valid concern. > > Usually, doing the SELECT FOR UPDATE, even though you are not going to > update the table is used. You are suggesting SELECT FOR SHARE LOCK, but > because SELECT's don't need a lock anymore, isn't that the same as a FOR > UPDATE in an MVCC system? Is the problem that SHARE LOCK does not > modify the tuple, so it is harder to lock the rows? > > > > As you can see in old lock.c postgres supposed to do > > row --> table lock escalation if #rows locked in > > table equal 10 - very simple and bad -:) > > > > I wouldn't like to care about # of SHARE LOCKed rows, > > at least in 6.5. Actually, this number shouldn't be > > too big. In the case of referential integrity, FOR SHARE LOCK > > should be used only on primary table and only if user > > inserts/updates foreign table, but primary key is unique... > > I think lock escalation is nice. Locking every row makes for lock > resource problems. I would recommend locking a single row, and if a > second row needs to be locked, just escalate to lock the whole table... > if that can be done. This would seem to be the most reasonable and > easiest to do. Making two assumption here...first is that your explanation of MVCC is correct, second one being that my understanding of your explannation is correct... If you are going to set the 'table lock' at 2...why not just do the table lock period? From what youexplain above, a table lock won't affect a read, only other writes...? Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
> > I think lock escalation is nice. Locking every row makes for lock > > resource problems. I would recommend locking a single row, and if a > > second row needs to be locked, just escalate to lock the whole table... > > if that can be done. This would seem to be the most reasonable and > > easiest to do. > > Making two assumption here...first is that your explanation of MVCC is > correct, second one being that my understanding of your explannation is > correct... > > If you are going to set the 'table lock' at 2...why not just do the table > lock period? From what youexplain above, a table lock won't affect a > read, only other writes...? Good point. I am assuming he is doing some kind of row-level locking for shared and write locks. I can only guess this from his statement that shared locking of every row would be a problem. Sounds like my explaination may be wrong, because it is saying he has some kind of row-locking going, perhaps for writes. Maybe he is using the fact that if a writer is going to update a row that has a superceeded transaction id that is marked 'in progress' the writer has to wait for the transaction to finish. If you do this, muliple writers can update at the same time, making MVCC better than row-level locking systems. Readers don't block writers, and multiple writers can write as long as they are not touching the same rows. In this scenario, shared locks are tricky, because the above system does not work. You have to do some explicit locking, because reading does not set anything on the row. Vadim is sleeping now, so I assume we will hear something from him this evening. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> Let me see if I understand what MVCC(Multi-level concurrency control > is). I looked in "Transaction Processing: Concepts and Techniques", > and saw some mention on pages 435-437, but not much more. > > In MVCC, SELECT's do not share lock the table, allowing UPDATE's during > the SELECT. This is done by having the SELECT sequential scan look at > rows that are committed with transaction ids less than their own, or > superseded rows that have a superseded id greater than their own. The > only lock a SELECT does it to prevent a vacuum during its table scan. > > My assumption is table writes still require an exclusive lock, but > because SELECT does not need a lock, both can occur at the same time. > (Sounds like my deadlock and lock queue code may need tweaking.) > > Your stated problem is that someone in a transaction doing a SELECT is > not getting a shared lock on the rows he is selecting, so they could > change while inside the transaction. This is a valid concern. > > Usually, doing the SELECT FOR UPDATE, even though you are not going to > update the table is used. You are suggesting SELECT FOR SHARE LOCK, but > because SELECT's don't need a lock anymore, isn't that the same as a FOR > UPDATE in an MVCC system? Is the problem that SHARE LOCK does not > modify the tuple, so it is harder to lock the rows? I hate to reply to my own posting, but I must. I have just read the Date book, Introduction to Database Systems about MVCC, and it confirms my above posting. Date states the MVCC advantages: * Reads are never delayed(in particular, they are not delayed by any concurrent long transaction) * Reads never delay updates(in particular, they do not delay any concurrent long transaction) * It is never necessary to roll back a read-only transaction * Deadlock is possible only between update transactions This is an amazing leap forward. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
OK, I have retrieved Vadim's original proposal for LLL from July. (No wonder I didn't remember it.) He basically describes the MVCC system at the bottom, and at the top, I am discussing a way to avoid a mapping of SCN's to XID's. There are other postings that finally resolved the issue. At least Vadim didn't say I was way off, so somehow he resolved it. This is an great benefit. We don't have to apologize about table-level locking anymore. I didn't think it could be done, and Vadim's changes kind of came as a surprise to me, probably because I didn't understand the ramifications when discussing it in July. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 Re: [HACKERS] proposals for LLL, part 1 ------------------------------------------------------------------------ * From: Bruce Momjian <maillist@candle.pha.pa.us> * To: vadim@krs.ru (Vadim Mikheev) * Subject: Re: [HACKERS] proposals for LLL, part 1 * Date: Thu, 16 Jul 1998 12:14:51 -0400 (EDT) ------------------------------------------------------------------------ I am retaining your entire message here for reference. I have a good solution for this. It will require only 4k of shared memory, and will have no restrictions on the age or number of transactions. First, I think we only want to implement "read committed isolation level", not serialized. Not sure why someone would want serialized. OK, when a backend is looking at a row that has been committed, it must decide if the row was committed before or after my transaction started. If the transaction commit id(xmin) is greater than our current xid, we know we should not look at it because it is for a transaction that started after our own transaction. The problem is for transactions started before our own (have xmin's less than our own), and may have committed before or after our transaction. Here is my idea. We add a field to the shared memory Proc structure that can contain up to 32 transaction ids. When a transaction starts, we spin though all other open Proc structures, and record all currently-running transaction ids in our own Proc field used to store up to 32 transaction ids. While we do this, we remember the lowest of these open transaction ids. This is our snapshot of current transactions at the time our transaction starts. While analyzing a row, if it is greater than our transaction id, then the transaction was not even started before our transaction. If the xmin is lower than the min transaction id that we remembered from the Proc structures, it was committed before our transaction started. If it is greater than or equal to the min remembered transaction id, we must spin through our stored transaction ids. If it is in the stored list, we don't look at the row, because that transaction was not committed when we started our transaction. If it is not in the list, it must have been committed before our transaction started. We know this because if any backend starting a transaction after ours would get a transaction id higher than ours. Comments? > Ok, I'm not sure that LLL will appear in 6.4 but it's good time to > discuss about it. > > First, PostgreSQL is multi-version system due to its > non-overwriting storage manager. And so, first proposal is use > this feature (multi-versioning) in LLL implementation. > > In multi-version systems access methods don't use locks to read > consistent data and so readers don't block writers, writers don't > block readers and only the same-row writers block writers. In such > systems access methods returns snapshot of data as they were in > _some_ point in time. For read committed isolation level this > moment is the time when statement began. For serialized isolation > level this is the time when current transaction began. > > Oracle uses rollback segments to reconstract blocks that were > changed after statement/transaction began and so statement sees > only data that were committed by then. > > In our case we have to analyze tuple xmin/xmax to determine _when_ > corresponding transaction was committed in regard to the last > transaction (LCX) that was committed when statement/transaction > began. > > If xmin/xmax was committed before LCX then tuple > insertion/deletion is visible to statement, else - not visible. > > To achieve this, the second proposal is to use special SCN - > System Change Number (C) Oracle :) - that will be incremented by 1 > by each transaction commit. Each commited transaction will have > corresponding SCN (4 bytes -> equal to sizeof XID). > > We have to keep XID --> SCN mapping as long as there is running > transaction that is "interested" in XID: when transaction begins > it will determine the first (the oldest) running transaction XID > and this will be the minimum XID whose SCN transaction would like > to know. > > Access methods will have to determine SCN for xmin/xmax only if > FRX <= xmin/xmax <= LSX, where FRX is XID of first (oldest) > running transactions and LSX is last started transaction - in the > moment when statement (for read committed) or transaction (for > serialized) began. For such xmin/xmax their SCNs will be compared > with SCN determined in the moment of statement/transaction > begin... > > Changes made by xmin/xmax < FRX are visible to > statement/transaction, and changes made by xmin/xmax > LSX are not > visible. Without xmin/xmax SCN lookup. > > For XID --> SCN mapping I propose to use the simplest schema: > ordered queue of SCNs (or something like this) - i.e. keep SCNs > for all transactions from the first one whose SCN could be > required by some running transaction to the last started. > > This queue must be shared! > > The size of this queue and average number of commits/aborts per > second will define how long transactions will be able to run. 30 > xacts/sec and 400K of queue will enable 30 - 60 minuts running > transactions... > > Keeping queue in shared memmory may be unacceptable in some > cases... mmap or shared buffer pool could be used to access queue. > We'll see... > > Also note that Oracle has special READ ONLY transactions mode. > READ ONLY transactions are disallowed to change anything in the > database. This is good mode for pg_dump (etc) long running > applications. Because of no one will be "interested" in SCN of > READ ONLY transactions - such transactions can make private copy > of the queue part and after this queue could be truncated... > > Having 4 bytes per SCN enable to use special values to mark > corresponding transaction as running or aborted and avoid pg_log > lookup when we need in both SCN and state of transaction. > > ...Well, it's time to sleep :) > > To be continued... > > Comments ? > > Vadim > > -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h) ------------------------------------------------------------------------ * Prev by Date: How about re-moderating pgsql-announce? * Next by Date: Re: [HACKERS] How about re-moderating pgsql-announce? * Prev by thread: proposals for LLL, part 1 * Next by thread: Re: [HACKERS] proposals for LLL, part 1 * Index(es): o Date o Thread Home | Main Index | Thread Index
Bruce Momjian wrote: > > Let me see if I understand what MVCC(Multi-level concurrency control > is). I looked in "Transaction Processing: Concepts and Techniques", > and saw some mention on pages 435-437, but not much more. Yes, I use Oracle documentation to learn about MVCC. Nevertheless, thank you for this book - there are so many ideas there! > In MVCC, SELECT's do not share lock the table, allowing UPDATE's during > the SELECT. This is done by having the SELECT sequential scan look at > rows that are committed with transaction ids less than their own, or > superseded rows that have a superseded id greater than their own. The > only lock a SELECT does it to prevent a vacuum during its table scan. > > My assumption is table writes still require an exclusive lock, but > because SELECT does not need a lock, both can occur at the same time. > (Sounds like my deadlock and lock queue code may need tweaking.) > > Your stated problem is that someone in a transaction doing a SELECT is > not getting a shared lock on the rows he is selecting, so they could > change while inside the transaction. This is a valid concern. > > Usually, doing the SELECT FOR UPDATE, even though you are not going to > update the table is used. You are suggesting SELECT FOR SHARE LOCK, but > because SELECT's don't need a lock anymore, isn't that the same as a FOR > UPDATE in an MVCC system? Is the problem that SHARE LOCK does not Only one running transaction can mark row for update, SHARE LOCK could be acquired by many transactions and so, using referential integrity for example, many transactions could work with the same foreign key simultaneously. > modify the tuple, so it is harder to lock the rows? > > > As you can see in old lock.c postgres supposed to do > > row --> table lock escalation if #rows locked in > > table equal 10 - very simple and bad -:) > > > > I wouldn't like to care about # of SHARE LOCKed rows, > > at least in 6.5. Actually, this number shouldn't be > > too big. In the case of referential integrity, FOR SHARE LOCK > > should be used only on primary table and only if user > > inserts/updates foreign table, but primary key is unique... > > I think lock escalation is nice. Locking every row makes for lock > resource problems. I would recommend locking a single row, and if a > second row needs to be locked, just escalate to lock the whole table... > if that can be done. This would seem to be the most reasonable and > easiest to do. Easiest to do is don't worry about # of locks -:) Let's be on this way for 6.5 Vadim
> > I think lock escalation is nice. Locking every row makes for lock > > resource problems. I would recommend locking a single row, and if a > > second row needs to be locked, just escalate to lock the whole table... > > if that can be done. This would seem to be the most reasonable and > > easiest to do. > > Easiest to do is don't worry about # of locks -:) > Let's be on this way for 6.5 You mean just share-lock the whole table. I agree. It is a pretty rare situation. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > > > I think lock escalation is nice. Locking every row makes for lock > > > resource problems. I would recommend locking a single row, and if a > > > second row needs to be locked, just escalate to lock the whole table... > > > if that can be done. This would seem to be the most reasonable and > > > easiest to do. > > > > Easiest to do is don't worry about # of locks -:) > > Let's be on this way for 6.5 > > You mean just share-lock the whole table. I agree. It is a pretty rare > situation. No. User may use LOCK TABLE IN SHARE MODE for this. I propose SELECT FOR SHARE LOCK as alternative to LOCK TABLE IN SHARE MODE and SELECT FOR UPDATE and would like to share lock each row selected with FOR SHARE LOCK clause in use. I don't know what's real limitations of # locks, but I think that a tens of locks is Ok. Vadim
> Bruce Momjian wrote: > > > > > > I think lock escalation is nice. Locking every row makes for lock > > > > resource problems. I would recommend locking a single row, and if a > > > > second row needs to be locked, just escalate to lock the whole table... > > > > if that can be done. This would seem to be the most reasonable and > > > > easiest to do. > > > > > > Easiest to do is don't worry about # of locks -:) > > > Let's be on this way for 6.5 > > > > You mean just share-lock the whole table. I agree. It is a pretty rare > > situation. > > No. User may use LOCK TABLE IN SHARE MODE for this. > I propose SELECT FOR SHARE LOCK as alternative to > LOCK TABLE IN SHARE MODE and SELECT FOR UPDATE and > would like to share lock each row selected with > FOR SHARE LOCK clause in use. I don't know what's > real limitations of # locks, but I think that > a tens of locks is Ok. So you are going to shared lock every row. And if a user does a sequential scan of the entire table using SELECT FOR SHARE LOCK, he shared locks every row. Isn't he going to run out of locks? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > > > > If you are going to set the 'table lock' at 2...why not just do the table > > lock period? From what youexplain above, a table lock won't affect a > > read, only other writes...? > > Good point. I am assuming he is doing some kind of row-level locking > for shared and write locks. I can only guess this from his statement > that shared locking of every row would be a problem. > > Sounds like my explaination may be wrong, because it is saying he has > some kind of row-locking going, perhaps for writes. Maybe he is using > the fact that if a writer is going to update a row that has a > superceeded transaction id that is marked 'in progress' the writer has > to wait for the transaction to finish. If you do this, muliple writers > can update at the same time, making MVCC better than row-level locking > systems. Yes. When transaction begins it places exclusive lock for transaction ID in pseudo-table (no wait because of ID is unique). If other writer sees that t_xmax is valid (and not committed/aborted) transaction ID then it tries to place share lock for t_xmax in this pseudo-table. If t_xmax is running then concurrent writer will wait for t_xmax commit/abort when exclusive lock for t_xmax is released. With this technique writers use single lmgr entry for all updated rows - just to let other the-same-row writers know when transaction commits/aborts. > > Readers don't block writers, and multiple writers can write as long as > they are not touching the same rows. Yes. > In this scenario, shared locks are tricky, because the above system does > not work. You have to do some explicit locking, because reading does > not set anything on the row. Yes, but this will be user' decision to use FOR SHARE LOCK (to lock rows explicitly) or not. Vadim
Bruce Momjian wrote: > > > > > > > > > Easiest to do is don't worry about # of locks -:) > > > > Let's be on this way for 6.5 > > > > > > You mean just share-lock the whole table. I agree. It is a pretty rare > > > situation. > > > > No. User may use LOCK TABLE IN SHARE MODE for this. > > I propose SELECT FOR SHARE LOCK as alternative to > > LOCK TABLE IN SHARE MODE and SELECT FOR UPDATE and > > would like to share lock each row selected with > > FOR SHARE LOCK clause in use. I don't know what's > > real limitations of # locks, but I think that > > a tens of locks is Ok. > > So you are going to shared lock every row. And if a user does a > sequential scan of the entire table using SELECT FOR SHARE LOCK, he > shared locks every row. Isn't he going to run out of locks? I would like to work with this issue after 6.5 and writes some notes about FOR SHARE LOCK limitations/problems. Vadim
Clark Evans wrote: > > > Ok, in multi-version systems readers never lock > > selected rows and so never block writers. Nice but > > cause problems in some cases: if you want > > to implement referential integrity at the > > application level then you'll have to use > > LOCK TABLE IN SHARE MODE or SELECT FOR UPDATE to > > prevent updation of primary keys etc. Not so good... > > I was wondering if there is another perspective > to look upon this problem. > > Assertion: > > The primary difference between "primary key" and > "unique key" is that primary keys are _never_ ^^^^^^^^^^^^^^^^^^^^^^^^ > updated (during transaction processing) where ^^^^^^^ > unique keys may be. 1. Is this standard requirement? 2. Note that foreign keys may reference unique key, not just primary one... 3. I told about implementing referential integrity _at_the_application_level_, not by the DB system itself - it's up tothe user decide what's allowed and what's not, in this case. > Question: > > It seems that the techinical solution here may > not be better locking, but rather a mechinism > to prevent updates on primary keys unless the > entire table is locked, or some other dramatic > gesture for non-transaction processing. > > This leaves the issue of delete looming, so > mabye the suggestion won't help. Yes. Vadim
> > So you are going to shared lock every row. And if a user does a > > sequential scan of the entire table using SELECT FOR SHARE LOCK, he > > shared locks every row. Isn't he going to run out of locks? > > I would like to work with this issue after 6.5 and writes > some notes about FOR SHARE LOCK limitations/problems. OK, just checking. You can't use the xid, so I see the problem with shared-locking. No way to know which rows were seen by SELECT, so no way to shared-lock them without an explicit lock. With Informix, UPDATE automatically locks every row, and you quickly run out of locks. You have to explicitly lock the table in EXCLUSIVE MODE in a transaction to prevent your UPDATE from running out of locks on a large transaction. And you can still run out of log space or generate a 'large transaction' errror because the log got full before the transaction finished. With MVCC, these are not problems, and FOR SHARED LOCK is just a special thing people can enable for special cases. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Vadim Mikheev wrote: > Clark Evans wrote: > > > Ok, in multi-version systems readers never lock > > > selected rows and so never block writers. Nice but > > > cause problems in some cases: if you want > > > to implement referential integrity at the > > > application level then you'll have to use > > > LOCK TABLE IN SHARE MODE or SELECT FOR UPDATE to > > > prevent updation of primary keys etc. Not so good... > > > > I was wondering if there is another perspective > > to look upon this problem. > > > > Assertion: > > > > The primary difference between "primary key" and > > "unique key" is that primary keys are _never_ > ^^^^^^^^^^^^^^^^^^^^^^^^ > > updated (during transaction processing) where > ^^^^^^^ > > unique keys may be. > > 1. Is this standard requirement? A DBA at Ford that I had many afternoon chats with wished he could make the Oracle database prevent updates of primary keys, short of a full table lock -- which is a privilige. I'm sure that there are others that might disagree. The other thing that the DBA said he'd love was a "garbage collector". Where a delete on row would only "hide" the row after all constraint checks succeeded. In this way, data referenced by systems that are not attached to the database have a chance to be checked for data integrety concerns. Although you may be able to find DBA's that would argue about these points, the idea that primary keys are updated only on rare exception is a solid modeling practice (it's one of Oracle's Ten Commandments). > 2. Note that foreign keys may reference unique key, > not just primary one... You may be able to do it... but I'm not sure that it makes sence. Also, unique keys can have NULLS, primary keys cannot have any NULL allowable columns The idea of a primary key is to identify column(s) as a constant pointer to the object in question, so that other objects can use that pointer. The idea of a unique index is to enforce constraints that a combination of columns must be unique. The mechinism does not necessarily mean that the columns are "constant", and thus make good pointers. For instance, a full name may very well be unique in a small company - if this assumption is made in the reporting code, then a unique key is warranted, with clear documentation explaining that the reporting code assumes this fact. However, it would be a bad idea to use the full name as a pointer, as a marriage could cause havoc. > 3. I told about implementing referential > integrity _at_the_application_level_, not by the > DB system itself - it's up to the user decide > what's allowed and what's not, in this case. I think you are right here, although allowing the user to make an application level decision and then configure the database to automagically enforce this decision is golden. More e-mail to follow...
Vadim Mikheev wrote: > Clark Evans wrote: > > > Ok, in multi-version systems readers never lock > > > selected rows and so never block writers. Nice but > > > cause problems in some cases: if you want > > > to implement referential integrity at the > > > application level then you'll have to use > > > LOCK TABLE IN SHARE MODE or SELECT FOR UPDATE to > > > prevent updation of primary keys etc. Not so good... I'm trying to understand. Please excuse my ignorance of database implementation issues. From a application developers's perspective: Suppose that I have three tables: ORDER, PRODUCT, and ORDER_LINE. ORDER_NO is the primary key of ORDER, PRODUCT_NO is the primary key for PRODUCT, and ORDER_NO and PRODUCT_NO are foreign keys in ORDER_LINE. Now picture a data entry person with an order entry screen, with order information above, a grid with a row for each order line, and a drop down list box for each possible product. The problem you identified above would happen if while Lucy was inserting order lines for ORDER_NO=100 PRODUCT_NO=230, someone else changes the order_no from 100 to 101 in the order table _or_ if someone else deletes from product where product_no = 230. There are a number of solutions to this pattern. a) Pessimistic - In this pattern, when a child is about to be edited or added, the parent record is locked. When the updates to the child table are done, a commit frees the parent. This is coupled with code to check to see if the row is locked when it is read, if so, then the parent and the child are brought back "read-only" with a status of "being updated..." This solution is great when the foreign key points to a "natural" parent, such as an order pointing to an order_line... where deleting the order would cascade to the order lines. The solution works wonderfully when the parent and child are on the same screen. b) Optimistic - In this pattern, when a child is inserted/updated and a parent key is not found, an exception is thrown by the database - "foreign key violation". In this case, the client application verifies which foreign key it was, and informs the user about the failure: "Oval Vase (203) is no longer a valid product. Order line addition canceled. Updating product list..." This solution works well when the object being referenced is maintained by a different group or is configured by a different part of the application. In this case, you wouldn't want the product database to be constantly locked up as people are ordering products... fixing a product name would be a pain! In general, if deleting the parent would not cascade to the child, this pattern is the best one to use. This solution works great if the parent/child are on different screens. In any case, I'm not really sure what this locking in share mode would get you... could you explain in the context of the above examples. Thanks! Clark
> In any case, I'm not really sure what this locking > in share mode would get you... could you explain > in the context of the above examples. I will take a stab at it. Locking in shared mode would cause any row read by the SELECT to be remain unmodified until its transaction completes. Other SELECTs can read it, but writer have to wait and can not modify it. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026