Обсуждение: Proposal: String key space for advisory locks
Greetings, I'd like to propose a potential patch, and wanted to get preliminary feedback on it before I started looking into the design. Summary: Add a string key space to the advisory lock functionality. Rationale: Right now, the key spaces (the range of unique values that can be used as identity) for advisory locks are either a bigint or two ints. This is, of course, technically more than one could imaginably need in any application. The difficulty arises when the number of potential advisory locks is related to rows in one or more tables. For example, suppose one wanted to use advisory locks to signal that a queue entry is being processed, and entries in that queue have a primary key that's also a bigint. There's no problem; the advisory lock id is the primary key for the row. And, then, one wants to use an advisory lock to signal that a particular record in another table is being processed in a long-term process. One has a series of unappealing alternatives at that point, mostly involving encoding a table ID and the primary key of a record into the 64 bit number, or just hoping that the primary key doesn't overflow an int, and using the 2 x int form. API Changes: Overloading the various advisory lock functions to take a suitable string type (varchar(64)?) in addition to the bigint / 2 x int variations. As with the bigint / 2 x int forms, this string namespace would be disjoint from the other key spaces. Thanks in advance for any comments. -- -- Christophe Pettus xof@thebuild.com
Christophe Pettus <xof@thebuild.com> wrote: > Summary: Add a string key space to the advisory lock functionality. Why aren't you satisfied with hashtext('foo') ? The restriction comes from LOCKTAG struct, in which we can use only 3 * uint32 and 1 * uint16 for lock descriptor. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Christophe Pettus wrote: > API Changes: > > Overloading the various advisory lock functions to take a suitable > string type (varchar(64)?) in addition to the bigint / 2 x int > variations. As with the bigint / 2 x int forms, this string > namespace would be disjoint from the other key spaces. I don't think this can be made to work. The locktag hash element has a fixed size. Perhaps you could make it work if you hashed the string and used that as a locktag, but it would lock too much as soon as two strings had matching hashes. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Oct 26, 2009 at 1:54 AM, Christophe Pettus <xof@thebuild.com> wrote: > Greetings, > > I'd like to propose a potential patch, and wanted to get preliminary > feedback on it before I started looking into the design. > > Summary: Add a string key space to the advisory lock functionality. > > Rationale: > > Right now, the key spaces (the range of unique values that can be used as > identity) for advisory locks are either a bigint or two ints. This is, of > course, technically more than one could imaginably need in any application. > The difficulty arises when the number of potential advisory locks is > related to rows in one or more tables. > > For example, suppose one wanted to use advisory locks to signal that a queue > entry is being processed, and entries in that queue have a primary key > that's also a bigint. There's no problem; the advisory lock id is the > primary key for the row. > > And, then, one wants to use an advisory lock to signal that a particular > record in another table is being processed in a long-term process. One has > a series of unappealing alternatives at that point, mostly involving > encoding a table ID and the primary key of a record into the 64 bit number, > or just hoping that the primary key doesn't overflow an int, and using the 2 > x int form. If you want to lock records from multiple tables, probably the best approach is to use a single sequence and pull IDs from it for each table you want to use advisory locks with. It doesn't even have to be the primary key (although it can be)...you can even use a domain: create sequence lock_seq; create domain lock_val not null default nextval('lock_seq'); create table a_table(lock_val lock_val, ...); create table b_table(lock_val lock_val, ...); Regarding your proposal...the lock system is highly optimized and any suggestion that incurs performance issues is probably not going to make it... merlin
Christophe Pettus <xof@thebuild.com> writes: > I'd like to propose a potential patch, and wanted to get preliminary > feedback on it before I started looking into the design. > Summary: Add a string key space to the advisory lock functionality. Your chances of making the LOCKTAG struct bigger for this are nonexistent. I concur with Itagaki-san's suggestion that a hash of your strings ought to be a fine solution ... and you could use it today. regards, tom lane
Alvaro Herrera wrote: > Christophe Pettus wrote: > >> API Changes: >> >> Overloading the various advisory lock functions to take a suitable >> string type (varchar(64)?) in addition to the bigint / 2 x int >> variations. As with the bigint / 2 x int forms, this string >> namespace would be disjoint from the other key spaces. > > I don't think this can be made to work. The locktag hash element has a > fixed size. Perhaps you could make it work if you hashed the string and > used that as a locktag, but it would lock too much as soon as two > strings had matching hashes. You could add another level of indirection, e.g by adding a new table that maps the string to a bigint. I doubt it's worth the effort and performance impact, though. Cleaning up old unused rows from the table etc. would require a fair amount of work. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
> Why aren't you satisfied with hashtext('foo') ? Collisions, mostly. > The restriction comes from LOCKTAG struct, in which > we can use only 3 * uint32 and 1 * uint16 for lock descriptor. Yeah, that's a pretty hard limit. NM, we'll have to figure out some way around it. --Josh Berkus
Josh Berkus <josh@agliodbs.com> wrote: > > Why aren't you satisfied with hashtext('foo') ? > > Collisions, mostly. Hmmm, hashtext() returns int32. , Can you reduce the collision issue if we had hashtext64()? Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote: > Hmmm, hashtext() returns int32. , > Can you reduce the collision issue if we had hashtext64()? That would certainly reduce the chance of a collison considerably, assuming the right algorithm. -- -- Christophe Pettus xof@thebuild.com
On Mon, Oct 26, 2009 at 06:35:13PM -0700, Christophe Pettus wrote: > > On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote: > >> Hmmm, hashtext() returns int32. , >> Can you reduce the collision issue if we had hashtext64()? > > That would certainly reduce the chance of a collison considerably, assuming > the right algorithm. > > -- > -- Christophe Pettus > xof@thebuild.com > The current hash function can already support generating a 64-bit hash value by using both the b and c values. The function is called hashlittle2 and has this comment in the original Bob Jenkins 2006 code: /** hashlittle2: return 2 32-bit hash values** This is identical to hashlittle(), except it returns two 32-bit hash* valuesinstead of just one. This is good enough for hash table* lookup with 2^^64 buckets, or if you want a second hash ifyou're not* happy with the first, or if you want a probably-unique 64-bit ID for* the key. *pc is better mixed than *pb,so use *pc first. If you want* a 64-bit value do something like "*pc + (((uint64_t)*pb)<<32)".*/ This should be a simple change. It would be worth running it by the developer community to figure out how to add this functionality in a way that will make 64-bit hashes available easily to other code in the DB, perhaps even using them in very large hash indexes. Regards, Ken
On Mon, Oct 26, 2009 at 4:30 PM, Josh Berkus <josh@agliodbs.com> wrote: > >> Why aren't you satisfied with hashtext('foo') ? > > Collisions, mostly. Why even bother with a hash function when you can just have multiple table pull from a shared sequence? AFAICT, this solves the OP's problem with no downsides (I used the approach with excellent results in a ported cobol app which had pessimistic locking requirement). merlin
Merlin, > Why even bother with a hash function when you can just have multiple > table pull from a shared sequence? AFAICT, this solves the OP's > problem with no downsides (I used the approach with excellent results > in a ported cobol app which had pessimistic locking requirement). Well, if you have enough tables, the sequence itself becomes a bottleneck (yes, I've had this happen in an app where all tables shared one sequence). There's also the fact that such a solution is extremely hard to retrofit onto an existing application. It also offends my sense of good database design, but that's another issue entirely. More importantly, I think the issues raised here cause developers not to use advisory locks and instead use solutions more subject to race conditions, like a locking table. Advisory locks could be a really cool feature for developers if it was just a bit more usable. But, as others have pointed out, increasing the size of the lock namespace would cause huge issues elsewhere. --Josh Berkus
On Tue, Oct 27, 2009 at 12:43 PM, Josh Berkus <josh@agliodbs.com> wrote: > Merlin, > >> Why even bother with a hash function when you can just have multiple >> table pull from a shared sequence? AFAICT, this solves the OP's >> problem with no downsides (I used the approach with excellent results >> in a ported cobol app which had pessimistic locking requirement). > > Well, if you have enough tables, the sequence itself becomes a > bottleneck I wonder if that's a legacy problem...I tested on our development server w/pgbench -f and measured that nextval('s') scaled almost linearly (I tested up to 900 clients) at about 70% of 'select 0'. (28k tps on 4 core dell server vs 40k peak). pgbench does have it's own scaling problems though. Since I happen to be working on a project that relies heavily on high traffic sequences, do you have any specific insights on known scaling problems with sequences? > It also offends my sense of good database design, but that's another > issue entirely. I basically agree. > More importantly, I think the issues raised here cause developers not to > use advisory locks and instead use solutions more subject to race > conditions, like a locking table. Advisory locks could be a really cool > feature for developers if it was just a bit more usable. 'as is', advisory locks is a fantastic feature that can be used for signaling, mutexing, etc that are relatively difficult things to do in the transactional world of sql. My main gripe is that the 'shared id' method for doing record pessimistic locks is basically a nuclear missile pointed at your shared buffers if you don't have lot of discipline in the queries that lock IDs. Maybe this argues for more of a 'sql exposed' pessimistic lock feature that operates on similar level as 'for update'...I'm not sure...curious what thoughts you have about improving them. merlin
On Oct 27, 2009, at 4:37 PM, Merlin Moncure wrote: > 'as is', advisory locks is a fantastic feature that can be used for > signaling, mutexing, etc that are relatively difficult things to do in > the transactional world of sql. My main gripe is that the 'shared id' > method for doing record pessimistic locks is basically a nuclear > missile pointed at your shared buffers if you don't have lot of > discipline in the queries that lock IDs. Maybe this argues for more > of a 'sql exposed' pessimistic lock feature that operates on similar > level as 'for update'...I'm not sure...curious what thoughts you have > about improving them. Advisory locks have, as you say, a raft of very useful characteristics: 1. Enforced as much or as little as you wish, depending on your application design. 2. Race-condition-free. 3. Cleaned up automatically on session end. Of course, 2^64 potential entries are enough for anyone. The usability issue comes when you have multiple domains that you want to apply advisory locks to in a single database. For example, if you have multiple tables (one of which, just for example, has a character pk), and perhaps some inter-client mutex signaling for things that are outside of a transactional model ("this client is importing a file from an outside source, so don't you do it"), it's unappealing to have to come up with ways of representing those in a 64-bit namespace. Hashing isn't a terrible solution, assuming collisions don't become an issue; a well-designed hashtext64() would help a lot. -- -- Christophe Pettus xof@thebuild.com