Обсуждение: Proposal: String key space for advisory locks

Поиск

Список

Период

Сортировка

Proposal: String key space for advisory locks

От

Christophe Pettus

Дата:

26 октября 2009 г., 05:54:40

Greetings,

I'd like to propose a potential patch, and wanted to get preliminary  
feedback on it before I started looking into the design.

Summary:    Add a string key space to the advisory lock functionality.

Rationale:

Right now, the key spaces (the range of unique values that can be used  
as identity) for advisory locks are either a bigint or two ints.  This  
is, of course, technically more than one could imaginably need in any  
application.  The difficulty arises when the number of potential  
advisory locks is related to rows in one or more tables.

For example, suppose one wanted to use advisory locks to signal that a  
queue entry is being processed, and entries in that queue have a  
primary key that's also a bigint.  There's no problem; the advisory  
lock id is the primary key for the row.

And, then, one wants to use an advisory lock to signal that a  
particular record in another table is being processed in a long-term  
process.  One has a series of unappealing alternatives at that point,  
mostly involving encoding a table ID and the primary key of a record  
into the 64 bit number, or just hoping that the primary key doesn't  
overflow an int, and using the 2 x int form.

API Changes:

Overloading the various advisory lock functions to take a suitable  
string type (varchar(64)?) in addition to the bigint / 2 x int  
variations.  As with the bigint / 2 x int forms, this string namespace  
would be disjoint from the other key spaces.

Thanks in advance for any comments.
--
-- Christophe Pettus   xof@thebuild.com

Re: Proposal: String key space for advisory locks

От

Itagaki Takahiro

Дата:

26 октября 2009 г., 07:11:42

Christophe Pettus <xof@thebuild.com> wrote:

> Summary:    Add a string key space to the advisory lock functionality.

Why aren't you satisfied with hashtext('foo') ?

The restriction comes from LOCKTAG struct, in which
we can use only 3 * uint32 and 1 * uint16 for lock descriptor.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: Proposal: String key space for advisory locks

От

Alvaro Herrera

Дата:

26 октября 2009 г., 13:41:53

Christophe Pettus wrote:

> API Changes:
> 
> Overloading the various advisory lock functions to take a suitable
> string type (varchar(64)?) in addition to the bigint / 2 x int
> variations.  As with the bigint / 2 x int forms, this string
> namespace would be disjoint from the other key spaces.

I don't think this can be made to work.  The locktag hash element has a
fixed size.  Perhaps you could make it work if you hashed the string and
used that as a locktag, but it would lock too much as soon as two
strings had matching hashes.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Proposal: String key space for advisory locks

От

Merlin Moncure

Дата:

26 октября 2009 г., 17:50:00

On Mon, Oct 26, 2009 at 1:54 AM, Christophe Pettus <xof@thebuild.com> wrote:
> Greetings,
>
> I'd like to propose a potential patch, and wanted to get preliminary
> feedback on it before I started looking into the design.
>
> Summary:    Add a string key space to the advisory lock functionality.
>
> Rationale:
>
> Right now, the key spaces (the range of unique values that can be used as
> identity) for advisory locks are either a bigint or two ints.  This is, of
> course, technically more than one could imaginably need in any application.
>  The difficulty arises when the number of potential advisory locks is
> related to rows in one or more tables.
>
> For example, suppose one wanted to use advisory locks to signal that a queue
> entry is being processed, and entries in that queue have a primary key
> that's also a bigint.  There's no problem; the advisory lock id is the
> primary key for the row.
>
> And, then, one wants to use an advisory lock to signal that a particular
> record in another table is being processed in a long-term process.  One has
> a series of unappealing alternatives at that point, mostly involving
> encoding a table ID and the primary key of a record into the 64 bit number,
> or just hoping that the primary key doesn't overflow an int, and using the 2
> x int form.

If you want to lock records from multiple tables, probably the best
approach is to use a single sequence and pull IDs from it for each
table you want to use advisory locks with.  It doesn't even have to be
the primary key (although it can be)...you can even use a domain:

create sequence lock_seq;
create domain lock_val not null default nextval('lock_seq');
create table a_table(lock_val lock_val, ...);
create table b_table(lock_val lock_val, ...);

Regarding your proposal...the lock system is highly optimized and any
suggestion that incurs performance issues is probably not going to
make it...

merlin

Re: Proposal: String key space for advisory locks

От

Tom Lane

Дата:

26 октября 2009 г., 18:23:29

Christophe Pettus <xof@thebuild.com> writes:
> I'd like to propose a potential patch, and wanted to get preliminary  
> feedback on it before I started looking into the design.

> Summary:    Add a string key space to the advisory lock functionality.

Your chances of making the LOCKTAG struct bigger for this are nonexistent.
I concur with Itagaki-san's suggestion that a hash of your strings ought
to be a fine solution ... and you could use it today.
        regards, tom lane

Re: Proposal: String key space for advisory locks

От

Heikki Linnakangas

Дата:

26 октября 2009 г., 18:37:26

Alvaro Herrera wrote:
> Christophe Pettus wrote:
> 
>> API Changes:
>>
>> Overloading the various advisory lock functions to take a suitable
>> string type (varchar(64)?) in addition to the bigint / 2 x int
>> variations.  As with the bigint / 2 x int forms, this string
>> namespace would be disjoint from the other key spaces.
> 
> I don't think this can be made to work.  The locktag hash element has a
> fixed size.  Perhaps you could make it work if you hashed the string and
> used that as a locktag, but it would lock too much as soon as two
> strings had matching hashes.

You could add another level of indirection, e.g by adding a new table
that maps the string to a bigint. I doubt it's worth the effort and
performance impact, though. Cleaning up old unused rows from the table
etc. would require a fair amount of work.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

Re: Proposal: String key space for advisory locks

От

Josh Berkus

Дата:

26 октября 2009 г., 20:30:15

> Why aren't you satisfied with hashtext('foo') ?

Collisions, mostly.

> The restriction comes from LOCKTAG struct, in which
> we can use only 3 * uint32 and 1 * uint16 for lock descriptor.

Yeah, that's a pretty hard limit.  NM, we'll have to figure out some way
around it.

--Josh Berkus

Re: Proposal: String key space for advisory locks

От

Itagaki Takahiro

Дата:

27 октября 2009 г., 00:23:50

Josh Berkus <josh@agliodbs.com> wrote:

> > Why aren't you satisfied with hashtext('foo') ?
> 
> Collisions, mostly.

Hmmm, hashtext() returns int32. ,
Can you reduce the collision issue if we had hashtext64()?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Re: Proposal: String key space for advisory locks

От

Christophe Pettus

Дата:

27 октября 2009 г., 01:35:28

On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote:

> Hmmm, hashtext() returns int32. ,
> Can you reduce the collision issue if we had hashtext64()?

That would certainly reduce the chance of a collison considerably,  
assuming the right algorithm.

--
-- Christophe Pettus   xof@thebuild.com

Re: Proposal: String key space for advisory locks

От

Kenneth Marshall

Дата:

27 октября 2009 г., 12:53:19

On Mon, Oct 26, 2009 at 06:35:13PM -0700, Christophe Pettus wrote:
>
> On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote:
>
>> Hmmm, hashtext() returns int32. ,
>> Can you reduce the collision issue if we had hashtext64()?
>
> That would certainly reduce the chance of a collison considerably, assuming 
> the right algorithm.
>
> --
> -- Christophe Pettus
>    xof@thebuild.com
>
The current hash function can already support generating a 64-bit
hash value by using both the b and c values. The function is called
hashlittle2 and has this comment in the original Bob Jenkins 2006
code:

/** hashlittle2: return 2 32-bit hash values** This is identical to hashlittle(), except it returns two 32-bit hash*
valuesinstead of just one.  This is good enough for hash table* lookup with 2^^64 buckets, or if you want a second hash
ifyou're not* happy with the first, or if you want a probably-unique 64-bit ID for* the key.  *pc is better mixed than
*pb,so use *pc first.  If you want* a 64-bit value do something like "*pc + (((uint64_t)*pb)<<32)".*/

This should be a simple change. It would be worth running it by
the developer community to figure out how to add this functionality
in a way that will make 64-bit hashes available easily to other
code in the DB, perhaps even using them in very large hash indexes.

Regards,
Ken

Re: Proposal: String key space for advisory locks

От

Merlin Moncure

Дата:

27 октября 2009 г., 13:49:44

On Mon, Oct 26, 2009 at 4:30 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> Why aren't you satisfied with hashtext('foo') ?
>
> Collisions, mostly.

Why even bother with a hash function when you can just have multiple
table pull from a shared sequence?  AFAICT, this solves the OP's
problem with no downsides (I used the approach with excellent results
in a ported cobol app which had pessimistic locking requirement).

merlin

Re: Proposal: String key space for advisory locks

От

Josh Berkus

Дата:

27 октября 2009 г., 16:43:38

Merlin,

> Why even bother with a hash function when you can just have multiple
> table pull from a shared sequence?  AFAICT, this solves the OP's
> problem with no downsides (I used the approach with excellent results
> in a ported cobol app which had pessimistic locking requirement).

Well, if you have enough tables, the sequence itself becomes a
bottleneck (yes, I've had this happen in an app where all tables shared
one sequence).  There's also the fact that such a solution is extremely
hard to retrofit onto an existing application.

It also offends my sense of good database design, but that's another
issue entirely.

More importantly, I think the issues raised here cause developers not to
use advisory locks and instead use solutions more subject to race
conditions, like a locking table.  Advisory locks could be a really cool
feature for developers if it was just a bit more usable.

But, as others have pointed out, increasing the size of the lock
namespace would cause huge issues elsewhere.

--Josh Berkus

Re: Proposal: String key space for advisory locks

От

Merlin Moncure

Дата:

27 октября 2009 г., 23:37:14

On Tue, Oct 27, 2009 at 12:43 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Merlin,
>
>> Why even bother with a hash function when you can just have multiple
>> table pull from a shared sequence?  AFAICT, this solves the OP's
>> problem with no downsides (I used the approach with excellent results
>> in a ported cobol app which had pessimistic locking requirement).
>
> Well, if you have enough tables, the sequence itself becomes a
> bottleneck

I wonder if that's a legacy problem...I tested on our development
server w/pgbench -f and measured that nextval('s') scaled almost
linearly (I tested up to 900 clients) at about 70% of 'select 0'. (28k
tps on 4 core dell server vs 40k peak).  pgbench does have it's own
scaling problems though.  Since I happen to be working on a project
that relies heavily on high traffic sequences, do you have any
specific insights on known scaling problems with sequences?

> It also offends my sense of good database design, but that's another
> issue entirely.

I basically agree.

> More importantly, I think the issues raised here cause developers not to
> use advisory locks and instead use solutions more subject to race
> conditions, like a locking table.  Advisory locks could be a really cool
> feature for developers if it was just a bit more usable.

'as is', advisory locks is a fantastic feature that can be used for
signaling, mutexing, etc that are relatively difficult things to do in
the transactional world of sql.  My main gripe is that the 'shared id'
method for doing record pessimistic locks is basically a nuclear
missile pointed at your shared buffers if you don't have lot of
discipline in the queries that lock IDs.  Maybe this argues for more
of a 'sql exposed' pessimistic lock feature that operates on similar
level as 'for update'...I'm not sure...curious what thoughts you have
about improving them.

merlin

Re: Proposal: String key space for advisory locks

От

Christophe Pettus

Дата:

27 октября 2009 г., 23:50:34

On Oct 27, 2009, at 4:37 PM, Merlin Moncure wrote:
> 'as is', advisory locks is a fantastic feature that can be used for
> signaling, mutexing, etc that are relatively difficult things to do in
> the transactional world of sql.  My main gripe is that the 'shared id'
> method for doing record pessimistic locks is basically a nuclear
> missile pointed at your shared buffers if you don't have lot of
> discipline in the queries that lock IDs.  Maybe this argues for more
> of a 'sql exposed' pessimistic lock feature that operates on similar
> level as 'for update'...I'm not sure...curious what thoughts you have
> about improving them.

Advisory locks have, as you say, a raft of very useful characteristics:

1. Enforced as much or as little as you wish, depending on your  
application design.
2. Race-condition-free.
3. Cleaned up automatically on session end.

Of course, 2^64 potential entries are enough for anyone.  The  
usability issue comes when you have multiple domains that you want to  
apply advisory locks to in a single database.  For example, if you  
have multiple tables (one of which, just for example, has a character  
pk), and perhaps some inter-client mutex signaling for things that are  
outside of a transactional model ("this client is importing a file  
from an outside source, so don't you do it"), it's unappealing to have  
to come up with ways of representing those in a 64-bit namespace.

Hashing isn't a terrible solution, assuming collisions don't become an  
issue; a well-designed hashtext64() would help a lot.
--
-- Christophe Pettus   xof@thebuild.com

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Proposal: String key space for advisory locks