Обсуждение: UUID's as primary keys

Поиск
Список
Период
Сортировка

UUID's as primary keys

От
Thomas Hallgren
Дата:
I'm building an app where I have several entities that are identified
using a UUID (i.e. a 128 bit quantity). My current implementation uses a
composite primary key consisting of two int8 values. It's a bit
cumbersome and I would much rather have a distinct type. An earlier
implementation using Oracle mapped the UUID to a RAW(16) but PostgreSQL
doesn't have that and the BYTEA adds extra overhead.

What would be the best (as in most efficient) mapping for a 128 bit
primary key?

Regards,
Thomas Hallgren


Re: UUID's as primary keys

От
Martijn van Oosterhout
Дата:
On Wed, Jun 28, 2006 at 09:01:49AM +0200, Thomas Hallgren wrote:
> I'm building an app where I have several entities that are identified
> using a UUID (i.e. a 128 bit quantity). My current implementation uses a
> composite primary key consisting of two int8 values. It's a bit
> cumbersome and I would much rather have a distinct type. An earlier
> implementation using Oracle mapped the UUID to a RAW(16) but PostgreSQL
> doesn't have that and the BYTEA adds extra overhead.
>
> What would be the best (as in most efficient) mapping for a 128 bit
> primary key?

Sounds like something for a custom type. There's one here[1] though I
have no idea how good it is.

[1] http://gborg.postgresql.org/project/pguuid/projdisplay.php

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: UUID's as primary keys

От
Thomas Hallgren
Дата:
Martijn van Oosterhout wrote:
> On Wed, Jun 28, 2006 at 09:01:49AM +0200, Thomas Hallgren wrote:
>
>> I'm building an app where I have several entities that are identified
>> using a UUID (i.e. a 128 bit quantity). My current implementation uses a
>> composite primary key consisting of two int8 values. It's a bit
>> cumbersome and I would much rather have a distinct type. An earlier
>> implementation using Oracle mapped the UUID to a RAW(16) but PostgreSQL
>> doesn't have that and the BYTEA adds extra overhead.
>>
>> What would be the best (as in most efficient) mapping for a 128 bit
>> primary key?
>>
>
> Sounds like something for a custom type. There's one here[1] though I
> have no idea how good it is.
>
> [1] http://gborg.postgresql.org/project/pguuid/projdisplay.php
>
> Have a nice day,
>
Thanks. That would of course work but at the same time it increases the
complexity of my app. Yet another component to install and keep track
of. It's also a bit of an overkill since the only thing I need is an
opaque bit storage. Why is it that PostgreSQL lack a fixed length binary
type similar to the RAW type in Oracle? ISTM that could be very useful
and not very hard to implement.

Regards,
Thomas Hallgren


Re: UUID's as primary keys

От
Martijn van Oosterhout
Дата:
On Wed, Jun 28, 2006 at 10:22:45AM +0200, Thomas Hallgren wrote:
> Thanks. That would of course work but at the same time it increases the
> complexity of my app. Yet another component to install and keep track
> of. It's also a bit of an overkill since the only thing I need is an
> opaque bit storage. Why is it that PostgreSQL lack a fixed length binary
> type similar to the RAW type in Oracle? ISTM that could be very useful
> and not very hard to implement.

AIUI, it can't be done because of a basic rule of the type system: the
typmod can't be necessary to interpret the binary representation of a
value. For something like RAW(16) the type would be the oid for "raw"
and the typmod would be 16. However, when reading the value from a disk
page, you're not given the typmod, so you have no way of determining
the length.

That's why there is a verlena header for types where you don't know the
length, which adds overhead. The alternative is to create a type of the
exact length you want, but in your case that's not acceptable either.

I'm not sure if there are any other alternatives.

Have a ncie day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: UUID's as primary keys

От
Thomas Hallgren
Дата:
Martijn van Oosterhout wrote:
> On Wed, Jun 28, 2006 at 10:22:45AM +0200, Thomas Hallgren wrote:
>> Thanks. That would of course work but at the same time it increases the
>> complexity of my app. Yet another component to install and keep track
>> of. It's also a bit of an overkill since the only thing I need is an
>> opaque bit storage. Why is it that PostgreSQL lack a fixed length binary
>> type similar to the RAW type in Oracle? ISTM that could be very useful
>> and not very hard to implement.
>
> AIUI, it can't be done because of a basic rule of the type system: the
> typmod can't be necessary to interpret the binary representation of a
> value. For something like RAW(16) the type would be the oid for "raw"
> and the typmod would be 16. However, when reading the value from a disk
> page, you're not given the typmod, so you have no way of determining
> the length.
>
OK. I thought you always had a type descriptor handy when reading the binary representation.
I've noticed that the typmod is expected in some receive functions (bpcharrecv and
numeric_recv for instance). Are you saying that there are times when you don't use that?

Regards,
Thomas Hallgren


Re: UUID's as primary keys

От
Martijn van Oosterhout
Дата:
On Wed, Jun 28, 2006 at 12:03:40PM +0200, Thomas Hallgren wrote:
> OK. I thought you always had a type descriptor handy when reading the
> binary representation. I've noticed that the typmod is expected in some
> receive functions (bpcharrecv and numeric_recv for instance). Are you
> saying that there are times when you don't use that?

The input functions get it, the output functions (bpcharout,
bpcharsend, etc) don't. Which makes it kind of hard to print a raw
value if you don't know how long it's going to be. They used to, but
that was removed some time back. It's a security issue IIRC, since any
user could call raw_out(field, 2048) and get whatever was in the 2K of
data after that field.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: UUID's as primary keys

От
Thomas Hallgren
Дата:
Martijn van Oosterhout wrote:
> On Wed, Jun 28, 2006 at 12:03:40PM +0200, Thomas Hallgren wrote:
>
>> OK. I thought you always had a type descriptor handy when reading the
>> binary representation. I've noticed that the typmod is expected in some
>> receive functions (bpcharrecv and numeric_recv for instance). Are you
>> saying that there are times when you don't use that?
>>
>
> The input functions get it, the output functions (bpcharout,
> bpcharsend, etc) don't. Which makes it kind of hard to print a raw
> value if you don't know how long it's going to be. They used to, but
> that was removed some time back. It's a security issue IIRC, since any
> user could call raw_out(field, 2048) and get whatever was in the 2K of
> data after that field.
>
>
A user that is trusted with installing a C-function in the backend is
free to scan the process memory anyway so in what way did that increase
the security? IMHO, the only relevant security in that context is to
have trusted people install trusted modules. I'm surprised that
something like that made you remove significant functionality.

Regards,
Thomas Hallgren


Re: UUID's as primary keys

От
Martijn van Oosterhout
Дата:
On Wed, Jun 28, 2006 at 01:56:47PM +0200, Thomas Hallgren wrote:
> A user that is trusted with installing a C-function in the backend is
> free to scan the process memory anyway so in what way did that increase
> the security? IMHO, the only relevant security in that context is to
> have trusted people install trusted modules. I'm surprised that
> something like that made you remove significant functionality.

You're missing the point. The type output function is not generally a
priveledged function. Think bpcharout, text_out, numeric_out, etc...
These can be called by users directly and the input to those functions
cannot be trusted.

If the type output function needs an additional parameter to correctly
and safely decode the actual Datum, you're screwed because then users
can pass invalid parameters to affect the decoding. If you have a way
of telling what the right value is, then you didn't need to pass it in
the first place.

Hence, you have to be able to decode a datum knowing only its type,
irrespective of typmod. So say you had a field of type RAW(16) you
would have to be able to decode it knowing only that it is of type
"RAW". So you need a header to tell you how long it is i.e. a varlena
structure.

Hope this clarifies it a bit,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: UUID's as primary keys

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> The input functions get it, the output functions (bpcharout,
> bpcharsend, etc) don't. Which makes it kind of hard to print a raw
> value if you don't know how long it's going to be. They used to, but
> that was removed some time back.

Even back then you couldn't rely on the typmod value to be supplied;
it was quite likely to be passed as -1.  The issue is not actually
with on-disk storage, it is with function/operator arguments and
results.  Those have never been identified any more closely than by
giving a type OID.  So for any value that came from a function,
you won't have a typmod, and you'd better be able to find out all
you need to know just by inspecting the value itself.  Hence, length
words.

This is all pretty off-topic for pgsql-general, isn't it?

            regards, tom lane