Обсуждение: storing binary data

Поиск
Список
Период
Сортировка

storing binary data

От
Neil Conway
Дата:
I want to store some binary data in Postgres. The data is an
MD5 checksum of the user's password, in binary. It will be
exactly 16 bytes (since it is a one-way hash).

Can I store this safely in a CHAR column? Can the data be
treated normally (i.e. compare a binary checksum of the
password the user entered with the binary value stored
in the DB, etc).

Are there any issues I'll need to watch out for when
storing/manipulating binary data?

(I was going to RTFM and search the mailing list archives,
but they don't seem to be working.)

TIA,

Neil

--
Neil Conway <neilconway@home.com>
Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc
Encrypted mail welcomed

Secrecy is the beginning of tyranny.
        -- Heinlein

Вложения

Re: storing binary data

От
Tom Lane
Дата:
Neil Conway <nconway@klamath.dyndns.org> writes:
> I want to store some binary data in Postgres. The data is an
> MD5 checksum of the user's password, in binary. It will be
> exactly 16 bytes (since it is a one-way hash).

> Can I store this safely in a CHAR column?

No.  CHAR and friends assume there are no null (zero) bytes.
In MULTIBYTE setups there are probably additional constraints.

You could use bytea, but I would recommend converting the checksum
to a hex digit string and then storing that in a char-type field.
Hex is the usual textual representation for MD5 values, no?

            regards, tom lane

Re: storing binary data

От
Neil Conway
Дата:
On Mon, Oct 16, 2000 at 11:22:40PM -0400, Tom Lane wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> > I want to store some binary data in Postgres. The data is an
> > MD5 checksum of the user's password, in binary. It will be
> > exactly 16 bytes (since it is a one-way hash).
>
> > Can I store this safely in a CHAR column?
>
> No.  CHAR and friends assume there are no null (zero) bytes.
> In MULTIBYTE setups there are probably additional constraints.
>
> You could use bytea, but I would recommend converting the checksum
> to a hex digit string and then storing that in a char-type field.
> Hex is the usual textual representation for MD5 values, no?

It is, but (IMHO) it's a big waste of space. The actual MD5 digest is
128 bits. If stored in binary form, it's 16 bytes. If stored in hex
form (as ASCII), it's 32 characters @ 1 byte per character = 32 bytes.
In Unicode, that's 64 bytes (correct me if I'm wrong).

It's not a huge deal, but it would be nice to store this efficiently.
Is this possible?

TIA,

Neil

--
Neil Conway <neilconway@home.com>
Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc
Encrypted mail welcomed

Being able to break into computers doesn't make you a hacker any more
than being able to hotwire cars makes you an automotive engineer.

Вложения

Re: storing binary data

От
Alfred Perlstein
Дата:
* Neil Conway <nconway@klamath.dyndns.org> [001016 20:41] wrote:
> On Mon, Oct 16, 2000 at 11:22:40PM -0400, Tom Lane wrote:
> > Neil Conway <nconway@klamath.dyndns.org> writes:
> > > I want to store some binary data in Postgres. The data is an
> > > MD5 checksum of the user's password, in binary. It will be
> > > exactly 16 bytes (since it is a one-way hash).
> >
> > > Can I store this safely in a CHAR column?
> >
> > No.  CHAR and friends assume there are no null (zero) bytes.
> > In MULTIBYTE setups there are probably additional constraints.
> >
> > You could use bytea, but I would recommend converting the checksum
> > to a hex digit string and then storing that in a char-type field.
> > Hex is the usual textual representation for MD5 values, no?
>
> It is, but (IMHO) it's a big waste of space. The actual MD5 digest is
> 128 bits. If stored in binary form, it's 16 bytes. If stored in hex
> form (as ASCII), it's 32 characters @ 1 byte per character = 32 bytes.
> In Unicode, that's 64 bytes (correct me if I'm wrong).
>
> It's not a huge deal, but it would be nice to store this efficiently.
> Is this possible?

Why not use base64?  It's pretty gross but might work for you.

-Alfred

Re: storing binary data

От
Tom Lane
Дата:
Neil Conway <nconway@klamath.dyndns.org> writes:
>> You could use bytea, but I would recommend converting the checksum
>> to a hex digit string and then storing that in a char-type field.
>> Hex is the usual textual representation for MD5 values, no?

> It is, but (IMHO) it's a big waste of space. The actual MD5 digest is
> 128 bits. If stored in binary form, it's 16 bytes. If stored in hex
> form (as ASCII), it's 32 characters @ 1 byte per character =3D 32 bytes.

You're worried about 16 bytes per pg_shadow entry?  Get real.  I'd
have recommended bytea if the amount of storage involved were actually
significant, but for this application readability seems more important.

            regards, tom lane

Re: storing binary data

От
Neil Conway
Дата:
On Mon, Oct 16, 2000 at 11:57:36PM -0400, Tom Lane wrote:
> Neil Conway <nconway@klamath.dyndns.org> writes:
> > It is, but (IMHO) it's a big waste of space. The actual MD5 digest is
> > 128 bits. If stored in binary form, it's 16 bytes. If stored in hex
> > form (as ASCII), it's 32 characters @ 1 byte per character =3D 32 bytes.
>
> You're worried about 16 bytes per pg_shadow entry?  Get real.  I'd
> have recommended bytea if the amount of storage involved were actually
> significant, but for this application readability seems more important.

To clarify, these are 'application users', not Postgres users. So
the info is stored in one of my own tables, not pg_shadow. Although
I agree, this isn't a big deal either way.

Alfred: thanks for the tip. Looks like base64 will solve my problems!

Thanks to everyone who responded,

Neil

--
Neil Conway <neilconway@home.com>
Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc
Encrypted mail welcomed

Violence is to dictatorship as propaganda is to democracy.
        -- Noam Chomsky

Вложения