Обсуждение: Blobs in Postgresql

Поиск
Список
Период
Сортировка

Blobs in Postgresql

От
"Ron Olson"
Дата:
Hi all-

I am evaluating databases for use in a large project that will hold image data as blobs. I know, everybody says to just store pointers to files on the disk...can't do it here...the images are of a confidential nature and access to the database (and resulting client app) will be highly restricted. The underlying platform will likely be Linux though Solaris x86-64 has been suggested as well.

I did some tests with MySQL and found the results very sub-par...the standard blob field only holds 64k (they have three types of blobs for whatever reason) and the real problem is that my uploads and downloads have failed because of packet size issues...this can be solved somewhat with server settings, but I get the impression that blobs are barely supported.

So turning to Postgresql, can I get any recommendations, suggestions and tips on blob handling in the database? The image sizes will be pretty variable, from a few kilobytes to several hundred megabytes, so I need something that will handle the various file sizes, hopefully transparently.

Thanks for any info,

Ron

Re: Blobs in Postgresql

От
"Pavel Stehule"
Дата:
> So turning to Postgresql, can I get any recommendations, suggestions and
> tips on blob handling in the database? The image sizes will be pretty
> variable, from a few kilobytes to several hundred megabytes, so I need
> something that will handle the various file sizes, hopefully transparently.
>

PostgreSQL BLOB implementation is well. We used it without any
problems with images from 20K-30M.

Regards
Pavel Stehule

Re: Blobs in Postgresql

От
Gregory Stark
Дата:
"Ron Olson" <tachoknight@gmail.com> writes:

> Hi all-
>
> I am evaluating databases for use in a large project that will hold image
> data as blobs. I know, everybody says to just store pointers to files on the
> disk...

Well not everyone. I usually do, but if you're not handling these blobs under
heavy load independent of the database (like web servers) then either approach
works.

> So turning to Postgresql, can I get any recommendations, suggestions and
> tips on blob handling in the database? The image sizes will be pretty
> variable, from a few kilobytes to several hundred megabytes, so I need
> something that will handle the various file sizes, hopefully transparently.

There are basically two options. If you are not handling data that are too
large to copy around in memory, and you don't need to upload and download the
data in chunks (usually these are the same issue) then you can just store your
images in a bytea. Postgres transparently treats *all* large variable-sized
data whether text, bytea, arrays, like a blob. It stores it in a separate
table outside the main table.

If your data can sometimes be so large that you cannot manipulate the whole
thing in memory all at once (Keep in mind that Postgres expects to be able to
handle a few copies of the data at the same time. Conservatively expect 5
simultaneous copies to have to fit in memory.) then you'll have to look into
the large object interface which is a set of functions starting with lo_*

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Blobs in Postgresql

От
Trent Shipley
Дата:
On Wednesday 2007-08-15 05:52, Gregory Stark wrote:
> "Ron Olson" <tachoknight@gmail.com> writes:
> > Hi all-
> >
> > I am evaluating databases for use in a large project that will hold image
> > data as blobs. I know, everybody says to just store pointers to files on
> > the disk...
>
> Well not everyone. I usually do, but if you're not handling these blobs
> under heavy load independent of the database (like web servers) then either
> approach works.

I've always wondered how you keep transactions working when you only store
pointers to large data.  Do you need an external transaction manager to
insure that the file doesn't get deleted when you "delete" the data via the
pointer?  Do you need an external application that handles all deletes,
inserts, and updates?

Re: Blobs in Postgresql

От
"Merlin Moncure"
Дата:
On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote:
> Hi all-
>
> I am evaluating databases for use in a large project that will hold image
> data as blobs. I know, everybody says to just store pointers to files on the
> disk...can't do it here...the images are of a confidential nature and access
> to the database (and resulting client app) will be highly restricted. The
> underlying platform will likely be Linux though Solaris x86-64 has been
> suggested as well.
>
> I did some tests with MySQL and found the results very sub-par...the
> standard blob field only holds 64k (they have three types of blobs for
> whatever reason) and the real problem is that my uploads and downloads have
> failed because of packet size issues...this can be solved somewhat with
> server settings, but I get the impression that blobs are barely supported.
>
> So turning to Postgresql, can I get any recommendations, suggestions and
> tips on blob handling in the database? The image sizes will be pretty
> variable, from a few kilobytes to several hundred megabytes, so I need
> something that will handle the various file sizes, hopefully transparently.

for fast performance, you should make sure to use the parameterized
interface and send in the results as binary (or use a language that
accesses the database that way).  I would be nervous about storing
blobs if they were very large.

regarding the security issue, have you looked at encryption?

merlin

Re: Blobs in Postgresql

От
"Ron Olson"
Дата:
The language is Java. I've made some tests and they work very well for 25meg files....works exactly the way it should, first time. MySQL had all kinds of nasty surprises for me when I first started working with blobs, but I can say that I took my code, changed the driver, and it all works like a champ (mind you, this was a quick test app).

I haven't looked at encryption at the database level....is such a thing available? I know Oracle has some form of data encryption at the database level so the nefarious DBA with the wide mustache and black brimmed hat always going "ah ha ha ha ha" can't make off with the data, but does Postgres have something similar?

BTW, to put into context, the database will be designed to hold evidence (well, photos and videos of). Thus the compelling need for some security, as well as the variation in file sizes.

On 8/17/07, Merlin Moncure <mmoncure@gmail.com> wrote:
On 8/15/07, Ron Olson <tachoknight@gmail.com> wrote:
> Hi all-
>
> I am evaluating databases for use in a large project that will hold image
> data as blobs. I know, everybody says to just store pointers to files on the
> disk...can't do it here...the images are of a confidential nature and access
> to the database (and resulting client app) will be highly restricted. The
> underlying platform will likely be Linux though Solaris x86-64 has been
> suggested as well.
>
> I did some tests with MySQL and found the results very sub-par...the
> standard blob field only holds 64k (they have three types of blobs for
> whatever reason) and the real problem is that my uploads and downloads have
> failed because of packet size issues...this can be solved somewhat with
> server settings, but I get the impression that blobs are barely supported.
>
> So turning to Postgresql, can I get any recommendations, suggestions and
> tips on blob handling in the database? The image sizes will be pretty
> variable, from a few kilobytes to several hundred megabytes, so I need
> something that will handle the various file sizes, hopefully transparently.

for fast performance, you should make sure to use the parameterized
interface and send in the results as binary (or use a language that
accesses the database that way).  I would be nervous about storing
blobs if they were very large.

regarding the security issue, have you looked at encryption?

merlin

Re: Blobs in Postgresql

От
"Merlin Moncure"
Дата:
On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:
> The language is Java. I've made some tests and they work very well for 25meg
> files....works exactly the way it should, first time. MySQL had all kinds of
> nasty surprises for me when I first started working with blobs, but I can
> say that I took my code, changed the driver, and it all works like a champ
> (mind you, this was a quick test app).
>
> I haven't looked at encryption at the database level....is such a thing
> available? I know Oracle has some form of data encryption at the database
> level so the nefarious DBA with the wide mustache and black brimmed hat
> always going "ah ha ha ha ha" can't make off with the data, but does
> Postgres have something similar?
>
> BTW, to put into context, the database will be designed to hold evidence
> (well, photos and videos of). Thus the compelling need for some security, as
> well as the variation in file sizes.

Well, my assumption was that you would encrypt the data on the client
side and store it that way.

PostgreSQL has open architecture.  If you wanted to do the encryption
on the server, one possible approach that jumps out at me is to write
a small C function which receives the data, encrypts the image using a
key sent by the client all (but not stored), and either stores the
encrypted image back in the database via SPI or writes it out to a
file.

There are many strategies to encrypting data...first thing to think
about is where the encryption happens, where the keys are stored, etc.

merlin

Re: Blobs in Postgresql

От
Ron Johnson
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/07 23:16, Merlin Moncure wrote:
> On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:
>> The language is Java. I've made some tests and they work very well for 25meg
>> files....works exactly the way it should, first time. MySQL had all kinds of
>> nasty surprises for me when I first started working with blobs, but I can
>> say that I took my code, changed the driver, and it all works like a champ
>> (mind you, this was a quick test app).
>>
>> I haven't looked at encryption at the database level....is such a thing
>> available? I know Oracle has some form of data encryption at the database
>> level so the nefarious DBA with the wide mustache and black brimmed hat
>> always going "ah ha ha ha ha" can't make off with the data, but does
>> Postgres have something similar?
>>
>> BTW, to put into context, the database will be designed to hold evidence
>> (well, photos and videos of). Thus the compelling need for some security, as
>> well as the variation in file sizes.
>
> Well, my assumption was that you would encrypt the data on the client
> side and store it that way.
>
> PostgreSQL has open architecture.  If you wanted to do the encryption
> on the server, one possible approach that jumps out at me is to write
> a small C function which receives the data, encrypts the image using a
> key sent by the client all (but not stored), and either stores the
> encrypted image back in the database via SPI or writes it out to a
> file.
>
> There are many strategies to encrypting data...first thing to think
> about is where the encryption happens, where the keys are stored, etc.

Client-side encryption is important, because with server-side
encryption, you are sending the Valuable Data across the wire (or,
even worse!) wireless in cleartext form.

It's more likely that there's a packet sniffer on the network than
an Evil DBA snooping around.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxwf7S9HxQb37XmcRAimGAJ98Kykormb63BedYknIij2xZvDgEACgw23C
eWn7JJKSs1KL9dSfVx3p/BY=
=OLl1
-----END PGP SIGNATURE-----

Re: Blobs in Postgresql

От
Shane Ambler
Дата:
Ron Johnson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 08/17/07 23:16, Merlin Moncure wrote:
>> On 8/18/07, Ron Olson <tachoknight@gmail.com> wrote:
>>> The language is Java. I've made some tests and they work very well for 25meg
>>> files....works exactly the way it should, first time. MySQL had all kinds of
>>> nasty surprises for me when I first started working with blobs, but I can
>>> say that I took my code, changed the driver, and it all works like a champ
>>> (mind you, this was a quick test app).
>>>
>>> I haven't looked at encryption at the database level....is such a thing
>>> available? I know Oracle has some form of data encryption at the database
>>> level so the nefarious DBA with the wide mustache and black brimmed hat
>>> always going "ah ha ha ha ha" can't make off with the data, but does
>>> Postgres have something similar?
>>>
>>> BTW, to put into context, the database will be designed to hold evidence
>>> (well, photos and videos of). Thus the compelling need for some security, as
>>> well as the variation in file sizes.
>> Well, my assumption was that you would encrypt the data on the client
>> side and store it that way.
>>
>> PostgreSQL has open architecture.  If you wanted to do the encryption
>> on the server, one possible approach that jumps out at me is to write
>> a small C function which receives the data, encrypts the image using a
>> key sent by the client all (but not stored), and either stores the
>> encrypted image back in the database via SPI or writes it out to a
>> file.
>>
>> There are many strategies to encrypting data...first thing to think
>> about is where the encryption happens, where the keys are stored, etc.
>
> Client-side encryption is important, because with server-side
> encryption, you are sending the Valuable Data across the wire (or,
> even worse!) wireless in cleartext form.
>
> It's more likely that there's a packet sniffer on the network than
> an Evil DBA snooping around.
>

The two options I see are -

1. the client encrypts the data and sends it to the DB

2. the client uses an SSL connection to the server to prevent snooping
and lets the DB encrypt for storage.

I would suggest looking at pgcrypto in contrib for server side encryption.

The main benefit I would see from the first is it doesn't matter if
another DB admin changes the server security settings or not. The new
guy may setup a new server and not enforce SSL connections. Of course if
the client refused non-SSL connections you can prevent that.

Either way the app provides the key to decrypt the data for viewing, so
the developers current and future must maintain the security level you
choose.

What sort of security measures are taken for viewing the data? Will each
user have a security certificate on their own USB flash drive to allow
them to view the data? which could also prevent developers from
accessing the data. Or is their password enough to allow the program to
decrypt it for them?

It would really come down to which encryption method you find easiest to
implement that provides enough security for your needs.




--

Shane Ambler
pgSQL@Sheeky.Biz

Get Sheeky @ http://Sheeky.Biz