Обсуждение: Storing images in database for web applications

Поиск
Список
Период
Сортировка

Storing images in database for web applications

От
Thom Brown
Дата:
Hi all,

I'm wondering if anyone has experience of storing and getting images to and from a database?  We currently have the problem of images being uploaded to a single gateway used by many companies, most of which run several websites.  As it stands, once they upload the image, it then has to be fsync'd to the appropriate servers (3-way in some cases) or accessed through our image proxy.

But now we wish to redesign our various image upload systems into one system which will also make images available across various sites/companies and therefore servers.  So one solution is to store images in a database.  What I'm wondering is if this is an appropriate solution?  What are the downsides?  Could we expect severe performance hits?  To help mitigate the issue of many sites continuously querying a database for images, we'd implement a caching system, so images would only be queried from the database probably around once an hour.

The benefits I can see is that images are managed in one place and accessibly easily by all servers.  The problem is putting everything in one place (all eggs in one basket), so if the server goes down, all sites lose their images once their cache has expired... unless we implemented a system that falls back to cache if connection fails, even if cache has expired.

Any suggestion?  Alternatives?

Thanks

Thom

Re: Storing images in database for web applications

От
Craig Ringer
Дата:
On 27/11/2009 7:04 PM, Thom Brown wrote:

> But now we wish to redesign our various image upload systems into one
> system which will also make images available across various
> sites/companies and therefore servers.  So one solution is to store
> images in a database.  What I'm wondering is if this is an appropriate
> solution?  What are the downsides?  Could we expect severe performance
> hits?

It depends a bit on your needs. If you need facilities provided by the
database, such as transactional updates of images and associated
metadata, then it's probably a good choice. Otherwise, you'll be paying
the costs associated with storing blobs in a database without gaining much.

There *are* costs to using a database over a network file system -
mostly increased management. It's harder to do incremental backups and
harder to work with those incremental backups (see PITR and continuous
archiving) than it is with a regular file system. Storage requirements
may be slightly increased. Deletion of images requires additional
cleanup work (VACUUM) to free the space. Etc.

With a database, you can't use the sendfile(...) system call or the like
to transmit the data, so there's more work for the system when
transmitting query results. It's also not generally transmitted as raw
binary, but hex- or octal- encoded, which can be ugly to work with. You
can request it as pure binary instead if you're using libpq directly,
though.

Why don't you use a HTTP-based image server? It's pretty low overhead,
is easily cached by commonly available tools, etc. Failing that, there
are plenty of network file systems out there (NFSv4 and CIFS come to
mind) that would do the job.

> To help mitigate the issue of many sites continuously querying a
> database for images, we'd implement a caching system, so images would
> only be queried from the database probably around once an hour.

Memcached ?

--
Craig Ringer

Re: Storing images in database for web applications

От
"Massa, Harald Armin"
Дата:
Thom,

> I'm wondering if anyone has experience of storing and getting images to and
> from a database?

Yes. For a customer I have one application running for ~8 years which
stores images (and various other BLOBS) within a PostgreSQL database.
Started with 7.something, now running on 8.3; allways used BYTEA.

It is a intranet and extranet application, so it does not get
slashdotted or reddited.

Another application is a public website, also storing HTML(fragments)
and images within a PostgreSQL database.

> The benefits I can see is that images are managed in one place and
> accessibly easily by all servers.  The problem is putting everything in one
> place (all eggs in one basket), so if the server goes down, all sites lose
> their images once their cache has expired... unless we implemented a system
> that falls back to cache if connection fails, even if cache has expired.

Your analyzes is correct. Other benefits are:

- no filesystem fuss (rights, sharing, locking, names (windows / unix)
- options for security: in my application, editing happens within a
strongly firewalled VPN. The webserver in the big, bad internet only
accesses the database; so the attack vectors are limited.
- transactional save: no problems with "partially saved images",
"images locked by whatever"
- mime information etc. can be stored alongside the bytes
- reporting options

drawbacks are:

- scaling is more challenging: static content from filesystems can be
replicated ( as you do), replicating a database is more difficult
- different performance characteristics then filesystem (search
usually faster (index), but access to image contents has more layers
to the disc)
- backups of the database grow huge, and usually the dump of a file in
the database is bigger then the file itself. (usually, because the
effects of compression are hard to judge before)
- dump / restore times grow

Best wishes,

Harald


--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality

Re: Storing images in database for web applications

От
Vick Khera
Дата:
On Fri, Nov 27, 2009 at 6:04 AM, Thom Brown <thombrown@gmail.com> wrote:
> I'm wondering if anyone has experience of storing and getting images to and
> from a database?  We currently have the problem of images being uploaded to
> a single gateway used by many companies, most of which run several
> websites.  As it stands, once they upload the image, it then has to be
> fsync'd to the appropriate servers (3-way in some cases) or accessed through
> our image proxy.

For our customer image hosting service, we store the master copy of
everything in the database, and have an external process that pushes
the (scaled) images to the front-end delivery service.  Currently this
is amazon's cloudfront service.  The URLs that our customers can
use/give out are all pointing to the cloudfront, which is wicked fast
and scalable.

We do not directly serve the images from the DB.  That would just not scale.

What I'd do is put a shim layer in front of the http service that
gives out the image and looks for it in a cache.  If not found,
fetches it from the DB then every subsequent request gets it from the
local cache on the http server.  You could probably implement this
shim as a 'file not found' error handler in apache, so there would be
no overhead on a cache hit.

Re: Storing images in database for web applications

От
Jasen Betts
Дата:
On 2009-11-27, Thom Brown <thombrown@gmail.com> wrote:
> --0016e659f44c2bea2504795842a7
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi all,
>
> I'm wondering if anyone has experience of storing and getting images to and
> from a database?  We currently have the problem of images being uploaded to
> a single gateway used by many companies, most of which run several
> websites.  As it stands, once they upload the image, it then has to be
> fsync'd to the appropriate servers (3-way in some cases) or accessed through
> our image proxy.
>
> But now we wish to redesign our various image upload systems into one system
> which will also make images available across various sites/companies and
> therefore servers.  So one solution is to store images in a database.  What
> I'm wondering is if this is an appropriate solution?  What are the
> downsides?  Could we expect severe performance hits?  To help mitigate the
> issue of many sites continuously querying a database for images, we'd
> implement a caching system, so images would only be queried from the
> database probably around once an hour.
>
> The benefits I can see is that images are managed in one place and
> accessibly easily by all servers.  The problem is putting everything in one
> place (all eggs in one basket), so if the server goes down, all sites lose
> their images once their cache has expired... unless we implemented a system
> that falls back to cache if connection fails, even if cache has expired.
>
> Any suggestion?  Alternatives?

some sort of broadcasting to share the images as when they are
uploaded, perhaps something based on NNTP or email.