Обсуждение: Very large database

Поиск
Список
Период
Сортировка

Very large database

От
Michael Welter
Дата:
I need some help here.  We need to implement a 180+GB database with
120+MB of updates every evening.  Rather than purchasing the big iron,
we would like to use postgres running over Linux 2.4.x as the data
server.  Is this even possible?  Who has the largest database out there
and what does it run on?

How should we implement the disk array?  Should we purchase a hardware
RAID card or should we use the software RAID capabilities in Linux 2.4?
  Should we consider a SMP system?  Should we use an outboard RAID box
(like RaidZone)?

If anyone out there has implemented a database of this size then I would
like to correspond with you.

Thanks for your help,
Mike




Re: Very large database

От
Chris Albertson
Дата:
Don't expect Postgresql to out perform Oracle.  If Ocacle needs
to "big iron" so will Postgresql.  I've done some testing and
found that's it's the number of transactions that matters more
then the abount of data being dumped in.  Our application was
astronomy, I would batch load a few nights of observational data
every few days.  If all you are doing is loading data you can
use COPY and it will move fast, Just a few minutes on even a
low end machine.  But, if that 120MB is in one million INSERTS
each with lots of processing, contraint checks, index updates
and so on then you will need some high end hardware to finish
in only 24 hours.  I wrote my application twice.  The first
version took __days__ to complete a run.  My second version was
100x faster.  I did much of the processing outside of the DBMS
in standard "C" and then just COPYed the data in.

So, the answer depends on what you need to do.  Simply inputting
that much data is easy.

Also, how will it by used once it is in the database?  Do you
have many active users looking at it?  What kind of seaches are
they doing.

In any case, SCSI drives are the way to go get a stack of them
with a couple on-line spares. That and LOTS of RAM.  At least
1GB as a minimum.

Solaris has very good RAID support built in.  I think better
than Linux's.  Both OSes are free although Solaris 8 will be the
last PC version.  Prototype you applacation with faked data
then try a test where you pull out the power connection on a drive
while the DBMS is updating data.  Pulling the power should have
NO effect if the RAID is set up right. Solaris found my spare drive
and swapped it in automatically.  Do this a few times before you
depend on it.  Likey either Solaris, Linux or BSD would work and
pass this test.

The big question is the transaction rate, table size is the second
question.

--- Michael Welter <mike@introspect.com> wrote:
> I need some help here.  We need to implement a 180+GB database with
> 120+MB of updates every evening.  Rather than purchasing the big
> iron,
> we would like to use postgres running over Linux 2.4.x as the data
> server.  Is this even possible?  Who has the largest database out
> there
> and what does it run on?
>
> How should we implement the disk array?  Should we purchase a
> hardware
> RAID card or should we use the software RAID capabilities in Linux
> 2.4?
>   Should we consider a SMP system?  Should we use an outboard RAID
> box
> (like RaidZone)?
>
> If anyone out there has implemented a database of this size then I
> would
> like to correspond with you.
>
> Thanks for your help,
> Mike
>
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)


=====
Chris Albertson
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org

__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

Re: Very large database

От
Steve Crawford
Дата:
Not enough info. How many tables? Is the nightly run a bulk insert or update
of data or more complicated than that? What sort of queries (quantity and
complexity) is the database supposed to handle and what is the acceptable
performance (how many simultaneous users, how many queries per second and
what is the acceptable response time to a query)?

Other things being equal, hardware RAID beats software RAID as you will keep
the processor free for your programs. As a general rule, more spindles is
better, more memory is better but the specifics of your project will point to
the area of maximum benefit.

If lots of queries will hit the same data, cache memory on the RAID card or
external RAID subsystem will help. If you have lots of scattered writes, a
RAID with a battery-backed cache that can safely optimize disk writes (ie.,
writes don't have to be sent to disk right away to protect them - they can be
made when the disks are available) will help.

In other words, depending on what you are trying to do you may need anything
from a couple 100GB IDE in you Linux box to an external Winchester Systems
Flash Disk.

I can't speak with authority on the SMP issue but have run across items in
the newsgroups that indicate that SMP performance in Postgresql needs work
and you may be better off with a screaming single CPU machine. Don't overlook
the effects of the on-chip cache size, bus and memory speeds.

Given that you can get 4 70+GB IDE drives for not a huge investment, I'd
start there and make a machine with software RAID. Do some testing and
development in that environment and use the tools available to see if your
bottlenecks seem more influenced by disk IO, memory, CPU or just what.
Develop, test, experiment and you will be in a much better position to spec a
production system.

-Steve

On Tuesday 08 January 2002 18:34, Michael Welter wrote:
> I need some help here.  We need to implement a 180+GB database with
> 120+MB of updates every evening.  Rather than purchasing the big iron,
> we would like to use postgres running over Linux 2.4.x as the data
> server.  Is this even possible?  Who has the largest database out there
> and what does it run on?
>
> How should we implement the disk array?  Should we purchase a hardware
> RAID card or should we use the software RAID capabilities in Linux 2.4?
>   Should we consider a SMP system?  Should we use an outboard RAID box
> (like RaidZone)?
>
> If anyone out there has implemented a database of this size then I would
> like to correspond with you.
>
> Thanks for your help,
> Mike
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: Very large database

От
Justin Clift
Дата:
Hi Chris,

Chris Albertson wrote:
>
<snip>
> Solaris has very good RAID support built in.  I think better
> than Linux's.  Both OSes are free although Solaris 8 will be the
> last PC version.

Where did you hear that Solaris 8 will be the last PC version?  By "PC",
do you mean "Intel Platform" version?

???

I half downloaded the "Intel Platform" cd's for the Solaris 9 Early
Access program about a month ago, and now when I go to the Sun site the
rest are not available.

This makes me greatly concerned.  :(

Regards and best wishes,

Justin Clift


<snip>
> --- Michael Welter <mike@introspect.com> wrote:
> > I need some help here.  We need to implement a 180+GB database with
> > 120+MB of updates every evening.  Rather than purchasing the big
> > iron,
> > we would like to use postgres running over Linux 2.4.x as the data
> > server.  Is this even possible?  Who has the largest database out
> > there
> > and what does it run on?
> >
> > How should we implement the disk array?  Should we purchase a
> > hardware
> > RAID card or should we use the software RAID capabilities in Linux
> > 2.4?
> >   Should we consider a SMP system?  Should we use an outboard RAID
> > box
> > (like RaidZone)?
> >
> > If anyone out there has implemented a database of this size then I
> > would
> > like to correspond with you.
> >
> > Thanks for your help,
> > Mike
> >
> >
> >
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to
> majordomo@postgresql.org)
>
> =====
> Chris Albertson
>   Home:   310-376-1029  chrisalbertson90278@yahoo.com
>   Cell:   310-990-7550
>   Office: 310-336-5189  Christopher.J.Albertson@aero.org
>
> __________________________________________________
> Do You Yahoo!?
> Send FREE video emails in Yahoo! Mail!
> http://promo.yahoo.com/videomail/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
   - Indira Gandhi

Re: Very large database

От
Doug McNaught
Дата:
Justin Clift <justin@postgresql.org> writes:

> Hi Chris,
>
> Chris Albertson wrote:
> >
> <snip>
> > Solaris has very good RAID support built in.  I think better
> > than Linux's.  Both OSes are free although Solaris 8 will be the
> > last PC version.
>
> Where did you hear that Solaris 8 will be the last PC version?  By "PC",
> do you mean "Intel Platform" version?

Yes.  Sun dropped all support for x86.  Slashdot covered it a few days
ago.

> I half downloaded the "Intel Platform" cd's for the Solaris 9 Early
> Access program about a month ago, and now when I go to the Sun site the
> rest are not available.
>
> This makes me greatly concerned.  :(

Good reason to use Free operating systems.  ;)

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

Re: Very large database - Now OT

От
Justin Clift
Дата:
Personally, I find the concept of Sun :

a - Spending large investments of time and energy to make
    their products work both on Intel and Sparc
b - Allowing and encouraging users to download Solaris
    Intel and Sparc for free, and quite a few products and
    extensions for them
c - Recognising and then publishing that the vast majority
    of people who downloaded Solaris for free were
    downloading the Intel version

...quite braindamaged, when they then decide to cease the Intel
version product line.  All those people who downloaded the Intel
version (the vast majority of 1.5 million people apparently)
and thought it was good will remember this.

Seems quite a lot of good effort to have expanded the market,
then to just abandon it like that will leave a bad impression.
Not just scaled down.  Abandoned.  Gone.  Morte.  Kaput.

Personally, being a Solaris specialist (and using Sun OS's
since '93) I feel quite let down.  I've got a few PC's around
running Solaris Intel for various things, so I'm unhappy.

As they can pull the plug on an entire OS architecture this easily,
even if they do bring it back at some point I'm not going to
feel they're very trustworthy to not do it again.

My recommending Solaris as a platform just stopped.  :(

Yep Doug.  It's a good reason to use Free Operating Systems.

+ Justin


Doug McNaught wrote:
>
> Justin Clift <justin@postgresql.org> writes:
>
> > Hi Chris,
> >
> > Chris Albertson wrote:
> > >
> > <snip>
> > > Solaris has very good RAID support built in.  I think better
> > > than Linux's.  Both OSes are free although Solaris 8 will be the
> > > last PC version.
> >
> > Where did you hear that Solaris 8 will be the last PC version?  By "PC",
> > do you mean "Intel Platform" version?
>
> Yes.  Sun dropped all support for x86.  Slashdot covered it a few days
> ago.
>
> > I half downloaded the "Intel Platform" cd's for the Solaris 9 Early
> > Access program about a month ago, and now when I go to the Sun site the
> > rest are not available.
> >
> > This makes me greatly concerned.  :(
>
> Good reason to use Free operating systems.  ;)
>
> -Doug
> --
> Let us cross over the river, and rest under the shade of the trees.
>    --T. J. Jackson, 1863