Re: Postgres for a "data warehouse", 5-10 TB

Поиск
Список
Период
Сортировка
От Igor Chudov
Тема Re: Postgres for a "data warehouse", 5-10 TB
Дата
Msg-id CAMhtkAah2c4XfSec=OtgL1V51wpD=jygbZnBVBRYCV02MscebQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Postgres for a "data warehouse", 5-10 TB  (Scott Marlowe <scott.marlowe@gmail.com>)
Ответы Re: Postgres for a "data warehouse", 5-10 TB  (Claudio Freire <klaussfreire@gmail.com>)
Re: Postgres for a "data warehouse", 5-10 TB  (Andy Colson <andy@squeakycode.net>)
Список pgsql-performance


On Sun, Sep 11, 2011 at 7:52 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Sun, Sep 11, 2011 at 6:35 AM, Igor Chudov <ichudov@gmail.com> wrote:
> I have a server with about 18 TB of storage and 48 GB of RAM, and 12
> CPU cores.

1 or 2 fast cores is plenty for what you're doing.

I need those cores to perform other tasks, like image manipulation with imagemagick, XML forming and parsing etc.
 
 But the drive
array and how it's configured etc are very important.  There's a huge
difference between 10 2TB 7200RPM SATA drives in a software RAID-5 and
36 500G 15kRPM SAS drives in a RAID-10 (SW or HW would both be ok for
data warehouse.)

Well, right now, my server has twelve 7,200 RPM 2TB hard drives in a RAID-6 configuration.

They are managed by a 3WARE 9750 RAID CARD.
 
I would say that I am not very concerned with linear relationship of read speed to disk speed. If that stuff is somewhat slow, it is OK with me. 

What I want to avoid is severe degradation of performance due to size (time complexity greater than O(1)), disastrous REPAIR TABLE operations etc. 


> I do not know much about Postgres, but I am very eager to learn and
> see if I can use it for my purposes more effectively than MySQL.
> I cannot shell out $47,000 per CPU for Oracle for this project.
> To be more specific, the batch queries that I would do, I hope,

Hopefully if needs be you can spend some small percentage of that for
a fast IO subsystem is needed.



I am actually open for suggestions here.
 
> would either use small JOINS of a small dataset to a large dataset, or
> just SELECTS from one big table.
> So... Can Postgres support a 5-10 TB database with the use pattern
> stated above?

I use it on a ~3TB DB and it works well enough.  Fast IO is the key
here.  Lots of drives in RAID-10 or HW RAID-6 if you don't do a lot of
random writing.

I do not plan to do a lot of random writing. My current design is that my perl scripts write to a temporary table every week, and then I do INSERT..ON DUPLICATE KEY UPDATE. 

By the way, does that INSERT UPDATE functionality or something like this exist in Postgres?

i

В списке pgsql-performance по дате отправления:

Предыдущее
От: pasman pasmański
Дата:
Сообщение: Re: Postgres for a "data warehouse", 5-10 TB
Следующее
От: Igor Chudov
Дата:
Сообщение: Re: Postgres for a "data warehouse", 5-10 TB