Re: PG performance in high volume environment (many INSERTs and lots of aggregation reporting)

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: PG performance in high volume environment (many INSERTs and lots of aggregation reporting)
Дата
Msg-id 603c8f070901280618u3b20da3ey1edbd91e83d7bad2@mail.gmail.com
обсуждение исходный текст
Ответ на PG performance in high volume environment (many INSERTs and lots of aggregation reporting)  (Phoenix Kiula <phoenix.kiula@gmail.com>)
Список pgsql-performance
> My question: with that kind of volume and the underlying aggregation
> functions (by product id, dates, possibly IP addresses or at least
> countries of origin..) will PG ever be a good choice? Or should I be
> looking at some other kind of tools? I wonder if OLAP tools would be
> overkill for something that needs to look like a barebones version of
> google analytics limited to our site..

Some other databases might have an optimization that makes this much
faster that it would ordinarily be.

select count(*) from table;

But I don't think anyone has an optimization that makes this fast:

select column, count(*) from table group by 1;

How do you expect the database to get this information other than be
reading the whole table and counting up the number of occurrences of
each value?  I guess an OLAP cube might precompute all the answers for
you, but I don't think MySQL is going to do that.

One option is to write a script that runs in the background and
updates all your statistics every 10 minutes or so, dumping the
results into separate (and smaller) tables that you can query quickly.

Another option (which is probably what I would do for really high
volume logging of web traffic) is to write your log records to a flat
file and then postprocess them with perl or something and load the
summary statistics into your database later.  PostgreSQL is really
fast, but nothing is as fast as writing to a flatfile.

...Robert

В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: LIKE Query performance
Следующее
От: Richard Huxton
Дата:
Сообщение: Re: PG performance in high volume environment (many INSERTs and lots of aggregation reporting)