Re: Guidance Requested - Bulk Inserting + Queries

Поиск
Список
Период
Сортировка
От Leonardo Francalanci
Тема Re: Guidance Requested - Bulk Inserting + Queries
Дата
Msg-id 1322669853.10239.YahooMailNeo@web29003.mail.ird.yahoo.com
обсуждение исходный текст
Ответ на Guidance Requested - Bulk Inserting + Queries  (Benjamin Johnson <benjamin.johnson@getcarbonblack.com>)
Ответы Re: Guidance Requested - Bulk Inserting + Queries  (Benjamin Johnson <benjamin.johnson@getcarbonblack.com>)
Список pgsql-performance
> We now found (thanks Andres and Snow-Man in #postgresql) that in our
> tests, after the indexes get too large performance drops signficantly
> and our system limps forward due to  disk reads (presumably for the
> indexes).  If we remove the indexes, performance for our entire sample
> test is great and everything is written to postgresql very quickly. 
It's usually the fact that the data you index is "random" as opposed to,
say, an always incremented value (could be a timestamp, or a sequence)
that leads to insert problems with btrees. 
> My question is, what possible routes can I take where we can have both
> fast inserts (with indexes removed until the end of the day), but still
> allow a user to query against today's data? Is this even possible?  One
> idea would be possibly have hourly tables for today and as soon as we
> can try to re-add indexes. 
Yep, that's the only way I've found: use smaller partitions. That leads
to slower reads (due to the fact that you have to visit more indexes to
read the same amount of data). But you'll get faster writes. 

> Another possible solution might be to stream
> the data to another "reader" postgres instance that has indexes,
> although I'm not very versed in replication. 
I don't think you can do that. 
Another option that you have is to use ssd instead of HD for the indexes
only (that is, having the indexes in a separate tablespace of ssds). The
problem is that your disks usually can't keep up with the number of
random writes it takes to update N "random values" btrees; ssd might help. 
Can you post some numbers, such as # of indexes, # of rows you're trying
to insert per hour etc etc? 



В списке pgsql-performance по дате отправления:

Предыдущее
От: Benjamin Johnson
Дата:
Сообщение: Guidance Requested - Bulk Inserting + Queries
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Query planner suggestion, for indexes with similar but not exact ordering.