Re: 600 million rows of data. Bad hardware or need partitioning?

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: 600 million rows of data. Bad hardware or need partitioning?
Дата
Msg-id 20200506013741.GH28974@telsasoft.com
обсуждение исходный текст
Ответ на Re: 600 million rows of data. Bad hardware or need partitioning?  (Arya F <arya6000@gmail.com>)
Ответы Re: 600 million rows of data. Bad hardware or need partitioning?  (Arya F <arya6000@gmail.com>)
Список pgsql-performance
On Tue, May 05, 2020 at 08:31:29PM -0400, Arya F wrote:
> On Mon, May 4, 2020 at 5:21 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> 
> > I mentioned in February and March that you should plan to set shared_buffers
> > to fit the indexes currently being updated.
> 
> The following command gives me
> 
> select pg_size_pretty (pg_indexes_size('test_table'));
>  pg_size_pretty >  5216 MB
> 
> So right now, the indexes on that table are taking about 5.2 GB, if a
> machine has 512 GB of RAM and SSDs, is it safe to assume I can achieve
> the same update that takes 1.5 minutes in less than 5 seconds while
> having 600 million rows of data without partitioning?

I am not prepared to guarantee server performance..

But, to my knowledge, you haven't configured shared_buffers at all.  Which I
think might be the single most important thing to configure for loading speed
(with indexes).

Couple months ago, you said your server had 4GB RAM, which isn't much, but if
shared_buffers is ~100MB, I think that deserves attention.

If you get good performance with a million rows and 32MB buffers, then you
could reasonably hope to get good performance (at least initially) with
100million rows and 320MB buffers.  Scale that up to whatever you expect your
index size to be.  Be conservative since you may need to add indexes later, and
you can expect they'll become bloated, so you may want to run a reindex job.

shared_buffers is frequently set to ~25% of RAM, and if you need to efficiently
use indexes larger than what that supports, then you should add RAM, or
implement partitioning.

-- 
Justin



В списке pgsql-performance по дате отправления:

Предыдущее
От: Arya F
Дата:
Сообщение: Re: 600 million rows of data. Bad hardware or need partitioning?
Следующее
От: Steve Pritchard
Дата:
Сообщение: Inaccurate Rows estimate for "Bitmap And" causes Planner to choosewrong join