Re: When to use PARTITION BY HASH?

Поиск
Список
Период
Сортировка
От Oleksandr Shulgin
Тема Re: When to use PARTITION BY HASH?
Дата
Msg-id CACACo5RxgQBO_cB97yFkWwLo_-Vs3WtUei2KszGg2LkJ95uiXA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: When to use PARTITION BY HASH?  (Michel Pelletier <pelletier.michel@gmail.com>)
Ответы Re: When to use PARTITION BY HASH?  (Jeff Janes <jeff.janes@gmail.com>)
Re: When to use PARTITION BY HASH?  (Michel Pelletier <pelletier.michel@gmail.com>)
Список pgsql-general
(sticking to pgsql-general)

On Tue, Jun 2, 2020 at 7:45 PM Michel Pelletier <pelletier.michel@gmail.com> wrote:

On Tue, Jun 2, 2020 at 10:17 AM Oleksandr Shulgin <oleksandr.shulgin@zalando.de> wrote:

I was reading up on declarative partitioning[1] and I'm not sure what could be a possible application of Hash partitioning.

Is anyone actually using it?  What are typical use cases?  What benefits does such a partitioning scheme provide?

On its face, it seems that it can only give you a number of tables which are smaller than the un-partitioned one, but I fail to see how it would provide any of the potential advantages listed in the documentation.

 
From my point of view, hash partitioning is very useful for spreading out high insert/update load.

Do you also assign the partitions to different tablespaces as you've hinted below or do you see performance improvement from partitioning alone?  How does that work?  Does it give better  results than using a RAID to spread the disk IO, for example?

Yes its' true you end up with more smaller tables than one big large one, but remember the indexes are (often) tree data structures.  Smaller trees are faster than bigger trees.  By making the indexes smaller they are faster.  Since the planner can knows to only examine the specific index it needs, this ends up being a lot faster.

That sounds logical, but can it be demonstrated?  If the index(es) fit in memory fully, it doesn't make a measurable difference, I guess?

With hash partitioning you are not expected, in general, to end up with a small number of partitions being accessed more heavily than the rest.  So your indexes will also not fit into memory.

I have the feeling that using a hash function to distribute rows simply contradicts the basic assumption of when you would think of partitioning your table at all: that is to make sure the most active part of the table and indexes is small enough to be cached in memory.

Regards,
--
Alex

В списке pgsql-general по дате отправления:

Предыдущее
От: Julien Rouhaud
Дата:
Сообщение: Re: Replication conflicts despite hot_standby_feedback = on?
Следующее
От: Laurenz Albe
Дата:
Сообщение: Re: Replication conflicts despite hot_standby_feedback = on?