Re: Buffer Manager and Contention

Поиск

Список

Период

Сортировка

От	Kyotaro Horiguchi
Тема	Re: Buffer Manager and Contention
Дата	25 февраля 2022 г. 04:20:25
Msg-id	20220225.102025.173579582113640633.horikyota.ntt@gmail.com обсуждение исходный текст
Ответ на	Buffer Manager and Contention (Simon Riggs <simon.riggs@enterprisedb.com>)
Ответы	Re: Buffer Manager and Contention (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) Re: Buffer Manager and Contention (Simon Riggs <simon.riggs@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

(I added Yura, as the author of a related patch)

At Thu, 24 Feb 2022 12:58:23 +0000, Simon Riggs <simon.riggs@enterprisedb.com> wrote in 
> Thinking about poor performance in the case where the data fits in
> RAM, but the working set is too big for shared_buffers, I notice a
> couple of things that seem bad in BufMgr, but don't understand why
> they are like that.
> 
> 1. If we need to allocate a buffer to a new block we do this in one
> step, while holding both partition locks for the old and the new tag.
> Surely it would cause less contention to make the old block/tag
> invalid (after flushing), drop the old partition lock and then switch
> to the new one? i.e. just hold one mapping partition lock at a time.
> Is there a specific reason we do it this way?

I'm not sure but I guess the developer wanted to make the operation
atomic.

Yura Sokolov is proposing a patch to separte the two partition locks.

https://www.postgresql.org/message-id/1edbb61981fe1d99c3f20e3d56d6c88999f4227c.camel%40postgrespro.ru

And it seems to me viable for me and a benchmarking in the thread
showed a good result.  I'd appreciate your input on that thread.

> 2. Possibly connected to the above, we issue BufTableInsert() BEFORE
> we issue BufTableDelete(). That means we need extra entries in the
> buffer mapping hash table to allow us to hold both the old and the new
> at the same time, for a short period. The way dynahash.c works, we try

Yes.

> to allocate an entry from the freelist and if that doesn't work, we
> begin searching ALL the freelists for free entries to steal. So if we
> get enough people trying to do virtual I/O at the same time, then we
> will hit a "freelist storm" where everybody is chasing the last few
> entries. It would make more sense if we could do BufTableDelete()

To reduce that overhead, Yura proposed a surgically modification on
dynahash, but I didn't like that and the latest patch doesn't contain
that part.

> first, then hold onto the buffer mapping entry rather than add it to
> the freelist, so we can use it again when we do BufTableInsert() very
> shortly afterwards. That way we wouldn't need to search the freelist
> at all. What is the benefit or reason of doing the Delete after the
> Insert?

Hmm. something like hash_swap_key() or hash_reinsert_entry()?  That
sounds reasonable. (Yura's proposal was taking out an entry from hash
then attach it with a new key again.)

> Put that another way, it looks like BufTable functions are used in a
> way that mismatches against the way dynahash is designed.
> 
> Thoughts?

On the first part, I think Yura's patch works.  On the second point,
Yura already showed it gives a certain amount of gain if we do that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Justin Pryzby
Дата: 25 февраля 2022 г., 04:17:04
Сообщение: Re: set TESTDIR from perl rather than Makefile

Следующее

От: Tom Lane
Дата: 25 февраля 2022 г., 04:44:57
Сообщение: Re: ltree_gist indexes broken after pg_upgrade from 12 to 13

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Buffer Manager and Contention

Предыдущее

Следующее