Re: Let's make PostgreSQL multi-threaded

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: Let's make PostgreSQL multi-threaded
Дата	6 июня 2023 г. 23:14:41
Msg-id	CAM-w4HPne2ab_ppKO6xSY+gyrczMu7CnFzggP+4mXqD1ctjh-A@mail.gmail.com обсуждение исходный текст
Ответ на	Let's make PostgreSQL multi-threaded (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы	Re: Let's make PostgreSQL multi-threaded
Список	pgsql-hackers

Дерево обсуждения

On Mon, 5 Jun 2023 at 10:52, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> I spoke with some folks at PGCon about making PostgreSQL multi-threaded,
> so that the whole server runs in a single process, with multiple
> threads. It has been discussed many times in the past, last thread on
> pgsql-hackers was back in 2017 when Konstantin made some experiments [0].
>
> I feel that there is now pretty strong consensus that it would be a good
> thing, more so than before. Lots of work to get there, and lots of
> details to be hashed out, but no objections to the idea at a high level.
>
> The purpose of this email is to make that silent consensus explicit. If
> you have objections to switching from the current multi-process
> architecture to a single-process, multi-threaded architecture, please
> speak up.

I suppose I should reiterate my comments that I gave at the time. I'm
not sure they qualify as "objections" but they're some kind of general
concern.

I think of processes and threads as fundamentally the same things,
just a slightly different API -- namely that in one memory is by
default unshared and needs to be explicitly shared and in the other
it's default shared and needs to be explicitly unshared. There are
obvious practical API differences too like how signals are handled but
those are just implementation details.

So the question is whether defaulting to shared memory or defaulting
to unshared memory is better -- and whether the implementation details
are significant enough to override that.

And my general concern was that in my experience default shared memory
leads to hugely complex and chaotic shared data structures with often
very loose rules for ownership of shared data and who is responsible
for making updates, handling errors, or releasing resources.

So all else equal I feel like having a good infrastructure for
explicitly allocating shared memory segments and managing them is
superior.

However all else is not equal. The discussion in the hallway turned to
whether we could just use pthread primitives like mutexes and
condition variables instead of our own locks -- and the point was
raised that those libraries assume these objects will be in threads of
one process not shared across completely different processes.

And that's probably not the only library we're stuck reimplementing
because of this. So the question is are these things worth taking the
risk of having data structures shared implicitly and having unclear
ownership rules?

I was going to say supporting both modes relieves that fear since it
would force that extra discipline and allow testing under the more
restrictive rule. However I don't think that will actually work. As
long as we support both modes we lose all the advantages of threads.
We still wouldn't be able to use pthreads and would still need to
provide and maintain our homegrown replacement infrastructure.

-- 
greg

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 06 июня 2023 г., 23:11:32
Сообщение: Re: Assert failure of the cross-check for nullingrels

Следующее

От: Tom Lane
Дата: 06 июня 2023 г., 23:20:40
Сообщение: Re: Order changes in PG16 since ICU introduction

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Let's make PostgreSQL multi-threaded

Предыдущее

Следующее