Re: Let's make PostgreSQL multi-threaded
От | Greg Stark |
---|---|
Тема | Re: Let's make PostgreSQL multi-threaded |
Дата | |
Msg-id | CAM-w4HPne2ab_ppKO6xSY+gyrczMu7CnFzggP+4mXqD1ctjh-A@mail.gmail.com обсуждение исходный текст |
Ответ на | Let's make PostgreSQL multi-threaded (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: Let's make PostgreSQL multi-threaded
|
Список | pgsql-hackers |
On Mon, 5 Jun 2023 at 10:52, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > I spoke with some folks at PGCon about making PostgreSQL multi-threaded, > so that the whole server runs in a single process, with multiple > threads. It has been discussed many times in the past, last thread on > pgsql-hackers was back in 2017 when Konstantin made some experiments [0]. > > I feel that there is now pretty strong consensus that it would be a good > thing, more so than before. Lots of work to get there, and lots of > details to be hashed out, but no objections to the idea at a high level. > > The purpose of this email is to make that silent consensus explicit. If > you have objections to switching from the current multi-process > architecture to a single-process, multi-threaded architecture, please > speak up. I suppose I should reiterate my comments that I gave at the time. I'm not sure they qualify as "objections" but they're some kind of general concern. I think of processes and threads as fundamentally the same things, just a slightly different API -- namely that in one memory is by default unshared and needs to be explicitly shared and in the other it's default shared and needs to be explicitly unshared. There are obvious practical API differences too like how signals are handled but those are just implementation details. So the question is whether defaulting to shared memory or defaulting to unshared memory is better -- and whether the implementation details are significant enough to override that. And my general concern was that in my experience default shared memory leads to hugely complex and chaotic shared data structures with often very loose rules for ownership of shared data and who is responsible for making updates, handling errors, or releasing resources. So all else equal I feel like having a good infrastructure for explicitly allocating shared memory segments and managing them is superior. However all else is not equal. The discussion in the hallway turned to whether we could just use pthread primitives like mutexes and condition variables instead of our own locks -- and the point was raised that those libraries assume these objects will be in threads of one process not shared across completely different processes. And that's probably not the only library we're stuck reimplementing because of this. So the question is are these things worth taking the risk of having data structures shared implicitly and having unclear ownership rules? I was going to say supporting both modes relieves that fear since it would force that extra discipline and allow testing under the more restrictive rule. However I don't think that will actually work. As long as we support both modes we lose all the advantages of threads. We still wouldn't be able to use pthreads and would still need to provide and maintain our homegrown replacement infrastructure. -- greg
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Alvaro HerreraДата:
Сообщение: Re: Assert failure of the cross-check for nullingrels