Обсуждение: Postgres Architecture
Hi all!
I'm a DevOps Manager/Engineer by trade (though the place I work is not, unfortunately, using Postgres). I've been thinking quite a bit about what our ideal architecture at work will look like and what scaling looks like, both for work and for home projects (where I *am* looking at using Postgres :) ).
Tonight, (unrelated to work) I took the time to draw up a diagram of an architecture that I think would help Postgres move one step towards both more scalability, and better ease of use.
Since I'm not so hot at drawing ASCII art diagrams, I thought maybe the way to go would be to drop it into a Google Presentation and make that public.
It's a couple of diagrams (existing and proposed architecture), and a list of what I think the advantages and disadvantages are.
https://docs.google.com/presentation/d/1ew31STf8qC2keded5GfQiSUwb3fukmO0PFnZw12yAs8/edit?usp=sharing
To keep it short, the proposal is that the stages from Parse through Plan be done in a separate process (and potentially on a separate server) from the Execute stage. The idea is:
- The Parse/Plan servers don't care whether they're read or write
- The Parse/Plan know which Execute server is the writer (and which the readers), and forward to the correct server for execution
I even wonder if this might not mean that the Parse/Plan servers can be deployed as K8s containers, with the Execute server being the external non-k8s server.
Note that in this e-mail, I've referred to:
- The Parse/Plan server (which my diagram calls the Postgres SQL server)
- The Execute server (which my diagram calls the Storage server)
I'm not sure what naming makes sense, but I intentionally used a couple of different names in hopes that one of them would get the idea across -- please disregard whichever names don't make sense, and feel free to suggest new ones.
I'm expecting that people will pick the idea apart, and wanted to know what people think of it.
Thanks!
On Mon, Oct 16, 2023 at 6:42 AM Timothy Nelson <wayland@wayland.id.au> wrote:
I'm expecting that people will pick the idea apart, and wanted to know what people think of it.
Thanks for the proposal. This is actually a model that's been around for a very long time. And, in fact, variations of it (e.g. parsing done in one place and generated plan fragments shipped to remote execution nodes where the data resides) are already used by things like Postgres-XL. There have also been a number of academic implementations where parsing is done locally and raw parse trees are sent to the server as well. While these things do reduce CPU, there are a number of negative aspects to deal with that make such an architecture more difficult to manage.
Jonah H. Harris
Great! I'm not surprised it's been around a long time -- I didn't think I could be the only one to think of it.
Thanks for the heads-up on Postgres-XL -- I'd missed that one somehow.
I'm going to include the words "architecture" and "replication" so that people searching the archives in the future have more chance of finding this conversation.
Thanks!
On Tue, 17 Oct 2023 at 02:07, Jonah H. Harris <jonah.harris@gmail.com> wrote:
On Mon, Oct 16, 2023 at 6:42 AM Timothy Nelson <wayland@wayland.id.au> wrote:--I'm expecting that people will pick the idea apart, and wanted to know what people think of it.Thanks for the proposal. This is actually a model that's been around for a very long time. And, in fact, variations of it (e.g. parsing done in one place and generated plan fragments shipped to remote execution nodes where the data resides) are already used by things like Postgres-XL. There have also been a number of academic implementations where parsing is done locally and raw parse trees are sent to the server as well. While these things do reduce CPU, there are a number of negative aspects to deal with that make such an architecture more difficult to manage.Jonah H. Harris
Timothy Nelson <wayland@wayland.id.au> writes: > Great! I'm not surprised it's been around a long time -- I didn't think I > could be the only one to think of it. > Thanks for the heads-up on Postgres-XL -- I'd missed that one somehow. FWIW, we also have some in-core history with passing plans around, for parallel-query workers. The things I'd take away from that are: 1. It's expensive. In the parallel-query case it's hard to tease apart the cost of passing across a plan from the cost of starting a worker, but it's certainly high. You would need a way of only invoking this mechanism for expensive-anyway queries, which puts a hole in the idea you seemed to have of having a hard separation between parse/plan processes and execute processes. 2. Constant-folding at plan time is another reason you can't have a hard separation: the planner might run arbitrary user-defined code. 3. Locking is a pain. In the Postgres architecture, table locks acquired during parse/plan have to be held through to execution, or concurrent DDL might invalidate your plan out from under you. We finesse that in the parallel-query case by expecting the leader process to keep hold of all the needed locks, and then having some kluges that allow child workers to acquire the same locks without blocking. (The workers perhaps don't really need those locks, but acquiring them avoids the need to poke holes in various you-must-have-a-lock-to-do-this sanity checks.) I fear this area might be a great deal harder if you're trying to pass plans from a parse/plan process to an arms-length execute process. 4. Sharing execute workers between sessions (which I think was an implicit part of your idea) is hard; hard enough that we haven't even tried. There's too much context-sensitive state in a backend and too little way of isolating which things depend on the current user, current database etc. Probably this could be cleaned up with enough work, but it'd not be a small project. regards, tom lane
On Tue, Oct 17, 2023 at 08:39:49AM +1100, Timothy Nelson wrote: > Great! I'm not surprised it's been around a long time -- I didn't think I > could be the only one to think of it. > > Thanks for the heads-up on Postgres-XL -- I'd missed that one somehow. > > I'm going to include the words "architecture" and "replication" so that people > searching the archives in the future have more chance of finding this > conversation. You can get some of this using foreign data wrappers to other Postgres servers. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.