Обсуждение: Postgres Architecture

Поиск
Список
Период
Сортировка

Postgres Architecture

От
Timothy Nelson
Дата:
Hi all! 

I'm a DevOps Manager/Engineer by trade (though the place I work is not, unfortunately, using Postgres).  I've been thinking quite a bit about what our ideal architecture at work will look like and what scaling looks like, both for work and for home projects (where I *am* looking at using Postgres :) ).  

Tonight, (unrelated to work) I took the time to draw up a diagram of an architecture that I think would help Postgres move one step towards both more scalability, and better ease of use. 

Since I'm not so hot at drawing ASCII art diagrams, I thought maybe the way to go would be to drop it into a Google Presentation and make that public. 

It's a couple of diagrams (existing and proposed architecture), and a list of what I think the advantages and disadvantages are. 


To keep it short, the proposal is that the stages from Parse through Plan be done in a separate  process (and potentially on a separate server) from the Execute stage.  The idea is:
- The Parse/Plan servers don't care whether they're read or write
- The Parse/Plan know which Execute server is the writer (and which the readers), and forward to the correct server for execution

I even wonder if this might not mean that the Parse/Plan servers can be deployed as K8s containers, with the Execute server being the external non-k8s server. 

Note that in this e-mail, I've referred to:
- The Parse/Plan server (which my diagram calls the Postgres SQL server)
- The Execute server (which my diagram calls the Storage server)

I'm not sure what naming makes sense, but I intentionally used a couple of different names in hopes that one of them would get the idea across -- please disregard whichever names don't make sense, and feel free to suggest new ones. 

I'm expecting that people will pick the idea apart, and wanted to know what people think of it. 

Thanks! 

Re: Postgres Architecture

От
"Jonah H. Harris"
Дата:
On Mon, Oct 16, 2023 at 6:42 AM Timothy Nelson <wayland@wayland.id.au> wrote:
I'm expecting that people will pick the idea apart, and wanted to know what people think of it. 

Thanks for the proposal. This is actually a model that's been around for a very long time. And, in fact, variations of it (e.g. parsing done in one place and generated plan fragments shipped to remote execution nodes where the data resides) are already used by things like Postgres-XL. There have also been a number of academic implementations where parsing is done locally and raw parse trees are sent to the server as well. While these things do reduce CPU, there are a number of negative aspects to deal with that make such an architecture more difficult to manage.

--
Jonah H. Harris

Re: Postgres Architecture

От
Timothy Nelson
Дата:
Great!  I'm not surprised it's been around a long time -- I didn't think I could be the only one to think of it.  

Thanks for the heads-up on Postgres-XL -- I'd missed that one somehow. 

I'm going to include the words "architecture" and "replication" so that people searching the archives in the future have more chance of finding this conversation. 

Thanks! 

On Tue, 17 Oct 2023 at 02:07, Jonah H. Harris <jonah.harris@gmail.com> wrote:
On Mon, Oct 16, 2023 at 6:42 AM Timothy Nelson <wayland@wayland.id.au> wrote:
I'm expecting that people will pick the idea apart, and wanted to know what people think of it. 

Thanks for the proposal. This is actually a model that's been around for a very long time. And, in fact, variations of it (e.g. parsing done in one place and generated plan fragments shipped to remote execution nodes where the data resides) are already used by things like Postgres-XL. There have also been a number of academic implementations where parsing is done locally and raw parse trees are sent to the server as well. While these things do reduce CPU, there are a number of negative aspects to deal with that make such an architecture more difficult to manage.

--
Jonah H. Harris

Re: Postgres Architecture

От
Tom Lane
Дата:
Timothy Nelson <wayland@wayland.id.au> writes:
> Great!  I'm not surprised it's been around a long time -- I didn't think I
> could be the only one to think of it.
> Thanks for the heads-up on Postgres-XL -- I'd missed that one somehow.

FWIW, we also have some in-core history with passing plans around,
for parallel-query workers.  The things I'd take away from that
are:

1. It's expensive.  In the parallel-query case it's hard to tease apart
the cost of passing across a plan from the cost of starting a worker,
but it's certainly high.  You would need a way of only invoking this
mechanism for expensive-anyway queries, which puts a hole in the idea
you seemed to have of having a hard separation between parse/plan
processes and execute processes.

2. Constant-folding at plan time is another reason you can't have
a hard separation: the planner might run arbitrary user-defined
code.

3. Locking is a pain.  In the Postgres architecture, table locks
acquired during parse/plan have to be held through to execution,
or concurrent DDL might invalidate your plan out from under you.
We finesse that in the parallel-query case by expecting the leader
process to keep hold of all the needed locks, and then having some
kluges that allow child workers to acquire the same locks without
blocking.  (The workers perhaps don't really need those locks,
but acquiring them avoids the need to poke holes in various
you-must-have-a-lock-to-do-this sanity checks.)  I fear this area
might be a great deal harder if you're trying to pass plans from a
parse/plan process to an arms-length execute process.

4. Sharing execute workers between sessions (which I think was an
implicit part of your idea) is hard; hard enough that we haven't
even tried.  There's too much context-sensitive state in a backend
and too little way of isolating which things depend on the current
user, current database etc.  Probably this could be cleaned up
with enough work, but it'd not be a small project.

            regards, tom lane



Re: Postgres Architecture

От
Bruce Momjian
Дата:
On Tue, Oct 17, 2023 at 08:39:49AM +1100, Timothy Nelson wrote:
> Great!  I'm not surprised it's been around a long time -- I didn't think I
> could be the only one to think of it.  
> 
> Thanks for the heads-up on Postgres-XL -- I'd missed that one somehow. 
> 
> I'm going to include the words "architecture" and "replication" so that people
> searching the archives in the future have more chance of finding this
> conversation. 

You can get some of this using foreign data wrappers to other Postgres
servers.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.