Обсуждение: Allow specifying a dbname in pg_basebackup connection string

Поиск
Список
Период
Сортировка

Allow specifying a dbname in pg_basebackup connection string

От
Jelte Fennema
Дата:
Normally it doesn't really matter which dbname is used in the connection
string that pg_basebackup and other physical replication CLI tools use.
The reason being, that physical replication does not work at the
database level, but instead at the server level. So you will always get
the data for all databases.

However, when there's a proxy, such as PgBouncer, in between the client
and the server, then it might very well matter. Because this proxy might
want to route the connection to a different server depending on the
dbname parameter in the startup packet.

This patch changes the creation of the connection string key value
pairs, so that the following command will actually include
dbname=postgres in the startup packet to the server:

pg_basebackup --dbname 'dbname=postgres port=6432' -D dump

This also applies to other physical replication CLI tools like
pg_receivewal.

To address the security issue described in CVE-2016-5424 it
now only passes expand_dbname=true when the tool did not
receive a connection_string argument.

I tested that the change worked on this PgBouncer PR of mine:
Вложения

Re: Allow specifying a dbname in pg_basebackup connection string

От
Thom Brown
Дата:
On Mon, 3 Jul 2023 at 13:23, Jelte Fennema <postgres@jeltef.nl> wrote:
>
> Normally it doesn't really matter which dbname is used in the connection
> string that pg_basebackup and other physical replication CLI tools use.
> The reason being, that physical replication does not work at the
> database level, but instead at the server level. So you will always get
> the data for all databases.
>
> However, when there's a proxy, such as PgBouncer, in between the client
> and the server, then it might very well matter. Because this proxy might
> want to route the connection to a different server depending on the
> dbname parameter in the startup packet.
>
> This patch changes the creation of the connection string key value
> pairs, so that the following command will actually include
> dbname=postgres in the startup packet to the server:
>
> pg_basebackup --dbname 'dbname=postgres port=6432' -D dump

I guess my immediate question is, should backups be taken through
PgBouncer?  It seems beyond PgBouncer's remit.

Thom



Re: Allow specifying a dbname in pg_basebackup connection string

От
"Euler Taveira"
Дата:
On Wed, Jul 5, 2023, at 9:43 AM, Thom Brown wrote:
I guess my immediate question is, should backups be taken through
PgBouncer?  It seems beyond PgBouncer's remit.

One of the PgBouncer's missions is to be a transparent proxy.

Sometimes you cannot reach out the database directly due to a security policy.
I've heard this backup question a few times. IMO if dbname doesn't matter for
reaching the server directly, I don't see a problem relaxing this restriction
to support this use case. We just need to document that dbname will be ignored
if specified. Other connection poolers might also benefit from it.


--
Euler Taveira

Re: Allow specifying a dbname in pg_basebackup connection string

От
Jelte Fennema
Дата:
On Wed, 5 Jul 2023 at 16:01, Euler Taveira <euler@eulerto.com> wrote:
> One of the PgBouncer's missions is to be a transparent proxy.
>
> Sometimes you cannot reach out the database directly due to a security policy.

Indeed the transparent proxy use case is where replication through
pgbouncer makes sense. There's quite some reasons to set up PgBouncer
like such a proxy apart from security policies. Some others that come
to mind are:
- load balancer layer of pgbouncers
- transparent failovers
- transparent database moves

And in all of those cases its nice for a user to use a single
connection string/hostname. Instead of having to think: Oh yeah, for
backups, I need to use this other one.



Re: Allow specifying a dbname in pg_basebackup connection string

От
Thom Brown
Дата:
On Wed, 5 Jul 2023 at 16:50, Jelte Fennema <postgres@jeltef.nl> wrote:
>
> On Wed, 5 Jul 2023 at 16:01, Euler Taveira <euler@eulerto.com> wrote:
> > One of the PgBouncer's missions is to be a transparent proxy.
> >
> > Sometimes you cannot reach out the database directly due to a security policy.
>
> Indeed the transparent proxy use case is where replication through
> pgbouncer makes sense. There's quite some reasons to set up PgBouncer
> like such a proxy apart from security policies. Some others that come
> to mind are:
> - load balancer layer of pgbouncers
> - transparent failovers
> - transparent database moves
>
> And in all of those cases its nice for a user to use a single
> connection string/hostname. Instead of having to think: Oh yeah, for
> backups, I need to use this other one.

Okay, understood.  In that case, please remember to write changes to
the pg_basebackup docs page explaining how the dbname value is ignored
under normal usage.

Thom



Re: Allow specifying a dbname in pg_basebackup connection string

От
Jelte Fennema
Дата:
On Wed, 5 Jul 2023 at 20:09, Thom Brown <thom@linux.com> wrote:
> Okay, understood.  In that case, please remember to write changes to
> the pg_basebackup docs page explaining how the dbname value is ignored


I updated the wording in the docs for pg_basebackup and pg_receivewal.
They now clarify that Postgres itself doesn't care if there's a
database name in the connection string, but that a proxy might.

Вложения

Re: Allow specifying a dbname in pg_basebackup connection string

От
Tristen Raab
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

Hello,

I've reviewed your patch and it applies and builds without error. When testing this patch I was slightly confused as to
whatits purpose was, after testing it I now understand. Initially, I thought this was a change to add database-level
replication.I would suggest some clarifications to the documentation such as changing:
 

"supplying a specific database name in the connection string won't cause PostgreSQL to behave any differently."

to 

"supplying a specific database name in the connection string won't cause pg_basebackup to behave any differently."

I believe this better illustrates that we are referring to the actual pg_basebackup utility and how this parameter is
onlyused for proxies and bears no impact on what pg_basebackup is actually doing. It also would remove any confusion
aboutdatabase replication I had prior.
 

There is also a small typo in the same documentation:

"However, if you are connecting to PostgreSQL through a proxy, then it's possible that this proxy does use the supplied
databasenameto make certain decisions, such as to which cluster to route the connection."
 

"databasename" is just missing a space.

Other than that, everything looks good.

Regards,

Tristen

Re: Allow specifying a dbname in pg_basebackup connection string

От
Jelte Fennema
Дата:
On Mon, 28 Aug 2023 at 23:50, Tristen Raab <tristen.raab@highgo.ca> wrote:
> I've reviewed your patch and it applies and builds without error. When testing this patch I was slightly confused as
towhat its purpose was, after testing it I now understand. Initially, I thought this was a change to add database-level
replication.I would suggest some clarifications to the documentation such as changing: 

Thanks for the review. I've updated the documentation to make it
clearer (using some of your suggestions but also some others)

Вложения

Re: Allow specifying a dbname in pg_basebackup connection string

От
Jim Jones
Дата:

Hi Jelte

On 29.08.23 15:55, Jelte Fennema wrote:
Thanks for the review. I've updated the documentation to make it
clearer (using some of your suggestions but also some others)

This patch applies and builds cleanly, and the documentation is very clear.

I tested it using the 'replication-support' branch from your github fork:

pg_basebackup --dbname "port=6432 user=postgres dbname=foo" -D /tmp/dump1

pgbouncer log:

2023-08-30 00:50:52.866 CEST [811770] LOG C-0x555fbd65bf40: (nodb)/postgres@unix(811776):6432 login attempt: db=foo user=postgres tls=no replication=yes

However, pgbouncer closes with a segmentation fault, so I couldn't test the result of pg_basebackup itself - but I guess it isn't the issue here.

Other than that, everything looks good to me.

Jim

Re: Allow specifying a dbname in pg_basebackup connection string

От
Jelte Fennema
Дата:
On Wed, 30 Aug 2023 at 01:01, Jim Jones <jim.jones@uni-muenster.de> wrote:
> However, pgbouncer closes with a segmentation fault, so I couldn't test the result of pg_basebackup itself - but I
guessit isn't the issue here.
 

Oops it indeed seemed like I made an unintended change when handling
database names that did not exist in pgbouncer.conf when you used
auth_type=hba. I pushed a fix for that now to the replication-support
branch. Feel free to try again. But as you said it's unrelated to the
postgres patch.



Re: Allow specifying a dbname in pg_basebackup connection string

От
Jim Jones
Дата:
On 30.08.23 14:11, Jelte Fennema wrote:
> Oops it indeed seemed like I made an unintended change when handling
> database names that did not exist in pgbouncer.conf when you used
> auth_type=hba. I pushed a fix for that now to the replication-support
> branch. Feel free to try again. But as you said it's unrelated to the
> postgres patch.

Nice! Now it seems to work as expected :)

$ /usr/local/postgres-dev/bin/pg_basebackup --dbname "host=127.0.0.1 
port=6432 user=jim  dbname=foo" -X fetch -v -l testlabel -D /tmp/dump
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/12000028 on timeline 1
pg_basebackup: write-ahead log end point: 0/12000100
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

pgbouncer log:

2023-08-30 21:04:30.225 CEST [785989] LOG C-0x55fbee0f50b0: 
foo/jim@127.0.0.1:49764 login attempt: db=foo user=jim tls=no 
replication=yes
2023-08-30 21:04:30.225 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 new connection to server (from 127.0.0.1:34400)
2023-08-30 21:04:30.226 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 closing because: replication client was closed 
(age=0s)
2023-08-30 21:04:30.226 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 new connection to server (from 127.0.0.1:34408)
2023-08-30 21:04:30.227 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 closing because: replication client was closed 
(age=0s)
2023-08-30 21:04:30.227 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 new connection to server (from 127.0.0.1:34410)
2023-08-30 21:04:30.228 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 closing because: replication client was closed 
(age=0s)
2023-08-30 21:04:30.228 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 new connection to server (from 127.0.0.1:34418)
2023-08-30 21:04:30.309 CEST [785989] LOG S-0x55fbee0fc560: 
foo/jim@127.0.0.1:5432 closing because: replication client was closed 
(age=0s)
2023-08-30 21:04:30.309 CEST [785989] LOG C-0x55fbee0f50b0: 
foo/jim@127.0.0.1:49764 closing because: client close request (age=0s)

Jim




Re: Allow specifying a dbname in pg_basebackup connection string

От
Jelte Fennema
Дата:
Attached is a new version with some slightly updated wording in the docs

Вложения

Re: Allow specifying a dbname in pg_basebackup connection string

От
Daniel Gustafsson
Дата:
> On 31 Aug 2023, at 11:01, Jelte Fennema <me@jeltef.nl> wrote:

> Attached is a new version with some slightly updated wording in the docs

I had a look at this Ready for Committer entry in the CF and it seems to strike
a balance between useful in certain cases and non-intrusive in others.

Unless something sticks out in a second pass over it I will go ahead and apply
it.

--
Daniel Gustafsson




Re: Allow specifying a dbname in pg_basebackup connection string

От
Daniel Gustafsson
Дата:
> On 18 Sep 2023, at 14:11, Daniel Gustafsson <daniel@yesql.se> wrote:

> Unless something sticks out in a second pass over it I will go ahead and apply
> it.

And applied, closing the CF entry.

--
Daniel Gustafsson