Обсуждение: Add new protocol message to change GUCs for usage with future protocol-only GUCs

Поиск

Список

Период

Сортировка

Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

29 декабря 2023 г., 20:29:06

Currently the only way to set GUCs from a client is by using SET
commands or set them in the StartupMessage. I think it would be very
useful to be able to change settings using a message at the protocol
level. For the following reasons:

1. Protocol messages are much easier to inspect for connection poolers
than queries
2. It paves the way for GUCs that can only be set using a protocol
message (and not using SET).
3. Being able to change GUCs while in an aborted transaction.
4. Have an easy way to use the value contained in a ParameterStatus
message as the value for a GUC

I attached a patch that adds a new protocol message to set a GUC
value. There's definitely still some more work that needs to be done
(docs for new libpq APIs, protocol version bump, working protocol
version negotiation). But the core functionality is working. At this
point I'd mainly like some feedback on the general idea.

The sections below go into more detail on each of the reasons I mentioned above:

(1) PgBouncer does not inspect query strings, to avoid having to
write/maintain a SQL parser (even a partial one). But that means that
clients cannot configure any behaviour of PgBouncer for their session.
A few examples of useful things to configure would be:
a. allow changing between session and transaction pooling on the same
connection.
b. intercepting changes to search_path, and routing different schemas
to different machines (useful for Citus its schema based sharding).
c. intercepting changing of pgbouncer.sharding_key, and route to
different machines based on this value.

(2) There are currently multiple threads ongoing that propose very
similar protocol changes for very different purposes. Essentially all
of them boil down to sending a protocol message to the server to
change some other protocol behaviour. And the reason why they cannot
use GUCs, is because the driver and/or connection pooler need to know
what the setting is and be able to choose it without a user running
some SQL suddenly changing the value. The threads I'm talking about
are: Choosing specific types that use binary format for encoding [1].
Changing what GUCs are reported to the client using ParameterStatus
(a.k.a configurable GUC_REPORT) [2]. Changing the compression method
that is used to compress messages[3].

Another benefit could be to allow a connection pooler to configure
certain settings to not be changeable with SQL. For instance if a
pooler could ensure that a client couldn't later change
session_authorization, it could use session_authorization to set the
user and then multiplex client connections from different users over
the same connection to the database.

(3) For psql it's useful to be able to control what messages it gets a
ParameterStatus for, even when the transaction is in aborted state.
Because that way it could decide what parameters status updates to
request based on the prompt it needs to display. And the prompt can be
changed even during an aborted transaction.

(4) PgBouncer uses the value contained in the ParameterStatus message
to correctly set some GUCs back to their expected value. But to do
this you currently need to use a SET query, which in turn requires
quoting the value using SQL quoting . This wouldn't be so bad, except
that GUC_LIST_QUOTE exists. Parameters with GUC_LIST_QUOTE have each
item in the list returned **double** quoted, and then those double
quoted items separated by commas. But to be able to correctly set
them, they need to be given each separately **single** quoted and
separated by commas. Doing that would require a lot of parsing logic
to replace double quotes with single quotes correctly. For now
pgbouncer only handles the empty string case correctly, for the
situations where the difference between double and single quotes
matters[4].

[1]:
https://www.postgresql.org/message-id/flat/CA%2BTgmoZyAh%2BhdN8zvHeN40n9vTstw8K1KjuWdgDuAMMbFAZqHg%40mail.gmail.com#e3a603bbc091e796148a2d660a4a1c1f
[2]: https://www.postgresql.org/message-id/flat/CAFj8pRBFU-WzzQhNrwRHn67N0Ug8a9-0-9BOo69PPtcHiBDQMA@mail.gmail.com
[3]:
https://www.postgresql.org/message-id/flat/AB607155-8FED-4C8C-B702-205B33884CBB%40yandex-team.ru#961c695d190cdccb3975a157b22ce9d8
[4]: https://github.com/pgbouncer/pgbouncer/blob/fb468025d61e1ffdc6dbc819558f45414e0a176e/src/varcache.c#L172-L183

P.S. I included authors and some reviewers of the threads I mentioned
for 2 in the CC. Since this patch is meant to be a generic protocol
change that could be used by all of them.

Вложения

v1-0001-Add-support-to-change-GUCs-at-the-protocol-level.patch

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Pavel Stehule

Дата:

29 декабря 2023 г., 21:09:16

pá 29. 12. 2023 v 18:29 odesílatel Jelte Fennema-Nio <me@jeltef.nl> napsal:

Currently the only way to set GUCs from a client is by using SET
commands or set them in the StartupMessage. I think it would be very
useful to be able to change settings using a message at the protocol
level. For the following reasons:

1. Protocol messages are much easier to inspect for connection poolers
than queries
2. It paves the way for GUCs that can only be set using a protocol
message (and not using SET).
3. Being able to change GUCs while in an aborted transaction.
4. Have an easy way to use the value contained in a ParameterStatus
message as the value for a GUC

I attached a patch that adds a new protocol message to set a GUC
value. There's definitely still some more work that needs to be done
(docs for new libpq APIs, protocol version bump, working protocol
version negotiation). But the core functionality is working. At this
point I'd mainly like some feedback on the general idea.

The sections below go into more detail on each of the reasons I mentioned above:

(1) PgBouncer does not inspect query strings, to avoid having to
write/maintain a SQL parser (even a partial one). But that means that
clients cannot configure any behaviour of PgBouncer for their session.
A few examples of useful things to configure would be:
a. allow changing between session and transaction pooling on the same
connection.
b. intercepting changes to search_path, and routing different schemas
to different machines (useful for Citus its schema based sharding).
c. intercepting changing of pgbouncer.sharding_key, and route to
different machines based on this value.

(2) There are currently multiple threads ongoing that propose very
similar protocol changes for very different purposes. Essentially all
of them boil down to sending a protocol message to the server to
change some other protocol behaviour. And the reason why they cannot
use GUCs, is because the driver and/or connection pooler need to know
what the setting is and be able to choose it without a user running
some SQL suddenly changing the value. The threads I'm talking about
are: Choosing specific types that use binary format for encoding [1].
Changing what GUCs are reported to the client using ParameterStatus
(a.k.a configurable GUC_REPORT) [2]. Changing the compression method
that is used to compress messages[3].

Another benefit could be to allow a connection pooler to configure
certain settings to not be changeable with SQL. For instance if a
pooler could ensure that a client couldn't later change
session_authorization, it could use session_authorization to set the
user and then multiplex client connections from different users over
the same connection to the database.

(3) For psql it's useful to be able to control what messages it gets a
ParameterStatus for, even when the transaction is in aborted state.
Because that way it could decide what parameters status updates to
request based on the prompt it needs to display. And the prompt can be
changed even during an aborted transaction.

(4) PgBouncer uses the value contained in the ParameterStatus message
to correctly set some GUCs back to their expected value. But to do
this you currently need to use a SET query, which in turn requires
quoting the value using SQL quoting . This wouldn't be so bad, except
that GUC_LIST_QUOTE exists. Parameters with GUC_LIST_QUOTE have each
item in the list returned **double** quoted, and then those double
quoted items separated by commas. But to be able to correctly set
them, they need to be given each separately **single** quoted and
separated by commas. Doing that would require a lot of parsing logic
to replace double quotes with single quotes correctly. For now
pgbouncer only handles the empty string case correctly, for the
situations where the difference between double and single quotes
matters[4].

[1]: https://www.postgresql.org/message-id/flat/CA%2BTgmoZyAh%2BhdN8zvHeN40n9vTstw8K1KjuWdgDuAMMbFAZqHg%40mail.gmail.com#e3a603bbc091e796148a2d660a4a1c1f
[2]: https://www.postgresql.org/message-id/flat/CAFj8pRBFU-WzzQhNrwRHn67N0Ug8a9-0-9BOo69PPtcHiBDQMA@mail.gmail.com
[3]: https://www.postgresql.org/message-id/flat/AB607155-8FED-4C8C-B702-205B33884CBB%40yandex-team.ru#961c695d190cdccb3975a157b22ce9d8
[4]: https://github.com/pgbouncer/pgbouncer/blob/fb468025d61e1ffdc6dbc819558f45414e0a176e/src/varcache.c#L172-L183

P.S. I included authors and some reviewers of the threads I mentioned
for 2 in the CC. Since this patch is meant to be a generic protocol
change that could be used by all of them.

Pavel

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jeff Davis

Дата:

29 декабря 2023 г., 21:32:15

On Fri, 2023-12-29 at 18:29 +0100, Jelte Fennema-Nio wrote:
> 2. It paves the way for GUCs that can only be set using a protocol
> message (and not using SET).

That sounds useful for GUCs that can interfere with the client, such as
client_encoding or the proposed GUC in you referred to at [1].

> There's definitely still some more work that needs to be done
> (docs for new libpq APIs, protocol version bump, working protocol
> version negotiation).

That is my biggest concern right now: what will new clients connecting
to old servers do?

If the version is bumped, should we look around for other unrelated
protocol changes to make at the same time? Do we want a more generic
form of negotiation?

Regards,
    Jeff Davis

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Tom Lane

Дата:

29 декабря 2023 г., 21:38:07

Jelte Fennema-Nio <me@jeltef.nl> writes:
> Currently the only way to set GUCs from a client is by using SET
> commands or set them in the StartupMessage. I think it would be very
> useful to be able to change settings using a message at the protocol
> level. For the following reasons:

> 1. Protocol messages are much easier to inspect for connection poolers
> than queries

Unless you somehow forbid clients from setting GUCs in the old way,
exactly how will that help a pooler?

> 2. It paves the way for GUCs that can only be set using a protocol
> message (and not using SET).

This is assuming facts not in evidence.  Why would we want such a thing?

> 3. Being able to change GUCs while in an aborted transaction.

I'm really dubious that that's sane.  How will it interact with, for
example, changes to the same GUC done in the now-failed transaction?
Or for that matter, changes that happen later in the current
transaction?  It seems like you can't even think about this unless
you deem GUC changes made this way to be non-transactional, which
seems very wart-y and full of consistency problems.

> 4. Have an easy way to use the value contained in a ParameterStatus
> message as the value for a GUC

I agree that GUC_LIST_QUOTE is a mess, but "I'm too lazy to deal
with that" seems like a rather poor argument for instead expending
the amount of labor involved in a protocol change.

On the whole, this feels like you are trying to force some things
into the GUC model that should not be there.  I do perceive that
there are things that could be protocol-level variables, but
trying to say they are a kind of GUC seems like it will create
more problems than it solves.

            regards, tom lane

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jeff Davis

Дата:

29 декабря 2023 г., 22:14:46

On Fri, 2023-12-29 at 13:38 -0500, Tom Lane wrote:
> > 2. It paves the way for GUCs that can only be set using a protocol
> > message (and not using SET).
>
> This is assuming facts not in evidence.  Why would we want such a
> thing?

The problem came up during the binary_formats GUC discussion: it
doesn't really make sense to change that with a SQL query, and doing so
can cause strange things to happen.

We already have the issue with client_encoding and binary format COPY,
so arguably it's not worth trying to solve it. But protocol-only GUCs
was one idea that came up.

Regards,
    Jeff Davis

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Tom Lane

Дата:

29 декабря 2023 г., 22:36:25

Jeff Davis <pgsql@j-davis.com> writes:
> On Fri, 2023-12-29 at 13:38 -0500, Tom Lane wrote:
>> This is assuming facts not in evidence.  Why would we want such a
>> thing?

> The problem came up during the binary_formats GUC discussion: it
> doesn't really make sense to change that with a SQL query, and doing so
> can cause strange things to happen.
> We already have the issue with client_encoding and binary format COPY,
> so arguably it's not worth trying to solve it. But protocol-only GUCs
> was one idea that came up.

Yeah, there's definitely an issue about what level of the client-side
software ought to be able to set such parameters.  I'm not sure that
"only the lowest level" is the correct answer though.  As an example,
libpq doesn't especially care what encoding it's dealing with, nor
(AFAIR) whether COPY data is text or binary.  The calling application
probably cares, but then we end up needing a bunch of new plumbing to
pass the settings through.  That's not really providing a lot of
value-add IMO.

            regards, tom lane

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

30 декабря 2023 г., 02:07:15

On Fri, 29 Dec 2023 at 19:32, Jeff Davis <pgsql@j-davis.com> wrote:
> On Fri, 2023-12-29 at 18:29 +0100, Jelte Fennema-Nio wrote:
> > There's definitely still some more work that needs to be done
> > (docs for new libpq APIs, protocol version bump, working protocol
> > version negotiation).
>
> That is my biggest concern right now: what will new clients connecting
> to old servers do?
>
> If the version is bumped, should we look around for other unrelated
> protocol changes to make at the same time? Do we want a more generic
> form of negotiation?

This is not that big of a deal. Since it's only an addition of a new
message type, it's completely backwards compatible with the current
protocol version. i.e. as long as a client just doesn't send it when
the server reports an older protocol version everything works fine.
The protocol already has version negotiation built in and the server
implements it in a reasonable way. The only problem is that no-one
bothered to implement the client side of it in libpq, because it
wasn't necessary yet since 3.x only had a single minor version.

Patch v20230911-0003[5] from thread [3] implements client side
handling in (imho) a sane way. I think it probably still needs some
small tweaks and discussion on if this is the exact user facing API
that we want. But there's no big hurdles implementation or protocol
wise to make the next libpq client backwards compatible with old
servers. I think it's worth merging something like that patch anyway,
because that's necessary for pretty much any protocol changes we would
like to do. After that there's pretty much no downside to bumping
minor versions of the protocol anymore, so we could even do it every
release if needed. Thus I don't think it's necessary to bulk any
protocol changes together.

[3]:
https://www.postgresql.org/message-id/flat/AB607155-8FED-4C8C-B702-205B33884CBB%40yandex-team.ru#961c695d190cdccb3975a157b22ce9d8
[5]:
https://www.postgresql.org/message-id/attachment/150192/v20230911-0003-allow-to-connect-to-server-with-major-protocol-versi.patch

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

30 декабря 2023 г., 02:31:36

On Fri, 29 Dec 2023 at 19:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Jelte Fennema-Nio <me@jeltef.nl> writes:
> > 1. Protocol messages are much easier to inspect for connection poolers
> > than queries
>
> Unless you somehow forbid clients from setting GUCs in the old way,
> exactly how will that help a pooler?

A co-operating client could choose to only use this new message type
to edit GUCs such as search_path or pgbouncer.sharding_key using this
method. That's already a big improvement over the status quo, where
PgBouncer (and most other poolers) only understands GUC changes by
observing the ParameterStatus responses from Postgres. At which point
it is obviously too late to make any routing decisions, because to get
the ParamaterStatus back the pooler already needs to have forwarded
the SET command to an actual Postgres server. The same holds for any
setting changes that are specific to the pooler and postgres doesn't
even know about, such as pgbouncer.pool_mode

> > 3. Being able to change GUCs while in an aborted transaction.
>
> I'm really dubious that that's sane.  How will it interact with, for
> example, changes to the same GUC done in the now-failed transaction?
> Or for that matter, changes that happen later in the current
> transaction?  It seems like you can't even think about this unless
> you deem GUC changes made this way to be non-transactional, which
> seems very wart-y and full of consistency problems.

I think that's a fair criticism of the current patch. I do think it's
quite useful for drivers/poolers not to have to worry about their
changes to protocol settings being rolled back unexpectedly because
they made the change while the client was doing a transaction.
Particularly because it's easy for poolers to detect when a Sync is
sent without parsing queries, but not when a BEGIN is sent (PgBouncer
uses the ReadyForQuery response from the server to detect if a
transaction is open or not). But I agree that this behaviour is
fraught with problems for any non-protocol level settings.

I feel like a reasonable solution would be to make this ParameterSet
message transactional after all, but explicitly define the relevant
protocol-only GUCs to be non-transactional.

> I agree that GUC_LIST_QUOTE is a mess, but "I'm too lazy to deal
> with that" seems like a rather poor argument for instead expending
> the amount of labor involved in a protocol change.

Fair enough, honestly this is more of a bonus benefit to me.

On Fri, 29 Dec 2023 at 20:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah, there's definitely an issue about what level of the client-side
> software ought to be able to set such parameters.  I'm not sure that
> "only the lowest level" is the correct answer though.  As an example,
> libpq doesn't especially care what encoding it's dealing with, nor
> (AFAIR) whether COPY data is text or binary.  The calling application
> probably cares, but then we end up needing a bunch of new plumbing to
> pass the settings through.  That's not really providing a lot of
> value-add IMO.

The value-add that I think it provides is forcing the user to use a
well defined interface when requesting behavioral changes to the
protocol. A client/pooler probably still wants to allow a user to
change these protocol level settings, but it wants to exert some
control when that happens. Clients could do so by exposing a basic
wrapper around PQparameterSet, and registering that they should parse
responses from postgres differently now. And poolers could do this by
understanding the message, and taking a relevant action (such as
enabling/disabling compression on outgoing messages). By having a well
defined interface, clients and poolers know when these protocol
related settings are being changed and can possibly even slightly
change the value before sending it to the server (e.g. by adding a few
extra GUCs to the list of GUCs that should be GUC_REPORTed because
PgBouncer wants to track them).

Achieving the same without this new ParameterSet message and
PQparameterSet might seem possible too, but then all clients and
poolers would need to implement a partial (and probably buggy) SQL
parser to check if the query that is being sent is "SET
binary_protocol = ...". And even this would not really be enough,
since a function would be able to change the relevant GUC without the
client ever sending SET ...

So, to summarize: Every piece of software in between the user writing
queries and postgres sending responses has some expectation and
requirements on what stuff looks like at the protocol level. Requiring
those intermediaries to look at the application layer (i.e. the actual
queries) to understand what to expect at the protocol layer is a
layering violation. Thus having a generic way to change protocol level
configuration options at the actual protocol level is needed to ensure
layer separation.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

30 декабря 2023 г., 02:49:40

On Fri, 29 Dec 2023 at 19:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > 2. It paves the way for GUCs that can only be set using a protocol
> > message (and not using SET).
>
> This is assuming facts not in evidence.

How about instead of talking about protocol-only GUCs (which are
indeed currently a theoretical concept), we start considering this
patch as a way to modify parameters for protocol extensions. Protocol
extension parameters (starting with _pq_.) can currently only be
configured using the StartupMessage and afterwards there is no way to
modify them.

I believe [1], [2] and [3] from my initial email could all use such
protocol extension parameters, if those parameters could be changed
after the initial startup.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jacob Burroughs

Дата:

30 декабря 2023 г., 18:05:27

> How about instead of talking about protocol-only GUCs (which are
> indeed currently a theoretical concept), we start considering this
> patch as a way to modify parameters for protocol extensions. Protocol
> extension parameters (starting with _pq_.) can currently only be
> configured using the StartupMessage and afterwards there is no way to
> modify them.
>
> I believe [1], [2] and [3] from my initial email could all use such
> protocol extension parameters, if those parameters could be changed
> after the initial startup.

What if we allowed the client to send `ParameterStatus` messages to
the server to reconfigure protocol extensions that were requested at
startup?  Such processing would be independent of the transaction
lifecycle since protocol-level options aren't related to transactions.
Any errors in the set value would be handled with an `ErrorResponse`
(including if an extension was not reconfigurable after connection
startup), and success with a new `ReadyForQuery` message. The actual
effect of an extension change must be delayed until after the
ReadyForQuery has been transmitted, though I don't know if that is
feasible or if we would need to instead implicitly assume changes were
successful and just close the connection on error.  We wouldn't need
to bump the protocol version since a client wouldn't (shouldn't) send
these messages unless it successfully negotiated the relevant protocol
extension(s) at startup.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

30 декабря 2023 г., 20:57:03

On Sat, 30 Dec 2023 at 16:05, Jacob Burroughs
<jburroughs@instructure.com> wrote:
> What if we allowed the client to send `ParameterStatus` messages to
> the server to reconfigure protocol extensions that were requested at
> startup?

Sending ParameterStatus from the frontend is not possible, because
Sync already claimed the 'S' message type on the frontend side. (and
even if that weren't the case I think the name ParameterStatus is not
descriptive of what the message would do when you'd send it from the
frontend.

> Such processing would be independent of the transaction
> lifecycle since protocol-level options aren't related to transactions.
> Any errors in the set value would be handled with an `ErrorResponse`
> (including if an extension was not reconfigurable after connection
> startup), and success with a new `ReadyForQuery` message.

If we only consider modifying protocol extension parameters with
ParameterSet, then I think I like the idea of reusing ReadyForQuery
instead of introducing ParamaterSetComplete. I like that it implicitly
makes clear that you should not send ParameterStatus while you already
have an open pipeline for protocol extension parameters. If we go this
approach, we should probably explicitly disallow sending ParameterSet
while a pipeline.

However, if we also allow using ParameterSet to change regular runtime
parameters, then I think this approach makes less sense. Because then
you might want to batch regular runtime parameter together with other
actions in a pipeline, and use the expected "ignore everything after
the first error"

> The actual
> effect of an extension change must be delayed until after the
> ReadyForQuery has been transmitted, though I don't know if that is
> feasible or if we would need to instead implicitly assume changes were
> successful and just close the connection on error.

I'm trying to think about how a client would want to handle a failure
when changing a protocol extension. Is there really an action it can
reasonably take at that point? I guess it might depend on the exact
extension, but I do think that in many cases closing the connection is
the only reasonable response. Maybe the server should even close the
connection with a FATAL error when it receives a ParameterSet for a
protocol extension but it fails to apply it.

> We wouldn't need
> to bump the protocol version since a client wouldn't (shouldn't) send
> these messages unless it successfully negotiated the relevant protocol
> extension(s) at startup.

I think I'd still prefer to bump the minor version, even if it's just
so that we've done it for the first time and we fixed all the libpq
issues with it. But I also think it makes sense from a versioning
perspective, if there's new message types that can be sent by the
client, which do not correspond to a protocol extension, then I think
the only reasonable thing is to update the version number. Otherwise
you'd basically need to define the ParameterSet message to be a part
of every new protocol extension that you would add. That seems more
confusing than saying that version 3.1 supports this new ParameterSet
message type.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

02 января, 17:57:37

On Sat, 30 Dec 2023 at 00:07, Jelte Fennema-Nio <me@jeltef.nl> wrote:
>
> On Fri, 29 Dec 2023 at 19:32, Jeff Davis <pgsql@j-davis.com> wrote:
> > That is my biggest concern right now: what will new clients connecting
> > to old servers do?
>
> This is not that big of a deal. Since it's only an addition of a new
> message type, it's completely backwards compatible with the current
> protocol version. i.e. as long as a client just doesn't send it when
> the server reports an older protocol version everything works fine.
> The protocol already has version negotiation built in and the server
> implements it in a reasonable way. The only problem is that no-one
> bothered to implement the client side of it in libpq, because it
> wasn't necessary yet since 3.x only had a single minor version.
>
> Patch v20230911-0003[5] from thread [3] implements client side
> handling in (imho) a sane way.

Okay I updated this patchset to start with better handling of the
NegotiateProtocolVersion packet. The implementation is quite different
from [5] after all, but the core idea is the same. It also allows the
connection to continue to work in case of a missing protocol
extension, which is necessary for the libpq compression patchset[6].
In case the protocol extension is a requirement, the client can still
choose to disconnect by checking the return value of the newly added
PQunsupportedProtocolExtensions function.

I also fixed the tests of my final patch, but haven't changed the
behaviour of it in any of the suggested ways.

[3]:
https://www.postgresql.org/message-id/flat/AB607155-8FED-4C8C-B702-205B33884CBB%40yandex-team.ru#961c695d190cdccb3975a157b22ce9d8

On Wed, 3 Jan 2024 at 00:43, Jacob Burroughs <jburroughs@instructure.com> wrote:
> What if we...

Great suggestions! Attached is a v3 version of the patchset that
implements all of this, including documentation.

Вложения

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

05 января, 18:13:48

New patchset attached, where I split up the patches in smaller logical units.
Note that the first 4 patches in this series are not making any
protocol changes. All they do is set up infrastructure in the code
that allows us to make protocol changes in the future.

I hope that those 4 should all be fairly uncontroversial, especially
patch 1 seems a no-brainer to me. Note that this infrastructure would
be needed for any patchset that introduces protocol changes.

The 5th only bumps the version

The 6th introduces the _pq_.protocol_managed_parms protocol parameter

The 7th adds the new ParameterSet protocol message

Okay, attempt number 5 attached. The primary changes with the previous
version are:

1. Split up commits a bit differently. I think each commit now stands
on its own and is an incremental improvement that could be committed
without any of the later ones being necessary. Descriptions of why
each commit is useful can be found in the commit message. Especially
the first 3 (and even first 4) seem rather uncontroversial to me.

2. ParameterSet now works for normal backend parameters too. For
normal parameters it works just like SET does (so in a transactional
manner). For protocol extension parameters it still works too, but
there it errors when trying to change protocol extension parameters
within a transaction. Thus (imho) elegantly avoiding confusing
situations around rolling back protocol extension parameters.

3. As Tom suggested, it now uses a PGC_PROTOCOL context for the
protocol extension GUCs, instead of using a GUC_PROTOCOL_EXTENSION
flag. This definitely seems like the cleanest way of adding "protocol
only" parameters to the current GUC system.

4. _pq_.protocol_managed_params its list of GUCs can now only contain
GUCs that have the PGC_USERSET or PGC_SUSET context.

On Fri, 5 Jan 2024 at 17:28, Robert Haas <robertmhaas@gmail.com> wrote:
> First, I don't see a reason to bump the protocol version.

It's still bumping the protocol version. I think this is a necessity
with the current changeset, since ParameterSet now applies to plain
GUCs too and. As I clarified in a previous email, this does **not**
break old clients, since the server silently downgrades when an older
protocol version is requested.

> Second, I don't really like the idea of selectively turning GUCs into
> protocol-managed parameters. I think there are a relatively small
> number of things that we'd ever want to manage on a protocol level,
> but this design would force us to make it work for any GUC whatsoever.

It now limits the possible GUCs that can be changed to PGC_USERSET and
PGC_SUSET. If desired, we could limit it even further by using an
allowlist of reasonable GUCs or set a flag on GUCs that can be
"upgraded" . Things that seem at least reasonable to me are "role",
"session_authorization", "client_encoding".

> If somebody makes a random GUC into a protocol-managed parameter and then
> somebody updates postgresql.conf and then the config file is reloaded,
> I guess we need to refuse to adopt the new value of that parameter?

This was actually really easy to do after changing to PGC_PROTOCOL.
This exact behaviour is needed for PGC_BACKEND parameters, so I simply
used that same small if statement for PGC_PROTOCOL.

If you still think we shouldn't do this, then the only other option I
can think of is to only allow GUCs with the GUC_DISALLOW_IN_FILE flag.
This would rule out adding client_encoding in the list though, but
using role and session_authorization would still be possible.

Вложения

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

10 января, 20:26:36

v6 attached with the following changes:

1. Fixed rebase conflicts with master

2. removed PGC_S_PROTOCOL (but kept PGC_PROTOCOL and PGC_SU_PROTOCOL).
This extra source level was not needed. And after some more testing I
realized this extra source level even caused problems, since protocol
messages could not override values set by SET commands anymore.

3. Added a new patch (0010) with a protocol parameter to configure
which GUCs are GUC_REPORT. This is partially to show that the GUC
interface makes sense for protocol parameters, but also because this
would be an extremely useful feature for connection poolers. And [2]
would be able to use this too.

4. Don't error, but only warn, if a GUC provided to
_pq_.protocol_managed_params is unknown. It seemed useful to be able
to specify GUCs in this list that not all Postgres versions support in
the StartupMessage, without having to guess what postgres version
you're going to connect to.

[2]: https://www.postgresql.org/message-id/flat/CAFj8pRBFU-WzzQhNrwRHn67N0Ug8a9-0-9BOo69PPtcHiBDQMA@mail.gmail.com

On Thu, 11 Jan 2024 at 17:20, Jelte Fennema-Nio <me@jeltef.nl> wrote:
> So to be clear, if you consider 0008 the most controversial/risky part
> of this patchset (which it sounds like that's the case). I'd be fine
> with removing that for now.

I haven't removed 0008 yet, since I'd like some feedback first if that
makes sense. But I did add two new patches in the middle of the
patchset (which shift the later patch numbers by 2):

0007: Adds a new \parameterset meta-command to psql to be able to more
easily test the new ParameterSet message

0008: Shows warning in psql if the server is not supporting the newest
protocol version.

On Mon, 22 Jan 2024 at 04:31, Peter Smith <smithpb2250@gmail.com> wrote:
> Hi, This patch has a CF status of "Needs Review", but it seems like
> there were some CFbot test failures last time it was run [1]. Please
> have a look and post an updated version if necessary.

Ah yes, I had noticed that a while back and fixed the problem in v7 of
my patchset but it seems a rebase conflict has caused the cfbot not to
run tests on that version at all. Attached is a rebased version
together with following small changes:

- Rename PQunsupportedProtocolExtensions to
PQunsupportedProtocolExtensionParameters

- Add PQunsupportedProtocolExtensionParameters in its own patch

Вложения

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

20 февраля, 17:04:07

Fixed a rebase conflict.

Вложения

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

08 марта, 14:47:15

Fixed some conflicts again.

To summarize the current state of this patchset, it consists of 5
different parts that could be merged separately:
1. 0001-0005 are needed for any protocol change, and hopefully
shouldn't require much discussion
2. 0006-0009 introduce a new protocol message that can be used to
update GUCs. I think we might want some discussion on the design of
the protocol message, but I think the behaviour has all feedback
incorporated.
3. 0010 Adds GUC contexts for protocol extensions, in the way that Tom
Lane suggested.
4. 0011 Adds a way to mark some GUCs as only changeable using protocol
messages, there's still some discussion needed on this (I can remove
this from the patchset if that makes it easier).
5. 0012 Adds some additional tests for some of the previous features
6. 0013 Add a protocol parameter to control which GUCs have
GUC_REPORT. Quite straightforward imho.

Вложения

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

13 марта, 21:18:07

Fixed some conflicts again, as well as adding a connection option to
choose the requested protocol version (as discussed in[1]). This new
connection option is not useful when connecting to any of the
supported postgres versions. But it can be useful when connecting to
PG versions before 9.3. Or when connecting to connection poolers or
other databases that implement the postgres protocol but do not
support the NegotiateProtocolVersion message.

[1]:
https://www.postgresql.org/message-id/flat/CAGECzQRrHn52yEX%2BFc6A9uvVbwRCxjU82KNuBirwFU5HRrNxqA%40mail.gmail.com#835914cbd55c56b36e8e7691cb743a18

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

13 марта, 21:27:06

On Wed, 13 Mar 2024 at 19:18, Jelte Fennema-Nio <me@jeltef.nl> wrote:
>
> Fixed some conflicts again, as well as adding a connection option to
> choose the requested protocol version

now with attachments...

Attached is a rebased patchset

Вложения

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Dave Cramer

Дата:

05 апреля, 15:20:44

On Thu, 4 Apr 2024 at 12:45, Jelte Fennema-Nio <me@jeltef.nl> wrote:

On Thu, 4 Apr 2024 at 14:50, Peter Eisentraut <peter@eisentraut.org> wrote:
> It appears there are several different perspectives about this. My
> intuition was that a protocol version change indicates something that we
> eventually want all client libraries to support. Whereas protocol
> extensions are truly opt-in.
>
> For example, if we didn't have the ParameterStatus message and someone
> wanted to add it, then this ought to be a protocol version change, so
> that we eventually get everyone to adopt it.
>
> But, for example, the column encryption feature is not something I'd
> expect all client libraries to implement, so it can be opt-in.

I think that is a good way of deciding between version bump vs
protocol extension parameter. But I'd like to make one
clarification/addition to this logic, if the protocol feature already
requires opt-in from the client because of how the feature works, then
there's no need for a protocol extension parameter. e.g. if you're
column-encryption feature would require the client to send a
ColumnDecrypt message before the server would exhibit any behaviour
relating to the column-encryption feature, then the client can simply
"support" the feature by never sending the ColumnDecrypt message.
Thus, in such cases a protocol extension parameter would not be
necessary, even if the feature is considered opt-in.

> (I believe we have added a number of new protocol messages since the
> beginning of the 3.0 protocol, without any version change, so "new
> protocol message" might not always be a good example to begin with.)

Personally, I feel like this is something we should change. IMHO, we
should get to a state where protocol minor version bumps are so
low-risk that we can do them whenever we add message types. Right now
there are basically multiple versions of the 3.0 protocol, which is
very confusing to anyone implementing it. Different servers
implementing the 3.0 protocol without the NegotiateVersion message is
a good example of that.

Totally agree.

> I fear that if we prefer protocol extensions over version increases,
> then we'd get a very fragmented landscape of different client libraries
> supporting different combinations of things.

Dave

+1

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

05 апреля, 15:55:41

On Thu, Apr 4, 2024 at 8:50 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> It appears there are several different perspectives about this.  My
> intuition was that a protocol version change indicates something that we
> eventually want all client libraries to support.  Whereas protocol
> extensions are truly opt-in.

Hmm. This doesn't seem like a bad way to make a distinction to me,
except that I fear it would be mushy in practice. For example:

> For example, if we didn't have the ParameterStatus message and someone
> wanted to add it, then this ought to be a protocol version change, so
> that we eventually get everyone to adopt it.
>
> But, for example, the column encryption feature is not something I'd
> expect all client libraries to implement, so it can be opt-in.

I agree that column encryption might not necessarily be supported by
all client libraries, but equally, the ParameterStatus message is just
for the benefit of the client. A client that doesn't care about the
contents of such a message is free to ignore it, and would be better
off if it weren't sent in the first place; it's just extra bytes on
the wire that aren't needed for anything. So why would we want to
force everyone to adopt that, if it didn't exist already?

I also wonder how the protocol negotiation for column encryption is
actually going to work. What are the actual wire protocol changes that
are needed? What does the server need to know from the client, or the
client from the server, about what is supported?

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

05 апреля, 17:02:08

On Thu, Apr 4, 2024 at 12:45 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> Yeah, we definitely think differently here then. To me bumping the
> minor protocol version shouldn't be a thing that we would need to
> carefully consider. It should be easy to do, and probably done often.

Often?

I kind of hope that the protocol starts to evolve a bit more than it
has, but I don't want a continuous stream of changes. That will be
very hard to test and verify correctness, and a hassle for drivers to
keep up with, and a mess for compatibility.

> So, what approach of checking feature support are you envisioning
> instead? A function for every feature?
> Something like SupportsSetProtocolParameter, that returns an error
> message if it's not supported and NULL otherwise. And then add such a
> support function for every feature?

Yeah, something like that.

> I think I might agree with you that it would be nice for features that
> depend on a protocol extension parameter, indeed because in the future
> we might make them required and they don't have to update their logic
> then. But for features only depending on the protocol version I
> honestly don't see the point. A protocol version check is always going
> to continue working.

Sure, if we introduce it based on the protocol version then we don't
need anything else.

> I'm also not sure why you're saying a user is not entitled to this
> information. We already provide users of libpq a way to see the full
> Postgres version, and the major protocol version. I think allowing a
> user to access this information is only a good thing. But I agree that
> providing easy to use feature support functions is a better user
> experience in some cases.

I guess that's a fair point. But I'm worried that if we expose too
much of the internals, we won't be able to change things later.

> > But this is another example of a problem that can *easily* be fixed
> > without using up the entirety of the _pq_ namespace. You can introduce
> > _pq_.dont_error_on_unknown_gucs=1 or
> > _pq_.dont_error_on_these_gucs='foo, bar, baz'. The distinction between
> > the startup packet containing whizzbang=frob and instead containing
> > _pq_.whizzbang=frob shouldn't be just whether an error is thrown if we
> > don't know anything about whizzbang.
>
> Nice idea, but that sadly won't work well in practice. Because old PG
> versions don't know about _pq_.dont_error_on_unknown_gucs. So that
> would mean we'd have to wait another 5 years (and probably more) until
> we could set non-_pq_-prefixed GUCs safely in the startup message,
> using this approach.

Hmm. I guess that is a problem.

> Two side-notes:
> 1. I realized there is a second benefit to using _pq_ for all GUCs
> that change the protocol level that I didn't mention in my previous
> email: It allows clients to assume that _pq_ prefixed GUCs will change
> the wire-protocol and that they should not allow a user of the client
> to set them willy-nilly. I'll go into this benefit more in the rest of
> this email.
> 2. To clarify: I'm suggesting that a startup packet containing
> _pq_.whizzbang would actually set the _pq_.whizzbang GUC, and not the
> whizzbang GUC. i.e. the _pq_ prefix is not stripped-off when parsing
> the startup packet.

I really intended the _pq_ prefix as a way of taking something out of
the GUC namespace, not as a part of the GUC namespace that users would
see. And I'm reluctant to go back on that. If we want to make
pg_protocol.${NAME} mean a wire protocol parameter, well maybe there's
something to that idea [insert caveats here]. But doesn't _pq_ look
like something that was intended to be internal? That's certainly how
I intended it.

> > I want to be able to know the state of my protocol
> > parameters by calling libpq functions that answer my questions
> > definitively based on libpq's own internal state. libpq itself *must*
> > know what compression I'm using on my connection; the server's answer
> > may be different.
>
> I think that totally makes sense that libpq should be able to answer
> those questions without contacting the server, and indeed some
> introspection should be added for that. But being able to introspect
> what the server thinks the setting is seems quite useful too. That
> still doesn't solve the problem of poolers though. How about
> introducing a new ParameterGet message type too (matching the proposed
> ParameterSet), so that poolers can easily parse and intercept that
> message type.

Wouldn't libpq already know what value it last set? Or is this needed
because it doesn't know what the other side's default is?

> 2. By using PGC_PROTOCOL to indicate that those GUCs can only be
> changed using ParameterSet

Hmm, OK. I guess if the PGC_PROTOCOL flag makes it so that the GUC can
only be set using ParameterSet, and it also makes them
non-transactional, then it's fine. So to be clear, I can't set these
in postgresql.conf, or postgresql.auto.conf, or via ALTER $ANYTHING,
or via SET, or in any other way than by sending ParameterStatus
messages. And when I send a ParameterStatus message, it doesn't matter
if I'm in a good transaction, an aborted transaction, or no
transaction at all, and the setting change takes effect regardless of
that and regardless of any subsequent rollbacks. Is that right?

I feel like maybe it's not, because you seem to be thinking that you'd
also set these in the startup packet, at least...

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

05 апреля, 17:04:19

On Thu, Apr 4, 2024 at 1:10 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> Attached is a rebased patchset

We should keep talking about this, but I think we're far too close to
the wire at this point to think about committing anything for v17 at
this point. These are big changes, they haven't been thoroughly
reviewed by anyone AFAICT, and we don't have consensus on what we
ought to be doing. I know that's probably not what you want to hear,
but realistically, I think that's the only reasonable decision at this
point.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

05 апреля, 18:28:03

On Fri, 5 Apr 2024 at 16:04, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Apr 4, 2024 at 1:10 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> > Attached is a rebased patchset
>
> We should keep talking about this, but I think we're far too close to
> the wire at this point to think about committing anything for v17 at
> this point. These are big changes, they haven't been thoroughly
> reviewed by anyone AFAICT, and we don't have consensus on what we
> ought to be doing. I know that's probably not what you want to hear,
> but realistically, I think that's the only reasonable decision at this
> point.

Agreed on not considering this for PG17 nor this commitfest anymore. I
changed the commit fest entry accordingly.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

05 апреля, 19:09:29

On Fri, 5 Apr 2024 at 16:02, Robert Haas <robertmhaas@gmail.com> wrote:
> Often?
>
> I kind of hope that the protocol starts to evolve a bit more than it
> has, but I don't want a continuous stream of changes. That will be
> very hard to test and verify correctness, and a hassle for drivers to
> keep up with, and a mess for compatibility.

I definitely think protocol changes require a lot more scrutiny than
many other things, given their hard/impossible to change nature.

But I do think that we shouldn't be at all averse to the act of
bumping the protocol version itself. If we have a single small
protocol change in one release, then imho it's no problem to bump the
protocol version. Bumping the version should be painless. So we
shouldn't be inclined to push an already agreed upon protocol change
to the next release, because there are some more protocol changes in
the pipeline that won't make it in the current one.

I don't think this would be any harder for drivers to keep up with,
then if we'd bulk all changes together. If driver developers only want
to support changes in bulk changes, they can simply choose not to
support 3.1 at all if they want and wait until 3.2 to support
everything in bulk then.

> > So, what approach of checking feature support are you envisioning
> > instead? A function for every feature?
> > Something like SupportsSetProtocolParameter, that returns an error
> > message if it's not supported and NULL otherwise. And then add such a
> > support function for every feature?
>
> Yeah, something like that.
> ...
>
> > I'm also not sure why you're saying a user is not entitled to this
> > information. We already provide users of libpq a way to see the full
> > Postgres version, and the major protocol version. I think allowing a
> > user to access this information is only a good thing. But I agree that
> > providing easy to use feature support functions is a better user
> > experience in some cases.
>
> I guess that's a fair point. But I'm worried that if we expose too
> much of the internals, we won't be able to change things later.

I'll take a look at redesigning the protocol parameter stuff. To work
with dedicated functions instead.

> I really intended the _pq_ prefix as a way of taking something out of
> the GUC namespace, not as a part of the GUC namespace that users would
> see. And I'm reluctant to go back on that. If we want to make
> pg_protocol.${NAME} mean a wire protocol parameter, well maybe there's
> something to that idea [insert caveats here]. But doesn't _pq_ look
> like something that was intended to be internal? That's certainly how
> I intended it.

I agree that _pq_ does look internal and doesn't clearly indicate that
it's a protocol related change. But sadly the _pq_ prefix is the only
one that doesn't error in startup packets, waiting another 5 years
until pg_protocol is allowed in the startup packet doesn't seem like a
reasonable solution either.

How about naming the GUCs pg_protocol.${NAME}, but still requiring the
_pq_ prefix in the StartupPacket. That way only client libraries would
have to see this internal prefix and they could remap it someway. I
see two options for that:
1. At the server replace the _pq_ prefix with pg_protocol. So
_pq_.${NAME} would map to pg_protocol.${name}
2. At the server replace the _pq_.pg_protocol prefix with pg_protocol.
So _pq_.pg_protocol.${NAME} would map to pg_protocol.${name}.

I guess you prefer option 2, because that would still leave lots of
space to do something with the rest of the _pq_ space, i.e.
_pq_.magic_pixie_dust can still be used for something different than a
GUC.

Bikeshedding: I think I prefer protocol.${NAME} over
pg_protocol.${NAME}, it's shorter and it seems obvious that protocol
is the postgres protocol in this context.

This should be a fairly simple change to make.

> Wouldn't libpq already know what value it last set? Or is this needed
> because it doesn't know what the other side's default is?

libpq could/should indeed know this, but for debugging/testing
purposes it is quite useful to have a facility to read the server side
value. I think defaults should always be whatever was happening if the
parameter wasn't specified before, so knowing the server default is
not something the client needs to worry about (i.e. the default is
defined as part of the protocol spec).

> Hmm, OK. I guess if the PGC_PROTOCOL flag makes it so that the GUC can
> only be set using ParameterSet, and it also makes them
> non-transactional, then it's fine. So to be clear, I can't set these
> in postgresql.conf, or postgresql.auto.conf, or via ALTER $ANYTHING,
> or via SET, or in any other way than by sending ParameterStatus
> messages. And when I send a ParameterStatus message, it doesn't matter
> if I'm in a good transaction, an aborted transaction, or no
> transaction at all, and the setting change takes effect regardless of
> that and regardless of any subsequent rollbacks. Is that right?
>
> I feel like maybe it's not, because you seem to be thinking that you'd
> also set these in the startup packet, at least...

Setting PGC_PROTOCOL gucs would be allowed in the startup packet,
which is fine afaict because that's also something that's part of the
protocol level and is thus fully controlled by client libraries and
poolers) But other than that: Yes, conf files, ALTER, and SET cannot
change these GUCs.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Dave Cramer

Дата:

05 апреля, 19:30:01

On Fri, 5 Apr 2024 at 12:09, Jelte Fennema-Nio <me@jeltef.nl> wrote:

On Fri, 5 Apr 2024 at 16:02, Robert Haas <robertmhaas@gmail.com> wrote:
> Often?
>
> I kind of hope that the protocol starts to evolve a bit more than it
> has, but I don't want a continuous stream of changes. That will be
> very hard to test and verify correctness, and a hassle for drivers to
> keep up with, and a mess for compatibility.

I definitely think protocol changes require a lot more scrutiny than
many other things, given their hard/impossible to change nature.

But I do think that we shouldn't be at all averse to the act of
bumping the protocol version itself. If we have a single small
protocol change in one release, then imho it's no problem to bump the
protocol version. Bumping the version should be painless. So we
shouldn't be inclined to push an already agreed upon protocol change
to the next release, because there are some more protocol changes in
the pipeline that won't make it in the current one.

I don't think this would be any harder for drivers to keep up with,
then if we'd bulk all changes together. If driver developers only want
to support changes in bulk changes, they can simply choose not to
support 3.1 at all if they want and wait until 3.2 to support
everything in bulk then.

> > So, what approach of checking feature support are you envisioning
> > instead? A function for every feature?
> > Something like SupportsSetProtocolParameter, that returns an error
> > message if it's not supported and NULL otherwise. And then add such a
> > support function for every feature?
>
> Yeah, something like that.
> ...
>
> > I'm also not sure why you're saying a user is not entitled to this
> > information. We already provide users of libpq a way to see the full
> > Postgres version, and the major protocol version. I think allowing a
> > user to access this information is only a good thing. But I agree that
> > providing easy to use feature support functions is a better user
> > experience in some cases.
>
> I guess that's a fair point. But I'm worried that if we expose too
> much of the internals, we won't be able to change things later.

I'll take a look at redesigning the protocol parameter stuff. To work
with dedicated functions instead.

> I really intended the _pq_ prefix as a way of taking something out of
> the GUC namespace, not as a part of the GUC namespace that users would
> see. And I'm reluctant to go back on that. If we want to make
> pg_protocol.${NAME} mean a wire protocol parameter, well maybe there's
> something to that idea [insert caveats here]. But doesn't _pq_ look
> like something that was intended to be internal? That's certainly how
> I intended it.

Is this actually used in practice? If so, how ?

I agree that _pq_ does look internal and doesn't clearly indicate that
it's a protocol related change. But sadly the _pq_ prefix is the only
one that doesn't error in startup packets, waiting another 5 years
until pg_protocol is allowed in the startup packet doesn't seem like a
reasonable solution either.

How about naming the GUCs pg_protocol.${NAME}, but still requiring the
_pq_ prefix in the StartupPacket. That way only client libraries would
have to see this internal prefix and they could remap it someway. I
see two options for that:
1. At the server replace the _pq_ prefix with pg_protocol. So
_pq_.${NAME} would map to pg_protocol.${name}
2. At the server replace the _pq_.pg_protocol prefix with pg_protocol.
So _pq_.pg_protocol.${NAME} would map to pg_protocol.${name}.

I guess you prefer option 2, because that would still leave lots of
space to do something with the rest of the _pq_ space, i.e.
_pq_.magic_pixie_dust can still be used for something different than a
GUC.

Bikeshedding: I think I prefer protocol.${NAME} over
pg_protocol.${NAME}, it's shorter and it seems obvious that protocol
is the postgres protocol in this context.

This should be a fairly simple change to make.

> Wouldn't libpq already know what value it last set? Or is this needed
> because it doesn't know what the other side's default is?

libpq could/should indeed know this, but for debugging/testing
purposes it is quite useful to have a facility to read the server side
value. I think defaults should always be whatever was happening if the
parameter wasn't specified before, so knowing the server default is
not something the client needs to worry about (i.e. the default is
defined as part of the protocol spec).

> Hmm, OK. I guess if the PGC_PROTOCOL flag makes it so that the GUC can
> only be set using ParameterSet, and it also makes them
> non-transactional, then it's fine. So to be clear, I can't set these
> in postgresql.conf, or postgresql.auto.conf, or via ALTER $ANYTHING,
> or via SET, or in any other way than by sending ParameterStatus
> messages. And when I send a ParameterStatus message, it doesn't matter
> if I'm in a good transaction, an aborted transaction, or no
> transaction at all, and the setting change takes effect regardless of
> that and regardless of any subsequent rollbacks. Is that right?
>
> I feel like maybe it's not, because you seem to be thinking that you'd
> also set these in the startup packet, at least...

Setting PGC_PROTOCOL gucs would be allowed in the startup packet,
which is fine afaict because that's also something that's part of the
protocol level and is thus fully controlled by client libraries and
poolers) But other than that: Yes, conf files, ALTER, and SET cannot
change these GUCs.

Dave

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Tom Lane

Дата:

05 апреля, 19:43:36

Dave Cramer <davecramer@gmail.com> writes:
> On Fri, 5 Apr 2024 at 12:09, Jelte Fennema-Nio <me@jeltef.nl> wrote:
>> Setting PGC_PROTOCOL gucs would be allowed in the startup packet,
>> which is fine afaict because that's also something that's part of the
>> protocol level and is thus fully controlled by client libraries and
>> poolers) But other than that: Yes, conf files, ALTER, and SET cannot
>> change these GUCs.

> +1

I don't buy that argument, actually.  libpq, and pretty much every
other client AFAIK, has provisions to let higher code levels insert
random options into the startup packet.  So to make this work libpq
would have to filter or at least inspect such options, which is
logic that doesn't exist and doesn't seem nice to need.

The other problem with adding these things in the startup packet
is that when you send that packet, you don't know what the server
version is and hence don't know if it will take these options.

What's so bad about insisting that these options must be sent in a
separate message?

            regards, tom lane

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

05 апреля, 19:48:14

On Fri, Apr 5, 2024 at 12:09 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> But I do think that we shouldn't be at all averse to the act of
> bumping the protocol version itself. If we have a single small
> protocol change in one release, then imho it's no problem to bump the
> protocol version. Bumping the version should be painless. So we
> shouldn't be inclined to push an already agreed upon protocol change
> to the next release, because there are some more protocol changes in
> the pipeline that won't make it in the current one.

I think I half-agree with this. Let's say we all agree that the world
will end unless we make wire protocol changes A and B, and for
whatever reason we also agree that these changes should be handled via
a protocol version bump rather than any other method, but only the
patch for A is sufficiently stable by the end of the release cycle.
Then we commit A and bump the protocol version and the next release we
do the same for B, hopefully before the world ends. In this sense this
is just like CATALOG_VERSION_NO or XLOG_PAGE_MAGIC. We don't postpone
commits because they'd require bumping those values; we just bump the
values when it's necessary.

But on the other hand, I find it a bit hard to believe that the
statement "Bumping the version should be painless" will ever
correspond to reality. Since we've not done a hard wire protocol break
in a very long time, I assume that we would not want to start now. But
that also means that when the PG version with protocol version 3.5
comes out, that server is going to have to be compatible with 3.4,
3.3, 3.2, 3.1, and 3.0. How will we test that it really is? We could
test against libpqs from older server versions, but that quickly
becomes awkward, because it means that there won't be tests that run
as part of a regular build, but it'll have to all be done in the
buildfarm or CI with something like the cross-version upgrade tests we
already have. Maybe we'd be better off adding a libpq connection
option that forces the use of a specific minor protocol version, but
then we'll need backward-compatibility code in libpq basically
forever. But maybe we need that anyway to avoid older and newer
servers being unable to communicate.

Plus, you've got all of the consequences for non-core drivers, which
have to both add support for the new wire protocol - if they don't
want to seem outdated and eventually obsolete - and also test that
they're still compatible with all supported server versions.
Connection poolers have the same set of problems. The whole thing is
almost a hole with no bottom. Keeping up with core changes in this
area could become a massive undertaking for lots and lots of people,
some of whom may be the sole maintainer of some important driver that
now needs a whole bunch of work.

I'm not sure how much it improves things if we imagine adding feature
flags to the existing protocol versions, rather than whole new
protocol versions, but at least it cuts down on the assumption that
adopting new features is mandatory, and that such features are
cumulative. If a driver wants to support TDE but not protocol
parameters or protocol parameters but not TDE, who are we to say no?
If in supporting those things we bump the protocol version to 3.2, and
then 3.3 fixes a huge performance problem, are drivers going to be
required to add support for features they don't care about to get the
performance fixes? I see some benefit in bumping the protocol version
for major changes, or for changes that we have an important reason to
make mandatory, or to make previously-optional features for which
support has become in practical terms universal part of the base
feature set. But I'm very skeptical of the idea that we should just
handle as many things as possible via a protocol version bump. We've
been avoiding protocol version bumps like the plague since forever,
and swinging all the way to the other extreme doesn't sound like the
right idea to me.

> How about naming the GUCs pg_protocol.${NAME}, but still requiring the
> _pq_ prefix in the StartupPacket. That way only client libraries would
> have to see this internal prefix and they could remap it someway. I
> see two options for that:
> 1. At the server replace the _pq_ prefix with pg_protocol. So
> _pq_.${NAME} would map to pg_protocol.${name}
> 2. At the server replace the _pq_.pg_protocol prefix with pg_protocol.
> So _pq_.pg_protocol.${NAME} would map to pg_protocol.${name}.
>
> I guess you prefer option 2, because that would still leave lots of
> space to do something with the rest of the _pq_ space, i.e.
> _pq_.magic_pixie_dust can still be used for something different than a
> GUC.

I'm not sure what I think about this. Do we need these new GUCs to be
both PGC_PROTOCOL *and also* live in a separate namespace? I see the
need for the former pretty clearly: if these kinds of things are to be
part of the GUC system (which wasn't my initial bias, but whatever)
then they need to have some important behavioral differences from
other GUCs and so we need a flag to signal that. But what problem are
we solving by also giving them special-looking names, and are we sure
we wouldn't rather solve that problem some other way?

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

05 апреля, 19:48:38

On Fri, 5 Apr 2024 at 18:30, Dave Cramer <davecramer@gmail.com> wrote:
>> > I really intended the _pq_ prefix as a way of taking something out of
>> > the GUC namespace, not as a part of the GUC namespace that users would
>> > see. And I'm reluctant to go back on that. If we want to make
>> > pg_protocol.${NAME} mean a wire protocol parameter, well maybe there's
>> > something to that idea [insert caveats here]. But doesn't _pq_ look
>> > like something that was intended to be internal? That's certainly how
>> > I intended it.
>
>
> Is this actually used in practice? If so, how ?

No, it's not used for anything at the moment. This whole thread is
basically about trying to agree on how we want to make protocol
changes in the future in a somewhat standardized way. But using the
tools available that we have to not break connecting to old postgres
servers: ProtocolVersionNegotation messages, minor version numbers,
and _pq_ parameters in the startup message. All of those have so far
been completely theoretical and have not appeared in any client-server
communication.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

05 апреля, 19:56:02

On Fri, 5 Apr 2024 at 18:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't buy that argument, actually.  libpq, and pretty much every
> other client AFAIK, has provisions to let higher code levels insert
> random options into the startup packet.  So to make this work libpq
> would have to filter or at least inspect such options, which is
> logic that doesn't exist and doesn't seem nice to need.

libpq actually doesn't support doing this (only by putting them in the
"options" parameter, but _pq_ parameters would not be allowed there),
but indeed many other clients do allow this and indeed likely don't
have logic to filter/disallow _pq_ prefixed parameters.

This seems very easy to address though: Only parse _pq_ options when
protocol version 3.1 is requested by the client, and otherwise always
report them as "not supported". Then clients upgrading to 3.1, they
should filter/disallow _pq_ parameters to be arbitrarily set. I don't
think that's hard/not nice to add, it's literally a prefix check for
the "_pq_." string.

> The other problem with adding these things in the startup packet
> is that when you send that packet, you don't know what the server
> version is and hence don't know if it will take these options.

(imho) the whole point of the _pq_ options is that they don't trigger
an error when they are requested by the client, but not supported by
the server. So I don't understand your problem here.

> What's so bad about insisting that these options must be sent in a
> separate message?

To not require an additional roundtrip waiting for the server to respond.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Dave Cramer

Дата:

05 апреля, 21:06:50

Plus, you've got all of the consequences for non-core drivers, which
have to both add support for the new wire protocol - if they don't
want to seem outdated and eventually obsolete - and also test that
they're still compatible with all supported server versions.
Connection poolers have the same set of problems. The whole thing is
almost a hole with no bottom. Keeping up with core changes in this
area could become a massive undertaking for lots and lots of people,
some of whom may be the sole maintainer of some important driver that
now needs a whole bunch of work.

We already have this in many places. Headers or functions change and extensions have to fix their code.

catalog changes force drivers to change their code.

This argument blocks any improvement to the protocol. I don't think it's unreasonable to expect maintainers to keep up.

We could make it easier by having a specific list for maintainers, but that doesn't change the work.

I'm not sure how much it improves things if we imagine adding feature
flags to the existing protocol versions, rather than whole new
protocol versions, but at least it cuts down on the assumption that
adopting new features is mandatory, and that such features are
cumulative. If a driver wants to support TDE but not protocol
parameters or protocol parameters but not TDE, who are we to say no?
If in supporting those things we bump the protocol version to 3.2, and
then 3.3 fixes a huge performance problem, are drivers going to be
required to add support for features they don't care about to get the
performance fixes? I see some benefit in bumping the protocol version
for major changes, or for changes that we have an important reason to
make mandatory, or to make previously-optional features for which
support has become in practical terms universal part of the base
feature set. But I'm very skeptical of the idea that we should just
handle as many things as possible via a protocol version bump. We've
been avoiding protocol version bumps like the plague since forever,
and swinging all the way to the other extreme doesn't sound like the
right idea to me.

+1 for not swinging too far here. But I don't think it should be a non starter.

Dave

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

07 апреля, 01:14:03

On Fri, 5 Apr 2024 at 18:48, Robert Haas <robertmhaas@gmail.com> wrote:
> Maybe we'd be better off adding a libpq connection
> option that forces the use of a specific minor protocol version, but
> then we'll need backward-compatibility code in libpq basically
> forever. But maybe we need that anyway to avoid older and newer
> servers being unable to communicate.

I think this would be good because it makes testing easy and, like you
said, I think we'll need this backward-compatibility code in libpq
anyway to be able to connect to old servers. To have even better and
more realistic test coverage though, I think we might also want to
actually test new libpq against old postgres servers and vice-versa in
a build farm animal though.

> Plus, you've got all of the consequences for non-core drivers, which
> have to both add support for the new wire protocol - if they don't
> want to seem outdated and eventually obsolete - and also test that
> they're still compatible with all supported server versions.

I think for clients/drivers, the work would generally be pretty
minimal. For almost all proposed changes, clients can "support" the
protocol version update by simply not using the new features, e.g. a
client can "support" the ParameterSet feature, by simply never sending
the ParameterSet message. So binding it to a protocol version bump
doesn't make it any harder for that client to support that protocol
version. I'm not saying that is the case for all protocol changes, but
based on what's being proposed so far that's definitely a very common
theme. Overall, I think this is something to discuss for each protocol
change in isolation: i.e. how to make supporting the new feature as
painless as possible for clients/drivers.

> Connection poolers have the same set of problems.

For connection poolers this is indeed a bigger hassle, because they at
least need to be able to handle all the new message types that a
client can send and maybe do something special for them. But I think
if we're careful to keep connection poolers in mind when designing the
features themselves then I think this isn't necessarily a problem. And
probably indeed for the features that we think are hard for connection
poolers to implement, we should be using protocol extension parameter
feature flags. But I think a lot of protocol would be fairly trivial
for a connection pooler to support.

> The whole thing is
> almost a hole with no bottom. Keeping up with core changes in this
> area could become a massive undertaking for lots and lots of people,
> some of whom may be the sole maintainer of some important driver that
> now needs a whole bunch of work.

I agree with Dave here, if you want to benefit from new features
there's some expectation to keep up with the changes. But to be clear,
we'd still support old protocol versions too. So we wouldn't break
connecting using those clients, they simply wouldn't benefit from some
of the new features. I think that's acceptable.

> I'm not sure how much it improves things if we imagine adding feature
> flags to the existing protocol versions, rather than whole new
> protocol versions, but at least it cuts down on the assumption that
> adopting new features is mandatory, and that such features are
> cumulative. If a driver wants to support TDE but not protocol
> parameters or protocol parameters but not TDE, who are we to say no?
> If in supporting those things we bump the protocol version to 3.2, and
> then 3.3 fixes a huge performance problem, are drivers going to be
> required to add support for features they don't care about to get the
> performance fixes?

I think there's an important trade-off here. On one side we don't want
to make maintainers of clients/poolers do lots of work to support
features they don't care about. And on the other side it seems quite
useful to limit the amount of feature combinations that are used it
the wild (both for users and for us) e.g. the combinations of
backwards compatibility testing you were talking about would explode
if every protocol change was a feature flag. I think this trade-off is
something we should be deciding on based on the specific protocol
change. But if work needed to "support" the feature is "minimal"
(to-be-defined exactly what we consider minimal), I think making it
part of a protocol version bump is reasonable.

> I see some benefit in bumping the protocol version
> for major changes, or for changes that we have an important reason to
> make mandatory, or to make previously-optional features for which
> support has become in practical terms universal part of the base
> feature set. But I'm very skeptical of the idea that we should just
> handle as many things as possible via a protocol version bump. We've
> been avoiding protocol version bumps like the plague since forever,
> and swinging all the way to the other extreme doesn't sound like the
> right idea to me.

I think there's two parts to a protocol version bump:
1. The changes that cause us to consider a protocol bump
2. The actual act of bumping the protocol version

I think 1 is a thing we should be careful about every time (especially
regarding impact on clients/poolers). But 2 shouldn't be something
that we should consider dangerous/scary. I think that every change
that we make to the protocol (no matter how minor or backwards
compatible it is), should be accompanied with a protocol version bump.
This isn't what has happened in the past, and it makes it quite hard
to understand what "supporting" a specific protocol version actually
means. e.g. PgBouncer currently supports protocol 3.0, but doesn't
support the NegotiateProtocolVersion message (I'm working on fixing
that).

To take it to the extreme: I think we should get to a state, where if
we bump the protocol version at the client and server side without
actually making any protocol changes, everything should continue to
work fine. If we'd do that right now, then libpq wouldn't be able to
connect to old postgres versions anymore.

> I'm not sure what I think about this. Do we need these new GUCs to be
> both PGC_PROTOCOL *and also* live in a separate namespace? I see the
> need for the former pretty clearly: if these kinds of things are to be
> part of the GUC system (which wasn't my initial bias, but whatever)
> then they need to have some important behavioral differences from
> other GUCs and so we need a flag to signal that. But what problem are
> we solving by also giving them special-looking names, and are we sure
> we wouldn't rather solve that problem some other way?

Clients might want to allow the user of the client to change regular
parameters using ParameterSet (e.g. so that a connection pooler can
intercept those ParameterSet messages and change its own behaviour if
the parameter name is pgbouncer.pool_mode). But they wouldn't want a
user to set any parameters that change the wire-protocol this way. And
because an old client might connect to a new server a simple
hard-coded list of parameters at the client side is not sufficient.

I can see two ways around this:
1. Using a well-known prefix or namespace for parameters that change
the wire protocol. (exact prefix to be bikeshedded on)
2. Using a hard-coded list at the client AND disallow changing
PGC_PROTOCOL parameters at the server if the negotiated protocol
version is lower than the version this parameter was introduced in AND
bump the protocol version whenever we add a new PGC_PROTOCOL
parameter.

I think 1 is easier to implement at the client side, as it only
requires a prefix comparison instead of keeping track of a list.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Peter Eisentraut

Дата:

10 апреля, 13:35:52

On 05.04.24 14:55, Robert Haas wrote:
> I also wonder how the protocol negotiation for column encryption is
> actually going to work. What are the actual wire protocol changes that
> are needed? What does the server need to know from the client, or the
> client from the server, about what is supported?

I have just posted an updated patch for that: [0]

The protocol changes can be inspected in the diffs for

doc/src/sgml/protocol.sgml
src/backend/access/common/printtup.c
src/interfaces/libpq/fe-protocol3.c

There are various changes, including new messages, additional fields in 
existing messages, and some more flag bits in existing fields.

It all works, so I don't have any requests or anything in this thread, 
but it would be good to get some feedback if I'm using this wrong. 
AFAICT, that patch was the first public one that ever tried to make use 
of the protocol extension facility, so I was mainly guessing about the 
intended way to use this.

[0]: 
https://www.postgresql.org/message-id/f63fe170-cef2-4914-be00-ef9222456505%40eisentraut.org

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

15 апреля, 20:43:28

On Sat, Apr 6, 2024 at 6:14 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> I think for clients/drivers, the work would generally be pretty
> minimal. For almost all proposed changes, clients can "support" the
> protocol version update by simply not using the new features, ...

I mean, I agree if a particular protocol version bump does nothing
other than signal the presence of some optional, ignorable feature,
then it doesn't cause a problem if we force clients to support it. But
that seems a bit like saying that eating wild mushrooms is fine
because some of them aren't poisonous. The point is that if we roll
out two protocol changes, A and B, each of which requires the client
to make some change in order to work with the newer protocol version,
then using version numbers as the gating mechanism requires that the
client can't support the newer of those two changes without also
supporting the older one. Using feature flags doesn't impose that
constraint, which I think is a plus.

> I think there's an important trade-off here. On one side we don't want
> to make maintainers of clients/poolers do lots of work to support
> features they don't care about. And on the other side it seems quite
> useful to limit the amount of feature combinations that are used it
> the wild (both for users and for us) e.g. the combinations of
> backwards compatibility testing you were talking about would explode
> if every protocol change was a feature flag.

This is a good point, and I agree. It's a little hard to discuss this
in the abstract. I wouldn't want to have feature flags that say:

1. I support rows containing an extra-large number of bytes.
2. I support rows containing an extra-large number of columns.
3. I support queries returning an extra-large number of rows.

It is quite likely that there might be bugs that only manifest with
particular combinations of those flags; knowing that someone who
supports (3) must also support (1) and (2) seems like it would make
everyone's life easier. But on the other hand, I wouldn't mind having
feature flags that say:

1. I support transparent column encryption.
2. I support letting a connection pooler lock down the session role in
a way that can't be reversed from SQL.
3. I support compression.

It's still not theoretically impossible that those features could
interact with each other in some way, but it seems a lot less likely.
Here, I think a driver ought to be able to choose any subset of these
things and support only the ones they care about, without having to
worry about the server deciding to do something for which the driver
is unprepared. Also, in a case like this, if there are bugs that only
occur in certain combinations, we have to test for and fix those
anyway, because even if a driver supports all of those features,
they're not all going to be used for every connection.

> I think there's two parts to a protocol version bump:
> 1. TI ahe changes that cause us to consider a protocol bump
> 2. The actual act of bumping the protocol version
>
> I think 1 is a thing we should be careful about every time (especially
> regarding impact on clients/poolers). But 2 shouldn't be something
> that we should consider dangerous/scary. I think that every change
> that we make to the protocol (no matter how minor or backwards
> compatible it is), should be accompanied with a protocol version bump.
> This isn't what has happened in the past, and it makes it quite hard
> to understand what "supporting" a specific protocol version actually
> means. e.g. PgBouncer currently supports protocol 3.0, but doesn't
> support the NegotiateProtocolVersion message (I'm working on fixing
> that).

I'm generally not a fan of giving things version numbers and then not
changing the version numbers when you change the thing, but I find
myself reluctant to apply that principle to this case. I think it's
bad that we keep adding functions to libpq and sometimes changing the
behavior of existing functions and never, ever bumping the libpq .so
version. I've seen that cause real, practical problems. It means for
example that you can't make an RPM depend on libpq.so.5 and expect
that to do anything meaningful -- every version going back forever is
version 5, even if it doesn't contain the functions (or other behavior
changes) that are needed for some program compiled against a newer
version of PostgreSQL to work.

But the wire protocol changes very slowly, and I think that is a
difference that actually matters quite a bit here. Broadly speaking, I
can use a psq

> To take it to the extreme: I think we should get to a state, where if
> we bump the protocol version at the client and server side without
> actually making any protocol changes, everything should continue to
> work fine. If we'd do that right now, then libpq wouldn't be able to
> connect to old postgres versions anymore.

I think I agree with this, but it seems like a bootstrapping problem
and nothing more.

>
> > I'm not sure what I think about this. Do we need these new GUCs to be
> > both PGC_PROTOCOL *and also* live in a separate namespace? I see the
> > need for the former pretty clearly: if these kinds of things are to be
> > part of the GUC system (which wasn't my initial bias, but whatever)
> > then they need to have some important behavioral differences from
> > other GUCs and so we need a flag to signal that. But what problem are
> > we solving by also giving them special-looking names, and are we sure
> > we wouldn't rather solve that problem some other way?
>
> Clients might want to allow the user of the client to change regular
> parameters using ParameterSet (e.g. so that a connection pooler can
> intercept those ParameterSet messages and change its own behaviour if
> the parameter name is pgbouncer.pool_mode). But they wouldn't want a
> user to set any parameters that change the wire-protocol this way. And
> because an old client might connect to a new server a simple
> hard-coded list of parameters at the client side is not sufficient.
>
> I can see two ways around this:
> 1. Using a well-known prefix or namespace for parameters that change
> the wire protocol. (exact prefix to be bikeshedded on)
> 2. Using a hard-coded list at the client AND disallow changing
> PGC_PROTOCOL parameters at the server if the negotiated protocol
> version is lower than the version this parameter was introduced in AND
> bump the protocol version whenever we add a new PGC_PROTOCOL
> parameter.
>
> I think 1 is easier to implement at the client side, as it only
> requires a prefix comparison instead of keeping track of a list.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

15 апреля, 20:52:10

[ Hit send too early, sorry. ]

On Mon, Apr 15, 2024 at 1:43 PM Robert Haas <robertmhaas@gmail.com> wrote:
> But the wire protocol changes very slowly, and I think that is a
> difference that actually matters quite a bit here. Broadly speaking, I
> can use a psq

...a psql that I just built today to talk to a server from many years
ago, and everything is fine. Sure, there are some marginal wire
protocol changes around the edges, but not in places that are going to
really affect the ability of psql to communicate. But I wouldn't try
to run a psql built against 16 with a libpq even one major release
old, and the other direction (older psql, newer libpq) also carries
some (albeit fewer) risks. So the two situations aren't really
entirely comparable, I feel. I don't quite know what to make of that
as a practical matter: surely it can't be right to use protocol
version 3.0 to refer to a bunch of different things. But at the same
time, surely we don't want clients to start panicking and bailing out
when everything would have been fine.

> > To take it to the extreme: I think we should get to a state, where if
> > we bump the protocol version at the client and server side without
> > actually making any protocol changes, everything should continue to
> > work fine. If we'd do that right now, then libpq wouldn't be able to
> > connect to old postgres versions anymore.
>
> I think I agree with this, but it seems like a bootstrapping problem
> and nothing more.

That is, once we figure out how we want backward compatibility to work
in general, I think we'll probably get pretty close to the state you
want here pretty quickly.

> > Clients might want to allow the user of the client to change regular
> > parameters using ParameterSet (e.g. so that a connection pooler can
> > intercept those ParameterSet messages and change its own behaviour if
> > the parameter name is pgbouncer.pool_mode). But they wouldn't want a
> > user to set any parameters that change the wire-protocol this way. And
> > because an old client might connect to a new server a simple
> > hard-coded list of parameters at the client side is not sufficient.
> >
> > I can see two ways around this:
> > 1. Using a well-known prefix or namespace for parameters that change
> > the wire protocol. (exact prefix to be bikeshedded on)
> > 2. Using a hard-coded list at the client AND disallow changing
> > PGC_PROTOCOL parameters at the server if the negotiated protocol
> > version is lower than the version this parameter was introduced in AND
> > bump the protocol version whenever we add a new PGC_PROTOCOL
> > parameter.
> >
> > I think 1 is easier to implement at the client side, as it only
> > requires a prefix comparison instead of keeping track of a list.

I'm unconvinced that we should let ParameterSet change
non-PGC_PROTOCOL GUCs. The pooler can agree on a list of protocol GUCs
with the end client that differs from what the server agreed with the
pooler - e.g., it can add pgbouncer.pool_mode to the list. But for
truly non-protocol GUCs, we have a lot of ways to set those already.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

15 апреля, 22:37:47

On Mon, 15 Apr 2024 at 19:43, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Apr 6, 2024 at 6:14 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> > I think for clients/drivers, the work would generally be pretty
> > minimal. For almost all proposed changes, clients can "support" the
> > protocol version update by simply not using the new features, ...
>
> I mean, I agree if a particular protocol version bump does nothing
> other than signal the presence of some optional, ignorable feature,
> then it doesn't cause a problem if we force clients to support it. But
> that seems a bit like saying that eating wild mushrooms is fine
> because some of them aren't poisonous. The point is that if we roll
> out two protocol changes, A and B, each of which requires the client
> to make some change in order to work with the newer protocol version,
> then using version numbers as the gating mechanism requires that the
> client can't support the newer of those two changes without also
> supporting the older one. Using feature flags doesn't impose that
> constraint, which I think is a plus.

I think we're in agreement here, i.e. it depends on the situation if a
feature flag or version bump is more appropriate. I think the
guidelines could be as follows:
1. For protocol changes that require "extremely minimal" work from
clients & poolers: bump the protocol version.
2. For "niche" features that require some work from clients and/or
poolers: use a protocol parameter feature flag.
3. For anything in between, let's discuss on the thread for that
specific protocol change on the tradeoffs.

On Mon, 15 Apr 2024 at 19:52, Robert Haas <robertmhaas@gmail.com> wrote:
> surely it can't be right to use protocol
> version 3.0 to refer to a bunch of different things. But at the same
> time, surely we don't want clients to start panicking and bailing out
> when everything would have been fine.

I feel like the ProtocolVersionNegotiation should make sure people
don't panic and bail out. And if not, then feature flags won't help
with this either. Because clients would then still bail out if some
feature is not supported.

> I'm unconvinced that we should let ParameterSet change
> non-PGC_PROTOCOL GUCs. The pooler can agree on a list of protocol GUCs
> with the end client that differs from what the server agreed with the
> pooler - e.g., it can add pgbouncer.pool_mode to the list. But for
> truly non-protocol GUCs, we have a lot of ways to set those already.

I feel like you're glossing over something fairly important here. How
exactly would the client know about pgbouncer.pool_mode? Are you
envisioning a list of GUCs which can be changed using ParameterSet,
which the server then sends to the client during connection startup
(using presumably some new protocol message)? If so, then I feel this
same problem still exists. How would the client know which of those
GUCs change wire-protocol behaviour and which don't? It still would
need a hardcoded list (now including pgbouncer.pool_mode and maybe
more) of things that a user is allowed to change using ParameterSet.
So I think a well-known prefix would still be applicable.

To be clear, imho the well-known prefix discussion is separate from
the discussion about whether Postgres should throw an ERROR when
ParameterSet is used to change any non-PGC_PROTOCOL GUC. I'd be fine
with disallowing that if that seems better/safer/clearer to you
(although I'd love to hear your exact concerns about this). But I'd
still want a well-known prefix for protocol parameters. Because that
prefix is not for the benefit of the server, it's for the benefit of
the client and pooler. So the client/pooler can error if any dangerous
GUC is being changed, because the server would accept it and change
the wire-protocol accordingly.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Dave Cramer

Дата:

15 апреля, 22:47:40

On Mon, 15 Apr 2024 at 15:38, Jelte Fennema-Nio <me@jeltef.nl> wrote:

On Mon, 15 Apr 2024 at 19:43, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Apr 6, 2024 at 6:14 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> > I think for clients/drivers, the work would generally be pretty
> > minimal. For almost all proposed changes, clients can "support" the
> > protocol version update by simply not using the new features, ...
>
> I mean, I agree if a particular protocol version bump does nothing
> other than signal the presence of some optional, ignorable feature,
> then it doesn't cause a problem if we force clients to support it. But
> that seems a bit like saying that eating wild mushrooms is fine
> because some of them aren't poisonous. The point is that if we roll
> out two protocol changes, A and B, each of which requires the client
> to make some change in order to work with the newer protocol version,
> then using version numbers as the gating mechanism requires that the
> client can't support the newer of those two changes without also
> supporting the older one. Using feature flags doesn't impose that
> constraint, which I think is a plus.

I think we're in agreement here, i.e. it depends on the situation if a
feature flag or version bump is more appropriate. I think the
guidelines could be as follows:
1. For protocol changes that require "extremely minimal" work from
clients & poolers: bump the protocol version.
2. For "niche" features that require some work from clients and/or
poolers: use a protocol parameter feature flag.
3. For anything in between, let's discuss on the thread for that
specific protocol change on the tradeoffs.

My first thought here is that all of the above is subjective and we will end up discussing all of the above.

No real argument just an observation.

On Mon, 15 Apr 2024 at 19:52, Robert Haas <robertmhaas@gmail.com> wrote:
> surely it can't be right to use protocol
> version 3.0 to refer to a bunch of different things. But at the same
> time, surely we don't want clients to start panicking and bailing out
> when everything would have been fine.

I feel like the ProtocolVersionNegotiation should make sure people
don't panic and bail out. And if not, then feature flags won't help
with this either. Because clients would then still bail out if some
feature is not supported.

I don't think a client should ever bail out. They may not support something but IMO bailing out is not an option.

Dave

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

18 апреля, 22:34:07

On Mon, 15 Apr 2024 at 21:47, Dave Cramer <davecramer@gmail.com> wrote:
>> On Mon, 15 Apr 2024 at 19:52, Robert Haas <robertmhaas@gmail.com> wrote:
>> > surely it can't be right to use protocol
>> > version 3.0 to refer to a bunch of different things. But at the same
>> > time, surely we don't want clients to start panicking and bailing out
>> > when everything would have been fine.
>>
>> I feel like the ProtocolVersionNegotiation should make sure people
>> don't panic and bail out. And if not, then feature flags won't help
>> with this either. Because clients would then still bail out if some
>> feature is not supported.
>
> I don't think a client should ever bail out. They may not support something but IMO bailing out is not an option.

On Thu, 18 Apr 2024 at 21:01, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 18, 2024 at 1:49 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
> > IMHO that means that we should also bump the protocol version for this
> > change, because it's changing the wire protocol by adding a new
> > parameter format code. And it does so in a way that does not depend on
> > the new protocol extension.
>
> I think we're more or less covering the same ground we did on the
> other thread here -- in theory I don't love the fact that we never
> bump the protocol version when we change stuff, but in practice if we
> start bumping it every time we do anything I think it's going to just
> break a bunch of stuff without any real benefit.

(the second quoted message comes from Peter his column encryption
thread, but responding here to keep this discussion in one place)

I really don't understand what exactly you're worried about. What do
you expect will break when bumping the protocol version? As Dave said,
clients should never bail out due to protocol version differences.

When the server supports a lower version than the client, the client
should disable certain features because it gets the
ProtocolVersionNegotiation message. This is also true if we don't bump
the version. Negotiating a lower version actually makes it clearer for
the client what features to disable. Using the reported postgres
version for this might not, because a connection pooler in the middle
might not support the features that the client wants and thus throw an
error (e.g. due to the client sending unknown messages) even if the
backing Postgres server would support these features. Not to mention
non-postgresql servers that implement the PostgreSQL protocol (of
which there are more and more).

When the server supports a higher version, the client never even
notices this because the server will silently accept and only enable
the features of the lower version. So this could never cause breakage,
as from the client's perspective the server didn't bump their protocol
version.

So, I don't understand why you seem to view bumping the protocol
version with so much negativity. We're also bumping PG versions every
year. Afaik people only like that, partially because it's immediately
clear that certain features (e.g. MERGE) are not supported when
connecting to older servers. To me the same is true for bumping the
protocol version. There are no downsides to bumping it, only upsides.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

18 апреля, 23:17:07

On Thu, Apr 18, 2024 at 3:34 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> I really don't understand what exactly you're worried about. What do
> you expect will break when bumping the protocol version? As Dave said,
> clients should never bail out due to protocol version differences.

Sure, and I should never forget to take out the trash or mow the lawn.

> So, I don't understand why you seem to view bumping the protocol
> version with so much negativity. We're also bumping PG versions every
> year. Afaik people only like that, partially because it's immediately
> clear that certain features (e.g. MERGE) are not supported when
> connecting to older servers. To me the same is true for bumping the
> protocol version. There are no downsides to bumping it, only upsides.

I see it the exact opposite way around.

If we go with Peter's approach, every driver that supports his feature
will work perfectly, and every driver that doesn't will work exactly
as it does today. The risk of breaking anything is as near to zero as
human developers can reasonably hope to achieve. Nobody who doesn't
care about the feature will have to lift a single finger, today,
tomorrow, or ever. That's absolutely brilliant.

If we instead go with your approach, then anyone who wants to support
3.2 when it materializes will have to also support 3.1, which means
they have to support this feature. That's not a terrible burden, but
it's not a necessary one either. Also, even just 3.1 is going to break
something for somebody. There's just no way that we've left the
protocol version unchanged for this long and the first change we make
doesn't cause some collateral damage.

Sure, those are minor downsides in the grand scheme of things. But
AFAICS the only downside of Peter's approach that you've alleged is
that doesn't involve bumping the version number. Of course, if bumping
the version number is an intrinsic good, then no further justification
is required, but I don't buy that. I do not believe that users or
maintainers will throw us a pizza party when they find out that we've
changed the version number. There's no reason for anyone to be happy
about that for its own sake.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

19 апреля, 00:36:19

On Thu, 18 Apr 2024 at 22:17, Robert Haas <robertmhaas@gmail.com> wrote:
> If we go with Peter's approach, every driver that supports his feature
> will work perfectly, and every driver that doesn't will work exactly
> as it does today. The risk of breaking anything is as near to zero as
> human developers can reasonably hope to achieve. Nobody who doesn't
> care about the feature will have to lift a single finger, today,
> tomorrow, or ever. That's absolutely brilliant.
>
> If we instead go with your approach, then anyone who wants to support
> 3.2 when it materializes will have to also support 3.1, which means
> they have to support this feature.

To clarify: My proposed approach is to use a protocol extension
parameter for to enable the new messages that the server can send
(like Peter is doing now). And **in addition to that** gate the new
Bind format type behind a feature switch. There is literally nothing
clients will have to do to "support" that feature (except for
requesting a higher version protocol). Your suggestion of not bumping
the version but still allowing the new format type on version 3.0
doesn't have any advantage afaict, except secretly hiding from any
pooler in the middle that such a format type might be sent.

> Also, even just 3.1 is going to break
> something for somebody. There's just no way that we've left the
> protocol version unchanged for this long and the first change we make
> doesn't cause some collateral damage.

Sure, but the exact same argument holds for protocol extension
parameters. We've never set them, so they are bound to break something
the first time. My whole point is that once we bite that bullet, the
next protocol parameters and protocol version bumps won't cause such
breakage.

> Sure, those are minor downsides in the grand scheme of things. But
> AFAICS the only downside of Peter's approach that you've alleged is
> that doesn't involve bumping the version number. Of course, if bumping
> the version number is an intrinsic good, then no further justification
> is required, but I don't buy that. I do not believe that users or
> maintainers will throw us a pizza party when they find out that we've
> changed the version number. There's no reason for anyone to be happy
> about that for its own sake.

As a connection pooler maintainer I would definitely love it if every
protocol change required either a protocol version parameter or a
protocol version bump. That way I can easily check every release if
the protocol changed by looking at two things, instead of diffing the
protocol docs for some tiny "supposedly irrelevant" change was made.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

19 апреля, 00:49:27

On Thu, 18 Apr 2024 at 23:36, Jelte Fennema-Nio <me@jeltef.nl> wrote:
> To clarify: My proposed approach is to use a protocol extension
> parameter for to enable the new messages that the server can send
> (like Peter is doing now). And **in addition to that** gate the new
> Bind format type behind a feature switch.

ugh, correction: gate the new Bind format type behind a **protocol bump**

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

22 апреля, 17:26:03

On Thu, Apr 18, 2024 at 5:36 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> To clarify: My proposed approach is to use a protocol extension
> parameter for to enable the new messages that the server can send
> (like Peter is doing now). And **in addition to that** gate the new
> Bind format type behind a feature switch. There is literally nothing
> clients will have to do to "support" that feature (except for
> requesting a higher version protocol). Your suggestion of not bumping
> the version but still allowing the new format type on version 3.0
> doesn't have any advantage afaict, except secretly hiding from any
> pooler in the middle that such a format type might be sent.

That's a fair point, but I'm still not seeing much practical
advantage. It's unlikely that a client is going to set a random bit in
a format parameter for no reason.

> As a connection pooler maintainer I would definitely love it if every
> protocol change required either a protocol version parameter or a
> protocol version bump. That way I can easily check every release if
> the protocol changed by looking at two things, instead of diffing the
> protocol docs for some tiny "supposedly irrelevant" change was made.

Perhaps this is the root of our disagreement, or at least part of it.
I completely agree that it is important for human beings to be able to
understand whether, and how, the wire protocol has changed from one
release to another. I think it would be useful to document that, and
maybe some agreement to start actually bumping the version number
would come out of that, either immediately or eventually. But I don't
think bumping the protocol version first is going to help anything. If
you know that something has changed at least one time in the release,
you still have to figure out what it was, and whether there were any
more of them that, presumably, would not bump the protocol version
because there would be no good reason to do that more than once per
major release. Not only that, but it's entirely possible that someone
could fail to realize that they were supposed to bump the protocol
version, or have some reason not to do it in a particular instance, so
even if there are no bumps at all in a particular release cycle, that
doesn't prove that there are no changes that you would have liked to
know about.

Said differently, I think bumping the protocol version should be,
first and foremost, a way of telling the computer on the end of the
connection something that it needs to know. There is a separate
problem of making sure that human maintainers know what they need to
know, and I think we're doing that quite poorly right now, but I think
you might be conflating those two problems a bit.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

23 апреля, 00:19:44

On Mon, 22 Apr 2024 at 16:26, Robert Haas <robertmhaas@gmail.com> wrote:
> That's a fair point, but I'm still not seeing much practical
> advantage. It's unlikely that a client is going to set a random bit in
> a format parameter for no reason.

I think you're missing an important point of mine here. The client
wouldn't be "setting a random bit in a format parameter for no
reason". The client would decide it is allowed to set this bit,
because the PG version it connected to supports column encryption
(e.g. PG18). But this completely breaks protocol and application layer
separation.

It doesn't seem completely outside of the realm of possibility for a
pooler to gather some statistics on the amount of Bind messages that
use text vs binary query parameters. That's very easily doable now,
while looking only at the protocol layer. If a client then sets the
new format parameter bit, this pooler could then get confused and
close the connection.

> Perhaps this is the root of our disagreement, or at least part of it.
> I completely agree that it is important for human beings to be able to
> understand whether, and how, the wire protocol has changed from one
> release to another.

I think this is partially the reason for our disagreement, but I think
there are at least two other large reasons:

1. I strongly believe minor protocol version bumps after the initial
3.1 one can be made painless for clients/poolers (so the ones to
3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not
having to worry about breaking TLS 1.2 communication. Once clients and
poolers implement version negotiation support for 3.1, there's no
reason for version negation support to work for 3.0 and 3.1 to then
suddenly break on the 3.2 bump. To be clear, I'm talking about the act
of bumping the version here, not the actual protocol changes. So
assuming zero/near-zero client implementation effort for the new
features (like never setting the newly supported bit in a format
parameter), then bumping the protocol version for these new features
can never have negative consequences.

2. I very much want to keep a clear split between the protocol layer
and the application layer of our communication. And these layers merge
whenever (like you say) "the wire protocol has changed from one
release to another", but no protocol version bump or protocol
extension is used to indicate that. When that happens the only way for
a client to know what valid wire protocol messages are according to
the server, is by checking the server version. This completely breaks
the separation between layers. So, while checking the server version
indeed works for direct client to postgres communication, it starts to
break down whenever you put a pooler inbetween (as explained in the
example earlier in this email). And it breaks down even more when
connecting to servers that implement the Postgres wire protocol, but
are not postgres at all, like CockroachDB. Right now libpq and other
postgres drivers can be used to talk to these other servers and
poolers, but if we start mixing protocol and application layer stuff
then eventually that will stop being the case.

Afaict from your responses, you disagree with 1. However, it's not at
all clear to me what exact problems you're worried about. It sounds
like you don't know either, and it's more that you're worried about
things breaking for not yet known reasons. I hoped to take away/reduce
those worries using some arguments in a previous email (quoted below),
but you didn't respond to those arguments, so I'm not sure if they
were able to change your mind.

On Thu, 18 Apr 2024 at 21:34, Jelte Fennema-Nio <me@jeltef.nl> wrote:
> When the server supports a lower version than the client, the client
> should disable certain features because it gets the
> ProtocolVersionNegotiation message. This is also true if we don't bump
> the version. Negotiating a lower version actually makes it clearer for
> the client what features to disable. Using the reported postgres
> version for this might not, because a connection pooler in the middle
> might not support the features that the client wants and thus throw an
> error (e.g. due to the client sending unknown messages) even if the
> backing Postgres server would support these features. Not to mention
> non-postgresql servers that implement the PostgreSQL protocol (of
> which there are more and more).
>
> When the server supports a higher version, the client never even
> notices this because the server will silently accept and only enable
> the features of the lower version. So this could never cause breakage,
> as from the client's perspective the server didn't bump their protocol
> version.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

23 апреля, 18:03:31

On Mon, Apr 22, 2024 at 5:19 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> On Mon, 22 Apr 2024 at 16:26, Robert Haas <robertmhaas@gmail.com> wrote:
> > That's a fair point, but I'm still not seeing much practical
> > advantage. It's unlikely that a client is going to set a random bit in
> > a format parameter for no reason.
>
> I think you're missing an important point of mine here. The client
> wouldn't be "setting a random bit in a format parameter for no
> reason". The client would decide it is allowed to set this bit,
> because the PG version it connected to supports column encryption
> (e.g. PG18). But this completely breaks protocol and application layer
> separation.

I can't see what the problem is here. If the client is connected to a
database that contains encrypted columns, and its response to seeing
an encrypted column is to set this bit, that's fine and nothing should
break. If a client doesn't know about encrypted columns and sets that
bit at random, that will break things, and formally I think that's a
risk, because I don't believe we document anywhere that you shouldn't
set unused bits in the format mask. But practically, it's not likely.
(And also, maybe we should document that you shouldn't do that.)

> It doesn't seem completely outside of the realm of possibility for a
> pooler to gather some statistics on the amount of Bind messages that
> use text vs binary query parameters. That's very easily doable now,
> while looking only at the protocol layer. If a client then sets the
> new format parameter bit, this pooler could then get confused and
> close the connection.

Right, this is the kind of risk I was worried about. I think it's
similar to my example of a client setting an unused bit for no reason
and breaking everything. Here, you've hypothesized a pooler that tries
to interpret the bit and just errors out when it sees something it
doesn't understand. I agree that *formally* this is enough to justify
bumping the protocol version, but I think *practically* it isn't,
because the incompatibility is so minor as to inconvenience almost
nobody, whereas changing the protocol version affects everybody.

Let's consider a hypothetical country much like Canada except that
there are three official languages rather than two: English, French,
and Robertish. Robertish is just like English except that the meanings
of the words cabbage and rutabaga are reversed. Shall we mandate that
all signs in the country be printed in three languages rather than
two? Formally, we ought, because the substantial minority of our
hypothetical country that proudly speaks Robertish as their mother
tongue will not want to feel that they are second class citizens. But
practically, there are very few situations where the differences
between the two languages are going to inconvenience anyone. Indeed,
the French speakers might be a bit put out if English is effectively
represented twice on every sign while their mother tongue is there
only once. Of course, people are entitled to organize their countries
politically in any way that works for the people who live in them, but
as a practical matter, English and Robertish are mutually
intelligible.

And so here. If someone codes a connection pooler in the way you
suppose, then it will break. But, first of all, they probably won't do
that, both because it's not particularly likely that someone wants to
gather that particular set of statistics and also because erroring out
seems like an overreaction. And secondly, let's imagine that we do
bump the protocol version and think about whether and how that solves
the problem. A client will request from the pooler a version 3.1
connection and the pooler will say, sorry, no can do, I only
understand 3.0. So the client will now say, oh ok, no problem, I'm
going to refrain from setting that parameter format bit. Cool, right?

Well, no, not really. First, now the client application is probably
broken. If the client is varying its behavior based on the server's
protocol version, that must mean that it cares about accessing
encrypted columns, and that means that the bit in question is not an
optional feature. So actually, the fact that the pooler can force the
client to downgrade hasn't fixed anything at all.

Second, if the connection pooler were written to do something other
than close the connection, like say mask out the one bit that it knows
how to deal with or have an "unknown" bucket to count values that it
doesn't recognize, then it wouldn't have needed to care about the
protocol version in the first place. It would have been better off not
even knowing, because then it wouldn't have forced a downgrade onto
the client application for no real reason. Throwing an error wasn't a
wrong decision on the part of the person writing the pooler, but there
are other things they could have done that would have been less
brittle.

Third, applications, drivers, and connection poolers now all need to
worry about handling downgrades smoothly. If a connection pooler
requests a v3.1 connection to the server and gets v3.0, it had better
make sure that it only advertises 3.0 to the client. If the client
requests v3.0, the pooler had better make sure to either request v3.0
from the server. Or alternatively, the pooler can be prepared to
translate between 3.0 and 3.1 wherever that's needed, in either
direction. But it's not at all clear what that would look like for
something like TCE. Will the pooler arrange to encrypt parameters
destined for encrypted tables if the client doesn't do so? Will it
arrange to decrypt values coming from encrypted tables if the client
doesn't understand encryption? It's possible someone will code that
sort of thing, but I bet a lot of people won't bother. In general, I
think we'll quickly end up with a bunch of different protocol versions
-- say, 3.0 through 3.4 -- but people will thoroughly test with only
one or two of them and support for the others will either be buggy
because it wasn't tested or work anyway because the differences didn't
really matter in the first place.

> 1. I strongly believe minor protocol version bumps after the initial
> 3.1 one can be made painless for clients/poolers (so the ones to
> 3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not
> having to worry about breaking TLS 1.2 communication. Once clients and
> poolers implement version negotiation support for 3.1, there's no
> reason for version negation support to work for 3.0 and 3.1 to then
> suddenly break on the 3.2 bump. To be clear, I'm talking about the act
> of bumping the version here, not the actual protocol changes. So
> assuming zero/near-zero client implementation effort for the new
> features (like never setting the newly supported bit in a format
> parameter), then bumping the protocol version for these new features
> can never have negative consequences.

I do like the idea of being able to introduce new versions without
breaking things, but I think that if the TLS folks bumped the protocol
version for something as minor as what we're talking about here, there
would quickly be so many TLS versions that the result would be
unmanageable. I suspect that they either never make small changes and
batch everything up for the next rev, or they slip small changes into
existing protocol versions as I propose that we do here. I have zero
objection to bumping the protocol version when there is a real
question of mutual intelligibility, and zero objection to trying to
reduce friction around version bumps. But my current view, which I
reserve the right to revise at a later time, is that a change that
99.99+% of people can safely ignore is not a sufficient reason for a
version bump.

> 2. I very much want to keep a clear split between the protocol layer
> and the application layer of our communication. And these layers merge
> whenever (like you say) "the wire protocol has changed from one
> release to another", but no protocol version bump or protocol
> extension is used to indicate that. When that happens the only way for
> a client to know what valid wire protocol messages are according to
> the server, is by checking the server version. This completely breaks
> the separation between layers. So, while checking the server version
> indeed works for direct client to postgres communication, it starts to
> break down whenever you put a pooler inbetween (as explained in the
> example earlier in this email). And it breaks down even more when
> connecting to servers that implement the Postgres wire protocol, but
> are not postgres at all, like CockroachDB. Right now libpq and other
> postgres drivers can be used to talk to these other servers and
> poolers, but if we start mixing protocol and application layer stuff
> then eventually that will stop being the case.

In practice, it's already the case. If such databases don't share code
with PostgreSQL, it seems impossible that the replication subprotocol
works in any meaningful way. It seems very likely that there are other
dark corners of the protocol where things don't work either. And TCE
will be another one, but bumping the protocol version doesn't fix
that.

I kind of feel bad arguing so much about this - I don't think the urge
to bump the protocol version when we change the protocol is a bad one
in concept. And it sounds like you've done more work with software
that cares about the protocol outside of PostgreSQL itself than I
have. So maybe you're right and I'm all wet. But I can't understand
why you don't see practical problems with frequent version bumps. It's
not just about the one-time effort of getting everything that doesn't
currently understand how to negotiate a version to do so. It's about
how everyone acts on that information, or doesn't, and whether the end
result of all of those individual decisions is better or worse for the
community as a whole.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jacob Champion

Дата:

23 апреля, 20:39:15

On Mon, Apr 22, 2024 at 2:20 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> 1. I strongly believe minor protocol version bumps after the initial
> 3.1 one can be made painless for clients/poolers (so the ones to
> 3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not
> having to worry about breaking TLS 1.2 communication.

Apologies for focusing on a single portion of your argument, but this
claim in particular stuck out to me. To my understanding, IETF worried
a _lot_ about breaking TLS 1.2 implementations with the TLS 1.3
change, to the point that TLS 1.3 clients and servers advertise
themselves as TLS 1.2 and sneak the actual version used into a TLS
extension (roughly analogous to the _pq_ stuff). I vaguely recall that
the engineering work done for that update was pretty far from
painless.

--Jacob

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

06 мая, 19:19:50

On Tue, 23 Apr 2024 at 17:03, Robert Haas <robertmhaas@gmail.com> wrote:
> If a client doesn't know about encrypted columns and sets that
> bit at random, that will break things, and formally I think that's a
> risk, because I don't believe we document anywhere that you shouldn't
> set unused bits in the format mask. But practically, it's not likely.
> (And also, maybe we should document that you shouldn't do that.)

Currently Postgres errors out when you set anything other than 0 or 1
in the format field. And it's already documented that these are the
only allowed values: "Each must presently be zero (text) or one
(binary)."

> Let's consider a hypothetical country much like Canada except that
> there are three official languages rather than two: English, French,
> and Robertish.

I don't really understand the point you are trying to make with this
analogy. Sure an almost equivalent unused language maybe shouldn't be
put on the signs, but having two different names (aka protocol
versions) for English and Robertish seems quite useful to avoid
confusion.

> And so here. If someone codes a connection pooler in the way you
> suppose, then it will break. But, first of all, they probably won't do
> that, both because it's not particularly likely that someone wants to
> gather that particular set of statistics and also because erroring out
> seems like an overreaction.

Is it really that much of an overreaction? Postgres happily errors on
any weirdness in the protocol including unknown format codes. Why
wouldn't a pooler/proxy in the middle be allowed to do the same? The
pooler has no clue what the new format means, maybe it requires some
session state to be passed on to the server to be interpreted
correctly. The pooler would be responsible for syncing that session
state but might not have synced it because it doesn't know that the
format code requires that. An example of this would be a compressed
value of which the compression algorithm is configured using a GUC.
Thus the pooler being strict in what it accepts would be a good way of
notifying the client that the pooler needs to be updated (or the
feature not used).

But I do agree that it seems quite unlikely that a pooler would
implement it like that though. However, a pooler logging a warning
seems not that crazy. And in that case the connection might not be
closed, but the logs would be flooded.

> And secondly, let's imagine that we do
> bump the protocol version and think about whether and how that solves
> the problem. A client will request from the pooler a version 3.1
> connection and the pooler will say, sorry, no can do, I only
> understand 3.0. So the client will now say, oh ok, no problem, I'm
> going to refrain from setting that parameter format bit. Cool, right?
>
> Well, no, not really. First, now the client application is probably
> broken. If the client is varying its behavior based on the server's
> protocol version, that must mean that it cares about accessing
> encrypted columns, and that means that the bit in question is not an
> optional feature. So actually, the fact that the pooler can force the
> client to downgrade hasn't fixed anything at all.

It seems quite a lot nicer if the client can safely fallback to not
writing data using the new format code, instead of getting an error
from the pooler/flooding the pooler logs/having the pooler close the
connection.

> Third, applications, drivers, and connection poolers now all need to
> worry about handling downgrades smoothly. If a connection pooler
> requests a v3.1 connection to the server and gets v3.0, it had better
> make sure that it only advertises 3.0 to the client.

This seems quite straightforward to solve from a pooler perspective:
1. Connect to the server first to find out what the maximum version is
that it support
2. If a client asks for a higher version, advertise the server version

> If the client
> requests v3.0, the pooler had better make sure to either request v3.0
> from the server. Or alternatively, the pooler can be prepared to
> translate between 3.0 and 3.1 wherever that's needed, in either
> direction. But it's not at all clear what that would look like for
> something like TCE. Will the pooler arrange to encrypt parameters
> destined for encrypted tables if the client doesn't do so? Will it
> arrange to decrypt values coming from encrypted tables if the client
> doesn't understand encryption?

This is a harder problem, and indeed one I hadn't considered much.
From a pooler perspective I really would like to be able to have just
one pool of connections to the server all initially connected with the
3.1 protocol and use those 3.1 server connections for clients that use
the 3.1 protocol but also for clients that use the 3.0 protocol.
Creating separate pools of server connections for different protocol
versions is super annoying to manage for operators (e.g. how should
these pools be sized). One of the reasons I'm proposing the
ParameterSet message is to avoid this problem for protocol parameters
e.g. PgBouncer could then set TCE=off on server connection with
TCE=on, just before it hands it to a client with TCE=off.

I can think of a few ways of solving this issue for protocol versions:
1. Introduce a ProtocolVersionSet message, which could be send on
handoff to downgrade/upgrade the protocol version at the postgres side
2. Feature-flag all protocol changes behind protocol parameters, so
ParameterSet can be used to enable/disable everything
3. Feature-flag most protocol changes using protocol parameters, but
allow protocol changes to be made using a version bump when the server
responds the exact same way to all messages a client can send using
the previous protocol version. So don't allow a protocol version bump
to add extra fields to existing messages, nor introduce new message
types that are not sent as a 1-to-1 response to a new message type
sent by the client, nor send existing message types more often than
was expected in the previous protocol version.

I would prefer the 3rd option. 1 seems strange when considering
protocol changes that only impact the handshake, such as lengthening
the cancel key[1], also it requires adding yet another message type.
And 2 seems an overly strict version of 3.

[1]: https://www.postgresql.org/message-id/flat/508d0505-8b7a-4864-a681-e7e5edfe32aa%40iki.fi

> It's possible someone will code that
> sort of thing, but I bet a lot of people won't bother. In general, I
> think we'll quickly end up with a bunch of different protocol versions
> -- say, 3.0 through 3.4 -- but people will thoroughly test with only
> one or two of them and support for the others will either be buggy
> because it wasn't tested or work anyway because the differences didn't
> really matter in the first place.

I think this is overly pessimistic. I'm pretty sure clients and
poolers will want to support multiple protocol versions, to be able to
talk to old clients/servers. I do think we should make supporting
multiple versions as easy as possible though.

> In practice, it's already the case. If such databases don't share code
> with PostgreSQL, it seems impossible that the replication subprotocol
> works in any meaningful way.

I think the fact that the replication subprotocol is gated behind the
"replication=true" StartupMessage parameter makes it very easy to
check for support.

> It seems very likely that there are other
> dark corners of the protocol where things don't work either. And TCE
> will be another one, but bumping the protocol version doesn't fix
> that.

To be clear, gating the new TCE format code behind a protocol version
bump is **not only useful detect non-support for TCE** in such other
servers. But it can also be used to detect that this other server
actually **does support TCE**. If the postgresql server version is
used to indicate such support, then these non-Postgres servers now
need to pretend that they actually are Postgres servers by sending the
same version number in the server_version GUC ParameterStatus message.

> But I can't understand
> why you don't see practical problems with frequent version bumps. It's
> not just about the one-time effort of getting everything that doesn't
> currently understand how to negotiate a version to do so. It's about
> how everyone acts on that information, or doesn't, and whether the end
> result of all of those individual decisions is better or worse for the
> community as a whole.

I do see practical problems. But I see the exact same practical
problems when encoding new protocol feature support in the postgres
server version number instead of the protocol version number. But
encoding protocol feature support in the server version introduces
other issues, such as not being able to detect that some non-Postgres
server supports TCE.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

06 мая, 19:22:25

On Tue, 23 Apr 2024 at 19:39, Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
>
> On Mon, Apr 22, 2024 at 2:20 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> > 1. I strongly believe minor protocol version bumps after the initial
> > 3.1 one can be made painless for clients/poolers (so the ones to
> > 3.2/3.3/etc). Similar to how TLS 1.3 can be safely introduced, and not
> > having to worry about breaking TLS 1.2 communication.
>
> Apologies for focusing on a single portion of your argument, but this
> claim in particular stuck out to me. To my understanding, IETF worried
> a _lot_ about breaking TLS 1.2 implementations with the TLS 1.3
> change, to the point that TLS 1.3 clients and servers advertise
> themselves as TLS 1.2 and sneak the actual version used into a TLS
> extension (roughly analogous to the _pq_ stuff). I vaguely recall that
> the engineering work done for that update was pretty far from
> painless.

My bad... I guess TLS 1.3 was a bad example, due to it changing the
handshake itself so significantly.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

16 мая, 12:21:56

On Fri, 5 Apr 2024 at 18:30, Dave Cramer <davecramer@gmail.com> wrote:
> On Fri, 5 Apr 2024 at 12:09, Jelte Fennema-Nio <me@jeltef.nl> wrote:
>> I'll take a look at redesigning the protocol parameter stuff. To work
>> with dedicated functions instead.
>
> +1

It's been a while, but I now actually took the time to look into this.
And I ran into a problem that I'd like to get some feedback on before
continuing the implementation:

If we're not setting the protocol parameter in the StartupMessage,
there's currently no way for us to know if the protocol parameter is
supported by the server. If protocol parameters were unchangable then
that would be fine, but the whole point of introducing ParameterSet is
to make it possible to change protocol parameters on an existing
connection. Having the function SupportsProtocolCompression return
false, even though you can enable compression just fine, only because
we didn't ask for compression when connecting seems quite silly and
confusing.

I see five ways around this problem and would love some feedback on
which you think is best (or if you can think of any other/better
ones):
1. Have protocol parameters always be GUC_REPORT, so that the presence
of a ParameterStatus message during connection startup can be used as
a way of detecting support for the protocol parameter.
2. Make libpq always send each known protocol parameter in the
StartupMessage to check for their support, even if the connection
string does not contain the related parameters (set them to their
default value then). Then the non-presence of the parameter in the
NegotiateProtocolVersion message can be used reliably to determine
support for the feature. We could even disallow changing a protocol
parameter at the server side using ParameterSet if it was not
requested in the StartupMessage.
3. Very similar to 1, but require explicit user input in the
connection string to request the feature on connection startup by
having the user explicitly provide its default value. If it's not
requested on connection startup assume its unsupported and disallow
usage of the feature (even if the server might actually support it).
4. Make SupportsProtocolCompression return a tri-state, SUPPORTED,
UNSUPPORTED, UNKNOWN. If it's UNKNOWN people could send a ParameterSet
message themselves to check for feature support after connection
startup. We could even recognize this and change the state that
SupportProtocolCompression function to return SUPPORTED/UNSUPPORTED on
future calls according to the server response.
5. Basically the same as 4 but automatically send a ParameterSet
message internally when calling SupportsProtocolCompression and the
state is UNKNOWN, so we'd only ever return SUPPORTED or UNSUPPORTED.

The above options are listed in my order of preference, below some
reasoning why:

1 and 2 would increase the bandwidth used during connection handshake
slightly for each protocol parameter that we would add, but they have
the best user experience IMHO.

I slightly prefer 1 over 2 because there is another argument to be
made for always having protocol parameters be GUC_REPORT: these
parameters change what message types a client can send or receive. So
it makes sense to me to have the server tell the client what the
current value of such a parameter is. This might not be a strong
argument though, because this value would only ever change due to user
interaction. But still, one might imagine scenarios where the value
that the client sent is not exactly what the server would set the
parameter to on receiving that value from the client. e.g. for
protocol compression, maybe the client sends a list of prefered
compression methods and the server would send a ParameterStatus
containing only the specific compression method that it will use when
sending messages to the client.

3 seems also an acceptable option to me. While having slightly worse
user experience than 2, it allows the user of the client to make the
decision if the extra bandwidth during connection startup is worth it
to be able to enable the feature later.

4 assumes that we want people to be able to trigger sending
ParameterSet messages for every protocol parameter. I'm not sure we'd
want to give that ability in all cases.

5 would require SupportsProtocolCompression to also have a
non-blocking version, which bloats our API more than I'd like. Also as
a user you wouldn't be able to know if SupportsProtocolCompression
will do a network request or not.

PS. This is only a problem for feature detection for features relying
on protocol parameters, feature-support relying on protocol version
bumps are easy to detect based on the NegotiateProtocolVersion
message.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

16 мая, 18:28:47

On Thu, May 16, 2024 at 5:22 AM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> If we're not setting the protocol parameter in the StartupMessage,
> there's currently no way for us to know if the protocol parameter is
> supported by the server. If protocol parameters were unchangable then
> that would be fine, but the whole point of introducing ParameterSet is
> to make it possible to change protocol parameters on an existing
> connection. Having the function SupportsProtocolCompression return
> false, even though you can enable compression just fine, only because
> we didn't ask for compression when connecting seems quite silly and
> confusing.

You're probably not going to like this answer, but I feel like this is
another sign that you're trying to use the protocol extensibility
facilities in the wrong way. In my first reply to the thread, I
proposed having the client send _pq_.protocol_set=1 in that startup
message. If the server accepts that message, then you can send
whatever set of message types are associated with that option, which
could include messages to list known settings, as well as messages to
set them. Alternatively, if we used a wire protocol bump for this, you
could request version 3.1 and everything that I just said still
applies. In other words, I feel that if you had adopted the design
that I proposed back in March, or some variant of it, the problem
you're having now wouldn't exist.

IMHO, we need to negotiate the language that we're going to use to
communicate before we start communicating. We should find out which
protocol version we're using, and what protocol options are accepted,
based on sending a StartupMessage and receiving a reply. Then, after
that, having established a common language, we can do whatever. I
think you're trying to meld those two steps into one, which is
understandable from the point of view of saving a round trip, but I
just don't see it working out well. What we can do in the startup
message is extremely limited, because any startup messages we generate
can't break existing servers, and also because of the concerns I
raised earlier about leaving room for more extension in the future.
Once we get past the startup message negotiation, the sky's the limit!

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

16 мая, 19:09:18

On Thu, 16 May 2024 at 17:29, Robert Haas <robertmhaas@gmail.com> wrote:
> You're probably not going to like this answer, but I feel like this is
> another sign that you're trying to use the protocol extensibility
> facilities in the wrong way. In my first reply to the thread, I
> proposed having the client send _pq_.protocol_set=1 in that startup
> message. If the server accepts that message, then you can send
> whatever set of message types are associated with that option, which
> could include messages to list known settings, as well as messages to
> set them. Alternatively, if we used a wire protocol bump for this, you
> could request version 3.1 and everything that I just said still
> applies. In other words, I feel that if you had adopted the design
> that I proposed back in March, or some variant of it, the problem
> you're having now wouldn't exist.

I don't really understand the benefit of your proposal over option 2
that I proposed. Afaict you're proposing that for e.g. compression we
first set _pq_.supports_compression=1 in the StartupMessage and use
that  to do feature detection, and then after we get the response we
send ParameterSet("compression", "gzip"). To me this is pretty much
identical to option 2, except that it introduces an extra round trip
for no benefit (as far as I can see). Why not go for option 2 and send
_pq_.compression=gzip in the StartupMessage directly.

> IMHO, we need to negotiate the language that we're going to use to
> communicate before we start communicating. We should find out which
> protocol version we're using, and what protocol options are accepted,
> based on sending a StartupMessage and receiving a reply. Then, after
> that, having established a common language, we can do whatever. I
> think you're trying to meld those two steps into one, which is
> understandable from the point of view of saving a round trip, but I
> just don't see it working out well.

I think not increasing the number of needed round trips in the startup
of a connection is extremely important. I think it's so important that
I honestly don't think we should merge a protocol change that
introduces an extra round trip without a VERY good reason, and this
round trip should only be needed when actually using the feature.

> What we can do in the startup
> message is extremely limited, because any startup messages we generate
> can't break existing servers, and also because of the concerns I
> raised earlier about leaving room for more extension in the future.
> Once we get past the startup message negotiation, the sky's the limit!

Sure, what we can do in the StartupMessage is extremely limited, but
what it does allow is passing arbitrary key value pairs to the server.
But by only using _pq_.feature_name=1, we're effectively only using
the key part of the key value pair. Limiting ourselves even more, by
throwing half of our communication channel away, seems like a bad idea
to me. But maybe I'm just not understanding the problem you're seeing
with using the value too.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

16 мая, 19:57:40

On Thu, May 16, 2024 at 12:09 PM Jelte Fennema-Nio <me@jeltef.nl> wrote:
> I don't really understand the benefit of your proposal over option 2
> that I proposed. Afaict you're proposing that for e.g. compression we
> first set _pq_.supports_compression=1 in the StartupMessage and use
> that  to do feature detection, and then after we get the response we
> send ParameterSet("compression", "gzip"). To me this is pretty much
> identical to option 2, except that it introduces an extra round trip
> for no benefit (as far as I can see). Why not go for option 2 and send
> _pq_.compression=gzip in the StartupMessage directly.

Ugh, it's so hard to communicate clearly about this stuff. I didn't
really have any thought that we'd ever try to handle something as
complicated as compression using ParameterSet. I tend to agree that
for compression I'd like to see the startup packet contain more than
_pq_.compression=1, but I'm not sure what would happen after that
exactly. If the client asks for _pq_.compression=lz4 and the server
tells the client that it doesn't understand _pq_.compression at all,
then everybody's on the same page: no compression. But, if the server
understands the option but isn't OK with the proposed value, what
happens then? Does it send a NegotiateCompressionType message after
the NegotiateProtocolVersion, for example? That seems like it could
lead to the client having to be prepared for a lot of NegotiateX
messages somewhere down the road.

I think at some point in the past we had discussed having the client
list all the algorithms it supported in the argument to
_pq_.compression, and then the server would respond with the algorithm
it wanted use, or maybe a list of algorithms that it could allow, and
then we'd go from there. But I'm not entirely sure if that's the right
idea, either.

Changing compression algorithms in mid-stream is tricky, too. If I
tell the server "hey, turn on server-to-client compression!" then I
need to be able to identify where exactly that happens. Any messages
already sent by the server and not yet processed by me, or any
messages sent after that but before the server handles my request, are
going to be uncompressed. Then, at some point, I'll start getting
compressed data. If the compressed data is framed inside some message
type created for that purpose, like I get a CompressedMessage message
and then I decompress to get the actual message, this is simpler to
manage. But even then, it's tricky if the protocol shifts. If I tell
the server, you know what, gzip was a bad choice, I want lz4, I'll
need to know where the switch happens to be able to decompress
properly.

I don't know if we want to support changing compression algorithms in
mid-stream. I don't think there's any reason we can't, but it might be
a bunch of work for something that nobody really cares about. Not
sure.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jacob Burroughs

Дата:

16 мая, 20:39:04

On Thu, May 16, 2024 at 6:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> Ugh, it's so hard to communicate clearly about this stuff. I didn't
> really have any thought that we'd ever try to handle something as
> complicated as compression using ParameterSet. I tend to agree that
> for compression I'd like to see the startup packet contain more than
> _pq_.compression=1, but I'm not sure what would happen after that
> exactly. If the client asks for _pq_.compression=lz4 and the server
> tells the client that it doesn't understand _pq_.compression at all,
> then everybody's on the same page: no compression. But, if the server
> understands the option but isn't OK with the proposed value, what
> happens then? Does it send a NegotiateCompressionType message after
> the NegotiateProtocolVersion, for example? That seems like it could
> lead to the client having to be prepared for a lot of NegotiateX
> messages somewhere down the road.
>
> I think at some point in the past we had discussed having the client
> list all the algorithms it supported in the argument to
> _pq_.compression, and then the server would respond with the algorithm
> it wanted use, or maybe a list of algorithms that it could allow, and
> then we'd go from there. But I'm not entirely sure if that's the right
> idea, either.

As currently implemented [1], the client sends the server the list of
all compression algorithms it is willing to accept, and the server can
use one of them.  If the server knows what `_pq_.compression` means
but doesn't actually support any compression, it will both send the
client its empty list of supported algorithms and just never send any
compressed messages, and everyone involved will be (relatively) happy.
There is a libpq function that a client can use to check what
compression is in use if a client *really* doesn't want to continue
with the conversation without compression, but 99% of the time I can't
see why a client wouldn't prefer to continue using a connection with
whatever compression the server supports (or even none) without more
explicit negotiation.  (Unlike TLS, where automagically picking
between using and not using TLS has strange security implications and
effects, compression is a convenience feature for everyone involved.)

> Changing compression algorithms in mid-stream is tricky, too. If I
> tell the server "hey, turn on server-to-client compression!" then I
> need to be able to identify where exactly that happens. Any messages
> already sent by the server and not yet processed by me, or any
> messages sent after that but before the server handles my request, are
> going to be uncompressed. Then, at some point, I'll start getting
> compressed data. If the compressed data is framed inside some message
> type created for that purpose, like I get a CompressedMessage message
> and then I decompress to get the actual message, this is simpler to
> manage. But even then, it's tricky if the protocol shifts. If I tell
> the server, you know what, gzip was a bad choice, I want lz4, I'll
> need to know where the switch happens to be able to decompress
> properly.
>
> I don't know if we want to support changing compression algorithms in
> mid-stream. I don't think there's any reason we can't, but it might be
> a bunch of work for something that nobody really cares about. Not
> sure.

As the protocol layer is currently designed [1], it explicitly makes
it very easy to change/restart compression streams, specifically for
this use case (and in particular for the general connection pooler use
case).  Compressed data is already framed in a `CompressedData`
message, and that message has a header byte that corresponds to an
enum value for which algorithm is currently in use.  Any time the
compression stream was restarted by the sender, the first
`CompressedData` message will set that byte, and then the client will
restart its decompression stream with the chosen algorithm from that
point.  For `CompressedData` messages that continue using the
already-established stream, the byte is simply set to 0.  (This is
also how the "each side sends a list" form of negotiation is able to
work without additional round trips, as the `CompressedData` framing
itself communicates which compression algorithm has been selected.)

[1] https://www.postgresql.org/message-id/CACzsqT5-7xfbz%2BSi35TBYHzerNX3XJVzAUH9AewQ%2BPp13fYBoQ%40mail.gmail.com

--
Jacob Burroughs | Staff Software Engineer
E: jburroughs@instructure.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

17 мая, 00:01:56

On Thu, 16 May 2024 at 18:57, Robert Haas <robertmhaas@gmail.com> wrote:
> Ugh, it's so hard to communicate clearly about this stuff. I didn't
> really have any thought that we'd ever try to handle something as
> complicated as compression using ParameterSet.

Okay, then it's definitely very hard to communicate clearly about
this. Because being able to re-configure compression using
ParameterSet is exactly the type of thing I want to be able to do in
PgBouncer. Otherwise a connection from PgBouncer to Postgres cannot be
handed off to any client, because its compression state cannot be
changed on the fly to the state that a client expects, if the first
one wants lz4 compression, and a second one wants zstd compression.
There's no way the same connection can be reused for both unless you
decompress and recompress in pgbouncer, which is expensive to do.
Being able to reconfigure the stream to compress the messages in the
expected form is much cheaper.

> Does it send a NegotiateCompressionType message after
> the NegotiateProtocolVersion, for example? That seems like it could
> lead to the client having to be prepared for a lot of NegotiateX
> messages somewhere down the road.

Like Jacob explains, you'd want to allow the client to provide a list
of options in order of preference. And then have the server respond
with a ParameterStatus message saying what it ended up being. So no
new NegotiateXXX messages are needed, as long as we make sure any
_pq_.xxx falls back to reporting its default value on failure. This is
exactly why I said I prefer option 1 of the options I listed, because
we need _pq_.xxx messages to report their current value to the client.

To be clear, this is not special to compression. This applies to ALL
proposed protocol parameters. The server should fall back to some
"least common denominator" if it doesn't understand (part of) the
value for the protocol parameter that's provided by the client,
possibly falling back to disabling the protocol extension completely.

> Changing compression algorithms in mid-stream is tricky, too. If I
> tell the server "hey, turn on server-to-client compression!"

Yes it is tricky, but it's something that it would need to support
imho. And Jacob actually implemented it this way, so I feel like we're
discussing a non-problem here.

> I don't know if we want to support changing compression algorithms in
> mid-stream. I don't think there's any reason we can't, but it might be
> a bunch of work for something that nobody really cares about.

Again, I guess I wasn't clear at all in my previous emails and/or
commit messages. Connection poolers care **very much** about this.
Poolers need to be able to re-configure any protocol parameter to be
able to pool the same server connection across clients with
differently configured protocol parameters. Again: This is the primary
reason for me wanting to introduce the ParameterSet message.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

17 мая, 16:15:20

On Thu, May 16, 2024 at 1:39 PM Jacob Burroughs
<jburroughs@instructure.com> wrote:
> As currently implemented [1], the client sends the server the list of
> all compression algorithms it is willing to accept, and the server can
> use one of them.  If the server knows what `_pq_.compression` means
> but doesn't actually support any compression, it will both send the
> client its empty list of supported algorithms and just never send any
> compressed messages, and everyone involved will be (relatively) happy.
> There is a libpq function that a client can use to check what
> compression is in use if a client *really* doesn't want to continue
> with the conversation without compression, but 99% of the time I can't
> see why a client wouldn't prefer to continue using a connection with
> whatever compression the server supports (or even none) without more
> explicit negotiation.  (Unlike TLS, where automagically picking
> between using and not using TLS has strange security implications and
> effects, compression is a convenience feature for everyone involved.)

This all seems sensible to me.

> As the protocol layer is currently designed [1], it explicitly makes
> it very easy to change/restart compression streams, specifically for
> this use case (and in particular for the general connection pooler use
> case).  Compressed data is already framed in a `CompressedData`
> message, and that message has a header byte that corresponds to an
> enum value for which algorithm is currently in use.  Any time the
> compression stream was restarted by the sender, the first
> `CompressedData` message will set that byte, and then the client will
> restart its decompression stream with the chosen algorithm from that
> point.  For `CompressedData` messages that continue using the
> already-established stream, the byte is simply set to 0.  (This is
> also how the "each side sends a list" form of negotiation is able to
> work without additional round trips, as the `CompressedData` framing
> itself communicates which compression algorithm has been selected.)

OK, so you made it so that compressed data is fully self-identifying.
Hence, there's no need to worry if something gets changed: the
receiver, if properly implemented, can't help but notice. The only
downside that I can see to this design is that you only have one byte
to identify the compression algorithm, but that doesn't actually seem
like a real problem at all, because I expect the number of supported
compression algorithms to grow very slowly. I think it would take
centuries, possibly millenia, before we started to get short of
identifiers. So, cool.

But, in your system, how does the client ask the server to switch to a
different compression algorithm, or to restart the compression stream?

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jacob Burroughs

Дата:

17 мая, 20:26:45

On Fri, May 17, 2024 at 3:15 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> OK, so you made it so that compressed data is fully self-identifying.
> Hence, there's no need to worry if something gets changed: the
> receiver, if properly implemented, can't help but notice. The only
> downside that I can see to this design is that you only have one byte
> to identify the compression algorithm, but that doesn't actually seem
> like a real problem at all, because I expect the number of supported
> compression algorithms to grow very slowly. I think it would take
> centuries, possibly millenia, before we started to get short of
> identifiers. So, cool.
>
> But, in your system, how does the client ask the server to switch to a
> different compression algorithm, or to restart the compression stream?

I was leaving the details around triggering that for this conversation
and in that patch just designing the messages in a way that would
allow adding the reconfiguration/restarting to be easily added in a
backwards-compatible way in a future patch.  I would imagine that an
explicit `ParameterSet` call that sets `_pq_.connection_compression`
(or whatever the implementation details turn out to be) would also
trigger a compressor restart, and when restarted it would pick an
algorithm/configuration based on the new value of the parameter rather
than the one used at connection establishment.

--
Jacob Burroughs | Staff Software Engineer
E: jburroughs@instructure.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

17 мая, 22:24:26

On Fri, May 17, 2024 at 1:26 PM Jacob Burroughs
<jburroughs@instructure.com> wrote:
> I was leaving the details around triggering that for this conversation
> and in that patch just designing the messages in a way that would
> allow adding the reconfiguration/restarting to be easily added in a
> backwards-compatible way in a future patch.  I would imagine that an
> explicit `ParameterSet` call that sets `_pq_.connection_compression`
> (or whatever the implementation details turn out to be) would also
> trigger a compressor restart, and when restarted it would pick an
> algorithm/configuration based on the new value of the parameter rather
> than the one used at connection establishment.

Hmm, OK, interesting. I suppose that I thought we were going to handle
problems like this by just adding bespoke messages for each case (e.g.
CompressionSet). I wasn't thinking it would be practical to make a
message whose remit was as general as what it seems like Jelte wants
to do with ParameterSet, because I'm not sure everything can be
handled that simply. It's not an exact analogy, but when you want to
stop the server, it's not enough to say that you want to change from
the Running state to the Stopped state. You have to specify which type
of shutdown should be used to make the transition. You could even have
more complicated cases, where one side says "prepare to do X" and the
other side eventually says "OK, I'm prepared" and the first side says
"great, now activate X" and the other side eventually says "OK, it's
activate, please confirm that you've also activated it from your side"
and the first side eventually says "OK, I confirm that". I think the
fear that we're going to run into cases where such complex handshaking
is necessary is a major reason why I'm afraid of Jelte's ideas about
ParameterSet: it seems much more opinionated to me than he seems to
think it is.

To the extent that I can wrap my head around what Jelte is proposing,
and all signs point to that extent being quite limited, I suppose I
was thinking of something like his option (2). That is, I assumed that
a client would request all the optional features that said client
might wish to use at connection startup time. But I can see how that
assumption might be somewhat problematic in a connection-pooling
environment. Still, at least to me, it seems better than trying to
rely on GUC_REPORT. My opinion is (1) GUC_REPORT isn't a particularly
well-designed mechanism so I dislike trying to double down on it and
(2) trying to mix these protocol-level parameters and the
transactional GUCs we have together in a single mechanism seems
potentially problematic and (3) I'm still not particularly happy about
the idea of making protocol parameters into GUCs in the first place.
Perhaps these are all minority positions, but I can't tell you what
everyone thinks, only what I think.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

18 мая, 02:33:29

On Fri, 17 May 2024 at 21:24, Robert Haas <robertmhaas@gmail.com> wrote:
> I think the
> fear that we're going to run into cases where such complex handshaking
> is necessary is a major reason why I'm afraid of Jelte's ideas about
> ParameterSet: it seems much more opinionated to me than he seems to
> think it is.

I think that fear is valid, and I agree that we might want to add a
bespoke message for cases where ParameterSet is not enough. But as far
as I can tell ParameterSet would at least cover all the protocol
changes that have been suggested so far. Using an opinionated but
limited message type for 90% of the cases and a bespoke message for
the last 10% seems much better to me than having a bespoke one for
each (especially if currently none of the protocol proposals fall into
the last 10%).

> To the extent that I can wrap my head around what Jelte is proposing,
> and all signs point to that extent being quite limited, I suppose I
> was thinking of something like his option (2). That is, I assumed that
> a client would request all the optional features that said client
> might wish to use at connection startup time. But I can see how that
> assumption might be somewhat problematic in a connection-pooling
> environment.

To be clear, I'd also be totally fine with my option (2). I'm
personally slightly leaning towards my option (1), due to the reasons
listed before. But connection poolers could request all the protocol
extensions at the start just fine (using the default "disabled" value)
to check for support. So I think option (2) would probably be the most
conservative, i.e. we could always decide that option (1) is fine in
some future release.

> Still, at least to me, it seems better than trying to
> rely on GUC_REPORT. My opinion is (1) GUC_REPORT isn't a particularly
> well-designed mechanism so I dislike trying to double down on it

I agree that GUC_REPORT is not particularly well designed,
currently... But even in its current form it's already a very
effective mechanism for connection poolers to find out to which value
a specific GUC is set to, and if something similar to patch 0014 would
be merged my main gripe with GUC_REPORT would be gone. Tracking GUC
settings by using ParameterSet would actually be harder, because if
ParameterSet errors at Postgres then the connection pooler would have
to roll back its cache of that setting. While with the GUC_REPORT
response from Postgres it can simply rely on Postgres telling the
pooler that the GUC has changed, even rollbacks are handled correctly
this way.

> and
> (2) trying to mix these protocol-level parameters and the
> transactional GUCs we have together in a single mechanism seems
> potentially problematic

I don't understand what potential problems you're worried about here.
Could you clarify?

> and (3) I'm still not particularly happy about
> the idea of making protocol parameters into GUCs in the first place.

Similar to the above: Could you clarify why you're not happy about that?

> Perhaps these are all minority positions, but I can't tell you what
> everyone thinks, only what I think.

I'd love to hear some opinions from others on these design choices. So
far it seems like we're the only two that have opinions on this (which
seems hard to believe) and our opinions are clearly conflicting. And
above all I'd like to move forward with this, be it my way or yours
(although I'd prefer my way of course ;) )

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Tom Lane

Дата:

23 мая, 21:12:37

Jelte Fennema-Nio <me@jeltef.nl> writes:
> On Fri, 17 May 2024 at 21:24, Robert Haas <robertmhaas@gmail.com> wrote:
>> Perhaps these are all minority positions, but I can't tell you what
>> everyone thinks, only what I think.

> I'd love to hear some opinions from others on these design choices. So
> far it seems like we're the only two that have opinions on this (which
> seems hard to believe) and our opinions are clearly conflicting. And
> above all I'd like to move forward with this, be it my way or yours
> (although I'd prefer my way of course ;) )

I got around to looking through this thread in preparation for next
week's patch review session.  I have a couple of opinions to offer:

1. Protocol versions suck.  Bumping them is seldom a good answer,
and should never be done if you have any finer-grained negotiation
mechanism available.  My aversion to this is over thirty years old:
I learned that lesson from watching the GIF87-to-GIF89 transition mess.
Authors of GIF-writing tools tended to take the easy way out and write
"GIF89" in the header whether they were actually using any of the new
version's features or not.  This led to an awful lot of pictures that
couldn't be read by available GIF-displaying tools, for no good reason
whatsoever.  The PNG committee, a couple years later, reacted to that
mess by designing PNG to have no version number whatsoever, and yet
be extensible in a fine-grained way.  (Basically, a PNG file is made
up of labeled chunks.  If a reader doesn't recognize a particular
chunk code, it can still tell whether the chunk is "critical" or not,
and thereby decide if it must give up or can proceed while ignoring
that chunk.)

So overall, I have a strong preference for using the _pq_.xxx
mechanism instead of a protocol version bump.  I do not believe
the latter has any advantage.

2. I share Robert's suspicion of equating protocol parameters
with GUCs.  The GUC mechanism is quite opinionated and already
serves multiple masters.  In particular, the fact that GUC
settings are normally transactional does not play nice with
the way protocol parameters need to behave.  Yeah, no doubt you
could add another dollop of complexity to guc.c to make parameters
work differently from other GUCs, but I think it's the wrong design
direction.  We should handle protocol parameters with a separate
mechanism.  It's not, for instance, clear to me that protocol
parameters should be exposed at the SQL level at all; but if we
don't feel they need to be available via SHOW and pg_settings,
what benefit is guc.c really bringing to the table?

            regards, tom lane

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jacob Champion

Дата:

23 мая, 21:25:55

On Thu, May 23, 2024 at 11:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If a reader doesn't recognize a particular
> chunk code, it can still tell whether the chunk is "critical" or not,
> and thereby decide if it must give up or can proceed while ignoring
> that chunk.)

Would it be good to expand on that idea of criticality? IIRC one of
Jelte's complaints earlier was that middleware has to know all the
extension types anyway, to be able to figure out whether it has to do
something about them or not. HTTP has the concept of hop-by-hop vs
end-to-end headers for related reasons.

--Jacob

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Tom Lane

Дата:

23 мая, 21:40:02

Jacob Champion <jacob.champion@enterprisedb.com> writes:
> Would it be good to expand on that idea of criticality? IIRC one of
> Jelte's complaints earlier was that middleware has to know all the
> extension types anyway, to be able to figure out whether it has to do
> something about them or not. HTTP has the concept of hop-by-hop vs
> end-to-end headers for related reasons.

Yeah, perhaps.  We'd need to figure out just which classes we need
to divide protocol parameters into, and then think about a way for
code to understand which class a parameter falls into even when
it doesn't specifically know that parameter.  That seems possible
though.  PNG did it with spelling rules for the chunk labels.
Here, since we don't yet have any existing _pq_.xxx parameter names,
we could maybe say that the names shall follow a pattern like
"_pq_.class.param".  (That works only if the classes are
non-overlapping, an assumption not yet justified by evidence;
but we could do something more complicated if we have to.)

            regards, tom lane

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

23 мая, 22:45:26

On Thu, May 23, 2024 at 2:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I got around to looking through this thread in preparation for next
> week's patch review session.  I have a couple of opinions to offer:

I agree with these opinions. Independently of that, I'm glad you shared them.

I think part of the reason we ended up with the protocol parameters =
GUCs thing is because you seemed to be concurring with that approach
upthread. I think it was Jelte's idea originally, but I interpreted
some of your earlier remarks to be supporting it. I'm not sure whether
you've revised your opinion, or just refined it, or whether we
misinterpreted your earlier remarks.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Tom Lane

Дата:

23 мая, 23:00:46

Robert Haas <robertmhaas@gmail.com> writes:
> I think part of the reason we ended up with the protocol parameters =
> GUCs thing is because you seemed to be concurring with that approach
> upthread. I think it was Jelte's idea originally, but I interpreted
> some of your earlier remarks to be supporting it. I'm not sure whether
> you've revised your opinion, or just refined it, or whether we
> misinterpreted your earlier remarks.

I don't recall exactly what I thought earlier, but now I think we'd
be better off with separate infrastructure.  guc.c is unduly complex
already.  Perhaps there are bits of it that could be factored out
and shared, but I bet not a lot.

            regards, tom lane

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Robert Haas

Дата:

24 мая, 16:28:32

On Thu, May 23, 2024 at 4:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't recall exactly what I thought earlier, but now I think we'd
> be better off with separate infrastructure.  guc.c is unduly complex
> already.  Perhaps there are bits of it that could be factored out
> and shared, but I bet not a lot.

OK. That seems fine to me, but I bet Jelte is going to disagree.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

25 мая, 01:08:57

On Fri, 24 May 2024 at 15:28, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, May 23, 2024 at 4:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I don't recall exactly what I thought earlier, but now I think we'd
> > be better off with separate infrastructure.  guc.c is unduly complex
> > already.  Perhaps there are bits of it that could be factored out
> > and shared, but I bet not a lot.
>
> OK. That seems fine to me, but I bet Jelte is going to disagree.

I indeed disagree. I think the effort needed to make guc.c handle
protocol parameters is extremely little. The 0011 patch are all the
changes that are needed to achieve that, and that patch only adds 65
additional lines. And only 15 of those 65 lines actually have to do
anything somewhat weird, to be able to handle the transactionality
discrepancy between protocol parameters and other GUCs. The other 50
lines are (imho) really clean and fit perfectly with the way guc.c is
currently structured (i.e. they add PGC_PROTOCOL and PGC_SU_PROTOCOL
in a really obvious way)

Separating it from the GUC infrastructure will mean we need to
duplicate a lot of the infrastructure. Assuming we don't care about
SHOW or pg_settings (which I agree are not super important), the
things that we would want for protocol parameters to have that guc.c
gives us for free are:
1. Reporting the value of the parameter to the client (done using
ParameterStatus)
2. Parsing and validating of the input, bool, int, enum, etc, but also
check_hook and assign_hook.
3. Logic in all connection poolers to change GUC values to the
client's expected values whenever a server connection is handed off to
a client
4. Permission checking, if we want some protocol extensions to only be
configurable by a highly privileged user

All of those things would have to be duplicated/re-implemented if we
make protocol parameters their own dedicated thing. Doing that work
seems like a waste of time to me, and would imho add much more
complexity than the proposed 65 lines of code in 0011.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

25 мая, 13:23:11

On Thu, 23 May 2024 at 20:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Jacob Champion <jacob.champion@enterprisedb.com> writes:
> > Would it be good to expand on that idea of criticality? IIRC one of
> > Jelte's complaints earlier was that middleware has to know all the
> > extension types anyway, to be able to figure out whether it has to do
> > something about them or not. HTTP has the concept of hop-by-hop vs
> > end-to-end headers for related reasons.
>
> Yeah, perhaps.  We'd need to figure out just which classes we need
> to divide protocol parameters into, and then think about a way for
> code to understand which class a parameter falls into even when
> it doesn't specifically know that parameter.

I think this class is so rare, that it's not worth complicating the
discussion on new protocol features even more. AFAICT there is only
one proposed protocol change that does not need any pooler support
(apart from syncing the feature value when re-assigning the
connectin): Automatic binary encoding for a list of types

All others need some support from poolers, at the very least they need
new message types to not error out. But in many cases more complex
stuff is needed.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

25 мая, 13:39:58

On Thu, 23 May 2024 at 20:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Jelte Fennema-Nio <me@jeltef.nl> writes:
> > On Fri, 17 May 2024 at 21:24, Robert Haas <robertmhaas@gmail.com> wrote:
> >> Perhaps these are all minority positions, but I can't tell you what
> >> everyone thinks, only what I think.
>
> > I'd love to hear some opinions from others on these design choices. So
> > far it seems like we're the only two that have opinions on this (which
> > seems hard to believe) and our opinions are clearly conflicting. And
> > above all I'd like to move forward with this, be it my way or yours
> > (although I'd prefer my way of course ;) )
>
> I got around to looking through this thread in preparation for next
> week's patch review session.  I have a couple of opinions to offer:
>
> 1. Protocol versions suck.  Bumping them is seldom a good answer,
> and should never be done if you have any finer-grained negotiation
> mechanism available.  My aversion to this is over thirty years old:
> I learned that lesson from watching the GIF87-to-GIF89 transition mess.
> Authors of GIF-writing tools tended to take the easy way out and write
> "GIF89" in the header whether they were actually using any of the new
> version's features or not.  This led to an awful lot of pictures that
> couldn't be read by available GIF-displaying tools, for no good reason
> whatsoever.  The PNG committee, a couple years later, reacted to that
> mess by designing PNG to have no version number whatsoever, and yet
> be extensible in a fine-grained way.  (Basically, a PNG file is made
> up of labeled chunks.  If a reader doesn't recognize a particular
> chunk code, it can still tell whether the chunk is "critical" or not,
> and thereby decide if it must give up or can proceed while ignoring
> that chunk.)
>
> So overall, I have a strong preference for using the _pq_.xxx
> mechanism instead of a protocol version bump.  I do not believe
> the latter has any advantage.

I'm not necessarily super opposed to only using the _pq_.xxx
mechanism. I mainly think it's silly to have a protocol version number
and then never use it. And I feel some of the proposed changes don't
really benefit from being able to be turned on-and-off by themselves.
My rule of thumb would be:
1. Things that a modern client/pooler would always request: version bump
2. Everything else: _pq_.xxx

Of the proposed changes so far on the mailing list the only 2 that
would fall under 1 imho are:
1. The ParameterSet message
2. Longer than 32bit secret in BackendKeyData

I also don't think the GIF situation you describe translates fully to
this discussion. We have active protocol version negotiation, so if a
server doesn't support protocol 3.1 a client is expected to fall back
to the 3.0 protocol when communicating. Of course you can argue that a
badly behaved client will fail to connect when it gets a downgrade
request from the server, but that same argument can be made about a
server not reporting support for a _pq_.xxx parameter that every
modern client/pooler requests. So I don't think there's a practical
difference in the problem you're describing.

But again if I'm alone in this, then I don't

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Jelte Fennema-Nio

Дата:

25 мая, 13:43:46

(pressed send to early)

On Sat, 25 May 2024 at 12:39, Jelte Fennema-Nio <me@jeltef.nl> wrote:
> But again if I'm alone in this, then I don't

... mind budging on this to move this decision along. Using _pq_.xxx
parameters for all protocol changes would totally be acceptable to me.

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

От

Dave Cramer

Дата:

25 мая, 13:53:32

On Sat, 25 May 2024 at 06:40, Jelte Fennema-Nio <me@jeltef.nl> wrote:

On Thu, 23 May 2024 at 20:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Jelte Fennema-Nio <me@jeltef.nl> writes:
> > On Fri, 17 May 2024 at 21:24, Robert Haas <robertmhaas@gmail.com> wrote:
> >> Perhaps these are all minority positions, but I can't tell you what
> >> everyone thinks, only what I think.
>
> > I'd love to hear some opinions from others on these design choices. So
> > far it seems like we're the only two that have opinions on this (which
> > seems hard to believe) and our opinions are clearly conflicting. And
> > above all I'd like to move forward with this, be it my way or yours
> > (although I'd prefer my way of course ;) )
>
> I got around to looking through this thread in preparation for next
> week's patch review session. I have a couple of opinions to offer:
>
> 1. Protocol versions suck. Bumping them is seldom a good answer,
> and should never be done if you have any finer-grained negotiation
> mechanism available. My aversion to this is over thirty years old:
> I learned that lesson from watching the GIF87-to-GIF89 transition mess.
> Authors of GIF-writing tools tended to take the easy way out and write
> "GIF89" in the header whether they were actually using any of the new
> version's features or not. This led to an awful lot of pictures that
> couldn't be read by available GIF-displaying tools, for no good reason
> whatsoever. The PNG committee, a couple years later, reacted to that
> mess by designing PNG to have no version number whatsoever, and yet
> be extensible in a fine-grained way. (Basically, a PNG file is made
> up of labeled chunks. If a reader doesn't recognize a particular
> chunk code, it can still tell whether the chunk is "critical" or not,
> and thereby decide if it must give up or can proceed while ignoring
> that chunk.)
>
> So overall, I have a strong preference for using the _pq_.xxx
> mechanism instead of a protocol version bump. I do not believe
> the latter has any advantage.

I'm not necessarily super opposed to only using the _pq_.xxx
mechanism.

I find it interesting that up to now nobody has ever used this mechanism.

I mainly think it's silly to have a protocol version number
and then never use it. And I feel some of the proposed changes don't
really benefit from being able to be turned on-and-off by themselves.
My rule of thumb would be:
1. Things that a modern client/pooler would always request: version bump
2. Everything else: _pq_.xxx

Have to agree, why have a protocol version and then just not use it ?

Of the proposed changes so far on the mailing list the only 2 that
would fall under 1 imho are:
1. The ParameterSet message
2. Longer than 32bit secret in BackendKeyData

I also don't think the GIF situation you describe translates fully to
this discussion. We have active protocol version negotiation, so if a
server doesn't support protocol 3.1 a client is expected to fall back
to the 3.0 protocol when communicating.

Also agree. Isn't the point of having a version number to figure out what features the client wants and subsequently the server can provide?

Of course you can argue that a
badly behaved client will fail to connect when it gets a downgrade
request from the server, but that same argument can be made about a
server not reporting support for a _pq_.xxx parameter that every
modern client/pooler requests. So I don't think there's a practical
difference in the problem you're describing.

But again if I'm alone in this, then I don't

I would prefer to see a well defined protocol handshaking mechanism rather than some strange _pq.xxx dance.

Dave

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Add new protocol message to change GUCs for usage with future protocol-only GUCs

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения