Обсуждение: [HACKERS] Making server name part of the startup message
As a cloud service, Azure Database for PostgreSQL uses a gateway proxy to route connections to a node hosting the actual server. To do that, the proxy needs to know the name of the server it tries to locate. As a work-around we currently overload the username parameter to pass in the server name using username@servername convention. It is purely a convention that our customers need to follow and understand. We would like to extend the PgSQL connection protocol to add an optional parameter for the server name to help with this scenario.
Proposed changes:
Change the Postgres wire protocol to include server name in the startup message. This field can be an optional field driven by the connection parameters for psql (-N, --servername).
We need this extra parameter for backward compatibility.
Make PostgreSQL server aware of the new field, and accept the startup message containing this field. Though server doesn’t need this field, this change helps making the server name by default included in the startup message in future.
P.S: I would like to get some initial feedback on this idea and will provide more design details if required. Any feedback in this regard is really appreciated.
Thanks,
Satya
Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com> writes: > As a cloud service, Azure Database for PostgreSQL uses a gateway proxy to route connections to a node hosting the actualserver. To do that, the proxy needs to know the name of the server it tries to locate. As a work-around we currentlyoverload the username parameter to pass in the server name using username@servername convention. It is purely aconvention that our customers need to follow and understand. We would like to extend the PgSQL connection protocol to addan optional parameter for the server name to help with this scenario. We don't actually have any concept of a server name at the moment, and it isn't very clear what introducing that concept would buy. Please explain. > Proposed changes: > Change the Postgres wire protocol to include server name in the startup message. This field can be an optional field drivenby the connection parameters for psql (-N, --servername). > We need this extra parameter for backward compatibility. > Make PostgreSQL server aware of the new field, and accept the startup message containing this field. Though server doesn'tneed this field, this change helps making the server name by default included in the startup message in future. This makes no sense at all. The client is telling the server what the server's name is? You're going to need a very substantially more well-reasoned proposal to have any chance of getting us to make a protocol-level change. regards, tom lane
On 6/15/17 03:20, Satyanarayana Narlapuram wrote: > As a cloud service, Azure Database for PostgreSQL uses a gateway proxy > to route connections to a node hosting the actual server. To do that, > the proxy needs to know the name of the server it tries to locate. As a > work-around we currently overload the username parameter to pass in the > server name using username@servername convention. It is purely a > convention that our customers need to follow and understand. We would > like to extend the PgSQL connection protocol to add an optional > parameter for the server name to help with this scenario. I think this could be useful if it's something like what HTTP uses. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tom Lane wrote: > Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com> writes: > > Change the Postgres wire protocol to include server name in the startup message. This field can be an optional fielddriven by the connection parameters for psql (-N, --servername). > > We need this extra parameter for backward compatibility. > > Make PostgreSQL server aware of the new field, and accept the startup message containing this field. Though server doesn'tneed this field, this change helps making the server name by default included in the startup message in future. > > This makes no sense at all. The client is telling the server what the > server's name is? I think for instance you could have one pgbouncer instance (or whatever pooler) pointing to several different servers. So the client connects to the pooler and indicates which of the servers to connect to. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> This makes no sense at all. The client is telling the server what the >> server's name is? > I think for instance you could have one pgbouncer instance (or whatever > pooler) pointing to several different servers. So the client connects > to the pooler and indicates which of the servers to connect to. I should think that in such cases, the end client is exactly not what you want to be choosing which server it gets redirected to. You'd be wanting to base that on policies defined at the pooler. There are already plenty of client-supplied attributes you could use as inputs for such policies (user name and application name, for instance). Why do we need to incur a protocol break to add another one? regards, tom lane
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> This makes no sense at all. The client is telling the server what the
>> server's name is?
> I think for instance you could have one pgbouncer instance (or whatever
> pooler) pointing to several different servers. So the client connects
> to the pooler and indicates which of the servers to connect to.
I should think that in such cases, the end client is exactly not what
you want to be choosing which server it gets redirected to. You'd
be wanting to base that on policies defined at the pooler. There are
already plenty of client-supplied attributes you could use as inputs
for such policies (user name and application name, for instance).
Why do we need to incur a protocol break to add another one?
On 2017-06-15 09:43:13 -0400, Tom Lane wrote: > Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com> writes: > > As a cloud service, Azure Database for PostgreSQL uses a gateway proxy to route connections to a node hosting the actualserver. To do that, the proxy needs to know the name of the server it tries to locate. As a work-around we currentlyoverload the username parameter to pass in the server name using username@servername convention. It is purely aconvention that our customers need to follow and understand. We would like to extend the PgSQL connection protocol to addan optional parameter for the server name to help with this scenario. > > We don't actually have any concept of a server name at the moment, > and it isn't very clear what introducing that concept would buy. > Please explain. cluster_name could be what's meant? - Andres
-----Original Message----- From: Andres Freund [mailto:andres@anarazel.de] Sent: Friday, June 16, 2017 10:48 AM To: Tom Lane <tgl@sss.pgh.pa.us> Cc: Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com>; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Making server name part of the startup message On 2017-06-15 09:43:13 -0400, Tom Lane wrote: > Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com> writes: > > As a cloud service, Azure Database for PostgreSQL uses a gateway proxy to route connections to a node hosting the actualserver. To do that, the proxy needs to know the name of the server it tries to locate. As a work-around we currentlyoverload the username parameter to pass in the server name using username@servername convention. It is purely aconvention that our customers need to follow and understand. We would like to extend the PgSQL connection protocol to addan optional parameter for the server name to help with this scenario. > > We don't actually have any concept of a server name at the moment, and > it isn't very clear what introducing that concept would buy. > Please explain. cluster_name could be what's meant? Andres, thank you! It is database cluster name as you mentioned. - Andres
Thank you, Alvaro, Andres, Magnus, Peter, and Tom for your thoughts! I consolidated all the responses, and provided the high level overview of the Azure PostgreSQL database architecture, and provided additional details at the bottom of the message. > Tom Lane: We don't actually have any concept of a server name at the > moment, and it isn't very clear what introducing that concept would buy. > Please explain. > Tom Lane: I think for instance you could have one pgbouncer instance (or > whatever pooler) pointing >to several different servers. So the client > connects to the pooler and indicates which of the servers to >connect to. > Magnus: The normal one to use for pgbonucer today is, well, "database > name". You can then have >pgbouncer map different databases to different > backend servers. It's fairly common in my experience >to have things like > "dbname" and "dbname-ro" (for example) as different > database names with one mapping to the master and one mapping to a > load-balanced set of standbys, >and things like that. ISTM that using the > database name is a good choice for that. PgBouncer for example assumes that the database names are unique across the database clusters it is serving. Our front-end Gateways can serve tens of thousands of Postgres servers spanning multiple customers, and organizations, and enforcing the database names being unique is not possible for the users of the service. > Magnus: For the original idea in this thread, using something like > dbname@server seems a more logical choice than username@server. We considered this option but connecting to the database from the GUI tools is not very intuitive / possible. Also /c option now requires including full cluster_name every time user connect. > Tome Lane: I should think that in such cases, the end client is exactly > not what you want to be choosing which server it gets redirected to. > You'd be wanting to base that on policies defined at the pooler. There > are > already plenty of client-supplied attributes you could use as inputs for > such policies (user name and application name, for instance). Why do we > need to incur a protocol break to add another one? This is optional and is not a protocol break. This doesn’t make the cluster name field mandatory in the startup message. If the client specifies the extra parameter in the connection string to include the server name in the startup message then only it will be included otherwise it is not. In a proxy scenario, end clients startup message doesn’t need to include the server name in it, and for proxy it is optional to include this field while sending the startup message to the server. It is preferred to set the field for the Azure PostgreSQL service instead of appending the cluster name to the user name. Yes, there are other fields like application name, but it is not easy to use them from GUI tools like Pg Admin. Overloading a field is also not always intuitive to the users, and some of the applications potentially using them for different purposes. Default database name is the user name for some of the clients, and as we are overloading user name today and the startup message has user@clustername in it. This behavior causing logins to fail with invalid database name as the database doesn’t exist on the server. Using database name may not be ideal because GUI tools doesn’t assume database has server name in it. > Peter: I think this could be useful if it's something like what HTTP uses. The proposal is similar to http host header field in HTTP1.1. This allows the origin server or gateway to differentiate between internally-ambiguous URLs, such as the root "/" URL of a server for multiple host names on a single IP address. For reference, http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23 Azure database for PostgreSQL follows the similar pattern where all the database cluster dns records points to our gateways nodes, and we would like to resolve them based on the server name field. High level overview of Azure Database For PostgreSQL service. Azure database for PostgreSQL is a cloud service that hosts several PostgreSQL database clusters a.k.a. servers with in a region, potentially tens of thousands of database clusters. We have several front-end proxies (called Gateways) deployed to proxy the customer connections to the appropriate database cluster. Each database cluster has a unique name, and a DNS record is provisioned with that name. For example, if pgserver is the name of the database cluster, then it has a dns record pgserver.postgres.database.azure.com associated with it. The DNS record of a customer database server will be pointing to the front-end Gateways, and the customer request reaches these Gateways. Gateway requires database cluster name to proxy the connection to the appropriate database cluster. In the absence of this it is impossible for us to proxy the request. Startup message containing the server name helps us route the requests to the right database clusters, without customers overloading the user name field in the startup message. For a friction free user experience, we are trying to make the changes in the client code, PostgreSQL code, and minimize the changes required in the application. Here is the logical flow of server creation, and the connection establishment: Database cluster provisioning workflow: 1. Customer requests Azure to create a cluster 2. Azure registers the request, and adds the necessary metadata in the central metadata store 3. Create a database cluster with the customer requested size, and with the provided user name and password combination 4. DNS record created for the server for the customer to connect to. This points to our front-end gateway nodes 5. Make proxies aware of the location of the database cluster created, and proxies caches this info, and refreshes if location changes Connectivity workflow: 1. Client uses the host name provided to connect to the database cluster provisioned in Azure, and today they include database cluster name in the user name (username@pgserver) 2. Request reaches the Gateway (proxy) 3. Gateway parses the packet, does the lookup, and creates a proxy connection to the right database cluster (in the lookup success path) 4. SSL handshake happens between the Postgres process, and the Gateway 5. Gateway sends the startup packet removing the hostname from the startup message 6. Postgres responds with the password authentication request message 7. Gateway sends the auth request back to the client 8. Client response with the password message 9. Gateway forwards the same to the database cluster 10. Database cluster authenticates the client and sends success / failure response to the client LibPQ Connection string format: host=localhost port=5432 dbname=mydb connect_timeout=10 include_cluster_name=true include_cluster_name is an optional parameter and setting this true includes cluster_name in the startup message, and will be ignored otherwise. Connection via psql: Today customers connect as follows: psql -h pgserver.postgres.database.azure.com -U pgadmin@pgserver -d Postgres After the proposed change the same command looks like: psql -h pgserver.postgres.database.azure.com -U pgadmin -d Postgres --includeClusterName or psql -h pgserver.postgres.database.azure.com -U pgadmin -d Postgres -N -N / -- includeClusterName is optional here. -- View this message in context: http://www.postgresql-archive.org/Making-server-name-part-of-the-startup-message-tp5966904p5967347.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
PgBouncer for example assumes that the database names are unique across the database clusters it is serving. Our front-end Gateways can serve tens of thousands of Postgres servers spanning multiple customers, and organizations, and enforcing the database names being unique is not possible for the users of the service.
> For the original idea in this thread, using something like dbname@server seems a more logical choice than username@server.
We considered this option but connecting to the database from the GUI tools is not very intuitive / possible. Also /c switch in Psql requires including full cluster_name every time user connect to a different database.
Thanks,
Satya
From: Magnus Hagander [mailto:magnus@hagander.net]
Sent: Thursday, June 15, 2017 9:24 AM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Alvaro Herrera <alvherre@2ndquadrant.com>; Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com>; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Making server name part of the startup message
On Thu, Jun 15, 2017 at 5:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> This makes no sense at all. The client is telling the server what the
>> server's name is?
> I think for instance you could have one pgbouncer instance (or whatever
> pooler) pointing to several different servers. So the client connects
> to the pooler and indicates which of the servers to connect to.
I should think that in such cases, the end client is exactly not what
you want to be choosing which server it gets redirected to. You'd
be wanting to base that on policies defined at the pooler. There are
already plenty of client-supplied attributes you could use as inputs
for such policies (user name and application name, for instance).
Why do we need to incur a protocol break to add another one?
The normal one to use for pgbonucer today is, well, "database name". You can then have pgbouncer map different databases to different backend servers. It's fairly common in my experience to have things like "dbname" and "dbname-ro" (for example) as different database names with one mapping to the master and one mapping to a load-balanced set of standbys, and things like that. ISTM that using the database name is a good choice for that.
For the original idea in this thread, using something like dbname@server seems a more logical choice than username@server.
TBH, so maybe I'm misunderstanding the original issue?
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
> I should think that in such cases, the end client is exactly not what you want to be choosing which server it gets redirectedto. You'd be wanting to base that on >policies defined at the pooler. There are already plenty of client-suppliedattributes you could use as inputs for such policies (user name and application name, for >instance). Pooler would be the end client for the Postgres database cluster, and connection string changes are required at the pooler.There is no change in the connection string format in such cases. >Why do we need to incur a protocol break to add another one? This is optional and is not a protocol break. This doesn't make the cluster name field mandatory in the startup message.If the client specifies the extra parameter in the connection string to include the server name in the startup message,then only it will be included otherwise it is not. In a proxy scenario, end client's startup message doesn't needto include the server name in it, and for proxy it is optional to include this field while sending the startup messageto the server. It is preferred to set the field for the Azure PostgreSQL service instead of appending the clustername to the user name. Proposed LibPQ connection string format would be: host=localhost port=5432 dbname=mydb connect_timeout=10 include_cluster_name=true include_cluster_name is an optional parameter and setting this true includes cluster_name in the startup message and willnot be included otherwise. Thanks, Satya -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Thursday, June 15, 2017 8:58 AM To: Alvaro Herrera <alvherre@2ndquadrant.com> Cc: Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com>; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Making server name part of the startup message Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> This makes no sense at all. The client is telling the server what >> the server's name is? > I think for instance you could have one pgbouncer instance (or > whatever > pooler) pointing to several different servers. So the client connects > to the pooler and indicates which of the servers to connect to. I should think that in such cases, the end client is exactly not what you want to be choosing which server it gets redirectedto. You'd be wanting to base that on policies defined at the pooler. There are already plenty of client-suppliedattributes you could use as inputs for such policies (user name and application name, for instance). Why do we need to incur a protocol break to add another one? regards, tom lane
Satyanarayana Narlapuram <Satyanarayana.Narlapuram@microsoft.com> writes: >> Why do we need to incur a protocol break to add another one? > This is optional and is not a protocol break. Yes, it is. We've been around on this sort of thing before and we understand the consequences. If the option is carried in the startup message, the client has to send it without knowing whether the server is of new enough version to accept it. If not, the server will reject the connection (with a scary looking message in its log) and the client then has to retry without the option. This is not distinguishable from what you have to do if you consider the startup message as belonging to a new protocol version 4 instead of 3. We have done this in the past, but it's painful, subject to bugs, and generally is a pretty high price to pay for a marginal feature. regards, tom lane