Обсуждение: Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

Поиск
Список
Период
Сортировка

Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

От
Tatsuo Ishii
Дата:
> The following bug has been logged on the website:
> 
> Bug reference:      13611
> Logged by:          Kondo Yuta
> Email address:      kondo@sraoss.co.jp
> PostgreSQL version: 9.4.4
> Operating system:   Windows 7
> Description:        
> 
> Hello,
> 
> According to PostgreSQL document, listen_addresses = '0.0.0.0' or '::' are
> allowed.
> http://www.postgresql.org/docs/9.4/static/runtime-config-connection.html#GUC-LISTEN-ADDRESSES
> 
> But I found "pg_ctl -w ..." timeouts connection test on Windows with
> listen_addresses = '0.0.0.0' or '::'.
> 
> I found this reason in src/bin/pg_ctl/pg_ctl.c
> [test_postmaster_connection(bool)].
> 
> When pg_ctl tries to connect to postmaster, it uses "0.0.0.0" as the
> target ip address. Unfortunately "0.0.0.0" is not a valid address on
> Windows and it fails. Shouldn't pg_ctl translate "0.0.0.0" to
> "127.0.0.1" in this case?

I think this is definitely a bug. I privately heard from the reporter
that if postmaster is started by not using pg_ctl, it happily starts
with "listen_addresses = '0.0.0.0'. That means, postmaster itself
works as advertised, but pg_ctl does not.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Tatsuo Ishii <ishii@postgresql.org> writes:
>> When pg_ctl tries to connect to postmaster, it uses "0.0.0.0" as the
>> target ip address. Unfortunately "0.0.0.0" is not a valid address on
>> Windows and it fails. Shouldn't pg_ctl translate "0.0.0.0" to
>> "127.0.0.1" in this case?

> I think this is definitely a bug. I privately heard from the reporter
> that if postmaster is started by not using pg_ctl, it happily starts
> with "listen_addresses = '0.0.0.0'. That means, postmaster itself
> works as advertised, but pg_ctl does not.

I looked at this before, and could not see anything in either the
postmaster or pg_ctl that would invent the address 0.0.0.0 out of
thin air.  I think this report most likely depends on some
misconfiguration of the OP's system.  I doubt it should be our business
to work around such misconfiguration.  In particular, magically
substituting 127.0.0.1 for 0.0.0.0 seems utterly without principle.
        regards, tom lane



> Tatsuo Ishii <ishii@postgresql.org> writes:
>>> When pg_ctl tries to connect to postmaster, it uses "0.0.0.0" as the
>>> target ip address. Unfortunately "0.0.0.0" is not a valid address on
>>> Windows and it fails. Shouldn't pg_ctl translate "0.0.0.0" to
>>> "127.0.0.1" in this case?
> 
>> I think this is definitely a bug. I privately heard from the reporter
>> that if postmaster is started by not using pg_ctl, it happily starts
>> with "listen_addresses = '0.0.0.0'. That means, postmaster itself
>> works as advertised, but pg_ctl does not.
> 
> I looked at this before, and could not see anything in either the
> postmaster or pg_ctl that would invent the address 0.0.0.0 out of
> thin air.  I think this report most likely depends on some
> misconfiguration of the OP's system.  I doubt it should be our business
> to work around such misconfiguration.  In particular, magically
> substituting 127.0.0.1 for 0.0.0.0 seems utterly without principle.

Are you saying that "listen_addresses = '0.0.0.0' should work on Windows?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



On Mon, Sep 14, 2015 at 12:36:14AM -0400, Tom Lane wrote:
> Tatsuo Ishii <ishii@postgresql.org> writes:
> >> When pg_ctl tries to connect to postmaster, it uses "0.0.0.0" as the
> >> target ip address. Unfortunately "0.0.0.0" is not a valid address on
> >> Windows and it fails. Shouldn't pg_ctl translate "0.0.0.0" to
> >> "127.0.0.1" in this case?
> 
> > I think this is definitely a bug. I privately heard from the reporter
> > that if postmaster is started by not using pg_ctl, it happily starts
> > with "listen_addresses = '0.0.0.0'. That means, postmaster itself
> > works as advertised, but pg_ctl does not.
> 
> I looked at this before, and could not see anything in either the
> postmaster or pg_ctl that would invent the address 0.0.0.0 out of
> thin air.  I think this report most likely depends on some
> misconfiguration of the OP's system.  I doubt it should be our business
> to work around such misconfiguration.

Use of "0.0.0.0" or "::" as a socket destination address is not portable.  The
Windows connect() documentation says, "If the address member of the structure
specified by the name parameter is filled with zeros, connect will return the
error WSAEADDRNOTAVAIL."  OpenBSD 5.0 behaves the same way.  NetBSD 6.0 does
not accept ::, but it accepts 0.0.0.0.  (For this to affect pg_ctl on
non-Windows platforms, you would need to empty unix_socket_directories.)

> In particular, magically
> substituting 127.0.0.1 for 0.0.0.0 seems utterly without principle.

Binding a listening socket to "0.0.0.0" listens on every local IPv4 address,
and 127.0.0.1 is one of those addresses.  That's the principle.  It's
inelegant, but I expect it to work everywhere.

nm



On Thu, Oct 8, 2015 at 11:26 PM, Noah Misch <noah@leadboat.com> wrote:
>> In particular, magically
>> substituting 127.0.0.1 for 0.0.0.0 seems utterly without principle.
>
> Binding a listening socket to "0.0.0.0" listens on every local IPv4 address,
> and 127.0.0.1 is one of those addresses.  That's the principle.  It's
> inelegant, but I expect it to work everywhere.

But... what about the machine's other addresses?

If Windows doesn't treat 0.0.0.0 to mean listen on every interface,
that's a shame.  But making it only listen on 127.0.0.1 and not any of
the others does not seem better.  Then, instead of 0.0.0.0 failing on
Windows, it would instead work but with different behavior.  That
doesn't seem good either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Fri, Oct 09, 2015 at 03:14:26PM -0400, Robert Haas wrote:
> On Thu, Oct 8, 2015 at 11:26 PM, Noah Misch <noah@leadboat.com> wrote:
> >> In particular, magically
> >> substituting 127.0.0.1 for 0.0.0.0 seems utterly without principle.
> >
> > Binding a listening socket to "0.0.0.0" listens on every local IPv4 address,
> > and 127.0.0.1 is one of those addresses.  That's the principle.  It's
> > inelegant, but I expect it to work everywhere.
> 
> But... what about the machine's other addresses?
> 
> If Windows doesn't treat 0.0.0.0 to mean listen on every interface,
> that's a shame.  But making it only listen on 127.0.0.1 and not any of
> the others does not seem better.  Then, instead of 0.0.0.0 failing on
> Windows, it would instead work but with different behavior.  That
> doesn't seem good either.

The listening side is in good shape today.  This thread is about the address
that pg_ctl uses in PQping("host=...").  Listening on 0.0.0.0 is portable.
PQping("host='0.0.0.0'") relies on non-portable semantics in the underlying
connect() syscall.  PQping("host='127.0.0.1'") is a portable substitute.



On Fri, Oct 9, 2015 at 10:16 PM, Noah Misch <noah@leadboat.com> wrote:
> On Fri, Oct 09, 2015 at 03:14:26PM -0400, Robert Haas wrote:
>> On Thu, Oct 8, 2015 at 11:26 PM, Noah Misch <noah@leadboat.com> wrote:
>> >> In particular, magically
>> >> substituting 127.0.0.1 for 0.0.0.0 seems utterly without principle.
>> >
>> > Binding a listening socket to "0.0.0.0" listens on every local IPv4 address,
>> > and 127.0.0.1 is one of those addresses.  That's the principle.  It's
>> > inelegant, but I expect it to work everywhere.
>>
>> But... what about the machine's other addresses?
>>
>> If Windows doesn't treat 0.0.0.0 to mean listen on every interface,
>> that's a shame.  But making it only listen on 127.0.0.1 and not any of
>> the others does not seem better.  Then, instead of 0.0.0.0 failing on
>> Windows, it would instead work but with different behavior.  That
>> doesn't seem good either.
>
> The listening side is in good shape today.  This thread is about the address
> that pg_ctl uses in PQping("host=...").  Listening on 0.0.0.0 is portable.
> PQping("host='0.0.0.0'") relies on non-portable semantics in the underlying
> connect() syscall.  PQping("host='127.0.0.1'") is a portable substitute.

Ah.  So in this case 0.0.0.0 is interpreted to mean "any IP that's a
way to reach the local host", and using 127.0.0.1 makes sense because
we know that will always be one of them?  I could buy that line of
reasoning.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Mon, Oct 12, 2015 at 07:37:42PM -0400, Robert Haas wrote:
> On Fri, Oct 9, 2015 at 10:16 PM, Noah Misch <noah@leadboat.com> wrote:
> > On Fri, Oct 09, 2015 at 03:14:26PM -0400, Robert Haas wrote:
> >> On Thu, Oct 8, 2015 at 11:26 PM, Noah Misch <noah@leadboat.com> wrote:
> >> >> In particular, magically
> >> >> substituting 127.0.0.1 for 0.0.0.0 seems utterly without principle.
> >> >
> >> > Binding a listening socket to "0.0.0.0" listens on every local IPv4 address,
> >> > and 127.0.0.1 is one of those addresses.  That's the principle.  It's
> >> > inelegant, but I expect it to work everywhere.
> >>
> >> But... what about the machine's other addresses?
> >>
> >> If Windows doesn't treat 0.0.0.0 to mean listen on every interface,
> >> that's a shame.  But making it only listen on 127.0.0.1 and not any of
> >> the others does not seem better.  Then, instead of 0.0.0.0 failing on
> >> Windows, it would instead work but with different behavior.  That
> >> doesn't seem good either.
> >
> > The listening side is in good shape today.  This thread is about the address
> > that pg_ctl uses in PQping("host=...").  Listening on 0.0.0.0 is portable.
> > PQping("host='0.0.0.0'") relies on non-portable semantics in the underlying
> > connect() syscall.  PQping("host='127.0.0.1'") is a portable substitute.
> 
> Ah.  So in this case 0.0.0.0 is interpreted to mean "any IP that's a
> way to reach the local host", and using 127.0.0.1 makes sense because
> we know that will always be one of them?  I could buy that line of
> reasoning.

Exactly.



Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Oct 9, 2015 at 10:16 PM, Noah Misch <noah@leadboat.com> wrote:
>> The listening side is in good shape today.  This thread is about the address
>> that pg_ctl uses in PQping("host=...").  Listening on 0.0.0.0 is portable.
>> PQping("host='0.0.0.0'") relies on non-portable semantics in the underlying
>> connect() syscall.  PQping("host='127.0.0.1'") is a portable substitute.

> Ah.  So in this case 0.0.0.0 is interpreted to mean "any IP that's a
> way to reach the local host", and using 127.0.0.1 makes sense because
> we know that will always be one of them?  I could buy that line of
> reasoning.

I do *not* buy that we can safely replace "localhost" by "127.0.0.1".
Consider a system that's only set up with IPv6 local addressing.

AFAICS the complaint in this bug is about a system with broken DNS (ie,
unable to resolve "localhost" properly, which is something mandated by
relevant RFCs, I believe).  We should not break legally, if perhaps
strangely, configured systems in order to work around a misconfigured one.
Failure to resolve localhost to a working loopback address will break a
lot of services besides ours, so the OP is going to need to fix the
configuration problem eventually anyway.

BTW, it looks like pgstat.c will work in such a case, but it's only
accidentally so; as Noah notes, listening to 0.0.0.0 works, and then
it looks like we get the actual IP address for backends to connect to
from the listen socket itself.  That wasn't a situation that the code
was meant to handle, for sure, and I wouldn't want to bet that it would
work like that on every platform.

Now, if you're proposing trying "localhost" and falling back to 127.0.0.1
if it fails to resolve, that would be OK with me --- but it would be a
significant amount of additional complexity for what remains a case I do
not think we need to support.
        regards, tom lane



On Mon, Oct 12, 2015 at 08:07:37PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Fri, Oct 9, 2015 at 10:16 PM, Noah Misch <noah@leadboat.com> wrote:
> >> The listening side is in good shape today.  This thread is about the address
> >> that pg_ctl uses in PQping("host=...").  Listening on 0.0.0.0 is portable.
> >> PQping("host='0.0.0.0'") relies on non-portable semantics in the underlying
> >> connect() syscall.  PQping("host='127.0.0.1'") is a portable substitute.
> 
> > Ah.  So in this case 0.0.0.0 is interpreted to mean "any IP that's a
> > way to reach the local host", and using 127.0.0.1 makes sense because
> > we know that will always be one of them?  I could buy that line of
> > reasoning.
> 
> I do *not* buy that we can safely replace "localhost" by "127.0.0.1".

Nobody proposed that.  The word "localhost" did not appear in this thread.

> Consider a system that's only set up with IPv6 local addressing.
> 
> AFAICS the complaint in this bug is about a system with broken DNS (ie,
> unable to resolve "localhost" properly, which is something mandated by
> relevant RFCs, I believe).

The original post used only "0.0.0.0" and "::", not "localhost" or anything
else entailing name resolution.  As I wrote above, Kondo proposed for pg_ctl
to use PQping("host='127.0.0.1'") in place of PQping("host='0.0.0.0'").
That's all.  pg_ctl would continue to use PQping("host='localhost'") where
it's doing so today.  A patch that changes the 0.0.0.0 case in this way should
also change PQping("host='::'") to PQping("host='::1'"); I suspect that was
implicit in the original proposal.

nm



> The original post used only "0.0.0.0" and "::", not "localhost" or anything
> else entailing name resolution.  As I wrote above, Kondo proposed for pg_ctl
> to use PQping("host='127.0.0.1'") in place of PQping("host='0.0.0.0'").
> That's all.  pg_ctl would continue to use PQping("host='localhost'") where
> it's doing so today.  A patch that changes the 0.0.0.0 case in this way should
> also change PQping("host='::'") to PQping("host='::1'"); I suspect that was
> implicit in the original proposal.

Does anybody already write a patch in this direction or willing to do
it? If not, I (or kondo) would like to write the patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Tatsuo Ishii <ishii@postgresql.org> writes:
>> The original post used only "0.0.0.0" and "::", not "localhost" or anything
>> else entailing name resolution.  As I wrote above, Kondo proposed for pg_ctl
>> to use PQping("host='127.0.0.1'") in place of PQping("host='0.0.0.0'").
>> That's all.  pg_ctl would continue to use PQping("host='localhost'") where
>> it's doing so today.

AFAICS, the only hard-wired hostname reference in pg_ctl is "localhost",
not "127.0.0.1" (much less "0.0.0.0").  So what you're proposing doesn't
seem to me to have anything to do with what's there.  I continue to think
that the OP's complaint is somehow founded on a bad address obtained by
looking up "localhost", because where else would it've come from?
        regards, tom lane



On 2015-10-22 16:15:10 -0700, Tom Lane wrote:
> AFAICS, the only hard-wired hostname reference in pg_ctl is "localhost",
> not "127.0.0.1" (much less "0.0.0.0").  So what you're proposing doesn't
> seem to me to have anything to do with what's there.  I continue to think
> that the OP's complaint is somehow founded on a bad address obtained by
> looking up "localhost", because where else would it've come from?

I've not read this thread, and this is just referencing the above:

Perhaps we should start to emit a notice at startup if localhost doesn't
resolve to either v4 or v6 definitions. The few environments where
that's indeed intentionally not the case, should be fine with such a
message at pgstat startup.

- Andres



Andres Freund <andres@anarazel.de> writes:
> Perhaps we should start to emit a notice at startup if localhost doesn't
> resolve to either v4 or v6 definitions. The few environments where
> that's indeed intentionally not the case, should be fine with such a
> message at pgstat startup.

I think there's already a complaint of that sort coming out of the
pgstat machinery, because it will fail to establish the statistics
reporting socket.
        regards, tom lane



On Thu, Oct 22, 2015 at 04:15:10PM -0700, Tom Lane wrote:
> Tatsuo Ishii <ishii@postgresql.org> writes:
> >> The original post used only "0.0.0.0" and "::", not "localhost" or anything
> >> else entailing name resolution.  As I wrote above, Kondo proposed for pg_ctl
> >> to use PQping("host='127.0.0.1'") in place of PQping("host='0.0.0.0'").
> >> That's all.  pg_ctl would continue to use PQping("host='localhost'") where
> >> it's doing so today.
> > 
> > Does anybody already write a patch in this direction or willing to do
> > it? If not, I (or kondo) would like to write the patch.

I have not; please do.

> AFAICS, the only hard-wired hostname reference in pg_ctl is "localhost",
> not "127.0.0.1" (much less "0.0.0.0").  So what you're proposing doesn't
> seem to me to have anything to do with what's there.  I continue to think
> that the OP's complaint is somehow founded on a bad address obtained by
> looking up "localhost", because where else would it've come from?

pg_ctl reads the address from postmaster.pid, which in turn derives from
listen_addresses:

$ grep -E '(unix|listen)' postgresql.conf
listen_addresses = '0.0.0.0'
unix_socket_directories = ''
$ strace -e connect pg_ctl -D . -w start
--- SIGCHLD (Child exited) @ 0 (0) ---
waiting for server to start...connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such
fileor directory)
 
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(6432), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINPROGRESS (Operation
nowin progress)
 
403978 2015-10-23 00:45:06.677 GMT LOG:  redirecting log output to logging collector process
403978 2015-10-23 00:45:06.677 GMT HINT:  Future log output will appear in directory "..".done
server started
Process 403975 detached



Noah Misch <noah@leadboat.com> writes:
> On Thu, Oct 22, 2015 at 04:15:10PM -0700, Tom Lane wrote:
>> I continue to think
>> that the OP's complaint is somehow founded on a bad address obtained by
>> looking up "localhost", because where else would it've come from?

> pg_ctl reads the address from postmaster.pid, which in turn derives from
> listen_addresses:

> $ grep -E '(unix|listen)' postgresql.conf
> listen_addresses = '0.0.0.0'
> unix_socket_directories = ''

Hmm, now I see.  I was about to object that that's a pretty silly setting,
but I see that we actually document it as supported.

> $ strace -e connect pg_ctl -D . -w start
> --- SIGCHLD (Child exited) @ 0 (0) ---
> waiting for server to start...connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such
fileor directory)
 
> connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
> connect(3, {sa_family=AF_INET, sin_port=htons(6432), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINPROGRESS (Operation
nowin progress)
 
> 403978 2015-10-23 00:45:06.677 GMT LOG:  redirecting log output to logging collector process
> 403978 2015-10-23 00:45:06.677 GMT HINT:  Future log output will appear in directory "..".
>  done
> server started
> Process 403975 detached

... although this trace appears to show pg_ctl working just fine with this
setting, which kinda weakens your theory.  Still, it wouldn't be the first
thing we've seen fail on Windows but work elsewhere.

I'd be inclined to suggest fixing it like this:
                       /* If postmaster is listening on "*", use localhost */
-                       if (strcmp(host_str, "*") == 0)
+                       if (strcmp(host_str, "*") == 0 ||
+                           strcmp(host_str, "0.0.0.0") == 0 ||
+                           strcmp(host_str, "::") == 0)                           strcpy(host_str, "localhost");

which covers the cases we document as supported.

A different line of thought would be to teach the postmaster to record
actually bound-to addresses in postgresql.conf, rather than regurgitating
the listen_addresses setting verbatim.  That would likely be a lot harder
(and less portable); though if we think there's anything besides pg_ctl
looking at this field, it might be worth trying.
        regards, tom lane



On Thu, Oct 22, 2015 at 07:59:27PM -0700, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > pg_ctl reads the address from postmaster.pid, which in turn derives from
> > listen_addresses:
> 
> > $ grep -E '(unix|listen)' postgresql.conf
> > listen_addresses = '0.0.0.0'
> > unix_socket_directories = ''
> 
> Hmm, now I see.  I was about to object that that's a pretty silly setting,
> but I see that we actually document it as supported.
> 
> > $ strace -e connect pg_ctl -D . -w start
> > --- SIGCHLD (Child exited) @ 0 (0) ---
> > waiting for server to start...connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No
suchfile or directory)
 
> > connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
> > connect(3, {sa_family=AF_INET, sin_port=htons(6432), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINPROGRESS
(Operationnow in progress)
 
> > 403978 2015-10-23 00:45:06.677 GMT LOG:  redirecting log output to logging collector process
> > 403978 2015-10-23 00:45:06.677 GMT HINT:  Future log output will appear in directory "..".
> >  done
> > server started
> > Process 403975 detached
> 
> ... although this trace appears to show pg_ctl working just fine with this
> setting, which kinda weakens your theory.  Still, it wouldn't be the first
> thing we've seen fail on Windows but work elsewhere.

As I stated upthread, PQping("host='0.0.0.0'") is _not portable_.  It works on
GNU/Linux, which I used for that demo.  It fails on OpenBSD and Windows.

> I'd be inclined to suggest fixing it like this:
> 
>                         /* If postmaster is listening on "*", use localhost */
> -                       if (strcmp(host_str, "*") == 0)
> +                       if (strcmp(host_str, "*") == 0 ||
> +                           strcmp(host_str, "0.0.0.0") == 0 ||
> +                           strcmp(host_str, "::") == 0)
>                             strcpy(host_str, "localhost");
> 
> which covers the cases we document as supported.

On RHEL 5 and some other "active adult" systems, "localhost" does not reach a
listen_addresses='::' server.  IPv6 is available, but "localhost" resolves to
127.0.0.1 only.

The latest systems resolve "localhost" to both 127.0.0.1 and ::1, in which
case PQping("host='localhost'") will attempt both addresses in an unspecified
order.  Given a postmaster with listen_addresses='0.0.0.0', contacting ::1
first will fail (fine) or reach a different postmaster (not fine).

Kondo's design is correct.

> A different line of thought would be to teach the postmaster to record
> actually bound-to addresses in postgresql.conf, rather than regurgitating
> the listen_addresses setting verbatim.  That would likely be a lot harder
> (and less portable); though if we think there's anything besides pg_ctl
> looking at this field, it might be worth trying.

Binding to INADDR_ANY is not equivalent to binding to some list of concrete
addresses.

nm



Re: Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

От
Peter Eisentraut
Дата:
On 10/23/15 11:10 PM, Noah Misch wrote:
> On RHEL 5 and some other "active adult" systems, "localhost" does not reach a
> listen_addresses='::' server.  IPv6 is available, but "localhost" resolves to
> 127.0.0.1 only.
> 
> The latest systems resolve "localhost" to both 127.0.0.1 and ::1, in which
> case PQping("host='localhost'") will attempt both addresses in an unspecified
> order.  Given a postmaster with listen_addresses='0.0.0.0', contacting ::1
> first will fail (fine) or reach a different postmaster (not fine).

A design I have seen in some other systems is that you specify as a
configuration parameter an address by which you want to be contacted by
admin tools.  This might be overkill here since we only need to be
contacted by a local client and the discussion is hoping to handle those
cases, but if not it would be a more principled solution.




> As I stated upthread, PQping("host='0.0.0.0'") is _not portable_.  It works on
> GNU/Linux, which I used for that demo.  It fails on OpenBSD and Windows.
> 
>> I'd be inclined to suggest fixing it like this:
>> 
>>                         /* If postmaster is listening on "*", use localhost */
>> -                       if (strcmp(host_str, "*") == 0)
>> +                       if (strcmp(host_str, "*") == 0 ||
>> +                           strcmp(host_str, "0.0.0.0") == 0 ||
>> +                           strcmp(host_str, "::") == 0)
>>                             strcpy(host_str, "localhost");
>> 
>> which covers the cases we document as supported.
> 
> On RHEL 5 and some other "active adult" systems, "localhost" does not reach a
> listen_addresses='::' server.  IPv6 is available, but "localhost" resolves to
> 127.0.0.1 only.
> 
> The latest systems resolve "localhost" to both 127.0.0.1 and ::1, in which
> case PQping("host='localhost'") will attempt both addresses in an unspecified
> order.  Given a postmaster with listen_addresses='0.0.0.0', contacting ::1
> first will fail (fine) or reach a different postmaster (not fine).
> 
> Kondo's design is correct.

So more proper fix looks like this?
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index dacdfef..23d5a3c 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -646,9 +646,11 @@ test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint)                            return
PQPING_NO_ATTEMPT;                       }
 
-                        /* If postmaster is listening on "*", use localhost */
-                        if (strcmp(host_str, "*") == 0)
-                            strcpy(host_str, "localhost");
+                        /* If postmaster is listening on "*", "0.0.0.0" or "::", use 127.0.0.1 */
+                        if (strcmp(host_str, "*") == 0 ||
+                            strcmp(host_str, "0.0.0.0") == 0 ||
+                            strcmp(host_str, "::") == 0)
+                            strcpy(host_str, "127.0.0.1");                        /*                         * We need
toset connect_timeout otherwise on Windows
 

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



On Mon, Oct 26, 2015 at 09:54:03AM +0900, Tatsuo Ishii wrote:
> > Kondo's design is correct.
> 
> So more proper fix looks like this?

> +                        /* If postmaster is listening on "*", "0.0.0.0" or "::", use 127.0.0.1 */
> +                        if (strcmp(host_str, "*") == 0 ||
> +                            strcmp(host_str, "0.0.0.0") == 0 ||
> +                            strcmp(host_str, "::") == 0)
> +                            strcpy(host_str, "127.0.0.1");

No, PQping("host='127.0.0.1'") fails to reach a listen_addresses='::' server
on many systems.  Here's what I thought Kondo was proposing:

--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -649,5 +649,9 @@ test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint)
-                        /* If postmaster is listening on "*", use localhost */
+                        /* explanation here */                        if (strcmp(host_str, "*") == 0)
         strcpy(host_str, "localhost");
 
+                        else if (strcmp(host_str, "0.0.0.0") == 0)
+                            strcpy(host_str, "127.0.0.1");
+                        else if (strcmp(host_str, "::") == 0)
+                            strcpy(host_str, "::1");



Noah,

> No, PQping("host='127.0.0.1'") fails to reach a listen_addresses='::' server
> on many systems.  Here's what I thought Kondo was proposing:
> 
> --- a/src/bin/pg_ctl/pg_ctl.c
> +++ b/src/bin/pg_ctl/pg_ctl.c
> @@ -649,5 +649,9 @@ test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint)
>  
> -                        /* If postmaster is listening on "*", use localhost */
> +                        /* explanation here */
>                          if (strcmp(host_str, "*") == 0)
>                              strcpy(host_str, "localhost");
> +                        else if (strcmp(host_str, "0.0.0.0") == 0)
> +                            strcpy(host_str, "127.0.0.1");
> +                        else if (strcmp(host_str, "::") == 0)
> +                            strcpy(host_str, "::1");
>  

I see. Would you like to commit this?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Noah,

> No, PQping("host='127.0.0.1'") fails to reach a listen_addresses='::' server
> on many systems.  Here's what I thought Kondo was proposing:
> 
> --- a/src/bin/pg_ctl/pg_ctl.c
> +++ b/src/bin/pg_ctl/pg_ctl.c
> @@ -649,5 +649,9 @@ test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint)
>  
> -                        /* If postmaster is listening on "*", use localhost */
> +                        /* explanation here */
>                          if (strcmp(host_str, "*") == 0)
>                              strcpy(host_str, "localhost");
> +                        else if (strcmp(host_str, "0.0.0.0") == 0)
> +                            strcpy(host_str, "127.0.0.1");
> +                        else if (strcmp(host_str, "::") == 0)
> +                            strcpy(host_str, "::1");
>  

I see. Would you like to commit this?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



On Tue, Oct 27, 2015 at 05:31:25PM +0900, Tatsuo Ishii wrote:
> > No, PQping("host='127.0.0.1'") fails to reach a listen_addresses='::' server
> > on many systems.  Here's what I thought Kondo was proposing:
> > 
> > --- a/src/bin/pg_ctl/pg_ctl.c
> > +++ b/src/bin/pg_ctl/pg_ctl.c
> > @@ -649,5 +649,9 @@ test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint)
> >  
> > -                        /* If postmaster is listening on "*", use localhost */
> > +                        /* explanation here */
> >                          if (strcmp(host_str, "*") == 0)
> >                              strcpy(host_str, "localhost");
> > +                        else if (strcmp(host_str, "0.0.0.0") == 0)
> > +                            strcpy(host_str, "127.0.0.1");
> > +                        else if (strcmp(host_str, "::") == 0)
> > +                            strcpy(host_str, "::1");
> >  
> 
> I see. Would you like to commit this?

I am happy to finish it, but I am no less happy if you finish it.  Which do
you prefer?

Should the back-branch commits mirror the master branch?  A more-cautious
alternative would be to, in back branches, wrap the change in #ifdefs so it
takes effect only on Windows, OpenBSD and NetBSD.  It could break setups with
local firewall rules that block connections to "127.0.0.1" or "::1" without
blocking "0.0.0.0" or "::".  Such firewall rules sound outlandish enough that
I would be fairly comfortable not worrying about this and making the change
unconditional in all branches.  It's a judgment call, though.



> On Tue, Oct 27, 2015 at 05:31:25PM +0900, Tatsuo Ishii wrote:
>> > No, PQping("host='127.0.0.1'") fails to reach a listen_addresses='::' server
>> > on many systems.  Here's what I thought Kondo was proposing:
>> > 
>> > --- a/src/bin/pg_ctl/pg_ctl.c
>> > +++ b/src/bin/pg_ctl/pg_ctl.c
>> > @@ -649,5 +649,9 @@ test_postmaster_connection(pgpid_t pm_pid, bool do_checkpoint)
>> >  
>> > -                        /* If postmaster is listening on "*", use localhost */
>> > +                        /* explanation here */
>> >                          if (strcmp(host_str, "*") == 0)
>> >                              strcpy(host_str, "localhost");
>> > +                        else if (strcmp(host_str, "0.0.0.0") == 0)
>> > +                            strcpy(host_str, "127.0.0.1");
>> > +                        else if (strcmp(host_str, "::") == 0)
>> > +                            strcpy(host_str, "::1");
>> >  
>> 
>> I see. Would you like to commit this?
> 
> I am happy to finish it, but I am no less happy if you finish it.  Which do
> you prefer?

Please go ahead and commit.

> Should the back-branch commits mirror the master branch?  A more-cautious
> alternative would be to, in back branches, wrap the change in #ifdefs so it
> takes effect only on Windows, OpenBSD and NetBSD.  It could break setups with
> local firewall rules that block connections to "127.0.0.1" or "::1" without
> blocking "0.0.0.0" or "::".  Such firewall rules sound outlandish enough that
> I would be fairly comfortable not worrying about this and making the change
> unconditional in all branches.  It's a judgment call, though.

I think back patching with #ifdefs is better. On Windows etc. the case
has been broken anyway and the fix should only bring benefits to users.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



On Thu, Oct 29, 2015 at 05:39:36PM +0900, Tatsuo Ishii wrote:

> > I am happy to finish it, but I am no less happy if you finish it.  Which do
> > you prefer?
> 
> Please go ahead and commit.
> 
> > Should the back-branch commits mirror the master branch?  A more-cautious
> > alternative would be to, in back branches, wrap the change in #ifdefs so it
> > takes effect only on Windows, OpenBSD and NetBSD.  It could break setups with
> > local firewall rules that block connections to "127.0.0.1" or "::1" without
> > blocking "0.0.0.0" or "::".  Such firewall rules sound outlandish enough that
> > I would be fairly comfortable not worrying about this and making the change
> > unconditional in all branches.  It's a judgment call, though.
> 
> I think back patching with #ifdefs is better. On Windows etc. the case
> has been broken anyway and the fix should only bring benefits to users.

I committed it with #ifdef in 9.1-9.4 and without #ifdef in 9.5 and master.