Обсуждение: Occupied port warning
During a recent training session I was reminded about a peculiar misbehavior that recent PostgreSQL releases exhibit when the TCP port they are trying to bind to is occupied: LOG: could not bind IPv4 socket: Address already in use HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry. WARNING: could not create listen socket for "localhost" The trainees found this behavior somewhat unuseful. Can someone remind me why this is not an error? Does any other server software behave this way? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut said: > During a recent training session I was reminded about a peculiar > misbehavior that recent PostgreSQL releases exhibit when the TCP port > they are trying to bind to is occupied: > > LOG: could not bind IPv4 socket: Address already in use > HINT: Is another postmaster already running on port 5432? If not, wait > a few seconds and retry. > WARNING: could not create listen socket for "localhost" > > The trainees found this behavior somewhat unuseful. Can someone remind > me why this is not an error? Does any other server software behave > this way? > IIRC, in previous versions any bind failure was fatal, but in 8.0 we decided to be slightly more forgiving and only bail out if we failed to bind at all. cheers andrew
Andrew Dunstan wrote: > IIRC, in previous versions any bind failure was fatal, but in 8.0 we > decided to be slightly more forgiving and only bail out if we failed > to bind at all. I realize that, but I would like to know where that bright idea came from in violation of all other principles of this and any other software. I recall that it had something to do with IPv6, but I'm not sure. -- Peter Eisentraut http://developer.postgresql.org/~petere/
At 2005-06-28 15:14:29 +0200, peter_e@gmx.net wrote: > > I recall that it had something to do with IPv6, but I'm not sure. Under Linux, if you bind to AF_INET6/::0, a subsequent bind to AF_INET/0 will fail, but the IPv4 address is also bound by the first call, and the program will accept IPv4 connections anyway (BSD behaves differently). Maybe that had something to do with it? I remember I had to add code to my program to allow that second bind to fail without complaint, and now my code also exits only if it can't bind anything at all. (For what it's worth, I don't think this behaviour is such a big deal.) -- ams
Peter Eisentraut wrote: >Andrew Dunstan wrote: > > >>IIRC, in previous versions any bind failure was fatal, but in 8.0 we >>decided to be slightly more forgiving and only bail out if we failed >>to bind at all. >> >> > >I realize that, but I would like to know where that bright idea came >from in violation of all other principles of this and any other >software. I recall that it had something to do with IPv6, but I'm not >sure. > > > It came from the fertile brain of Tom Lane :-) see http://archives.postgresql.org/pgsql-hackers/2004-03/msg00679.php I think "violation of all other principles of this and any other software" is far too strong. cheers andrew
On Tue, Jun 28, 2005 at 03:14:29PM +0200, Peter Eisentraut wrote: > Andrew Dunstan wrote: > > IIRC, in previous versions any bind failure was fatal, but in 8.0 we > > decided to be slightly more forgiving and only bail out if we failed > > to bind at all. > > I realize that, but I would like to know where that bright idea came > from in violation of all other principles of this and any other > software. I recall that it had something to do with IPv6, but I'm not > sure. If the TCP socket is used we can still bind to the Unix-domain socket, no? -- Alvaro Herrera (<alvherre[a]surnet.cl>) "Vivir y dejar de vivir son soluciones imaginarias. La existencia está en otra parte" (Andre Breton)
Alvaro Herrera wrote: > If the TCP socket is used we can still bind to the Unix-domain > socket, no? If I configured a TCP/IP socket, what good does a Unix-domain socket do me? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Andrew Dunstan wrote: > see http://archives.postgresql.org/pgsql-hackers/2004-03/msg00679.php Well, with once release of field experience behind me I'd like to revisit this idea. Who would actually be hurt by generating an error here like it used to do? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > During a recent training session I was reminded about a peculiar > misbehavior that recent PostgreSQL releases exhibit when the TCP port > they are trying to bind to is occupied: > LOG: could not bind IPv4 socket: Address already in use > HINT: Is another postmaster already running on port 5432? If not, wait > a few seconds and retry. > WARNING: could not create listen socket for "localhost" > The trainees found this behavior somewhat unuseful. What behavior are you proposing, exactly? I don't think it's practical to make the server error out if it can't bind to every socket it tries to bind to --- that will leave you dead in the water in an uncomfortably large number of scenarios. I think the cases that forced us to adopt this behavior originally were ones where userland thinks IPv6 is supported but the kernel does not. Thus, we can *not* treat the list returned by getaddrinfo as gospel. It might be reasonable to treat some error conditions as fatal but not others. But you'd have to engage in pretty close analysis to make sure you weren't buying into any bad behaviors. regards, tom lane
I wrote: > Andrew Dunstan wrote: > > see > > http://archives.postgresql.org/pgsql-hackers/2004-03/msg00679.php > > Well, with once release of field experience behind me I'd like to > revisit this idea. Who would actually be hurt by generating an error > here like it used to do? It seems that the only concern was broken resolvers (namely, "localhost" not being resolvable). Then you can easily replace that with 127.0.0.1, or * if you like. That sounds like the place for an error message with a hint, not silent failure. Comments? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Tom Lane wrote: > What behavior are you proposing, exactly? The least thing it should do is error out if *no* TCP/IP port could be created while listen_addresses is set. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane wrote: >> What behavior are you proposing, exactly? > The least thing it should do is error out if *no* TCP/IP port could be > created while listen_addresses is set. That might be reasonable --- I think right now we only die if we couldn't create the Unix socket either. regards, tom lane
Tom Lane wrote: >Peter Eisentraut <peter_e@gmx.net> writes: > > >>Tom Lane wrote: >> >> >>>What behavior are you proposing, exactly? >>> >>> > > > >>The least thing it should do is error out if *no* TCP/IP port could be >>created while listen_addresses is set. >> >> > >That might be reasonable --- I think right now we only die if we >couldn't create the Unix socket either. > > > > correct (in the cases where we try to create it, e.g. Unix but not Windows). cheers andrew
I wrote: > The least thing it should do is error out if *no* TCP/IP port could > be created while listen_addresses is set. It's doing that now, and that should guard against the most common problem, namemly the port already being occupied (since all TCP/IP listen sockets use the same port). Reading the comments in StreamServerPort, it seems the only problem we can't go fatal error everywhere is that on some systems the IPv4 and IPv6 sockets fight each other when bind() is called. For the other failure modes, it seems that no such precautions are necessary. In particular, I think we could error out in all of the following cases: - Host or service name could not be resolved (just specify it numerically instead). This would help against mistyped host names and misconfigured name servers. - MaxListen exceeded (don't configure so many sockets instead). - socket() failed - listen() failed I think we could also error out if we cannot create at least one listen socket for each entry in listen_addresses (instead of at least one overall). Comments on that? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Reading the comments in StreamServerPort, it seems the only problem we > can't go fatal error everywhere is that on some systems the IPv4 and > IPv6 sockets fight each other when bind() is called. For the other > failure modes, it seems that no such precautions are necessary. In > particular, I think we could error out in all of the following cases: I think you are putting *far* too much faith in the platforms that are out there. We fought enough kernel and libc bugs (or at least disagreements) while we were putting in IPv6 support to make me very wary of proposals to treat socket problems as fatal. I would much rather have the postmaster start and not connect to everything it originally tried to connect to than have it refuse to play ball until you get a new kernel version. > - socket() failed Definitely wrong, see archives. EAFNOSUPPORT for example is an entirely expected case. > - listen() failed Ditto, see archives. > I think we could also error out if we cannot create at least one listen > socket for each entry in listen_addresses (instead of at least one > overall). No; that will break cases that don't need to break. I was willing to hold still for the limited check you just applied, but I do not see that making it less error-tolerant than that is a good idea at all. It will just put obstacles in the path of newbies. (In fact, I'm not even convinced that the limited check will survive beta. I think we'll be taking it out again, or at least reducing it to a WARNING, when the complaints start coming in. As of CVS tip, a default postmaster configuration will refuse to start if there is anything wrong with your "localhost" DNS setup, and we already learned that there are way too many machines where that is true.) regards, tom lane
Tom Lane wrote: > > I think we could also error out if we cannot create at least one > > listen socket for each entry in listen_addresses (instead of at > > least one overall). > > No; that will break cases that don't need to break. Which cases would that be? If you specify a host name and it doesn't get used at all, what sense could that possibly make? > I was willing to hold still for the limited check you just applied, > but I do not see that making it less error-tolerant than that is a > good idea at all. It will just put obstacles in the path of newbies. Not ignoring errors is one of the staples of PostgreSQL. What you are proposing here sounds entirely like a MySQL design plan. Maybe that is newbie-friendly in your mind, but I really doubt that. I agree that we do not want to force people to change kernel or system libraries. But it is not acceptable to ignore misconfigurations where a simple change of a few configuration parameters would correct the situation, as in this case: > (In fact, I'm not even convinced that the limited check will survive > beta. I think we'll be taking it out again, or at least reducing it > to a WARNING, when the complaints start coming in. As of CVS tip, > a default postmaster configuration will refuse to start if there is > anything wrong with your "localhost" DNS setup, and we already > learned that there are way too many machines where that is true.) Here, you simply change the configuration to use numeric IP addresses. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Not ignoring errors is one of the staples of PostgreSQL. What you are > proposing here sounds entirely like a MySQL design plan. Maybe that is > newbie-friendly in your mind, but I really doubt that. I agree that we > do not want to force people to change kernel or system libraries. But > it is not acceptable to ignore misconfigurations where a simple change > of a few configuration parameters would correct the situation, My fundamental objection here is that I think you will be making error cases out of situations where a kernel update is the only solution; in particular the ones stemming from kernel and libc not being on the same page about whether IPv6 is supported. We must likewise not assume that a would-be Postgres user is in a position to fix his DNS infrastructure. Treating these problems as warnings instead of hard errors is hardly equivalent to risking data loss --- all it says is that you won't be able to connect from certain places until you fix it, which is certainly not worse than being unable to connect from anyplace because you cannot get the postmaster to start. regards, tom lane