Обсуждение: libpq connection status and closed fd

Поиск
Список
Период
Сортировка

libpq connection status and closed fd

От
Daniele Varrazzo
Дата:
Hello,

a psycopg user is reporting [1] that the library is not marking the
connection as closed and/or bad after certain errors, such as a
connection timeout. He is emulating the error by closing the
connection fd (I don't know if the two conditions result in the same
effect, but I'll stick to this hypothesis for now).

[1] https://github.com/psycopg/psycopg2/issues/263

Testing with a short C program [2] it seems that indeed, after closing
the fd and causing an error in `PQconsumeInput`, the connection is
deemed in good state by `PQstatus`. A similar test gives the same
result after `PQexec` is attempted on a connection whose fd is closed
(tests performed with PostgreSQL 9.3.5).

[2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f

Is this intentional? Is there a better way to check for a broken connection?

If we mark the connection as closed every time `PQconsumeInput`
returns 0 (or `PQexec` returns null, which are the two code paths
affecting psycopg) would it be too aggressive and cause false
positives?

Thank you very much.

-- Daniele



Re: libpq connection status and closed fd

От
Dmitriy Igrishin
Дата:


2014-09-22 10:42 GMT+04:00 Daniele Varrazzo <daniele.varrazzo@gmail.com>:
Hello,

a psycopg user is reporting [1] that the library is not marking the
connection as closed and/or bad after certain errors, such as a
connection timeout. He is emulating the error by closing the
connection fd (I don't know if the two conditions result in the same
effect, but I'll stick to this hypothesis for now).

[1] https://github.com/psycopg/psycopg2/issues/263

Testing with a short C program [2] it seems that indeed, after closing
the fd and causing an error in `PQconsumeInput`, the connection is
deemed in good state by `PQstatus`. A similar test gives the same
result after `PQexec` is attempted on a connection whose fd is closed
(tests performed with PostgreSQL 9.3.5).

[2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f
Why are you using close() instead of PQfinish()?


Is this intentional? Is there a better way to check for a broken connection?
BTW, PQsocket() returns -1 after PQfinish(). 

If we mark the connection as closed every time `PQconsumeInput`
returns 0 (or `PQexec` returns null, which are the two code paths
affecting psycopg) would it be too aggressive and cause false
positives?

Thank you very much.

-- Daniele


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
// Dmitriy.

Re: libpq connection status and closed fd

От
Daniele Varrazzo
Дата:
On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin <dmitigr@gmail.com> wrote:
>
> 2014-09-22 10:42 GMT+04:00 Daniele Varrazzo <daniele.varrazzo@gmail.com>:

>> [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f
>
> Why are you using close() instead of PQfinish()?

Because I'm testing for an error, please read my message and the
original bug report.

-- Daniele



Re: libpq connection status and closed fd

От
Dmitriy Igrishin
Дата:


2014-09-22 11:35 GMT+04:00 Daniele Varrazzo <daniele.varrazzo@gmail.com>:
On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin <dmitigr@gmail.com> wrote:
>
> 2014-09-22 10:42 GMT+04:00 Daniele Varrazzo <daniele.varrazzo@gmail.com>:

>> [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f
>
> Why are you using close() instead of PQfinish()?

Because I'm testing for an error, please read my message and the
original bug report.
I read it. You are testing for an error, but use libpq in a wrong way in
your test case. You must use PQfinish() to close the connection
and PQstatus() will works for you.


-- Daniele



--
// Dmitriy.

Re: libpq connection status and closed fd

От
Marko Tiikkaja
Дата:
On 9/22/14 9:45 AM, Dmitriy Igrishin wrote:
> 2014-09-22 11:35 GMT+04:00 Daniele Varrazzo <daniele.varrazzo@gmail.com>:
>
>> On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin <dmitigr@gmail.com>
>> wrote:
>>> Why are you using close() instead of PQfinish()?
>>
>> Because I'm testing for an error, please read my message and the
>> original bug report.
>>
> I read it. You are testing for an error, but use libpq in a wrong way in
> your test case. You must use PQfinish() to close the connection
> and PQstatus() will works for you.

Perhaps you should go back and re-read it then.  The point of the test 
case is not to test connection closure; it's to test behaviour in the 
presence of network errors.


.marko



Re: libpq connection status and closed fd

От
Dmitriy Igrishin
Дата:


2014-09-22 12:36 GMT+04:00 Marko Tiikkaja <marko@joh.to>:
On 9/22/14 9:45 AM, Dmitriy Igrishin wrote:
2014-09-22 11:35 GMT+04:00 Daniele Varrazzo <daniele.varrazzo@gmail.com>:

On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin <dmitigr@gmail.com>
wrote:
Why are you using close() instead of PQfinish()?

Because I'm testing for an error, please read my message and the
original bug report.

I read it. You are testing for an error, but use libpq in a wrong way in
your test case. You must use PQfinish() to close the connection
and PQstatus() will works for you.

Perhaps you should go back and re-read it then.  The point of the test case is not to test connection closure; it's to test behaviour in the presence of network errors.
And where the network error emulation in the test case? By closing fd?
I'm sorry if I don't understand something, but really, I don't see any problem
or incorrect behavior of *libpq*. It's behavior adequate to a test case.



.marko



--
// Dmitriy.

Re: libpq connection status and closed fd

От
Andres Freund
Дата:
On 2014-09-22 07:42:01 +0100, Daniele Varrazzo wrote:
> Hello,
> 
> a psycopg user is reporting [1] that the library is not marking the
> connection as closed and/or bad after certain errors, such as a
> connection timeout. He is emulating the error by closing the
> connection fd (I don't know if the two conditions result in the same
> effect, but I'll stick to this hypothesis for now).
> 
> [1] https://github.com/psycopg/psycopg2/issues/263
> 
> Testing with a short C program [2] it seems that indeed, after closing
> the fd and causing an error in `PQconsumeInput`, the connection is
> deemed in good state by `PQstatus`. A similar test gives the same
> result after `PQexec` is attempted on a connection whose fd is closed
> (tests performed with PostgreSQL 9.3.5).
> 
> [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f
> 
> Is this intentional? Is there a better way to check for a broken connection?

Note that the libpq code treats connection resets differently from
other, arbitrary, errors:

int
pqReadData(PGconn *conn)
{
...nread = pqsecure_read(conn, conn->inBuffer + conn->inEnd,                      conn->inBufSize - conn->inEnd);if
(nread< 0){    if (SOCK_ERRNO == EINTR)        goto retry3;    /* Some systems return EAGAIN/EWOULDBLOCK for no data
*/
#ifdef EAGAIN    if (SOCK_ERRNO == EAGAIN)        return someread;
#endif
#if defined(EWOULDBLOCK) && (!defined(EAGAIN) || (EWOULDBLOCK != EAGAIN))    if (SOCK_ERRNO == EWOULDBLOCK)
returnsomeread;
 
#endif    /* We might get ECONNRESET here if using TCP and backend died */
#ifdef ECONNRESET    if (SOCK_ERRNO == ECONNRESET)        goto definitelyFailed;
#endif    /* pqsecure_read set the error message for us */    return -1;}


I.e. if the kernel returns that the connection has been closed it'll do
different stuff.

So I'm not convinced that the testcaseq is valid. This isn't the error
you'd get after a timeout or similar. We could add a special case path
for EBADFD, but why?

> If we mark the connection as closed every time `PQconsumeInput`
> returns 0 (or `PQexec` returns null, which are the two code paths
> affecting psycopg) would it be too aggressive and cause false
> positives?

Likely, yes.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: libpq connection status and closed fd

От
Marko Tiikkaja
Дата:
On 9/22/14 10:57 AM, Andres Freund wrote:
> On 2014-09-22 07:42:01 +0100, Daniele Varrazzo wrote:
>> Is this intentional? Is there a better way to check for a broken connection?
>
> Note that the libpq code treats connection resets differently from
> other, arbitrary, errors:
>
> I.e. if the kernel returns that the connection has been closed it'll do
> different stuff.
>
> So I'm not convinced that the testcaseq is valid. This isn't the error
> you'd get after a timeout or similar. We could add a special case path
> for EBADFD, but why?

Probably not for EBADFD, but how about e.g. ETIMEDOUT?  The SSL code 
path seems to be putting EPIPE into the same category as ECONNRESET, too.


.marko



Re: libpq connection status and closed fd

От
Tom Lane
Дата:
Daniele Varrazzo <daniele.varrazzo@gmail.com> writes:
> a psycopg user is reporting [1] that the library is not marking the
> connection as closed and/or bad after certain errors, such as a
> connection timeout. He is emulating the error by closing the
> connection fd

That seems like a completely illegitimate test procedure.  A network
failure does not result in automatic closure of the FD so there is
no reason for the libpq code to be expecting that to happen.  Instead,
it expects to see an appropriate error code next time it tries to
use the socket.

If you want to test for connection loss, consider doing a kill -9 on
the connected backend (best not to do that with a production server
of course).

A larger point is that your user may well have false expectations
about how quickly libpq will detect connection loss.  Generally
that won't happen until it next tries to transmit or receive some
data; so a simple PQstatus() call doesn't prove a lot about whether
the connection has been lost since the last query.  This is not a
bug either IMO.
        regards, tom lane