Re: Designing a better connection pool for psycopg3

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Designing a better connection pool for psycopg3
Дата
Msg-id CABUevEy6iYxAgH6qFJN5OqiPFkv1O1JU90jbHna2xDXqFDLFHg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Designing a better connection pool for psycopg3  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Ответы Re: Designing a better connection pool for psycopg3  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Re: Designing a better connection pool for psycopg3  (Karsten Hilbert <Karsten.Hilbert@gmx.net>)
Re: Designing a better connection pool for psycopg3  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Список psycopg
On Mon, Jan 18, 2021 at 3:29 PM Daniele Varrazzo
<daniele.varrazzo@gmail.com> wrote:
>
> On Mon, 18 Jan 2021 at 15:05, Magnus Hagander <magnus@hagander.net> wrote:
> >
> > On Mon, Jan 18, 2021 at 2:50 PM Daniele Varrazzo
> > <daniele.varrazzo@gmail.com> wrote:
> > >
> > > On Mon, 18 Jan 2021 at 14:13, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote:
> > >
> > > > I would strongly advise against making sys.exit() the default
> > > > for pool.terminate() unless I misunderstand something.
> > >
> > > How would you terminate the program if a maintenance thread, not the
> > > main one, thinks that the program is not in working state?
> >
> > Why would it be OK for a maintenance thread to terminate the program
> > at all? And certainly by default?
> >
> > Wouldn't the reasonable thing to do be to flag the pool as broken, and
> > then just stop trying. Then whenever the application makes an attempt
> > to use the pool, it can he thrown an exception saying that this
> > happened.
>
> I'm trying to imagine what happens in a case such as a network
> partition or reconfiguration, and the app server doesn't see the
> database anymore. This node is arguably broken.

Only if this is the *only* thing it does.

It might still be able to reach other services on other nodes. Other
databases. Heck, even other database son the same node if it was a
config error.


> If the reconnection thread fails to obtain new connections, and the
> ones currently in the pool are discarded as detected broken,
> eventually the pool is depleted.
>
> The requesting threads (e.g. web requests handlers) would try to
> obtain a connection, time out after e.g. 30 seconds, and receive an
> error. The error would result in a 500 for the user, probably a sentry
> exception and log in a file, but the program would likely not die.

Yes, and thus it would continue to serve all requests that don't need
a database connection from this particular pool.


> The program could remain in this condition for an arbitrary long time,
> until someone notices by looking at the logs and scratches their head
> to understand how to fix the problem.

The *program* can still decide to do this. By itself doing an exit(1)
when the exception is thrown (per my suggestion above), or by just not
catching that exception at all.


> If the program dies, its manager would try to restart it, insisting
> until the configuration makes the database visible again. A service
> down in a crash loop is more visible than a service up and running but
> not serving.

But what about the other services that potentially ran in the same process?

It's an extremely simplistic view to think that a single web server
will only ever talk to a single database, *and* require this database
for all it's operations. Yet that's what this default would do.

The entire point of a connection pool is persistent processes that do
more than one thing, after all. And there is (surprising maybe, but
there is) still an entire world out that that isn't just serving
individual web requests.


> Anyway I appreciate that the default of terminating a program is
> probably too aggressive. So I would remove the `terminate()` function
> and base implementation and leave a `connection_failed()` handler,
> with a default no-op implementation, which people preferring their
> program to terminate can subclass (with `sys.exit(1)` or whatever
> termination strategy they find useful).

That would definitely work. A plugin-point for this would be very useful.

And if you want to avoid the "timeout based error" in the client,
adding a flag to the pool saying "this pool is currently unhealthy" in
this case would work, and then whenever someone tries to get a
connection out of this pool it can throw an exception immediately
instead of waiting for a timeout to happen.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



В списке psycopg по дате отправления:

Предыдущее
От: Daniele Varrazzo
Дата:
Сообщение: Re: Designing a better connection pool for psycopg3
Следующее
От: Karsten Hilbert
Дата:
Сообщение: Re: Designing a better connection pool for psycopg3