Обсуждение: pg_dump's aborted transactions

Поиск
Список
Период
Сортировка

pg_dump's aborted transactions

От
Stephen Frost
Дата:
All,
 We recently had a client complain that check_postgres' commitratio check would alert about relatively unused
databases. As it turns out, the reason for this is because they automate running pg_dump against their databases
(surelya good thing..), but pg_dump doesn't close out its transaction cleanly, leading to rolled back transactions.
 
 At first blush, at least, this strikes me as an oversight which we should probably fix and possibly backpatch.
 Thoughts?
     Thanks,
    Stephen

Re: pg_dump's aborted transactions

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> All,
>   We recently had a client complain that check_postgres' commitratio
>   check would alert about relatively unused databases.  As it turns
>   out, the reason for this is because they automate running pg_dump
>   against their databases (surely a good thing..), but pg_dump doesn't
>   close out its transaction cleanly, leading to rolled back
>   transactions.

>   At first blush, at least, this strikes me as an oversight which we
>   should probably fix and possibly backpatch.

No, somebody should fix check_postgres to count rollbacks as well as
commits as activity (as they obviously are).

This is not an oversight, it's 100% intentional.  The reason pg_dump
aborts rather than commits is to make entirely sure that it does not
commit any changes to the database.  I would be against removing that
safety feature, considering that pg_dump is typically run as superuser.
We have frequently worried about security exploits that involve hijacking
superuser activities, and this behavior provides at least a small
increment of safety against such holes.
        regards, tom lane



Re: pg_dump's aborted transactions

От
Guillaume Lelarge
Дата:
2015-02-04 6:37 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
Stephen Frost <sfrost@snowman.net> writes:
> All,
>   We recently had a client complain that check_postgres' commitratio
>   check would alert about relatively unused databases.  As it turns
>   out, the reason for this is because they automate running pg_dump
>   against their databases (surely a good thing..), but pg_dump doesn't
>   close out its transaction cleanly, leading to rolled back
>   transactions.

>   At first blush, at least, this strikes me as an oversight which we
>   should probably fix and possibly backpatch.

No, somebody should fix check_postgres to count rollbacks as well as
commits as activity (as they obviously are).


Well, actually, no. This is a commit ratio, not an activity counter, not even a transactions count.

The formula right now is:

round(100.*sd.xact_commit/(sd.xact_commit+sd.xact_rollback), 2)

which, AFAICT, is correct.

The fact that the OP uses it to know if there's activity on his databases can get him false positives if he has no actual activity, except for dumps.

I might be wrong, but there is nothing to fix on the check_postgres (at least, for this issue ;) ). The expectation of this user is to fix :)

This is not an oversight, it's 100% intentional.  The reason pg_dump
aborts rather than commits is to make entirely sure that it does not
commit any changes to the database.  I would be against removing that
safety feature, considering that pg_dump is typically run as superuser.
We have frequently worried about security exploits that involve hijacking
superuser activities, and this behavior provides at least a small
increment of safety against such holes.


+1


--

Re: pg_dump's aborted transactions

От
Stephen Frost
Дата:
* Guillaume Lelarge (guillaume@lelarge.info) wrote:
> 2015-02-04 6:37 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
> > Stephen Frost <sfrost@snowman.net> writes:
> > No, somebody should fix check_postgres to count rollbacks as well as
> > commits as activity (as they obviously are).
> >
> Well, actually, no. This is a commit ratio, not an activity counter, not
> even a transactions count.
>
> The formula right now is:
>
> round(100.*sd.xact_commit/(sd.xact_commit+sd.xact_rollback), 2)
>
> which, AFAICT, is correct.
>
> The fact that the OP uses it to know if there's activity on his databases
> can get him false positives if he has no actual activity, except for dumps.
>
> I might be wrong, but there is nothing to fix on the check_postgres (at
> least, for this issue ;) ). The expectation of this user is to fix :)

Apologies for the confusion- the client isn't using it to determine if
there's activity.  They're using it exactly as it's intended, as I
understand it- to check and see if the number of rollbacks is
signifigant compared to the number of commits.  The issue is that, with
databases that have little activity, you can end up with a commit ratio
of 50% or less.  Perhaps check_postgres could have an option to only
complain when some minimum number of transactions has been reached..

I've already told them that not all aborted transactions are necessairly
bad ones (I tend to create a lot when I'm using psql, in fact), and that
the commit ratio check is really intended for active databases as just a
heuristic to detect if things have suddenly changed for the worse.

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> This is not an oversight, it's 100% intentional.  The reason pg_dump
> aborts rather than commits is to make entirely sure that it does not
> commit any changes to the database.  I would be against removing that
> safety feature, considering that pg_dump is typically run as superuser.
> We have frequently worried about security exploits that involve hijacking
> superuser activities, and this behavior provides at least a small
> increment of safety against such holes.

We already mark the transaction as READ ONLY in modern versions and so
I'm not sure that this really buys us all that much.  If someone's able
to get sufficient transaction control to get out from the read-only one
which is created, aborting at the end isn't going to help.
Thanks,
    Stephen

Re: pg_dump's aborted transactions

От
Robert Haas
Дата:
On Wed, Feb 4, 2015 at 8:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Apologies for the confusion- the client isn't using it to determine if
> there's activity.  They're using it exactly as it's intended, as I
> understand it- to check and see if the number of rollbacks is
> signifigant compared to the number of commits.  The issue is that, with
> databases that have little activity, you can end up with a commit ratio
> of 50% or less.  Perhaps check_postgres could have an option to only
> complain when some minimum number of transactions has been reached..

That sounds like a much better idea.  pg_dump isn't really doing
anything wrong here, so "fixing" it feels like the wrong solution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_dump's aborted transactions

От
Stephen Frost
Дата:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Feb 4, 2015 at 8:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > Apologies for the confusion- the client isn't using it to determine if
> > there's activity.  They're using it exactly as it's intended, as I
> > understand it- to check and see if the number of rollbacks is
> > signifigant compared to the number of commits.  The issue is that, with
> > databases that have little activity, you can end up with a commit ratio
> > of 50% or less.  Perhaps check_postgres could have an option to only
> > complain when some minimum number of transactions has been reached..
>
> That sounds like a much better idea.  pg_dump isn't really doing
> anything wrong here, so "fixing" it feels like the wrong solution.

It occurs to me that 'commit ratio' might not be the check that
check_postgres users really want but is, instead, a proxy as we don't
actually provide a way to get at what they really want.

Perhaps we should consider adding a new stat- *errored* transactions.
This would be a count of transactions which rolled back due to an error
(and not an explicit "rollback;" or abort), or even have counts of both
errored and aborted transactions.

At least in my experience, rolled back transactions aren't all bad, but
transactions which were terminated due to an error case are things you
care about and want to look into.  I've used tail-n-mail with PG logs to
check for such errors, but it'd certainly be nice to have a Nagios check
which could report on them.
Thanks,
    Stephen