Обсуждение: vacuum vs open transactions

Поиск
Список
Период
Сортировка

vacuum vs open transactions

От
"Ed L."
Дата:
I'm looking at some 7.3.4 vacuum output, and at first glance it does not
appear that vacuum is reclaiming any dead tuple space if there is even a
single open transaction, even if the open transaction does not in any way
reference the table being vacuumed.  Is that correct?  Is the behavior
different in later versions?

Ed


Re: vacuum vs open transactions

От
Scott Marlowe
Дата:
On Wed, 2005-01-12 at 11:59, Ed L. wrote:
> I'm looking at some 7.3.4 vacuum output, and at first glance it does not
> appear that vacuum is reclaiming any dead tuple space if there is even a
> single open transaction, even if the open transaction does not in any way
> reference the table being vacuumed.  Is that correct?  Is the behavior
> different in later versions?

I believe the problem is occurring if the open transaction is older than
the tuples that could be vacuumed.  The MVCC system means that as long
as a transaction that started X hours ago is still open, the tuples that
have been freed since then can't vacuumed because they need to stay
visible for that transaction.

Re: vacuum vs open transactions

От
"Ed L."
Дата:
On Wednesday January 12 2005 11:10, Scott Marlowe wrote:
>
> I believe the problem is occurring if the open transaction is older than
> the tuples that could be vacuumed.  The MVCC system means that as long
> as a transaction that started X hours ago is still open, the tuples that
> have been freed since then can't vacuumed because they need to stay
> visible for that transaction.

Is it possible via SQL query to tell how long a transaction has been open?

Ed


Re: vacuum vs open transactions

От
Michael Fuhr
Дата:
On Wed, Jan 12, 2005 at 11:49:12AM -0700, Ed L. wrote:
>
> Is it possible via SQL query to tell how long a transaction has been open?

I'm not aware of a way to find out when a transaction started, but
if you have stats_command_string enabled then you can query
pg_stat_activity to see when a session's current query started.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: vacuum vs open transactions

От
Alvaro Herrera
Дата:
On Wed, Jan 12, 2005 at 11:30:50PM -0700, Michael Fuhr wrote:
> On Wed, Jan 12, 2005 at 11:49:12AM -0700, Ed L. wrote:
> >
> > Is it possible via SQL query to tell how long a transaction has been open?
>
> I'm not aware of a way to find out when a transaction started, but
> if you have stats_command_string enabled then you can query
> pg_stat_activity to see when a session's current query started.

now() returns the current transaction's start time.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"No necesitamos banderas
 No reconocemos fronteras"                  (Jorge González)

Re: vacuum vs open transactions

От
"Ed L."
Дата:
On Thursday January 13 2005 5:50, Alvaro Herrera wrote:
> On Wed, Jan 12, 2005 at 11:30:50PM -0700, Michael Fuhr wrote:
> > On Wed, Jan 12, 2005 at 11:49:12AM -0700, Ed L. wrote:
> > > Is it possible via SQL query to tell how long a transaction has been
> > > open?
> >
> > I'm not aware of a way to find out when a transaction started, but
> > if you have stats_command_string enabled then you can query
> > pg_stat_activity to see when a session's current query started.
>
> now() returns the current transaction's start time.

That would be perfect if I could see that for transactions other than my
own.

Ed


Re: vacuum vs open transactions

От
"Ed L."
Дата:
On Wednesday January 12 2005 11:30, Michael Fuhr wrote:
> On Wed, Jan 12, 2005 at 11:49:12AM -0700, Ed L. wrote:
> > Is it possible via SQL query to tell how long a transaction has been
> > open?
>
> I'm not aware of a way to find out when a transaction started, but
> if you have stats_command_string enabled then you can query
> pg_stat_activity to see when a session's current query started.

Yes, I see that in 7.4 (not in 7.3).  But my purpose would be to remotely
identify long-open transactions that are causing table bloat by making
vacuum fail to reclaim space, so it seems I need the transaction start
time, not query start time.  Most likely this situation occurs when 1)
someone starts a transaction in psql and then leaves it there, or 2) an
application opens a transaction prior to getting user input.

Ed


Re: vacuum vs open transactions

От
Alvaro Herrera
Дата:
On Thu, Jan 13, 2005 at 08:45:38AM -0700, Ed L. wrote:
> On Thursday January 13 2005 5:50, Alvaro Herrera wrote:
> > On Wed, Jan 12, 2005 at 11:30:50PM -0700, Michael Fuhr wrote:
> > > On Wed, Jan 12, 2005 at 11:49:12AM -0700, Ed L. wrote:
> > > > Is it possible via SQL query to tell how long a transaction has been
> > > > open?

> > now() returns the current transaction's start time.
>
> That would be perfect if I could see that for transactions other than my
> own.

No, there's no way to know that.  Unless, of course, you save it
yourself somewhere at transaction start (though because of isolation,
using a regular table would not help you any.  Perhaps some PL's shared
data can help you there.)

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Este mail se entrega garantizadamente 100% libre de sarcasmo.

Re: vacuum vs open transactions

От
"Ed L."
Дата:
On Thursday January 13 2005 10:09, Michael Fuhr wrote:
>
> For idle transactions pg_stat_activity shows "<IDLE> in transaction"
> and the query_start column shows when the transaction became idle
> (i.e., when the last statement completed).  So if long-lived idle
> transactions are the problem, then at least you can find out how
> long they've been idle.

That will help, thanks.  Unfortunately, that doesn't appear to work for many
of our pre-7.4 clusters, of which we have 60, but that will definitely help
as we migrate forward.

Ed


Re: vacuum vs open transactions

От
Michael Fuhr
Дата:
On Thu, Jan 13, 2005 at 08:44:56AM -0700, Ed L. wrote:
> On Wednesday January 12 2005 11:30, Michael Fuhr wrote:
> >
> > I'm not aware of a way to find out when a transaction started, but
> > if you have stats_command_string enabled then you can query
> > pg_stat_activity to see when a session's current query started.
>
> Yes, I see that in 7.4 (not in 7.3).  But my purpose would be to remotely
> identify long-open transactions that are causing table bloat by making
> vacuum fail to reclaim space, so it seems I need the transaction start
> time, not query start time.  Most likely this situation occurs when 1)
> someone starts a transaction in psql and then leaves it there, or 2) an
> application opens a transaction prior to getting user input.

For idle transactions pg_stat_activity shows "<IDLE> in transaction"
and the query_start column shows when the transaction became idle
(i.e., when the last statement completed).  So if long-lived idle
transactions are the problem, then at least you can find out how
long they've been idle.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: vacuum vs open transactions

От
Michael Fuhr
Дата:
On Thu, Jan 13, 2005 at 09:50:38AM -0300, Alvaro Herrera wrote:
> On Wed, Jan 12, 2005 at 11:30:50PM -0700, Michael Fuhr wrote:
> >
> > I'm not aware of a way to find out when a transaction started, but
> > if you have stats_command_string enabled then you can query
> > pg_stat_activity to see when a session's current query started.
>
> now() returns the current transaction's start time.

I meant when any transaction started, particularly a transaction
other than the current one.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: vacuum vs open transactions

От
Martijn van Oosterhout
Дата:
On Thu, Jan 13, 2005 at 08:44:56AM -0700, Ed L. wrote:
> Yes, I see that in 7.4 (not in 7.3).  But my purpose would be to remotely
> identify long-open transactions that are causing table bloat by making
> vacuum fail to reclaim space, so it seems I need the transaction start
> time, not query start time.  Most likely this situation occurs when 1)
> someone starts a transaction in psql and then leaves it there, or 2) an
> application opens a transaction prior to getting user input.

Wouldn't the transaction ID be more useful. An earlier transaction ID
obviously started earlier. So you should be able to identify the oldest
transaction. Would the transaction ID field in pg_locks do?

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

Re: vacuum vs open transactions

От
Michael Fuhr
Дата:
On Thu, Jan 13, 2005 at 12:04:28PM -0700, Ed L. wrote:

> It'd be nice if pg_stat_activity.transaction_start were added in the future
> for a 100% answer, but I'm not sure there's much interest in this apart
> from our needs.

I wouldn't expect that to be hard to add.  Consider submitting a
patch or proposing it to the developers.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: vacuum vs open transactions

От
Michael Fuhr
Дата:
On Thu, Jan 13, 2005 at 10:33:50AM -0700, Ed L. wrote:
> On Thursday January 13 2005 10:09, Michael Fuhr wrote:
> >
> > For idle transactions pg_stat_activity shows "<IDLE> in transaction"
> > and the query_start column shows when the transaction became idle
>
> That will help, thanks.  Unfortunately, that doesn't appear to work for many
> of our pre-7.4 clusters, of which we have 60, but that will definitely help
> as we migrate forward.

Ah yes, I see in the Relase Notes that the query start time was
added in 7.4.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: vacuum vs open transactions

От
"Ed L."
Дата:
On Thursday January 13 2005 11:37, Michael Fuhr wrote:
>
> That'll show which transaction is oldest but not how long it's been
> open or idle, i.e., whether it's "long-open" or not.  I assumed,
> perhaps incorrectly, that he was already looking at pg_locks and
> wanted to find out which of those transactions had been open for a
> long time.  Since pg_locks has a pid column, you can join (visually
> or via a join query) with pg_stat_activity's procpid column.

What I'm after is a simple way to automatically tell via cron query if there
is a transaction staying open long enough (probably an hour) to cause bloat
but that doesn't require any visual inspection.  While not fool-proof,
pg_stat_activity.query_start looks like a 90% answer.

It'd be nice if pg_stat_activity.transaction_start were added in the future
for a 100% answer, but I'm not sure there's much interest in this apart
from our needs.

Ed


Re: vacuum vs open transactions

От
Michael Fuhr
Дата:
On Thu, Jan 13, 2005 at 07:11:09PM +0100, Martijn van Oosterhout wrote:
>
> Wouldn't the transaction ID be more useful. An earlier transaction ID
> obviously started earlier. So you should be able to identify the oldest
> transaction. Would the transaction ID field in pg_locks do?

That'll show which transaction is oldest but not how long it's been
open or idle, i.e., whether it's "long-open" or not.  I assumed,
perhaps incorrectly, that he was already looking at pg_locks and
wanted to find out which of those transactions had been open for a
long time.  Since pg_locks has a pid column, you can join (visually
or via a join query) with pg_stat_activity's procpid column.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/