Обсуждение: Should we remove vacuum_defer_cleanup_age?

Поиск
Список
Период
Сортировка

Should we remove vacuum_defer_cleanup_age?

От
Andres Freund
Дата:
Hi,

As evidenced by the bug fixed in be504a3e974, vacuum_defer_cleanup_age is not
heavily used - the bug was trivial to hit as soon as vacuum_defer_cleanup_age
is set to a non-toy value. It complicates thinking about visibility horizons
substantially, as vacuum_defer_cleanup_age can make them go backward
substantially. Obviously it's also severely undertested.

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

vacuum_defer_cleanup_age was added as part of hot standby. Back then we did
not yet have hot_standby_feedback. Now that that exists,
vacuum_defer_cleanup_age doesn't seem like a good idea anymore. It's
pessimisistic, i.e. always retains rows, even if none of the standbys has an
old enough snapshot.

The only benefit of vacuum_defer_cleanup_age over hot_standby_feedback is that
it provides a limit of some sort. But transactionids aren't producing dead
rows in a uniform manner, so limiting via xid isn't particularly useful. And
even if there are use cases, it seems those would be served better by
introducing a cap on how much hot_standby_feedback can hold the horizon back.

I don't think I have the cycles to push this through in the next weeks, but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

Greetings,

Andres Freund



Re: Should we remove vacuum_defer_cleanup_age?

От
Magnus Hagander
Дата:

On Sat, Mar 18, 2023 at 12:09 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

As evidenced by the bug fixed in be504a3e974, vacuum_defer_cleanup_age is not
heavily used - the bug was trivial to hit as soon as vacuum_defer_cleanup_age
is set to a non-toy value. It complicates thinking about visibility horizons
substantially, as vacuum_defer_cleanup_age can make them go backward
substantially. Obviously it's also severely undertested.

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

vacuum_defer_cleanup_age was added as part of hot standby. Back then we did
not yet have hot_standby_feedback. Now that that exists,
vacuum_defer_cleanup_age doesn't seem like a good idea anymore. It's
pessimisistic, i.e. always retains rows, even if none of the standbys has an
old enough snapshot.

The only benefit of vacuum_defer_cleanup_age over hot_standby_feedback is that
it provides a limit of some sort. But transactionids aren't producing dead
rows in a uniform manner, so limiting via xid isn't particularly useful. And
even if there are use cases, it seems those would be served better by
introducing a cap on how much hot_standby_feedback can hold the horizon back.

I don't think I have the cycles to push this through in the next weeks, but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

+1. I haven't seen any (correct) use of this in many many years on my end at least.

And yes, having a cap on hot_standby_feedback would also be great. 

--

Re: Should we remove vacuum_defer_cleanup_age?

От
Alvaro Herrera
Дата:
On 2023-Mar-17, Andres Freund wrote:

> I started writing a test for vacuum_defer_cleanup_age while working on the fix
> referenced above, but now I am wondering if said energy would be better spent
> removing vacuum_defer_cleanup_age alltogether.

+1  I agree it's not useful anymore.

> I don't think I have the cycles to push this through in the next weeks, but if
> we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
> good idea to mark it as deprecated in 16?

Hmm, for the time being, can we just "disable" it by disallowing to set
the GUC to a value different from 0?  Then we can remove the code later
in the cycle at leisure.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"



Re: Should we remove vacuum_defer_cleanup_age?

От
Justin Pryzby
Дата:
On Sat, Mar 18, 2023 at 10:33:57AM +0100, Alvaro Herrera wrote:
> On 2023-Mar-17, Andres Freund wrote:
> 
> > I started writing a test for vacuum_defer_cleanup_age while working on the fix
> > referenced above, but now I am wondering if said energy would be better spent
> > removing vacuum_defer_cleanup_age alltogether.
> 
> +1  I agree it's not useful anymore.
> 
> > I don't think I have the cycles to push this through in the next weeks, but if
> > we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
> > good idea to mark it as deprecated in 16?
> 
> Hmm, for the time being, can we just "disable" it by disallowing to set
> the GUC to a value different from 0?  Then we can remove the code later
> in the cycle at leisure.

It can be useful to do a "rolling transition", and it's something I do
often.

But I can't see why that would be useful here?  It seems like something
that could be done after the feature freeze.  It's removing a feature,
not adding one.

-- 
Justin



Re: Should we remove vacuum_defer_cleanup_age?

От
Andres Freund
Дата:
Hi,

On 2023-03-22 11:44:20 -0500, Justin Pryzby wrote:
> On Sat, Mar 18, 2023 at 10:33:57AM +0100, Alvaro Herrera wrote:
> > On 2023-Mar-17, Andres Freund wrote:
> > 
> > > I started writing a test for vacuum_defer_cleanup_age while working on the fix
> > > referenced above, but now I am wondering if said energy would be better spent
> > > removing vacuum_defer_cleanup_age alltogether.
> > 
> > +1  I agree it's not useful anymore.
> > 
> > > I don't think I have the cycles to push this through in the next weeks, but if
> > > we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
> > > good idea to mark it as deprecated in 16?
> > 
> > Hmm, for the time being, can we just "disable" it by disallowing to set
> > the GUC to a value different from 0?  Then we can remove the code later
> > in the cycle at leisure.
> 
> It can be useful to do a "rolling transition", and it's something I do
> often.
> 
> But I can't see why that would be useful here?  It seems like something
> that could be done after the feature freeze.  It's removing a feature,
> not adding one.

It wasn't actually that much work to write a patch to remove
vacuum_defer_cleanup_age, see the attached.

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

Greetings,

Andres Freund

Вложения

Re: Should we remove vacuum_defer_cleanup_age?

От
Daniel Gustafsson
Дата:
> On 22 Mar 2023, at 18:00, Andres Freund <andres@anarazel.de> wrote:

> It wasn't actually that much work to write a patch to remove
> vacuum_defer_cleanup_age, see the attached.

-    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
+    provides protection against
     relevant rows being removed by vacuum, but the former provides no
     protection during any time period when the standby is not connected,
     and the latter often needs to be set to a high value to provide adequate

Isn't "the latter" in the kept part of the sentence referring to the guc we're
removing here?

-     * It's possible that slots / vacuum_defer_cleanup_age backed up the
-     * horizons further than oldest_considered_running. Fix.
+     * It's possible that slots backed up the horizons further than
+     * oldest_considered_running. Fix.

While not the fault of this patch, can't we use the opportunity to expand
"Fix." to a short "Fix this by ..." sentence?  Or remove "Fix." perhaps, either
of those would improve the comment IMHO.

> I don't know whether others think we should apply it this release, given the
> "late submission", but I tend to think it's not worth caring the complication
> of vacuum_defer_cleanup_age forward.

It might be late in the cycle, but as it's not adding something that might
break but rather removing something that's known for being problematic (and not
useful) I think it's Ok.

--
Daniel Gustafsson




Re: Should we remove vacuum_defer_cleanup_age?

От
Andres Freund
Дата:
Hi,

On 2023-03-23 10:18:35 +0100, Daniel Gustafsson wrote:
> > On 22 Mar 2023, at 18:00, Andres Freund <andres@anarazel.de> wrote:
> 
> > It wasn't actually that much work to write a patch to remove
> > vacuum_defer_cleanup_age, see the attached.
> 
> -    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
> +    provides protection against
>      relevant rows being removed by vacuum, but the former provides no
>      protection during any time period when the standby is not connected,
>      and the latter often needs to be set to a high value to provide adequate
> 
> Isn't "the latter" in the kept part of the sentence referring to the guc we're
> removing here?

You're right. That paragraph generally seems a bit off. In HEAD:

   <para>
    In lieu of using replication slots, it is possible to prevent the removal
    of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
    storing the segments in an archive using
    <xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
    However, these methods often result in retaining more WAL segments than
    required, whereas replication slots retain only the number of segments
    known to be needed.  On the other hand, replication slots can retain so
    many WAL segments that they fill up the space allocated
    for <literal>pg_wal</literal>;
    <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
    retained by replication slots.
   </para>
   <para>
    Similarly, <xref linkend="guc-hot-standby-feedback"/>
    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
    relevant rows being removed by vacuum, but the former provides no
    protection during any time period when the standby is not connected,
    and the latter often needs to be set to a high value to provide adequate
    protection.  Replication slots overcome these disadvantages.
   </para>

Replication slots alone don't prevent row removal, that requires
hot_standby_feedback to be used as well.

A minimal rephrasing would be:
   <para>
    Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
    also using a replication slot, provides protection against relevant rows
    being removed by vacuum, but provides no protection during any time period
    when the standby is not connected.  Replication slots overcome these
    disadvantages.
   </para>



> -     * It's possible that slots / vacuum_defer_cleanup_age backed up the
> -     * horizons further than oldest_considered_running. Fix.
> +     * It's possible that slots backed up the horizons further than
> +     * oldest_considered_running. Fix.
> 
> While not the fault of this patch, can't we use the opportunity to expand
> "Fix." to a short "Fix this by ..." sentence?  Or remove "Fix." perhaps, either
> of those would improve the comment IMHO.

Perhaps unsurprisingly, given that I wrote that comment, I don't see a problem
with the "Fix."...

Greetings,

Andres Freund



Re: Should we remove vacuum_defer_cleanup_age?

От
Daniel Gustafsson
Дата:
> On 24 Mar 2023, at 21:27, Andres Freund <andres@anarazel.de> wrote:
> On 2023-03-23 10:18:35 +0100, Daniel Gustafsson wrote:
>>> On 22 Mar 2023, at 18:00, Andres Freund <andres@anarazel.de> wrote:
>>
>>> It wasn't actually that much work to write a patch to remove
>>> vacuum_defer_cleanup_age, see the attached.
>>
>> -    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
>> +    provides protection against
>>     relevant rows being removed by vacuum, but the former provides no
>>     protection during any time period when the standby is not connected,
>>     and the latter often needs to be set to a high value to provide adequate
>>
>> Isn't "the latter" in the kept part of the sentence referring to the guc we're
>> removing here?
>
> You're right. That paragraph generally seems a bit off. In HEAD:
>
> ...
>
> Replication slots alone don't prevent row removal, that requires
> hot_standby_feedback to be used as well.
>
> A minimal rephrasing would be:
>   <para>
>    Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
>    also using a replication slot, provides protection against relevant rows
>    being removed by vacuum, but provides no protection during any time period
>    when the standby is not connected.  Replication slots overcome these
>    disadvantages.
>   </para>

+1, that's definitely an improvement.

--
Daniel Gustafsson




Re: Should we remove vacuum_defer_cleanup_age?

От
Peter Geoghegan
Дата:
On Sat, Mar 18, 2023 at 2:34 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Mar-17, Andres Freund wrote:
>
> > I started writing a test for vacuum_defer_cleanup_age while working on the fix
> > referenced above, but now I am wondering if said energy would be better spent
> > removing vacuum_defer_cleanup_age alltogether.
>
> +1  I agree it's not useful anymore.

+1.

I am suspicious of most of the GUCs whose value is an XID age. It
strikes me as something that is convenient to the implementation, but
not to the user, since there are so many ways that XID age might be a
poor proxy for whatever it is that you really care about in each case.

A theoretical advantage of vacuum_defer_cleanup_age is that it allows
the user to control things in terms of the impact on the primary --
whereas hot_standby_feedback is a mechanism that controls things in
terms of the needs of the standby. In practice this is pretty useless,
but it seems like it might be possible to come up with some other new
mechanism that somehow does this in a way that's truly useful.
Something that allows the user to constrain how far we hold back
conflicts/vacuuming in terms of the *impact* on the primary.

It might be helpful to permit opportunistic cleanup by pruning and
index deletion at some point, but to throttle it when we know it would
violate some soft limit related to hot_standby_feedback. Maybe the
system could prevent the first few attempts at pruning when it
violates the soft limit, or make pruning prune somewhat less
aggressively where there is little advantage to it in terms of
space/tuples freed -- decide on what to do at the very last minute,
based on all available information at that late stage, with the full
context available. The system could be taught to be very patient at
first, when relatively few pruning operations have been attempted,
when the cost is basically still acceptable. But as more pruning
operations ran and clearly didn't free space that really should be
freed, we'd quickly lose patience.

The big idea here is to delay committing to any course of action for
as long as possible, so we wouldn't kill queries on standbys for very
little benefit on the primary, while at the same time avoiding ever
really failing to kill queries on standbys when the cost proved too
high on the primary. For this to have any chance of working it needs
to focus on the actual costs on the primary, and not some extremely
noisy proxy for that cost. The standby will have its query killed by
just one prune record affecting just one heap page, and delaying that
specific prune record is likely no big deal. It's preventing pruning
of tens of thousands of heap pages that we need to worry about.

--
Peter Geoghegan



Re: Should we remove vacuum_defer_cleanup_age?

От
Justin Pryzby
Дата:
On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:
> I don't know whether others think we should apply it this release, given the
> "late submission", but I tend to think it's not worth caring the complication
> of vacuum_defer_cleanup_age forward.

I don't see any utility in waiting; it just makes the process of
removing it take longer for no reason.

As long as it's done before the betas, it seems completely reasonable to
remove it for v16. 

-- 
Justin



Re: Should we remove vacuum_defer_cleanup_age?

От
Andres Freund
Дата:
Hi,

On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:
> On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:
> > I don't know whether others think we should apply it this release, given the
> > "late submission", but I tend to think it's not worth caring the complication
> > of vacuum_defer_cleanup_age forward.
>
> I don't see any utility in waiting; it just makes the process of
> removing it take longer for no reason.
>
> As long as it's done before the betas, it seems completely reasonable to
> remove it for v16.

Added the RMT.

We really should have a rmt@pg.o alias...

Updated patch attached. I think we should either apply something like that
patch, or at least add a <warning/> to the docs.

Greetings,

Andres Freund

Вложения

Re: Should we remove vacuum_defer_cleanup_age?

От
Amit Kapila
Дата:
On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:
> > On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:
> > > I don't know whether others think we should apply it this release, given the
> > > "late submission", but I tend to think it's not worth caring the complication
> > > of vacuum_defer_cleanup_age forward.
> >
> > I don't see any utility in waiting; it just makes the process of
> > removing it take longer for no reason.
> >
> > As long as it's done before the betas, it seems completely reasonable to
> > remove it for v16.
>
> Added the RMT.
>
> We really should have a rmt@pg.o alias...
>
> Updated patch attached. I think we should either apply something like that
> patch, or at least add a <warning/> to the docs.
>

+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?

--
With Regards,
Amit Kapila.



Re: Should we remove vacuum_defer_cleanup_age?

От
Alvaro Herrera
Дата:
On 2023-Apr-11, Andres Freund wrote:

> Updated patch attached. I think we should either apply something like that
> patch, or at least add a <warning/> to the docs.

I gave this patch a look.  The only code change is that
ComputeXidHorizons() and GetSnapshotData() no longer handle the case
where vacuum_defer_cleanup_age is different from zero.  It looks good.
The TransactionIdRetreatSafely() code being removed is pretty weird (I
spent a good dozen minutes writing a complaint that your rewrite was
faulty, but it turns out I had misunderstood the function), so I'm glad
it's being retired.


>     <para>
> -    Similarly, <xref linkend="guc-hot-standby-feedback"/>
> -    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
> -    relevant rows being removed by vacuum, but the former provides no
> -    protection during any time period when the standby is not connected,
> -    and the latter often needs to be set to a high value to provide adequate
> -    protection.  Replication slots overcome these disadvantages.
> +    Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
> +    also using a replication slot, provides protection against relevant rows
> +    being removed by vacuum, but provides no protection during any time period
> +    when the standby is not connected.  Replication slots overcome these
> +    disadvantages.

I think it made sense to have this paragraph be separate from the
previous one when it was talking about two separate variables, but now
that it's just one, it looks a bit isolated.  I would merge it with the
one above, which is talking about pretty much the same thing, and
reorder the whole thing approximately like this

   <para>
    In lieu of using replication slots, it is possible to prevent the removal
    of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
    storing the segments in an archive using
    <xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
    However, these methods often result in retaining more WAL segments than
    required.
    Similarly, <xref linkend="guc-hot-standby-feedback"/> without
    a replication slot provides protection against relevant rows
    being removed by vacuum, but provides no protection during any time period
    when the standby is not connected.
   </para>
   <para>
    Replication slots overcome these disadvantages by retaining only the number
    of segments known to be needed.
    On the other hand, replication slots can retain so
    many WAL segments that they fill up the space allocated
    for <literal>pg_wal</literal>;
    <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
    retained by replication slots.
   </para>

Though the "However," looks a poor fit; I would do this:

   <para>
    In lieu of using replication slots, it is possible to prevent the removal
    of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
    storing the segments in an archive using
    <xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
    A disadvantage of these methods is that they often result in retaining
    more WAL segments than required.
    Similarly, <xref linkend="guc-hot-standby-feedback"/> without
    a replication slot provides protection against relevant rows
    being removed by vacuum, but provides no protection during any time period
    when the standby is not connected.
   </para>
   <para>
    Replication slots overcome these disadvantages by retaining only the number
    of segments known to be needed.
    On the other hand, replication slots can retain so
    many WAL segments that they fill up the space allocated
    for <literal>pg_wal</literal>;
    <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
    retained by replication slots.
   </para>

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes...  Fixed.
                               http://postgr.es/m/482D1632.8010507@sigaev.ru



Re: Should we remove vacuum_defer_cleanup_age?

От
"Jonathan S. Katz"
Дата:
On 4/12/23 11:34 PM, Amit Kapila wrote:
> On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:
>>> On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:
>>>> I don't know whether others think we should apply it this release, given the
>>>> "late submission", but I tend to think it's not worth caring the complication
>>>> of vacuum_defer_cleanup_age forward.
>>>
>>> I don't see any utility in waiting; it just makes the process of
>>> removing it take longer for no reason.
>>>
>>> As long as it's done before the betas, it seems completely reasonable to
>>> remove it for v16.
>>
>> Added the RMT.
>>
>> We really should have a rmt@pg.o alias...

(I had thought something as much -- will reach out to pginfra about options)

>> Updated patch attached. I think we should either apply something like that
>> patch, or at least add a <warning/> to the docs.
>>

> +1 to do one of the above. I think there is a good chance that
> somebody might be doing more harm by using it so removing this
> shouldn't be a problem. Personally, I have not heard of people using
> it but OTOH it is difficult to predict so giving some time is also not
> a bad idea.
> 
> Do others have any opinion/suggestion on this matter?

I need a bit more time to study this before formulating an opinion on 
whether we should remove it for v16. In any case, I'm not against 
documentation.

Jonathan

Вложения

Re: Should we remove vacuum_defer_cleanup_age?

От
"Jonathan S. Katz"
Дата:
On 4/13/23 11:32 AM, Jonathan S. Katz wrote:
> On 4/12/23 11:34 PM, Amit Kapila wrote:
>> On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de> 

>> +1 to do one of the above. I think there is a good chance that
>> somebody might be doing more harm by using it so removing this
>> shouldn't be a problem. Personally, I have not heard of people using
>> it but OTOH it is difficult to predict so giving some time is also not
>> a bad idea.
>>
>> Do others have any opinion/suggestion on this matter?
> 
> I need a bit more time to study this before formulating an opinion on 
> whether we should remove it for v16. In any case, I'm not against 
> documentation.

(didn't need too much more time).

[RMT hat]

+1 for removing.

I looked at some data and it doesn't seem like vacuum_defer_cleanup_age 
is used in any significant way, whereas hot_standby_feedback is much 
more widely used. Given this, and all the problems + arguments made in 
the thread, we should just get rid of it for v16.

There are cases where we should deprecate before removing, but I don't 
think this one based upon usage and having a better alternative.

Per [1] it does sound like we can make some improvements to 
hot_standby_feedback, but those can wait to v17.

We should probably set $DATE to finish this, too. I don't think it's a 
rush, but we should give enough time before Beta 1.

Jonathan

[1] 
https://www.postgresql.org/message-id/20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de

Вложения

Re: Should we remove vacuum_defer_cleanup_age?

От
Laurenz Albe
Дата:
On Thu, 2023-04-13 at 12:16 -0400, Jonathan S. Katz wrote:
> On 4/13/23 11:32 AM, Jonathan S. Katz wrote:
> > On 4/12/23 11:34 PM, Amit Kapila wrote:
> > > On Tue, Apr 11, 2023 at 11:50 PM Andres Freund <andres@anarazel.de>
>
> > > +1 to do one of the above. I think there is a good chance that
> > > somebody might be doing more harm by using it so removing this
> > > shouldn't be a problem. Personally, I have not heard of people using
> > > it but OTOH it is difficult to predict so giving some time is also not
> > > a bad idea.
> > >
> > > Do others have any opinion/suggestion on this matter?
> >
> > I need a bit more time to study this before formulating an opinion on
> > whether we should remove it for v16. In any case, I'm not against
> > documentation.
>
> [RMT hat]
>
> +1 for removing.

I am not against this in principle, but I know that there are people using
this parameter; see the discussion linked in

https://postgr.es/m/E1jkzxE-0006Dw-Dg@gemulon.postgresql.org

I can't say if they have a good use case for that parameter or not.

Yours,
Laurenz Albe



Re: Should we remove vacuum_defer_cleanup_age?

От
Robert Haas
Дата:
On Thu, Apr 13, 2023 at 11:06 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> I am not against this in principle, but I know that there are people using
> this parameter; see the discussion linked in
>
> https://postgr.es/m/E1jkzxE-0006Dw-Dg@gemulon.postgresql.org
>
> I can't say if they have a good use case for that parameter or not.

Yeah, I feel similarly. Actually, personally I have no evidence, not
even an anecdote, suggesting that this parameter is in use, but I'm a
bit skeptical of any consensus of the form "no one is using X,"
because there sure are a lot of people running PostgreSQL and they
sure do a lot of things. Some more justifiably than others, but often
people have surprisingly good excuses for doing stuff that sounds
bizarre when you first hear about it, and it doesn't seem totally
impossible that somebody could have found a way to get value out of
this.

However, I suspect that there aren't many such people, and I think the
setting is a kludge, so I support removing it. Maybe we'll find out
that we ought to add something else instead, like a limited delimited
in time rather than in XIDs. Or maybe the existing facilities are good
enough. But as Peter rightly says, XID age is likely a poor proxy for
whatever people really care about, so I don't think continuing to have
a setting that works like that is a good plan.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Should we remove vacuum_defer_cleanup_age?

От
Daniel Gustafsson
Дата:
> On 14 Apr 2023, at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:

> ..as Peter rightly says, XID age is likely a poor proxy for
> whatever people really care about, so I don't think continuing to have
> a setting that works like that is a good plan.

Agreed, and removing it is likely to be a good vehicle for figuring out what we
should have instead (if anything).

--
Daniel Gustafsson




Re: Should we remove vacuum_defer_cleanup_age?

От
"Jonathan S. Katz"
Дата:
On 4/14/23 8:30 AM, Robert Haas wrote:
> On Thu, Apr 13, 2023 at 11:06 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>> I am not against this in principle, but I know that there are people using
>> this parameter; see the discussion linked in
>>
>> https://postgr.es/m/E1jkzxE-0006Dw-Dg@gemulon.postgresql.org
>>
>> I can't say if they have a good use case for that parameter or not.
> 
> Yeah, I feel similarly. Actually, personally I have no evidence, not
> even an anecdote, suggesting that this parameter is in use, but I'm a
> bit skeptical of any consensus of the form "no one is using X,"
> because there sure are a lot of people running PostgreSQL and they
> sure do a lot of things. Some more justifiably than others, but often
> people have surprisingly good excuses for doing stuff that sounds
> bizarre when you first hear about it, and it doesn't seem totally
> impossible that somebody could have found a way to get value out of
> this.

Let me restate [1] in a different way.

Using a large enough dataset, I did qualitatively look at overall usage 
of both "vacuum_defer_cleanup_age" and compared to 
"hot_standby_feedback", given you can use both to accomplish similar 
outcomes. The usage of "vacuum_defer_cleanup_age" was really minimal, in 
fact approaching "0", whereas "hot_standby_feedback" had significant 
adoption.

I'm not saying that "we should remove a setting just because it's not 
used" or that it may not have utility -- I'm saying that we can remove 
the setting given:

1. We know that using this setting incorrectly (which can be done fairly 
easily) can cause significant issues
2. There's another setting that can accomplish similar goals that's much 
safer
3. The setting itself is not widely used

It's the combination of all 3 that led to my conclusion. If it were just 
(1), I'd lean more strongly towards trying to fix it first.

> However, I suspect that there aren't many such people, and I think the
> setting is a kludge, so I support removing it. Maybe we'll find out
> that we ought to add something else instead, like a limited delimited
> in time rather than in XIDs. Or maybe the existing facilities are good
> enough. But as Peter rightly says, XID age is likely a poor proxy for
> whatever people really care about, so I don't think continuing to have
> a setting that works like that is a good plan.

That seems like a good eventual outcome.

Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/bf42784f-4d57-0a3d-1a06-ffac1a09318c%40postgresql.org


Вложения

Re: Should we remove vacuum_defer_cleanup_age?

От
Greg Stark
Дата:
On Fri, 14 Apr 2023 at 09:47, Jonathan S. Katz <jkatz@postgresql.org> wrote:
>
> Let me restate [1] in a different way.
>
> Using a large enough dataset, I did qualitatively look at overall usage
> of both "vacuum_defer_cleanup_age" and compared to
> "hot_standby_feedback", given you can use both to accomplish similar
> outcomes.

I assume people would use hot_standby_feedback if they have streaming
replication. The main use cases for vacuum_defer_cleanup_age would be
if you're replaying WAL files. That may sound archaic but there are
plenty of circumstances where your standby may not have network access
to your primary at all or not want to be replaying continuously.

I wonder whether your dataset is self-selecting sites that have
streaming replication. That does seem like the more common usage
pattern.

Systems using wal files are more likely to be things like data
warehouses, offline analytics systems, etc. They may not even be well
known in the same organization that runs the online operations -- in
my experience they're often run by marketing or sales organizations or
in some cases infosec teams and consume data from lots of sources. The
main reason to use wal archive replay is often to provide the
isolation so that the operations team don't need to worry about the
impact on production and that makes it easy to forget these even
exist.

-- 
greg



Re: Should we remove vacuum_defer_cleanup_age?

От
Alvaro Herrera
Дата:
On 2023-Apr-14, Greg Stark wrote:

> On Fri, 14 Apr 2023 at 09:47, Jonathan S. Katz <jkatz@postgresql.org> wrote:
> >
> > Let me restate [1] in a different way.
> >
> > Using a large enough dataset, I did qualitatively look at overall usage
> > of both "vacuum_defer_cleanup_age" and compared to
> > "hot_standby_feedback", given you can use both to accomplish similar
> > outcomes.
> 
> I assume people would use hot_standby_feedback if they have streaming
> replication. 

Yes, either that or a replication slot.

vacuum_defer_cleanup_age was added in commit efc16ea52067 (2009-12-19),
for Postgres 9.0.  hot_standby_feedback is a bit newer
(bca8b7f16a3e, 2011-02-16), and replication slots are newer still
(858ec11858a9, 2014-01-31).

> The main use cases for vacuum_defer_cleanup_age would be if you're
> replaying WAL files.  That may sound archaic but there are plenty of
> circumstances where your standby may not have network access to your
> primary at all or not want to be replaying continuously.

True, those cases exist.  However, it sounds to me like in those cases
vacuum_defer_cleanup_age doesn't really help you either; you'd just want
to pause WAL replay depending on your queries or whatever.  After all,
you'd have to feed the WAL files "manually" to replay, so you're in
control anyway without having to touch the primary.

I find it very hard to believe that people are doing stuff with
vacuum_defer_cleanup_age that cannot be done with either of the other
newer mechanisms, which have also seen much wider usage and so bugs
fixed, etc.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)



Re: Should we remove vacuum_defer_cleanup_age?

От
Laurenz Albe
Дата:
On Fri, 2023-04-14 at 18:43 +0200, Alvaro Herrera wrote:
> On 2023-Apr-14, Greg Stark wrote:
> > I assume people would use hot_standby_feedback if they have streaming
> > replication.
>
> Yes, either that or a replication slot.

A replication slot doesn't do anything against snapshot conflicts,
which is what we are discussing here.  Or are we not?

>
> I find it very hard to believe that people are doing stuff with
> vacuum_defer_cleanup_age that cannot be done with either of the other
> newer mechanisms, which have also seen much wider usage and so bugs
> fixed, etc.

vacuum_defer_cleanup_age offers a more fine-grained approach.
With hot_standby_feedback you can only say "don't ever remove any dead
tuples that the standby still needs".

But perhaps you'd prefer "don't remove dead tuples unless they are
quite old", so that you can get your shorter queries on the standby
to complete, without delaying replay and without the danger that a
long running query on the standby bloats your primary.

How about this:
Let's remove vacuum_defer_cleanup_age, and put a note in the release notes
that recommends using statement_timeout and hot_standby_feedback = on
on the standby instead.
That should have pretty much the same effect, and it is measured in
time and not in the number of transactions.

Yours,
Laurenz Albe



Re: Should we remove vacuum_defer_cleanup_age?

От
Greg Stark
Дата:
On Fri, 14 Apr 2023 at 13:15, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Fri, 2023-04-14 at 18:43 +0200, Alvaro Herrera wrote:
> > On 2023-Apr-14, Greg Stark wrote:
> > > I assume people would use hot_standby_feedback if they have streaming
> > > replication.
> >
> > Yes, either that or a replication slot.
>
> A replication slot doesn't do anything against snapshot conflicts,
> which is what we are discussing here.  Or are we not?

They're related -- the replication slot holds the feedback xmin so
that if your standby disconnects it can reconnect later and not have
lost data in the meantime. At least I think that's what I think it
does -- I don't know if I'm just assuming that, but xmin is indeed in
pg_replication_slots.

-- 
greg



Re: Should we remove vacuum_defer_cleanup_age?

От
"Jonathan S. Katz"
Дата:
On 4/14/23 1:15 PM, Laurenz Albe wrote:

> Let's remove vacuum_defer_cleanup_age, and put a note in the release notes
> that recommends using statement_timeout and hot_standby_feedback = on
> on the standby instead.
> That should have pretty much the same effect, and it is measured in
> time and not in the number of transactions.

+1.

Jonathan


Вложения

Re: Should we remove vacuum_defer_cleanup_age?

От
Nathan Bossart
Дата:
On Fri, Apr 14, 2023 at 03:07:37PM -0400, Jonathan S. Katz wrote:
> +1.

+1.  I agree with the upthread discussion and support removing
vacuum_defer_cleanup_age.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Should we remove vacuum_defer_cleanup_age?

От
Andres Freund
Дата:
Hi,

On 2023-04-13 13:18:38 +0200, Alvaro Herrera wrote:
> On 2023-Apr-11, Andres Freund wrote:
> 
> > Updated patch attached. I think we should either apply something like that
> > patch, or at least add a <warning/> to the docs.
> 
> I gave this patch a look.  The only code change is that
> ComputeXidHorizons() and GetSnapshotData() no longer handle the case
> where vacuum_defer_cleanup_age is different from zero.  It looks good.
> The TransactionIdRetreatSafely() code being removed is pretty weird (I
> spent a good dozen minutes writing a complaint that your rewrite was
> faulty, but it turns out I had misunderstood the function), so I'm glad
> it's being retired.

My rewrite of what? The creation of TransactionIdRetreatSafely() in
be504a3e974?

I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert more
things to 64bit xids (lest they end up with the same bug as fixed by
be504a3e974), so it's perhaps worth thinking about how to make it less
confusing.


> >     <para>
> > -    Similarly, <xref linkend="guc-hot-standby-feedback"/>
> > -    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
> > -    relevant rows being removed by vacuum, but the former provides no
> > -    protection during any time period when the standby is not connected,
> > -    and the latter often needs to be set to a high value to provide adequate
> > -    protection.  Replication slots overcome these disadvantages.
> > +    Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
> > +    also using a replication slot, provides protection against relevant rows
> > +    being removed by vacuum, but provides no protection during any time period
> > +    when the standby is not connected.  Replication slots overcome these
> > +    disadvantages.
> 
> I think it made sense to have this paragraph be separate from the
> previous one when it was talking about two separate variables, but now
> that it's just one, it looks a bit isolated.  I would merge it with the
> one above, which is talking about pretty much the same thing, and
> reorder the whole thing approximately like this
> 
>    <para>
>     In lieu of using replication slots, it is possible to prevent the removal
>     of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
>     storing the segments in an archive using
>     <xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
>     However, these methods often result in retaining more WAL segments than
>     required.
>     Similarly, <xref linkend="guc-hot-standby-feedback"/> without
>     a replication slot provides protection against relevant rows
>     being removed by vacuum, but provides no protection during any time period
>     when the standby is not connected.
>    </para>
>    <para>
>     Replication slots overcome these disadvantages by retaining only the number
>     of segments known to be needed.
>     On the other hand, replication slots can retain so
>     many WAL segments that they fill up the space allocated
>     for <literal>pg_wal</literal>;
>     <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
>     retained by replication slots.
>    </para>

It seems a bit confusing now, because "by retaining only the number of
segments ..." now also should cover hs_feedback (due to merging), but doesn't.


> Though the "However," looks a poor fit; I would do this:

I agree, I don't like the however.


I think I'll push the version I had. Then we can separately word-smith the
section? Unless somebody protests I'm gonna do that soon.

Greetings,

Andres Freund



Re: Should we remove vacuum_defer_cleanup_age?

От
Alvaro Herrera
Дата:
On 2023-Apr-22, Andres Freund wrote:

> On 2023-04-13 13:18:38 +0200, Alvaro Herrera wrote:
> > 
> > > Updated patch attached. I think we should either apply something like that
> > > patch, or at least add a <warning/> to the docs.
> > 
> > I gave this patch a look.  The only code change is that
> > ComputeXidHorizons() and GetSnapshotData() no longer handle the case
> > where vacuum_defer_cleanup_age is different from zero.  It looks good.
> > The TransactionIdRetreatSafely() code being removed is pretty weird (I
> > spent a good dozen minutes writing a complaint that your rewrite was
> > faulty, but it turns out I had misunderstood the function), so I'm glad
> > it's being retired.
> 
> My rewrite of what? The creation of TransactionIdRetreatSafely() in
> be504a3e974?

I meant the code that used to call TransactionIdRetreatSafely() and that
you're changing in the proposed patch.

> I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert more
> things to 64bit xids (lest they end up with the same bug as fixed by
> be504a3e974), so it's perhaps worth thinking about how to make it less
> confusing.

The one thing that IMO makes it less confusing is to have it return the
value rather than modifying it in place.

> >    <para>
> >     Replication slots overcome these disadvantages by retaining only the number
> >     of segments known to be needed.
> >     On the other hand, replication slots can retain so
> >     many WAL segments that they fill up the space allocated
> >     for <literal>pg_wal</literal>;
> >     <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
> >     retained by replication slots.
> >    </para>
> 
> It seems a bit confusing now, because "by retaining only the number of
> segments ..." now also should cover hs_feedback (due to merging), but doesn't.

Hah, ok.

> I think I'll push the version I had. Then we can separately word-smith the
> section? Unless somebody protests I'm gonna do that soon.

No objection.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: Should we remove vacuum_defer_cleanup_age?

От
Robert Haas
Дата:
On Mon, Apr 24, 2023 at 8:36 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> The one thing that IMO makes it less confusing is to have it return the
> value rather than modifying it in place.

Yeah, I don't understand why we have these functions that modify the
value in place. Those are probably convenient here and there, but
overall they seem to make things more confusing.

--
Robert Haas
EDB: http://www.enterprisedb.com



RE: Should we remove vacuum_defer_cleanup_age?

От
Phil Florent
Дата:


Hi,
Not very convenient but if autovacuum is enabled isn't vacuum_defer_cleanup_age the way to make extensions like pg_dirtyread more effective for temporal queries to quickly correct human DML mistakes without the need of a complete restore, even if no standby is in use ? vacuum_defer_cleanup_age+pg_dirtyread give PostgreSQL something like "flashback query" in Oracle.
Best regards,
Phil


De : Andres Freund <andres@anarazel.de>
Envoyé : dimanche 23 avril 2023 00:47
À : Alvaro Herrera <alvherre@alvh.no-ip.org>
Cc : Justin Pryzby <pryzby@telsasoft.com>; pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>; Amit Kapila <amit.kapila16@gmail.com>
Objet : Re: Should we remove vacuum_defer_cleanup_age?
 
Hi,

On 2023-04-13 13:18:38 +0200, Alvaro Herrera wrote:
> On 2023-Apr-11, Andres Freund wrote:
>
> > Updated patch attached. I think we should either apply something like that
> > patch, or at least add a <warning/> to the docs.
>
> I gave this patch a look.  The only code change is that
> ComputeXidHorizons() and GetSnapshotData() no longer handle the case
> where vacuum_defer_cleanup_age is different from zero.  It looks good.
> The TransactionIdRetreatSafely() code being removed is pretty weird (I
> spent a good dozen minutes writing a complaint that your rewrite was
> faulty, but it turns out I had misunderstood the function), so I'm glad
> it's being retired.

My rewrite of what? The creation of TransactionIdRetreatSafely() in
be504a3e974?

I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert more
things to 64bit xids (lest they end up with the same bug as fixed by
be504a3e974), so it's perhaps worth thinking about how to make it less
confusing.


> >     <para>
> > -    Similarly, <xref linkend="guc-hot-standby-feedback"/>
> > -    and <xref linkend="guc-vacuum-defer-cleanup-age"/> provide protection against
> > -    relevant rows being removed by vacuum, but the former provides no
> > -    protection during any time period when the standby is not connected,
> > -    and the latter often needs to be set to a high value to provide adequate
> > -    protection.  Replication slots overcome these disadvantages.
> > +    Similarly, <xref linkend="guc-hot-standby-feedback"/> on its own, without
> > +    also using a replication slot, provides protection against relevant rows
> > +    being removed by vacuum, but provides no protection during any time period
> > +    when the standby is not connected.  Replication slots overcome these
> > +    disadvantages.
>
> I think it made sense to have this paragraph be separate from the
> previous one when it was talking about two separate variables, but now
> that it's just one, it looks a bit isolated.  I would merge it with the
> one above, which is talking about pretty much the same thing, and
> reorder the whole thing approximately like this
>
>    <para>
>     In lieu of using replication slots, it is possible to prevent the removal
>     of old WAL segments using <xref linkend="guc-wal-keep-size"/>, or by
>     storing the segments in an archive using
>     <xref linkend="guc-archive-command"/> or <xref linkend="guc-archive-library"/>.
>     However, these methods often result in retaining more WAL segments than
>     required.
>     Similarly, <xref linkend="guc-hot-standby-feedback"/> without
>     a replication slot provides protection against relevant rows
>     being removed by vacuum, but provides no protection during any time period
>     when the standby is not connected.
>    </para>
>    <para>
>     Replication slots overcome these disadvantages by retaining only the number
>     of segments known to be needed.
>     On the other hand, replication slots can retain so
>     many WAL segments that they fill up the space allocated
>     for <literal>pg_wal</literal>;
>     <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
>     retained by replication slots.
>    </para>

It seems a bit confusing now, because "by retaining only the number of
segments ..." now also should cover hs_feedback (due to merging), but doesn't.


> Though the "However," looks a poor fit; I would do this:

I agree, I don't like the however.


I think I'll push the version I had. Then we can separately word-smith the
section? Unless somebody protests I'm gonna do that soon.

Greetings,

Andres Freund


Re: Should we remove vacuum_defer_cleanup_age?

От
Andres Freund
Дата:
Hi,

On 2023-04-24 14:36:36 +0200, Alvaro Herrera wrote:
> On 2023-Apr-22, Andres Freund wrote:
> > I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert more
> > things to 64bit xids (lest they end up with the same bug as fixed by
> > be504a3e974), so it's perhaps worth thinking about how to make it less
> > confusing.
> 
> The one thing that IMO makes it less confusing is to have it return the
> value rather than modifying it in place.

Partially I made it that way because you otherwise end up repeating long
variable names multiple times within one statement, yielding long repetitive
lines...  Not sure that's a good enough reason, but ...



> > >    <para>
> > >     Replication slots overcome these disadvantages by retaining only the number
> > >     of segments known to be needed.
> > >     On the other hand, replication slots can retain so
> > >     many WAL segments that they fill up the space allocated
> > >     for <literal>pg_wal</literal>;
> > >     <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
> > >     retained by replication slots.
> > >    </para>
> > 
> > It seems a bit confusing now, because "by retaining only the number of
> > segments ..." now also should cover hs_feedback (due to merging), but doesn't.
> 
> Hah, ok.
> 

> > I think I'll push the version I had. Then we can separately word-smith the
> > section? Unless somebody protests I'm gonna do that soon.
> 
> No objection.

Cool. Pushed now.