Обсуждение: Suggestion; "WITH VACUUM" option
Tom, Folks: Joe and I were discussing your recent discussion about the costs of VACUUM and tuple maintainence, and I had an interesting idea. How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE queries? This option would cause the regular vacuum activity -- purging the dead tuple and its index references -- to be done immediately, as part of the statement, instead of being deferred. I agree that we don't want immediate tuple maintainence, most of the time. However, a couple of my clients would really like to have a few queries do their vacuum immediately, and are currently getting around that by vacuuming every 5 mintutes. Easy? Hard? Insane? What do you think? -- -Josh BerkusAglio Database SolutionsSan Francisco
On Mon, 16 Dec 2002, Josh Berkus wrote: > Tom, Folks: > > Joe and I were discussing your recent discussion about the costs of VACUUM and > tuple maintainence, and I had an interesting idea. > > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE > queries? This option would cause the regular vacuum activity -- purging the > dead tuple and its index references -- to be done immediately, as part of the > statement, instead of being deferred. > > I agree that we don't want immediate tuple maintainence, most of the time. > However, a couple of my clients would really like to have a few queries do > their vacuum immediately, and are currently getting around that by vacuuming > every 5 mintutes. > > Easy? Hard? Insane? What do you think? Just curious, but wouldn't it be just as simple to issue a VACUUM call right after the UPDATE/DELETE?
Marc, > > Easy? Hard? Insane? What do you think? > > Just curious, but wouldn't it be just as simple to issue a VACUUM call > right after the UPDATE/DELETE? Well, you can't do that as part of a transaction or procedure, whereas .... Hmmm. Couldn't do "with vacuum" as part of a transaction, either. Funny, it sounded like a really good idea on the phone. In the words of Rosanne Rosanna Danna, "Neeeveeeeeer miiiiiiiiind." -- -Josh BerkusAglio Database SolutionsSan Francisco
Josh Berkus <josh@agliodbs.com> writes: > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE > queries? This option would cause the regular vacuum activity -- purging the > dead tuple and its index references -- to be done immediately, as part of the > statement, instead of being deferred. > Easy? Hard? Insane? What do you think? Impossible. You can't vacuum a tuple until the last open transaction that can see it is gone. It is therefore *impossible* for a transaction to vacuum away its own detritus; until the transaction commits, you can't even start to wonder whether other open transactions see it or not. Vacuuming has to be done later, and that being the case, I don't see any real advantage to altering the "background vacuum" design we have. regards, tom lane
On Mon, 16 Dec 2002, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE > > queries? This option would cause the regular vacuum activity -- purging the > > dead tuple and its index references -- to be done immediately, as part of the > > statement, instead of being deferred. > > > Easy? Hard? Insane? What do you think? > > Impossible. You can't vacuum a tuple until the last open transaction > that can see it is gone. It is therefore *impossible* for a transaction > to vacuum away its own detritus; until the transaction commits, you > can't even start to wonder whether other open transactions see it or > not. > > Vacuuming has to be done later, and that being the case, I don't see any > real advantage to altering the "background vacuum" design we have. Then, would a "commit with vacuum" work? OR a "begin transaction with vacuum" Just tossing them out there...
On Mon, 16 Dec 2002, scott.marlowe wrote: > On Mon, 16 Dec 2002, Tom Lane wrote: > > > Josh Berkus <josh@agliodbs.com> writes: > > > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE > > > queries? This option would cause the regular vacuum activity -- purging the > > > dead tuple and its index references -- to be done immediately, as part of the > > > statement, instead of being deferred. > > > > > Easy? Hard? Insane? What do you think? > > > > Impossible. You can't vacuum a tuple until the last open transaction > > that can see it is gone. It is therefore *impossible* for a transaction > > to vacuum away its own detritus; until the transaction commits, you > > can't even start to wonder whether other open transactions see it or > > not. > > > > Vacuuming has to be done later, and that being the case, I don't see any > > real advantage to altering the "background vacuum" design we have. > > Then, would a "commit with vacuum" work? OR a "begin transaction with > vacuum" Just tossing them out there... Tom will correct me here, but I believe what he was trying to get across isn't that the 'current transaction' is the problem ... the problem is the other connections who have open transactions ... my simplistic understanding (and I hope it isn't too flawed) of MVCC is that as long as *one* transaction is outstanding on a tuple, that tuple can't be physically removed ... as far as any new transactions are concerned, it has disappeared ... so if I open a transaction, then you open one 'with vacuum', your 'with vacuum' will fail unless I happen to be out of my transaction before you ...
On Mon, 16 Dec 2002, Marc G. Fournier wrote: > On Mon, 16 Dec 2002, scott.marlowe wrote: > > > On Mon, 16 Dec 2002, Tom Lane wrote: > > > > > Josh Berkus <josh@agliodbs.com> writes: > > > > How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE > > > > queries? This option would cause the regular vacuum activity -- purging the > > > > dead tuple and its index references -- to be done immediately, as part of the > > > > statement, instead of being deferred. > > > > > > > Easy? Hard? Insane? What do you think? > > > > > > Impossible. You can't vacuum a tuple until the last open transaction > > > that can see it is gone. It is therefore *impossible* for a transaction > > > to vacuum away its own detritus; until the transaction commits, you > > > can't even start to wonder whether other open transactions see it or > > > not. > > > > > > Vacuuming has to be done later, and that being the case, I don't see any > > > real advantage to altering the "background vacuum" design we have. > > > > Then, would a "commit with vacuum" work? OR a "begin transaction with > > vacuum" Just tossing them out there... > > Tom will correct me here, but I believe what he was trying to get across > isn't that the 'current transaction' is the problem ... the problem is the > other connections who have open transactions ... my simplistic > understanding (and I hope it isn't too flawed) of MVCC is that as long as > *one* transaction is outstanding on a tuple, that tuple can't be > physically removed ... as far as any new transactions are concerned, it > has disappeared ... so if I open a transaction, then you open one 'with > vacuum', your 'with vacuum' will fail unless I happen to be out of my > transaction before you ... Oh, yeah I have no doubt of that. I was thinking more along the lines of when a transaction ends it throws a background "vacuum table1;vacuum table2;vacuum tablen" command into some kind of vacuuming hopper. I.e. it doesn't block waiting, it runs it as though it were run AFTER the transaction. If there are a few tuples from other transactions we can't reclaim, no big deal. The other option would be some kind of GUC that set a max number of rows deleted/updated in a table in a transaction that would trigger this kind of thing automagically. But I could see such a setting causing just as much harm (chaos theory anyone? :-) as good.
"scott.marlowe" <scott.marlowe@ihs.com> writes: > Oh, yeah I have no doubt of that. I was thinking more along the lines of > when a transaction ends it throws a background "vacuum table1;vacuum > table2;vacuum tablen" command into some kind of vacuuming hopper. Actually, the plans I liked best for driving auto-vacuum were essentially an indirect version of that: the FSM module would keep track of committed deletes + aborted inserts for each active table, and then the autovacuum scheduler could use that info to decide which tables are highest-priority to vacuum. (Or possibly the runtime stats module would be a better place to track it than FSM.) regards, tom lane
Tom Lane wrote: >Josh Berkus <josh@agliodbs.com> writes: > > >>How hard would it be to add a "WITH (VACUUM)" option to UPDATE and DELETE >>queries? This option would cause the regular vacuum activity -- purging the >>dead tuple and its index references -- to be done immediately, as part of the >>statement, instead of being deferred. >> >> > > > >>Easy? Hard? Insane? What do you think? >> >> > >Impossible. You can't vacuum a tuple until the last open transaction >that can see it is gone. It is therefore *impossible* for a transaction >to vacuum away its own detritus; until the transaction commits, you >can't even start to wonder whether other open transactions see it or >not. > >Vacuuming has to be done later, and that being the case, I don't see any >real advantage to altering the "background vacuum" design we have. > > This does raise an interresting question, and I understand that it is *impossible* to do with PostgreSQL as it currently exists, however, let me just toss this out there: Suppose you do this: update largetable set foo=bar; Lets also assume that "largetable" has tens of millions of rows. I have databases like this, and I sometimes do operations like this. I have found it more efficient to break up the update into a series of: update largetable set foo=bar where somefield < a; vacuum update largetable set foo=bar where somefield < b; vacuum update largetable set foo=bar where somefield < c; vacuum update largetable set foo=bar where not foo = bar; vacuum On some of my databases a statement which updates all the rows is unworkable in PostgreSQL, on Oracle, however, there is no poblem. For my use, it is a pain in the neck to deal with, but not unworkable. For some other users, it may be a bigger problem. > > > >
On Tue, 17 Dec 2002, mlw wrote: > update largetable set foo=bar; > > Lets also assume that "largetable" has tens of millions of rows. [..] > On some of my databases a statement which updates all the rows is > unworkable in PostgreSQL, on Oracle, however, there is no poblem. .. provided you have a lot of rollback space, which is essentially what the datafile growth here is providing. Matthew.