I am just asking this for my general understanding of PostgreSQL. I was
reading the 'Realtime VACUUM + performance on insert/update/delete' thread
and got stuck on something Scott Marlowe had mentioned:
Also, things like:
update table set field1=field1+1
are killers in an MVCC database as well.
If I understand this correctly, such an operation basically results in copying
the entire table. For big tables, it seems like such an operation would be
something you wouldn't want to take lightly.
My questions is: when the update commits, and all transactions that depend on
the version of these (old/updated) tuples all commit, could these tuples be
automatically marked for reuse during the course of normal processing
(without adding significant overhead)? --- rather than waiting for a vacuum?
I am sure the short answer it 'No'.
This probably centers around how MVCC works, and most likely it only concerns
itself with a tuple's version in the course of processing transactions, not
figuring out which versions are no longer relevant --- which is vacuum's job.
I'm just trying to understand if it would ever be possible/reasonable for
postgres to make vacuum campaigns on tables based on administrative
thresholds --- say if 50% of the table's tuples are invalid, then mark them
for reuse.
Again, this is just a matter of curiosity/interest, and it is probably more a
matter of me needing to study PostgreSQL's internals than anything else. I
was just wondering if anyone might offer a quick and easy answer before I
dove in.
Thanks.