Tomas Szepe reported here
http://archives.postgresql.org/pgsql-bugs/2008-02/msg00068.php
about a bug in VACUUM FULL, which turned out to be that the code
was expecting pages with no live tuples to always get added to
the fraged_pages list, and this was sometimes not happening.
On investigation the problem occurs because we changed vacuum.c's
PageGetFreeSpaceWithFillFactor() to use PageGetHeapFreeSpace()
instead of just computing pd_upper - pd_lower as it had done in
every previous release. This was *not* a good idea: VACUUM FULL
does its own accounting for line pointers and does not need "help".
Aside from exposing the aforementioned bug, this would result in
pages sometimes being passed over as not being useful insertion
targets, when in fact they were going to be completely empty after
the "reaping" step.
The attached proposed patch reverts that bad decision, and also adds an
extra condition to force pages into the fraged_pages list when notup is
true. Under normal circumstances this extra condition should be a
useless test, but I am worried that with extremely small fillfactor
settings it might still be possible to provoke the empty_end_pages
problem.
I am thinking that the extra notup condition should be back-patched,
at least as far back as we have fillfactor support, and maybe all
the way back on general principles.
Comments?
regards, tom lane
Index: vacuum.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.363
diff -c -r1.363 vacuum.c
*** vacuum.c 3 Jan 2008 21:23:15 -0000 1.363
--- vacuum.c 10 Feb 2008 21:07:25 -0000
***************
*** 1659,1670 ****
free_space += vacpage->free;
/*
! * Add the page to fraged_pages if it has a useful amount of free
! * space. "Useful" means enough for a minimal-sized tuple. But we
! * don't know that accurately near the start of the relation, so add
! * pages unconditionally if they have >= BLCKSZ/10 free space.
*/
! do_frag = (vacpage->free >= min_tlen || vacpage->free >= BLCKSZ / 10);
if (do_reap || do_frag)
{
--- 1659,1676 ----
free_space += vacpage->free;
/*
! * Add the page to vacuum_pages if it requires reaping, and add it to
! * fraged_pages if it has a useful amount of free space. "Useful"
! * means enough for a minimal-sized tuple. But we don't know that
! * accurately near the start of the relation, so add pages
! * unconditionally if they have >= BLCKSZ/10 free space. Also
! * forcibly add pages with no live tuples, to avoid confusing the
! * empty_end_pages logic. (In the presence of unreasonably small
! * fillfactor, it seems possible that such pages might not pass
! * the free-space test, but they had better be in the list anyway.)
*/
! do_frag = (vacpage->free >= min_tlen || vacpage->free >= BLCKSZ / 10 ||
! notup);
if (do_reap || do_frag)
{
***************
*** 1679,1684 ****
--- 1685,1691 ----
/*
* Include the page in empty_end_pages if it will be empty after
* vacuuming; this is to keep us from using it as a move destination.
+ * Note that such pages are guaranteed to be in fraged_pages.
*/
if (notup)
{
***************
*** 3725,3731 ****
static Size
PageGetFreeSpaceWithFillFactor(Relation relation, Page page)
{
! Size freespace = PageGetHeapFreeSpace(page);
Size targetfree;
targetfree = RelationGetTargetPageFreeSpace(relation,
--- 3732,3750 ----
static Size
PageGetFreeSpaceWithFillFactor(Relation relation, Page page)
{
! /*
! * It is correct to use PageGetExactFreeSpace() here, *not*
! * PageGetHeapFreeSpace(). This is because (a) we do our own, exact
! * accounting for whether line pointers must be added, and (b) we will
! * recycle any LP_DEAD line pointers before starting to add rows to a
! * page, but that may not have happened yet at the time this function is
! * applied to a page, which means PageGetHeapFreeSpace()'s protection
! * against too many line pointers on a page could fire incorrectly. We do
! * not need that protection here: since VACUUM FULL always recycles all
! * dead line pointers first, it'd be physically impossible to insert more
! * than MaxHeapTuplesPerPage tuples anyway.
! */
! Size freespace = PageGetExactFreeSpace(page);
Size targetfree;
targetfree = RelationGetTargetPageFreeSpace(relation,