Re: GiST VACUUM

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: GiST VACUUM
Дата
Msg-id cd18b07b-f74c-34b2-313d-64ea7dc45f77@iki.fi
обсуждение исходный текст
Ответ на Re: GiST VACUUM  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: GiST VACUUM  (Andrey Borodin <x4mmm@yandex-team.ru>)
Re: GiST VACUUM  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-hackers
I'm now looking at the first patch in this series, to allow completely 
empty GiST pages to be recycled. I've got some questions:

> --- a/src/backend/access/gist/gist.c
> +++ b/src/backend/access/gist/gist.c
> @@ -700,6 +700,13 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace, GISTSTATE *giststate)
>              GISTInsertStack *item;
>              OffsetNumber downlinkoffnum;
>  
> +            if(GistPageIsDeleted(stack->page))
> +            {
> +                UnlockReleaseBuffer(stack->buffer);
> +                xlocked = false;
> +                state.stack = stack = stack->parent;
> +                continue;
> +            }
>              downlinkoffnum = gistchoose(state.r, stack->page, itup, giststate);
>              iid = PageGetItemId(stack->page, downlinkoffnum);
>              idxtuple = (IndexTuple) PageGetItem(stack->page, iid);

This seems misplaced. This code deals with internal pages, and as far as 
I can see, this patch never marks internal pages as deleted, only leaf 
pages. However, we should have something like this in the leaf-page 
branch, to deal with the case that an insertion lands on a page that was 
concurrently deleted. Did you have any tests, where an insertion runs 
concurrently with vacuum, that would exercise this?

The code in gistbulkdelete() seems pretty expensive. In the first phase, 
it records the parent of every empty leaf page it encounters. In the 
second phase, it scans every leaf page of that parent, not only those 
leaves that were seen as empty.

I'm a bit wary of using pd_prune_xid for the checks to determine if a 
deleted page can be recycled yet. In heap pages, pd_prune_xid is just a 
hint, but here it's used for a critical check. This seems to be the same 
mechanism we use in B-trees, but in B-trees, we store the XID in 
BTPageOpaqueData.xact, not pd_prune_xid. Also, in B-trees, we use 
ReadNewTransactionId() to set it, not GetCurrentTransactionId(). See 
comments in _bt_unlink_halfdead_page() for explanation. This patch is 
missing any comments to explain how this works in GiST.

If you crash in the middle of gistbulkdelete(), after it has removed the 
downlink from the parent, but before it has marked the leaf page as 
deleted, the leaf page is "leaked". I think that's acceptable, but a 
comment at least would be good.

- Heikki


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: _isnan() on Windows
Следующее
От: Jerry Jelinek
Дата:
Сообщение: Re: patch to allow disable of WAL recycling