Re: Maintaining cluster order on insert

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Maintaining cluster order on insert
Дата
Msg-id 46761DB7.7070505@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Maintaining cluster order on insert  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Maintaining cluster order on insert  (Gregory Stark <stark@enterprisedb.com>)
Re: Maintaining cluster order on insert  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-patches
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> The implementation has changed a bit since August. I thought I had
>> submitted an updated version in the winter but couldn't find it. Anyway,
>> I updated and dusted off the source tree, tidied up the comments a
>> little bit, and fixed some inconsistencies in pg_proc entries that made
>> opr_sanity to fail.
>
> I started looking at this patch.  My first reaction is that I liked last
> August's API (an independent "suggestblock" call) a lot better.  I think
> trying to hold an index page lock across the heap insert is an extremely
> bad idea; it will hurt concurrency and possibly cause deadlocks
> (especially for unique indexes).

The index page is not kept locked across the calls. Just pinned.

The reason for switching to the new API instead of the amsuggestblock
API is CPU overhead. It avoids constructing the IndexTuple twice and
descending the tree twice.

Clustering is mainly useful when the table doesn't fit in cache, so one
could argue that if you care about clustering you're most likely I/O
bound and don't care about the CPU overhead that much. Nevertheless,
avoiding it seems like a good idea to me.

The amsuggestblock API is simpler, though. We might choose it on those
grounds alone.

> The other question is why is execMain involved in this?  That makes the
> design nonfunctional for tuples inserted in any other way than through
> the main executor --- COPY for instance.  Also, if this is successful,
> I could see using it on system catalogs eventually.  I'm inclined to
> think that the right design is for heap_insert to call amsuggestblock
> for itself, or maybe even push that down to RelationGetBufferForTuple.
> (Note: having heap_insert contain logic that duplicates
> RelationGetBufferForTuple's is another bit of bad design here, but
> that's at least correctable locally.)  Now the difficulty in that is
> that the heapam.c routines don't have hold of any data structure
> containing index knowledge ... but they do have hold of the Relation
> structure for the heap.  I suggest making RelationGetIndexList() cache
> the OID of the clustered index (if any) in relcache entries, and add
> RelationGetClusterIndex much like RelationGetOidIndex, and then
> heap_insert can use that.

Hmm. My first reaction on that is that having heap_insert reach into an
index is a modularity violation. It wouldn't be too hard to make similar
changes to COPY that I did to the executor.

I doubt it's very useful for system catalogs; they should never grow
very large, and should stay mostly cached anyway.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

В списке pgsql-patches по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Maintaining cluster order on insert
Следующее
От: Gregory Stark
Дата:
Сообщение: Re: Maintaining cluster order on insert