Re: Maintaining cluster order on insert
От | Heikki Linnakangas |
---|---|
Тема | Re: Maintaining cluster order on insert |
Дата | |
Msg-id | 46761DB7.7070505@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Maintaining cluster order on insert (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Maintaining cluster order on insert
(Gregory Stark <stark@enterprisedb.com>)
Re: Maintaining cluster order on insert (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-patches |
Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: >> The implementation has changed a bit since August. I thought I had >> submitted an updated version in the winter but couldn't find it. Anyway, >> I updated and dusted off the source tree, tidied up the comments a >> little bit, and fixed some inconsistencies in pg_proc entries that made >> opr_sanity to fail. > > I started looking at this patch. My first reaction is that I liked last > August's API (an independent "suggestblock" call) a lot better. I think > trying to hold an index page lock across the heap insert is an extremely > bad idea; it will hurt concurrency and possibly cause deadlocks > (especially for unique indexes). The index page is not kept locked across the calls. Just pinned. The reason for switching to the new API instead of the amsuggestblock API is CPU overhead. It avoids constructing the IndexTuple twice and descending the tree twice. Clustering is mainly useful when the table doesn't fit in cache, so one could argue that if you care about clustering you're most likely I/O bound and don't care about the CPU overhead that much. Nevertheless, avoiding it seems like a good idea to me. The amsuggestblock API is simpler, though. We might choose it on those grounds alone. > The other question is why is execMain involved in this? That makes the > design nonfunctional for tuples inserted in any other way than through > the main executor --- COPY for instance. Also, if this is successful, > I could see using it on system catalogs eventually. I'm inclined to > think that the right design is for heap_insert to call amsuggestblock > for itself, or maybe even push that down to RelationGetBufferForTuple. > (Note: having heap_insert contain logic that duplicates > RelationGetBufferForTuple's is another bit of bad design here, but > that's at least correctable locally.) Now the difficulty in that is > that the heapam.c routines don't have hold of any data structure > containing index knowledge ... but they do have hold of the Relation > structure for the heap. I suggest making RelationGetIndexList() cache > the OID of the clustered index (if any) in relcache entries, and add > RelationGetClusterIndex much like RelationGetOidIndex, and then > heap_insert can use that. Hmm. My first reaction on that is that having heap_insert reach into an index is a modularity violation. It wouldn't be too hard to make similar changes to COPY that I did to the executor. I doubt it's very useful for system catalogs; they should never grow very large, and should stay mostly cached anyway. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-patches по дате отправления: