Обсуждение: preliminary: logical column order
This patch provides a preliminary implementation of the logical column ordering functionality recently proposed on -hackers. This patch has a number of known issues, so please don't apply it. Current functionality implemented: - Added attpos to the system catalogs & bumped the catalog version: initdb required - COPY TO/FROM obey logical column ordering - When expanding "SELECT * ...", we make sure to obey logical column order (see below, however) - If INSERT is given an empty column list, it will produce the necessary implicit column list in sorted by attpos - (unrelated) improve a bunch of comments in pg_class.h, remove some dead code from pg_attribute.h, refactor some code in access/common/printtup.c Remaining work to do: - Change the name of the pg_attribute attribute to something other than 'attpos', per the discussion on -hackers. I don't believe we've settled on the right name yet. - The code for actually sorting the columns in attpos-order is duplicated a few times -- this was just done for the sake of convenience, I'm going to clean this up and stick it in a single, shared location in the new patch. - The INSERT change causes the alter_table regression tests to fail (the regression.diff) is attached. This appears to be the logical column ordering interacting with inheritance, but I haven't yet taken an in-depth look at it. - When processing a "SELECT *", for example, the actual data columns are returned in the right order, but the RowDescription messages sent by libpq are not (i.e. they are sent in attnum-order, not attpos). In SendRowDescriptionMessage() and related printtup code, I took a look at trying to sort the attributes we're about to send RowDescription messages for by their 'attpos', but the TupleDesc that we have for the return type doesn't included a filled-in attpos. I spent a while trying to see how feasible it would be to fill in the result-type-TupleDesc with an attpos where available, but couldn't figure out an easy way to do this. Are there any suggestions on how this should be best done? - I haven't yet implemented ordering-by-attpos for the output columns of a join, or the alias->column matching in the FROM alias list (per Tom's email to -hackers). I saw a couple places where I might be able to do this, but I'm quite sure what the correct spot is. Tom, do you have any suggestions? Any comments would be gratefully appreciated. -Neil
Вложения
Neil Conway <neilc@samurai.com> writes: > - The code for actually sorting the columns in attpos-order is > duplicated a few times -- this was just done for the sake of > convenience, I'm going to clean this up and stick it in a > single, shared location in the new patch. Bruce and I were chatting about that on the phone today. I think it might be useful for TupleDescs to doubly index their contained attribute rows --- that is, keep the existing array-indexed-by-attnum, but add another pointer array indexed by attpos, containing only nondeleted columns. This would be easy to build, and it'd eliminate searching/sorting for places that had access to a TupleDesc. > - When processing a "SELECT *", for example, the actual data > columns are returned in the right order, but the > RowDescription messages sent by libpq are not (i.e. they are > sent in attnum-order, not attpos). Easy to fix given above proposal ... although actually I am not sure why this would occur. printtup and friends should always get a constructed TupDesc that has no notion of deleted or renumbered columns. This may be a symptom of a more fundamental error somewhere. regards, tom lane
On Fri, 21 Nov 2003 03:09:02 -0500, Neil Conway <neilc@samurai.com> wrote: >attachment; filename=weird_regression.diffs This was caused by a small oversight in ALTER TABLE ... ADD COLUMN: diff -ruN ../base/src/backend/commands/tablecmds.c src/backend/commands/tablecmds.c --- ../base/src/backend/commands/tablecmds.c 2003-10-14 00:47:15.000000000 +0200 +++ src/backend/commands/tablecmds.c 2003-11-23 16:51:37.000000000 +0100 @@ -1786,6 +1786,7 @@ attribute->attcacheoff = -1; attribute->atttypmod = colDef->typename->typmod; attribute->attnum = i; + attribute->attpos = i; attribute->attbyval = tform->typbyval; attribute->attndims = attndims; attribute->attisset = (bool) (tform->typtype == 'c'); Servus Manfred
Tom Lane <tgl@sss.pgh.pa.us> writes: > Bruce and I were chatting about that on the phone today. I think it > might be useful for TupleDescs to doubly index their contained > attribute rows Ok, I implemented this. I made it so that the properly sorted attribute array is constructed lazily -- my reasoning was that (a) relatively few locations in the code actually use the sorted attribute array, so there is no point allocating and initializing it every time a TupleDesc is used (b) the array could potentially be 1,500 pointers long -- not huge, but not tiny either, so it is worth a little effort to avoid unnecessarily alloc'ing and sorting it. Access to the sorted array is (only) done via a new function, GetAttrByLogicalPosition() > Easy to fix given above proposal ... although actually I am not sure > why this would occur. printtup and friends should always get a > constructed TupDesc that has no notion of deleted or renumbered > columns. It seems to happen because: - SendRowDescription() in printtup.c notes that "the TupleDesc has been manufactured by ExecTypeFromTL() or some similar function; it does not contain a full set of fields." - ExecTypeFromTL() allocates an empty TupleDesc, and then fills in its individual entries with data from a TargetList -- the Form_pg_attribute for the attributes involved is never consulted, so the default attlogpos inserted by TupleDescInitEntry() is used: attlogpos == attnum I'm unsure of the best way to fix this so that the TupleDesc that is handed to printtup & friends contains the information we require. Any suggestions? A new version of the patch is attached. Changes: - replace sorting code with a lazily-constructed sorted array of pointers to attribute data in TupleDesc - include Manfred's suggested fix for the alter table regression test - (unrelated) add a comment to rewrite/rewriteDefine.c noting the intent of the code and known problems - (unrelated) merge my other patch for refactoring CreateTupleDescCopy() into this patch: I needed to modify this function in this patch, so I didn't want to deal with patching complexities. -Neil
Вложения
I wonder if it wouldn't be easier to reorder the TupDesc->attrs[] array according to an attphysid when filling the TupDesc structure, right after a column was dropped/recreated (before any indexes/constraints are recreated), so attnum remains, while storage changes. Example: before: attnum attphysid attname attisdropped 1 1 foo f 2 2 bar f after drop/recreate col: 1 3 foo f 2 2 bar f 3 1 foo_del t resulting in an attrs array attrs[0] describing physical col 3 attrs[1] describing physical col 2 attrs[2] describing physical col 1 Regards, Andreas