Обсуждение: ALTER TABLE MODIFY COLUMN

Поиск
Список
Период
Сортировка

ALTER TABLE MODIFY COLUMN

От
Mark Butler
Дата:
I was looking at how hard it would be to support altering column types and it
seems to me that it would be trivial to support changing nullability,
increasing the maximum length of the VARCHAR data type and increasing the
precision or scale of the DECIMAL / NUMERIC data type. 

Oracle allows you to update a column to null and then modify its data type to
any other type.  This is easy because it stores all columns in a variable
length format and a null looks the same regardless of type.

I understand that with the current heap tuple format described in
backend/access/common.c that changing the type of any fixed length attribute
requires updating every row.  

Surely if we have an write exclusive table lock we can rewrite tuples in place
rather than creating new versions with its corresponding 2x space requirement.

We could presumably do the following to change a column data type:

Preconditions: 1. Type conversion is possible from old type to new type and either    a) old type is unconditionally
convertible(e.g. length/precision        increasing)    b) read locked scan of table reveals that all values are
convertible2. Exclusive write lock on table
 

if(new and old types are variable length and are binary compatible){  1. Change type in catalog  2. Done}
else{ 1. Visit all current tuples and rewrite in place, converting attribute value
to new     type, and shifting all other attributes and null bitmask appropriately 2. Change type in catalog 3. Done} 
Does this sound reasonable?  Also, is anyone working on ALTER TABLE DROP
COLUMN right now?

Speaking of which, couldn't we make it so that UPDATES and DELETES running
under an exclusive table lock do an inline vacuum?

- Mark Butler


Re: ALTER TABLE MODIFY COLUMN

От
Hiroshi Inoue
Дата:
Mark Butler wrote:
> 
> I was looking at how hard it would be to support altering column types and it
> seems to me that it would be trivial to support changing nullability,

Yes. The problem is how to formulate 'DROP CONSTRAINT' feature. 

> increasing the maximum length of the VARCHAR data type and increasing the
> precision or scale of the DECIMAL / NUMERIC data type.
> 

Yes. The problem is how PostgreSQL could recognize the fact.

[snip]

> I understand that with the current heap tuple format described in
> backend/access/common.c that changing the type of any fixed length attribute
> requires updating every row.
> 
> Surely if we have an write exclusive table lock we can rewrite tuples in place
> rather than creating new versions with its corresponding 2x space requirement.
> 

PostgreSQL has a no overwrite storage manager, so this
seems to have little advantage. We now have a mechanism
to replace existent relation files safely. We could 
avoid running VACUUM, multiple version of tuples in
a file etc ...

regards,
Hiroshi Inoue


Re: ALTER TABLE MODIFY COLUMN

От
Tom Lane
Дата:
Mark Butler <butlerm@middle.net> writes:
> Surely if we have an write exclusive table lock we can rewrite tuples
> in place rather than creating new versions with its corresponding 2x
> space requirement.

Nyet.  Consider transaction rollback.
        regards, tom lane


Re: ALTER TABLE MODIFY COLUMN

От
Mark Butler
Дата:
Tom Lane wrote:
> 
> Mark Butler <butlerm@middle.net> writes:
> > Surely if we have an write exclusive table lock we can rewrite tuples
> > in place rather than creating new versions with its corresponding 2x
> > space requirement.
> 
> Nyet.  Consider transaction rollback.

Well, the first thing to consider would be to make this type of DDL operation
un-abortable. If the database goes down while the table modification is in
progress, the recovery process could continue the operation to completion
before releasing the table for general access.

The problem with the standard alternatives is that they waste space and are
slow:

Alt 1. create new version of tuples in new format like DROP COLUMN proposal
Alt 2. rename column; add new column; copy column; drop (hide) old column 
Alt 3. rename indices; rename table; copy table; recreate indices; 

Now this probably only makes a difference in a data warehouse environment,
where the speed
of mass load / update operations is much more important than being able to
roll them back.

I suppose there are two really radical alternatives as well:

Radical Alt 1: Use format versioning to allow multiple row formats to
co-exist,              lazy update to latest format

Radical Alt 2: Delta compress different versions of the same row on the same
page

I can see that the conventional alternatives make sense for now, however.

- Mark Butler