Обсуждение: varchar() change
Let me go over the issues with the varchar() change. char() will continue to store full bytes, while varchar() function like text, but with a limited length. Now, pg_attribute.attlen is access everywhere, trying to find out how long the data field is. With text, the length is -1, but with varchar currently, it is the max length, and hence, it has to store all those bytes. Now, my idea is to add a new pg_attribute column called 'attmaxlen' which will hold the maximum length of the field. char() and varchar() will use this field, and the code will have be changed. Cases where attlen is referenced to determine data size will continue to use -1, but references to all functions that create a data entry will use the attmaxlen. I see 124 references to attlen in the code. Not too bad. Most are obvious. We had some of this work in the past, fixing places where the size was not properly passed into the table creation code, because varchar() and char() do not have lengths defined in pg_type like everyone else, but it is only in pg_attribute. This is a related change to allow data reference and tuple max length reference to be separate. I can see other new types using this field to. Come to think of it, I wonder if I could have the disk copy of pg_attribute use the pg_type length, and use the pg_attribute length only when creating/updating entries? I wonder if that is what it does already. Looks like that may be true. Comments? -- Bruce Momjian maillist@candle.pha.pa.us
Bruce Momjian wrote: > > Let me go over the issues with the varchar() change. > > char() will continue to store full bytes, while varchar() function like > text, but with a limited length. > > Now, pg_attribute.attlen is access everywhere, trying to find out how > long the data field is. With text, the length is -1, but with varchar > currently, it is the max length, and hence, it has to store all those > bytes. > > Now, my idea is to add a new pg_attribute column called 'attmaxlen' > which will hold the maximum length of the field. char() and varchar() > will use this field, and the code will have be changed. Cases where > attlen is referenced to determine data size will continue to use -1, but > references to all functions that create a data entry will use the > attmaxlen. I see 124 references to attlen in the code. Not too bad. > Most are obvious. Ok. I agreed that we have to add new column to pg_attribute, but I recommend 1. use some other name - not attmaxlen: this field could be used for NUMBER, etc and "maxlen" is not good name for storing precision, etc (atttspec ?) 2. use -2 for varchar: let's think about attlen -1 as about "un-limited" varlena, and about attlen -2 as about "limited" one, with maxlen specified in att???. I don't see problem with -2 - just new case of switch (attlen) - and this will allow leave text (-1) untouched (or you will have to store -1 in att??? for text to differentiate text from varchar)... Hmm, ... on the other hand, we could check atttype before switch(attlen) in heaptuple.c and other places - don't know what's better... Vadim
> Let me go over the issues with the varchar() change. > > char() will continue to store full bytes, while varchar() function like > text, but with a limited length. > > Now, pg_attribute.attlen is access everywhere, trying to find out how > long the data field is. With text, the length is -1, but with varchar > currently, it is the max length, and hence, it has to store all those > bytes. > > Now, my idea is to add a new pg_attribute column called 'attmaxlen' > which will hold the maximum length of the field. char() and varchar() > will use this field, and the code will have be changed. Cases where > attlen is referenced to determine data size will continue to use -1, but > references to all functions that create a data entry will use the > attmaxlen. I see 124 references to attlen in the code. Not too bad. > Most are obvious. > > We had some of this work in the past, fixing places where the size was > not properly passed into the table creation code, because varchar() and > char() do not have lengths defined in pg_type like everyone else, but it > is only in pg_attribute. > > This is a related change to allow data reference and tuple max length > reference to be separate. I can see other new types using this field > to. > > Come to think of it, I wonder if I could have the disk copy of > pg_attribute use the pg_type length, and use the pg_attribute length > only when creating/updating entries? I wonder if that is what it does > already. Looks like that may be true. > > Comments? Is what you are trying to do related to what could be used to implement other (SQL92) data types like numeric(precision,scale) where there are one or two additional parameters which are assigned when a column/class/type is defined and which must be available when working with column/class/type instances? We probably don't want to do anything about the latter for v6.3 (spread pretty thin with the work we've already picked up) but I'd like to do something for v6.4... Oh, while I'm thinking about it, this kind of thing is probably also necessary to get arrays working as expected (enforcing dimensions specified in the declaration). - Tom
> > Come to think of it, I wonder if I could have the disk copy of > > pg_attribute use the pg_type length, and use the pg_attribute length > > only when creating/updating entries? I wonder if that is what it does > > already. Looks like that may be true. > > > > Comments? > > Is what you are trying to do related to what could be used to implement > other (SQL92) data types like numeric(precision,scale) where there are one > or two additional parameters which are assigned when a column/class/type is > defined and which must be available when working with column/class/type > instances? We probably don't want to do anything about the latter for v6.3 > (spread pretty thin with the work we've already picked up) but I'd like to > do something for v6.4... > > Oh, while I'm thinking about it, this kind of thing is probably also > necessary to get arrays working as expected (enforcing dimensions specified > in the declaration). Yes, I had the numeric in mind. -- Bruce Momjian maillist@candle.pha.pa.us