Обсуждение: Short varlena headers and arrays
I had intended to make varlenas alignment 'c' and have the heaptuple.c force them to alignment 'i' if they required it. However I've noticed a problem that makes me think I should do this the other way around. The problem is that other places in the codebase use the alignment. In particular arrays do. Also toasting.c expects to get a worst-case size from att_align rather than a best-case. Also there's indextuple.c but probably I should get to that in this round anyways. So now I'm thinking it's best to leave them as alignment 'i' unless heaptuple.c thinks it can get away without aligning them. This means we don't have a convenient way for data types to opt out of this header compression. But the more I think about it the less convinced I am that we need that. The alignment inside the data type doesn't matter since you'll only be working with detoasted versions of them unless you specifically go out of your way to do otherwise. Once this is done it may be worth having arrays convert to short varlenas as well. Arrays of short strings hurt pretty badly currently: postgres=# select pg_column_size(array['a','b','c','d']);pg_column_size ---------------- 56 (1 row) The only problem with this is if it's more likely for someone to stuff things in an array and then read them back out without detoasting than it is for someone to stuff them in a tuple. Probably the risk is the same. There is some code that assumes it understands how arrays are laid out in execQual.c and varlena.c. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > Once this is done it may be worth having arrays convert to short varlenas as > well. Elements of arrays are not subject to being toasted by themselves, so I don't think you can make that work. At least not without breaking wide swaths of code that works fine today. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: >> Once this is done it may be worth having arrays convert to short varlenas as >> well. > > Elements of arrays are not subject to being toasted by themselves, so > I don't think you can make that work. At least not without breaking > wide swaths of code that works fine today. You think it's more likely there are places that build arrays and then read the items back without passing through detoast than there are places that build tuples and do so? Btw I ran into some problems with system tables. Since many of them are read using the GETSTRUCT method and in that method the first varlena field should be safely accessible, i would have to not skip the alignment for the first varlena field in system tables. Instead I just punt on all system tables. The only one that seems like it'll be loss on is pg_statistic and there the biggest problem is the space wasted inside the arrays, not before the varlena fields. Also, int2vector and oidvector don't expect to be toasted so I've skipped them as well. If we want to have an escape hatch they would have to be so marked. For now I just hard coded them. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> Elements of arrays are not subject to being toasted by themselves, so >> I don't think you can make that work. At least not without breaking >> wide swaths of code that works fine today. > You think it's more likely there are places that build arrays and then read > the items back without passing through detoast than there are places that > build tuples and do so? The former is valid per the coding rules, the latter is not, so... regards, tom lane