Обсуждение: VARATT_EXTERNAL_GET_POINTER is not quite there yet
I got around to testing PG 8.3 on HPUX on Itanium (feel free to play along at www.testdrive.hp.com) ... and was dismayed to find that it doesn't work. If compiled with HP's C compiler, the regression tests dump core. Investigation shows that the problem occurs where tuptoaster.c tries to copy a misaligned toast pointer datum into a properly aligned local variable: it's using word-wide load and store instructions to do that copying, which of course does not work on any architecture that's picky about alignment. We'd seen this before and tried to fix it by introducing a pointer cast within VARATT_EXTERNAL_GET_POINTER(), but evidently that's not enough for some non-gcc compilers. After much experimentation I was able to get it to work by invoking memcpy through a function pointer, which seems to be sufficient to disable this particular compiler's built-in intelligence about memcpy. I can't say that I find this a nice clean solution; but does anyone have a better one? The full patch that I'm thinking of applying is *** src/backend/access/heap/tuptoaster.c.orig Tue Jan 1 14:53:12 2008 --- src/backend/access/heap/tuptoaster.c Wed Feb 20 20:28:13 2008 *************** *** 65,72 **** #define VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr) \ do { \ varattrib_1b_e *attre = (varattrib_1b_e*) (attr); \ ! Assert(VARSIZE_ANY_EXHDR(attre) == sizeof(toast_pointer)); \ ! memcpy(&(toast_pointer), VARDATA_EXTERNAL(attre), sizeof(toast_pointer)); \ } while (0) --- 65,74 ---- #define VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr) \ do { \ varattrib_1b_e *attre = (varattrib_1b_e*) (attr); \ ! void *(*mcopy) (void *dest, const void *src, size_t sz) = memcpy; \ ! Assert(VARATT_IS_EXTERNAL(attre)); \ ! Assert(VARSIZE_EXTERNAL(attre) == sizeof(toast_pointer)+VARHDRSZ_EXTERNAL); \ ! mcopy(&(toast_pointer), VARDATA_EXTERNAL(attre), sizeof(toast_pointer)); \ } while (0) (The Assert changes aren't necessary for this particular problem, but were added after realizing that the original Assert didn't adequately protect the subsequent use of VARDATA_EXTERNAL. That macro assumes that the datum has a 1-byte header. I had first thought that the cast to "varattrib_4b *" that occurs within one branch of VARSIZE_ANY_EXHDR might be giving the compiler license to think that the pointer is word-aligned. After further experimentation I don't think that HP's compiler thinks so; but some other compiler might, so it seems wise to get rid of that.) It's all pretty ugly, but I'm afraid that we're in for shenanigans like this as compilers get more aggressive about optimization. Has anyone got a better suggestion? regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > After much experimentation I was able to get it to work by invoking > memcpy through a function pointer, which seems to be sufficient to > disable this particular compiler's built-in intelligence about memcpy. > I can't say that I find this a nice clean solution; but does anyone have > a better one? I'm thinking instead of having struct varlena (which you're not allowed to safely use any members of anyways) we should just have a typedef to void*. That would make things like DATUM_GET_TEXT_PP slightly more sane as well. text* would just be a typedef to void* which could be passed to VARDATA_ANY and VARDATA_ANY_EXHDR but not manipulated directly. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> I can't say that I find this a nice clean solution; but does anyone have >> a better one? > I'm thinking instead of having struct varlena (which you're not allowed to > safely use any members of anyways) we should just have a typedef to void*. I don't think we could imagine eliminating the struct name, especially not as a back-patchable solution; there would be too many random breakages. It might work to change struct varlena's contents to something like char vl_len_[4]; /* Do not touch this field directly! */char vl_dat[1]; so that the compiler wouldn't see it as necessarily having more than 1-byte alignment. This would also not break any existing code that is following the rules (touching vl_dat has never been stated to be verboten). regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: >> "Tom Lane" <tgl@sss.pgh.pa.us> writes: >>> I can't say that I find this a nice clean solution; but does anyone have >>> a better one? > >> I'm thinking instead of having struct varlena (which you're not allowed to >> safely use any members of anyways) we should just have a typedef to void*. > > I don't think we could imagine eliminating the struct name, especially > not as a back-patchable solution; there would be too many random > breakages. Yeah, I wasn't thinking to backpatch that. > It might work to change struct varlena's contents to something like > > char vl_len_[4]; /* Do not touch this field directly! */ > char vl_dat[1]; > > so that the compiler wouldn't see it as necessarily having more than > 1-byte alignment. This would also not break any existing code that is > following the rules (touching vl_dat has never been stated to be > verboten). Oh, that would probably fix this problem pretty well. Touching vl_dat is only safe if you've either just allocated the object yourself or you've already detoasted it. In which case we could be using a struct pointer. But if you have a plain old varlena which might be toasted or the return value from GETARG_TEXT_PP then it doesn't make a lot of sense to have a struct pointer at all. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> It might work to change struct varlena's contents to something like >> >> char vl_len_[4]; /* Do not touch this field directly! */ >> char vl_dat[1]; > Oh, that would probably fix this problem pretty well. > Touching vl_dat is only safe if you've either just allocated the object > yourself or you've already detoasted it. Sure, but that's been true ever since TOAST was first put in. The varvarlena patch didn't change that coding rule. I'll go try the char[4] hack and see if it works on the machines I have. regards, tom lane
Tom Lane wrote: > Gregory Stark <stark@enterprisedb.com> writes: > > "Tom Lane" <tgl@sss.pgh.pa.us> writes: > >> It might work to change struct varlena's contents to something like > >> > >> char vl_len_[4]; /* Do not touch this field directly! */ > >> char vl_dat[1]; > > > Oh, that would probably fix this problem pretty well. > > > Touching vl_dat is only safe if you've either just allocated the object > > yourself or you've already detoasted it. > > Sure, but that's been true ever since TOAST was first put in. The > varvarlena patch didn't change that coding rule. > > I'll go try the char[4] hack and see if it works on the machines I have. Tom has applied this change to CVS HEAD and 8.3.X. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +