Обсуждение: fast defaults in heap_getattr vs heap_deform_tuple
Hi, While working on the patch to slotify trigger.c I got somewhat confused by the need to expand tuples in trigger.c: static HeapTuple GetTupleForTrigger(EState *estate, EPQState *epqstate, ResultRelInfo *relinfo, ItemPointer tid, LockTupleMode lockmode, TupleTableSlot **newSlot) { ... if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts) result = heap_expand_tuple(&tuple, relation->rd_att); else result = heap_copytuple(&tuple); ReleaseBuffer(buffer); There's no explanation why GetTupleForTrigger() needs expanding tuples, but most other parts of postgres don't. Normally deforming ought to take care of expanding missing attrs. As far as I can tell, the reason that it's needed is that heap_gettar() wasn't updated along the rest of the functions. heap_deform_tuple(), heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't. That, to me, makes no sense. The reason that we see a difference in regression test output is that composite_to_json() uses heap_getattr(), if it used heap_deform_tuple (which'd be considerably faster), we'd get the default value. Am I missing something? Greetings, Andres Freund
Hi, On 2019-02-01 08:24:04 -0800, Andres Freund wrote: > While working on the patch to slotify trigger.c I got somewhat confused > by the need to expand tuples in trigger.c: > > static HeapTuple > GetTupleForTrigger(EState *estate, > EPQState *epqstate, > ResultRelInfo *relinfo, > ItemPointer tid, > LockTupleMode lockmode, > TupleTableSlot **newSlot) > { > ... > if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts) > result = heap_expand_tuple(&tuple, relation->rd_att); > else > result = heap_copytuple(&tuple); > ReleaseBuffer(buffer); > > There's no explanation why GetTupleForTrigger() needs expanding tuples, > but most other parts of postgres don't. Normally deforming ought to take > care of expanding missing attrs. > > As far as I can tell, the reason that it's needed is that heap_gettar() > wasn't updated along the rest of the functions. heap_deform_tuple(), > heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't. > > That, to me, makes no sense. The reason that we see a difference in > regression test output is that composite_to_json() uses heap_getattr(), > if it used heap_deform_tuple (which'd be considerably faster), we'd get > the default value. > > Am I missing something? Indeed, a hacky patch fixing heap_getattr makes the heap_expand_tuple() in trigger.c unnecessary. Attached. I think we need to do something along those lines. Andrew, I think it'd be good to do a ground up review of the fast defaults patch. There's been a fair number of issues in it, and this is a another pretty fundamental issue. Greetings, Andres Freund
Вложения
Him On 2019-02-01 14:49:05 -0800, Andres Freund wrote: > +#ifdef IAM_THE_WRONG_FIX > if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts) > result = heap_expand_tuple(&tuple, relation->rd_att); > else > result = heap_copytuple(&tuple); > +#else > + result = heap_copytuple(&tuple); > +#endif This was added in commit 7636e5c60fea83a9f3cd2ad278c0819b98941c74 Author: Andrew Dunstan <andrew@dunslane.net> Date: 2018-09-24 16:11:24 -0400 Fast default trigger and expand_tuple fixes Ensure that triggers get properly filled in tuples for the OLD value. Also fix the logic of detecting missing null values. The previous logic failed to detect a missing null column before the first missing column with a default. Fixing this has simplified the logic a bit. Regression tests are added to test changes. This should ensure better coverage of expand_tuple(). Original bug reports, and some code and test scripts from Tomas Vondra Backpatch to release 11. Unfortunately the commit doesn't reference the discussion. That appears to have been at https://postgr.es/m/224e4807-395d-9fc5-2934-d5f85130f1f0%402ndQuadrant.com Greetings, Andres Freund
Hi, On 2019-02-01 08:24:04 -0800, Andres Freund wrote: > While working on the patch to slotify trigger.c I got somewhat confused > by the need to expand tuples in trigger.c: > > static HeapTuple > GetTupleForTrigger(EState *estate, > EPQState *epqstate, > ResultRelInfo *relinfo, > ItemPointer tid, > LockTupleMode lockmode, > TupleTableSlot **newSlot) > { > ... > if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts) > result = heap_expand_tuple(&tuple, relation->rd_att); > else > result = heap_copytuple(&tuple); > ReleaseBuffer(buffer); > > There's no explanation why GetTupleForTrigger() needs expanding tuples, > but most other parts of postgres don't. Normally deforming ought to take > care of expanding missing attrs. > > As far as I can tell, the reason that it's needed is that heap_gettar() > wasn't updated along the rest of the functions. heap_deform_tuple(), > heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't. > > That, to me, makes no sense. The reason that we see a difference in > regression test output is that composite_to_json() uses heap_getattr(), > if it used heap_deform_tuple (which'd be considerably faster), we'd get > the default value. > > Am I missing something? This breaks HOT (and probably also foreign keys), when fast default columns are set to NULL, because HeapDetermineModifiedColumns() gets the values with heap_getattr(), which returns a spurious NULL for the old value (instead of the fast default value). That then would compare equal to the new column value set to NULL. Greetings, Andres Freund
Hi, On 2019-02-02 05:35:21 -0800, Andres Freund wrote: > This breaks HOT (and probably also foreign keys), when fast default > columns are set to NULL, because HeapDetermineModifiedColumns() gets the > values with heap_getattr(), which returns a spurious NULL for the old > value (instead of the fast default value). That then would compare equal > to the new column value set to NULL. Repro: BEGIN; CREATE TABLE t(); INSERT INTO t DEFAULT VALUES; ALTER TABLE t ADD COLUMN a int default 1; CREATE INDEX ON t(a); UPDATE t SET a = NULL; SET LOCAL enable_seqscan = true; SELECT * FROM t WHERE a IS NULL; SET LOCAL enable_seqscan = false; SELECT * FROM t WHERE a IS NULL; ROLLBACK; output: ... UPDATE 1 SET ┌────────┐ │ a │ ├────────┤ │ (null) │ └────────┘ (1 row) SET ┌───┐ │ a │ ├───┤ └───┘ (0 rows) Greetings, Andres Freund
On 2/1/19 5:49 PM, Andres Freund wrote: > Hi, > > On 2019-02-01 08:24:04 -0800, Andres Freund wrote: > > Andrew, I think it'd be good to do a ground up review of the fast > defaults patch. There's been a fair number of issues in it, and this is > a another pretty fundamental issue. > OK. Will try to organise it. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-02-05 10:14:48 -0500, Andrew Dunstan wrote: > > On 2/1/19 5:49 PM, Andres Freund wrote: > > Hi, > > > > On 2019-02-01 08:24:04 -0800, Andres Freund wrote: > > > > Andrew, I think it'd be good to do a ground up review of the fast > > defaults patch. There's been a fair number of issues in it, and this is > > a another pretty fundamental issue. > > > > > OK. Will try to organise it. Cool. Here's the patch that I, after some commit message polishing, plan to commit to 11 and master to fix the issue at hand. Greetings, Andres Freund
Вложения
On 2019-Feb-05, Andres Freund wrote: > @@ -82,7 +80,7 @@ static Datum getmissingattr(TupleDesc tupleDesc, int attnum, bool *isnull); > /* > * Return the missing value of an attribute, or NULL if there isn't one. > */ > -static Datum > +Datum > getmissingattr(TupleDesc tupleDesc, > int attnum, bool *isnull) This is a terrible name for an exported function -- let's change it before it gets exported. Heck, even heap_getmissingattr() would be better. I notice that with this patch, heap_getattr() obtains a new Assert() that the attr being fetched is no further than tupledesc->natts. It previously just returned null for that case. Maybe we should change it so that it returns null if an attr beyond end-of-array is fetched? (I think in non-assert builds, it would dereference past the AttrMissing array.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-02-05 08:57:05 -0800, Andres Freund wrote: > Hi, > > On 2019-02-05 10:14:48 -0500, Andrew Dunstan wrote: > > > > On 2/1/19 5:49 PM, Andres Freund wrote: > > > Hi, > > > > > > On 2019-02-01 08:24:04 -0800, Andres Freund wrote: > > > > > > Andrew, I think it'd be good to do a ground up review of the fast > > > defaults patch. There's been a fair number of issues in it, and this is > > > a another pretty fundamental issue. > > > > > > > > > OK. Will try to organise it. > > Cool. Here's the patch that I, after some commit message polishing, plan > to commit to 11 and master to fix the issue at hand. And pushed. - Andres
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> Cool. Here's the patch that I, after some commit message polishing, >> plan to commit to 11 and master to fix the issue at hand. Andres> And pushed. Sorry I didn't spot this earlier, but... in the backpatch to v11, is the expansion of the trigger tuple really unnecessary now? Since heap_getattr is a macro, C-language triggers that aren't recompiled won't see the default? So perhaps the expansion should be left in? -- Andrew (irc:RhodiumToad)
Hi, On 2019-02-06 10:25:56 +0000, Andrew Gierth wrote: > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: > > >> Cool. Here's the patch that I, after some commit message polishing, > >> plan to commit to 11 and master to fix the issue at hand. > > Andres> And pushed. > > Sorry I didn't spot this earlier, but... in the backpatch to v11, is the > expansion of the trigger tuple really unnecessary now? Since > heap_getattr is a macro, C-language triggers that aren't recompiled > won't see the default? So perhaps the expansion should be left in? There was some IRC conversation about this: RhodiumToad | andres: ping? RhodiumToad | n/m, sent mail andres | RhodiumToad: IDK, there's a lot of other heap_getattr calls, and most of them don't go | through GetTupleForTrigger(). So extensions need to be recompiled either way. andres | (I'll write that in a bit on list, in meeting now) RhodiumToad | right, but those other calls were already broken. RhodiumToad | whereas in a trigger, they'd have worked previously, but then break with your fix Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical, | pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum RhodiumToad | it's not an encouraged interface I'm mildly disinclined to re-add the heap_expand_tuple, because it's not the only case, and the extensions named above don't seem like they'd specifically affected by the trigger path, but I'm not sure. Greetings, Andres Freund
> > Sorry I didn't spot this earlier, but... in the backpatch to v11, is the > > expansion of the trigger tuple really unnecessary now? Since > > heap_getattr is a macro, C-language triggers that aren't recompiled > > won't see the default? So perhaps the expansion should be left in? > > Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical, > | pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum > > I'm mildly disinclined to re-add the heap_expand_tuple, because it's not > the only case, and the extensions named above don't seem like they'd > specifically affected by the trigger path, but I'm not sure. I'm not following closely enough to say anything about which fix is the best, but if this isn't a "recompile the world" case, we can get the above list of packages rebuilt. It would be rather unpleasant to have to schedule that for 50 packages, though. Are rebuilt extension binaries compatible with older servers that do not have the fix yet? Christoph
Hi, On 2019-02-05 22:44:38 -0300, Alvaro Herrera wrote: > On 2019-Feb-05, Andres Freund wrote: > > > @@ -82,7 +80,7 @@ static Datum getmissingattr(TupleDesc tupleDesc, int attnum, bool *isnull); > > /* > > * Return the missing value of an attribute, or NULL if there isn't one. > > */ > > -static Datum > > +Datum > > getmissingattr(TupleDesc tupleDesc, > > int attnum, bool *isnull) > > This is a terrible name for an exported function -- let's change it > before it gets exported. Heck, even heap_getmissingattr() would be > better. I don't really aggree. Note that the relevant datastructure is named AttrMissing, and that the function isn't relevant for heap tuple. So I don't really see what we'd otherwise name it? I think if we wanted to improve this we should start with AttrMissing and not this function. > I notice that with this patch, heap_getattr() obtains a new Assert() > that the attr being fetched is no further than tupledesc->natts. > It previously just returned null for that case. Maybe we should change > it so that it returns null if an attr beyond end-of-array is fetched? > (I think in non-assert builds, it would dereference past the AttrMissing > array.) Hm, it seems like there's plenty issues with accessing datums after the end of the desc before this change, as well as after this change. Note that slot_getattr() (in 11, it looks a bit different in master) already calls getmissingattr(). And that's much more commonly used. Therefore I'm disinclined to see this as a problem? Greetings, Andres Freund
Hi, On 2019-02-06 03:39:19 -0800, Andres Freund wrote: > On 2019-02-06 10:25:56 +0000, Andrew Gierth wrote: > > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: > > > > >> Cool. Here's the patch that I, after some commit message polishing, > > >> plan to commit to 11 and master to fix the issue at hand. > > > > Andres> And pushed. > > > > Sorry I didn't spot this earlier, but... in the backpatch to v11, is the > > expansion of the trigger tuple really unnecessary now? Since > > heap_getattr is a macro, C-language triggers that aren't recompiled > > won't see the default? So perhaps the expansion should be left in? > > There was some IRC conversation about this: > > RhodiumToad | andres: ping? > RhodiumToad | n/m, sent mail > andres | RhodiumToad: IDK, there's a lot of other heap_getattr calls, and most of them don't go > | through GetTupleForTrigger(). So extensions need to be recompiled either way. > andres | (I'll write that in a bit on list, in meeting now) > RhodiumToad | right, but those other calls were already broken. > RhodiumToad | whereas in a trigger, they'd have worked previously, but then break with your fix > Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical, > | pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum > RhodiumToad | it's not an encouraged interface > > I'm mildly disinclined to re-add the heap_expand_tuple, because it's not > the only case, and the extensions named above don't seem like they'd > specifically affected by the trigger path, but I'm not sure. I put the expansion back for 11. Seems harmless enough. I'm planning to send a reply to the minor release info to -packagers informing people that extensions compiled after the minor release might not be compatible with older servers (because they depend on getmissingattr() which previously wasn't an exposed function) Greetings, Andres Freund