Обсуждение: fast defaults in heap_getattr vs heap_deform_tuple

Поиск
Список
Период
Сортировка

fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

While working on the patch to slotify trigger.c I got somewhat confused
by the need to expand tuples in trigger.c:

static HeapTuple
GetTupleForTrigger(EState *estate,
                   EPQState *epqstate,
                   ResultRelInfo *relinfo,
                   ItemPointer tid,
                   LockTupleMode lockmode,
                   TupleTableSlot **newSlot)
{
...
    if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
        result = heap_expand_tuple(&tuple, relation->rd_att);
    else
        result = heap_copytuple(&tuple);
    ReleaseBuffer(buffer);

There's no explanation why GetTupleForTrigger() needs expanding tuples,
but most other parts of postgres don't. Normally deforming ought to take
care of expanding missing attrs.

As far as I can tell, the reason that it's needed is that heap_gettar()
wasn't updated along the rest of the functions. heap_deform_tuple(),
heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't.

That, to me, makes no sense. The reason that we see a difference in
regression test output is that composite_to_json() uses heap_getattr(),
if it used heap_deform_tuple (which'd be considerably faster), we'd get
the default value.

Am I missing something?

Greetings,

Andres Freund


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

On 2019-02-01 08:24:04 -0800, Andres Freund wrote:
> While working on the patch to slotify trigger.c I got somewhat confused
> by the need to expand tuples in trigger.c:
> 
> static HeapTuple
> GetTupleForTrigger(EState *estate,
>                    EPQState *epqstate,
>                    ResultRelInfo *relinfo,
>                    ItemPointer tid,
>                    LockTupleMode lockmode,
>                    TupleTableSlot **newSlot)
> {
> ...
>     if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
>         result = heap_expand_tuple(&tuple, relation->rd_att);
>     else
>         result = heap_copytuple(&tuple);
>     ReleaseBuffer(buffer);
> 
> There's no explanation why GetTupleForTrigger() needs expanding tuples,
> but most other parts of postgres don't. Normally deforming ought to take
> care of expanding missing attrs.
> 
> As far as I can tell, the reason that it's needed is that heap_gettar()
> wasn't updated along the rest of the functions. heap_deform_tuple(),
> heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't.
> 
> That, to me, makes no sense. The reason that we see a difference in
> regression test output is that composite_to_json() uses heap_getattr(),
> if it used heap_deform_tuple (which'd be considerably faster), we'd get
> the default value.
> 
> Am I missing something?

Indeed, a hacky patch fixing heap_getattr makes the heap_expand_tuple()
in trigger.c unnecessary. Attached. I think we need to do something
along those lines.

Andrew, I think it'd be good to do a ground up review of the fast
defaults patch. There's been a fair number of issues in it, and this is
a another pretty fundamental issue.

Greetings,

Andres Freund

Вложения

Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Him

On 2019-02-01 14:49:05 -0800, Andres Freund wrote:
> +#ifdef IAM_THE_WRONG_FIX
>      if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
>          result = heap_expand_tuple(&tuple, relation->rd_att);
>      else
>          result = heap_copytuple(&tuple);
> +#else
> +    result = heap_copytuple(&tuple);
> +#endif

This was added in

commit 7636e5c60fea83a9f3cd2ad278c0819b98941c74
Author: Andrew Dunstan <andrew@dunslane.net>
Date:   2018-09-24 16:11:24 -0400

    Fast default trigger and expand_tuple fixes
    
    Ensure that triggers get properly filled in tuples for the OLD value.
    Also fix the logic of detecting missing null values. The previous logic
    failed to detect a missing null column before the first missing column
    with a default. Fixing this has simplified the logic a bit.
    
    Regression tests are added to test changes. This should ensure better
    coverage of expand_tuple().
    
    Original bug reports, and some code and test scripts from Tomas Vondra
    
    Backpatch to release 11.

Unfortunately the commit doesn't reference the discussion. That appears
to have been at
https://postgr.es/m/224e4807-395d-9fc5-2934-d5f85130f1f0%402ndQuadrant.com

Greetings,

Andres Freund


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

On 2019-02-01 08:24:04 -0800, Andres Freund wrote:
> While working on the patch to slotify trigger.c I got somewhat confused
> by the need to expand tuples in trigger.c:
> 
> static HeapTuple
> GetTupleForTrigger(EState *estate,
>                    EPQState *epqstate,
>                    ResultRelInfo *relinfo,
>                    ItemPointer tid,
>                    LockTupleMode lockmode,
>                    TupleTableSlot **newSlot)
> {
> ...
>     if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
>         result = heap_expand_tuple(&tuple, relation->rd_att);
>     else
>         result = heap_copytuple(&tuple);
>     ReleaseBuffer(buffer);
> 
> There's no explanation why GetTupleForTrigger() needs expanding tuples,
> but most other parts of postgres don't. Normally deforming ought to take
> care of expanding missing attrs.
> 
> As far as I can tell, the reason that it's needed is that heap_gettar()
> wasn't updated along the rest of the functions. heap_deform_tuple(),
> heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't.
> 
> That, to me, makes no sense. The reason that we see a difference in
> regression test output is that composite_to_json() uses heap_getattr(),
> if it used heap_deform_tuple (which'd be considerably faster), we'd get
> the default value.
> 
> Am I missing something?

This breaks HOT (and probably also foreign keys), when fast default
columns are set to NULL, because HeapDetermineModifiedColumns() gets the
values with heap_getattr(), which returns a spurious NULL for the old
value (instead of the fast default value). That then would compare equal
to the new column value set to NULL.

Greetings,

Andres Freund


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

On 2019-02-02 05:35:21 -0800, Andres Freund wrote:
> This breaks HOT (and probably also foreign keys), when fast default
> columns are set to NULL, because HeapDetermineModifiedColumns() gets the
> values with heap_getattr(), which returns a spurious NULL for the old
> value (instead of the fast default value). That then would compare equal
> to the new column value set to NULL.

Repro:

BEGIN;
CREATE TABLE t();
INSERT INTO t DEFAULT VALUES;
ALTER TABLE t ADD COLUMN a int default 1;
CREATE INDEX ON t(a);
UPDATE t SET a = NULL;

SET LOCAL enable_seqscan = true;
SELECT * FROM t WHERE a IS NULL;
SET LOCAL enable_seqscan = false;
SELECT * FROM t WHERE a IS NULL;
ROLLBACK;

output:
...
UPDATE 1
SET
┌────────┐
│   a    │
├────────┤
│ (null) │
└────────┘
(1 row)

SET
┌───┐
│ a │
├───┤
└───┘
(0 rows)



Greetings,

Andres Freund


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andrew Dunstan
Дата:
On 2/1/19 5:49 PM, Andres Freund wrote:
> Hi,
>
> On 2019-02-01 08:24:04 -0800, Andres Freund wrote:
>
> Andrew, I think it'd be good to do a ground up review of the fast
> defaults patch. There's been a fair number of issues in it, and this is
> a another pretty fundamental issue.
>


OK. Will try to organise it.


cheers


andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

On 2019-02-05 10:14:48 -0500, Andrew Dunstan wrote:
> 
> On 2/1/19 5:49 PM, Andres Freund wrote:
> > Hi,
> >
> > On 2019-02-01 08:24:04 -0800, Andres Freund wrote:
> >
> > Andrew, I think it'd be good to do a ground up review of the fast
> > defaults patch. There's been a fair number of issues in it, and this is
> > a another pretty fundamental issue.
> >
> 
> 
> OK. Will try to organise it.

Cool. Here's the patch that I, after some commit message polishing, plan
to commit to 11 and master to fix the issue at hand.

Greetings,

Andres Freund

Вложения

Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Alvaro Herrera
Дата:
On 2019-Feb-05, Andres Freund wrote:

> @@ -82,7 +80,7 @@ static Datum getmissingattr(TupleDesc tupleDesc, int attnum, bool *isnull);
>  /*
>   * Return the missing value of an attribute, or NULL if there isn't one.
>   */
> -static Datum
> +Datum
>  getmissingattr(TupleDesc tupleDesc,
>                 int attnum, bool *isnull)

This is a terrible name for an exported function -- let's change it
before it gets exported.  Heck, even heap_getmissingattr() would be
better.

I notice that with this patch, heap_getattr() obtains a new Assert()
that the attr being fetched is no further than tupledesc->natts.
It previously just returned null for that case.  Maybe we should change
it so that it returns null if an attr beyond end-of-array is fetched?
(I think in non-assert builds, it would dereference past the AttrMissing
array.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
On 2019-02-05 08:57:05 -0800, Andres Freund wrote:
> Hi,
> 
> On 2019-02-05 10:14:48 -0500, Andrew Dunstan wrote:
> > 
> > On 2/1/19 5:49 PM, Andres Freund wrote:
> > > Hi,
> > >
> > > On 2019-02-01 08:24:04 -0800, Andres Freund wrote:
> > >
> > > Andrew, I think it'd be good to do a ground up review of the fast
> > > defaults patch. There's been a fair number of issues in it, and this is
> > > a another pretty fundamental issue.
> > >
> > 
> > 
> > OK. Will try to organise it.
> 
> Cool. Here's the patch that I, after some commit message polishing, plan
> to commit to 11 and master to fix the issue at hand.

And pushed.

- Andres


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andrew Gierth
Дата:
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:

 >> Cool. Here's the patch that I, after some commit message polishing,
 >> plan to commit to 11 and master to fix the issue at hand.

 Andres> And pushed.

Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
expansion of the trigger tuple really unnecessary now? Since
heap_getattr is a macro, C-language triggers that aren't recompiled
won't see the default? So perhaps the expansion should be left in?

-- 
Andrew (irc:RhodiumToad)


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

On 2019-02-06 10:25:56 +0000, Andrew Gierth wrote:
> >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:
>
>  >> Cool. Here's the patch that I, after some commit message polishing,
>  >> plan to commit to 11 and master to fix the issue at hand.
>
>  Andres> And pushed.
>
> Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
> expansion of the trigger tuple really unnecessary now? Since
> heap_getattr is a macro, C-language triggers that aren't recompiled
> won't see the default? So perhaps the expansion should be left in?

There was some IRC conversation about this:

RhodiumToad | andres: ping?
RhodiumToad | n/m, sent mail
     andres | RhodiumToad: IDK, there's a lot of other heap_getattr calls, and most of them don't go
            | through GetTupleForTrigger(). So extensions need to be recompiled either way.
     andres | (I'll write that in a bit on list, in meeting now)
RhodiumToad | right, but those other calls were already broken.
RhodiumToad | whereas in a trigger, they'd have worked previously, but then break with your fix
       Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical,
            | pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum
RhodiumToad | it's not an encouraged interface

I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
the only case, and the extensions named above don't seem like they'd
specifically affected by the trigger path, but I'm not sure.

Greetings,

Andres Freund


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Christoph Berg
Дата:
> > Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
> > expansion of the trigger tuple really unnecessary now? Since
> > heap_getattr is a macro, C-language triggers that aren't recompiled
> > won't see the default? So perhaps the expansion should be left in?
> 
>        Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical,
>             | pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum
> 
> I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
> the only case, and the extensions named above don't seem like they'd
> specifically affected by the trigger path, but I'm not sure.

I'm not following closely enough to say anything about which fix is
the best, but if this isn't a "recompile the world" case, we can get
the above list of packages rebuilt. It would be rather unpleasant to
have to schedule that for 50 packages, though.

Are rebuilt extension binaries compatible with older servers that do
not have the fix yet?

Christoph


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

On 2019-02-05 22:44:38 -0300, Alvaro Herrera wrote:
> On 2019-Feb-05, Andres Freund wrote:
> 
> > @@ -82,7 +80,7 @@ static Datum getmissingattr(TupleDesc tupleDesc, int attnum, bool *isnull);
> >  /*
> >   * Return the missing value of an attribute, or NULL if there isn't one.
> >   */
> > -static Datum
> > +Datum
> >  getmissingattr(TupleDesc tupleDesc,
> >                 int attnum, bool *isnull)
> 
> This is a terrible name for an exported function -- let's change it
> before it gets exported.  Heck, even heap_getmissingattr() would be
> better.

I don't really aggree. Note that the relevant datastructure is named
AttrMissing, and that the function isn't relevant for heap tuple. So I
don't really see what we'd otherwise name it?  I think if we wanted to
improve this we should start with AttrMissing and not this function.


> I notice that with this patch, heap_getattr() obtains a new Assert()
> that the attr being fetched is no further than tupledesc->natts.
> It previously just returned null for that case.  Maybe we should change
> it so that it returns null if an attr beyond end-of-array is fetched?
> (I think in non-assert builds, it would dereference past the AttrMissing
> array.)

Hm, it seems like there's plenty issues with accessing datums after the
end of the desc before this change, as well as after this change. Note
that slot_getattr() (in 11, it looks a bit different in master) already
calls getmissingattr(). And that's much more commonly used. Therefore
I'm disinclined to see this as a problem?

Greetings,

Andres Freund


Re: fast defaults in heap_getattr vs heap_deform_tuple

От
Andres Freund
Дата:
Hi,

On 2019-02-06 03:39:19 -0800, Andres Freund wrote:
> On 2019-02-06 10:25:56 +0000, Andrew Gierth wrote:
> > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:
> >
> >  >> Cool. Here's the patch that I, after some commit message polishing,
> >  >> plan to commit to 11 and master to fix the issue at hand.
> >
> >  Andres> And pushed.
> >
> > Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
> > expansion of the trigger tuple really unnecessary now? Since
> > heap_getattr is a macro, C-language triggers that aren't recompiled
> > won't see the default? So perhaps the expansion should be left in?
> 
> There was some IRC conversation about this:
> 
> RhodiumToad | andres: ping?
> RhodiumToad | n/m, sent mail
>      andres | RhodiumToad: IDK, there's a lot of other heap_getattr calls, and most of them don't go
>             | through GetTupleForTrigger(). So extensions need to be recompiled either way.
>      andres | (I'll write that in a bit on list, in meeting now)
> RhodiumToad | right, but those other calls were already broken.
> RhodiumToad | whereas in a trigger, they'd have worked previously, but then break with your fix
>        Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical,
>             | pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum
> RhodiumToad | it's not an encouraged interface
> 
> I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
> the only case, and the extensions named above don't seem like they'd
> specifically affected by the trigger path, but I'm not sure.

I put the expansion back for 11. Seems harmless enough.  I'm planning to
send a reply to the minor release info to -packagers informing people
that extensions compiled after the minor release might not be compatible
with older servers (because they depend on getmissingattr() which
previously wasn't an exposed function)

Greetings,

Andres Freund