Обсуждение: BUG #18205: Performance regression with NOT NULL checks.

Поиск

Список

Период

Сортировка

BUG #18205: Performance regression with NOT NULL checks.

От

PG Bug reporting form

Дата:

19 ноября 2023 г., 18:26:32

The following bug has been logged on the website:

Bug reference:      18205
Logged by:          Daniel Migowski
Email address:      dmigowski@ikoffice.de
PostgreSQL version: 15.5
Operating system:   Windows + Linux
Description:

I found out that selecting from a wide table with a single not-null where
clause leads to severe performance regression when upgrading from PostgreSQL
9.5 to PostgreSQL 15. I am doing the following queries without parallel
query enabled because in a real world case we cannot go parallel anyway. 

    select count(id) FROM testtable t WHERE t.y IS NULL;

Here are the timings of the query (three runs):

    PG 9.5: 0,33s 0,33s 0,33s
    PG 15: 0,43s 0,44s 0,43s

Please create a test table this way:

drop table if exists testtable cascade;
SELECT generate_series::int4 as id, null::int4 a, null::int4 b,
    null::int4 c, null::int4 d, null::int4 e, null::int4 f, null::int4 g,
null::int4 h, null::int4 i, null::int4 j,
    null::int4 k, null::int4 l, null::int4 m, null::int4 n, null::int4 o,
null::int4 p, null::int4 q, null::int4 r, 
    null::int4 s, null::int4 t, null::int4 u, null::int4 v, null::int4 w,
null::int4 x, null::int4 y, null::int4 z
    into testtable 
   FROM generate_series(1,6000000,1);

-- On PG15:
set max_parallel_workers = 0;
set max_parallel_workers_per_gather = 0;

I already talked to despesz about this and he timed the query on different
PG versions, which seem to be the versions from PostgreSQL's Debian
repository:

- 9.5.25 : 590.958 ms
- 9.6.24 : 607.228 ms
- 10.23 : 820.779 ms
- 11.22 : 746.122 ms
- 12.17 : 829.786 ms
- 13.13 : 804.878 ms
- 14.10 : 772.415 ms
- 15.5 : 774.749 ms
- 16.1 : 746.802 ms

Starting with PostgreSQL 10 there is a severe degration in performance. I
just cannot believe later versions of PostgreSQL being so much slower than
the original versions.

Re: BUG #18205: Performance regression with NOT NULL checks.

От

Tom Lane

Дата:

19 ноября 2023 г., 22:08:05

PG Bug reporting form <noreply@postgresql.org> writes:
> I found out that selecting from a wide table with a single not-null where
> clause leads to severe performance regression when upgrading from PostgreSQL
> 9.5 to PostgreSQL 15.

I spent some time poking into this.  "git bisect" pins the blame on

commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755
Author: Andres Freund <andres@anarazel.de>
Date:   Tue Mar 14 15:45:36 2017 -0700

    Faster expression evaluation and targetlist projection.

The time needed for the seqscan gets about 50% worse at that
commit (on my test machine, anyway):

   ->  Seq Scan on testtable t  (cost=0.00..51892.80 rows=9730 width=4) (actual time=0.010..204.937 rows=6000000
loops=1)
         Filter: (y IS NULL)

versus

   ->  Seq Scan on testtable t  (cost=0.00..51892.80 rows=9730 width=4) (actual time=0.013..317.069 rows=6000000
loops=1)
         Filter: (y IS NULL)

"perf" says that the extra time is mostly spent in slot_deform_tuple's
inner loop:

    for (; attnum < natts; attnum++)
    {
        Form_pg_attribute thisatt = att[attnum];

        if (hasnulls && att_isnull(attnum, bp))
        {
            values[attnum] = (Datum) 0;
            isnull[attnum] = true;
            slow = true;        /* can't use attcacheoff anymore */
            continue;
        }
        ...

which confused me, because that code doesn't look materially different
in v10 than 9.6.  I eventually realized that the reason is that we
reach slot_deform_tuple with natts = 26 in the new code, but in the
old code we do not, thanks to this short-circuit in slot_getattr:

    /*
     * check if target attribute is null: no point in groveling through tuple
     */
    if (HeapTupleHasNulls(tuple) && att_isnull(attnum - 1, tup->t_bits))
    {
        *isnull = true;
        return (Datum) 0;
    }

So that results in not having to deconstruct most of the tuple,
whereas in the new code we do have to, thanks to b8d7f053c's
decision to batch all the variable-value-extraction work.

This is a pretty narrow corner case: it would only matter if the
column you're testing for null-ness is far past any other column
the query needs to fetch.  So I'm not sure that it's worth doing
anything about.  You could imagine special processing for
NullTest-on-a-simple-Var: exclude the Var from the ones that we
extract normally and instead compile it into some sort of "is column k
NULL" test on the HeapTuple.  But that seems messy, and it could be a
significant pessimization for storage methods that aren't like heapam.

On the whole I'm inclined to say "sorry, that's the price of
progress".  But it is a bit sad that a patch intended to be a
performance win shows a loss this big on a (presumably) real-world
case.

            regards, tom lane

Re: BUG #18205: Performance regression with NOT NULL checks.

От

Andres Freund

Дата:

20 ноября 2023 г., 00:12:03

Hi,

On 2023-11-19 14:08:05 -0500, Tom Lane wrote:
> So that results in not having to deconstruct most of the tuple,
> whereas in the new code we do have to, thanks to b8d7f053c's
> decision to batch all the variable-value-extraction work.

Yea, I think we were aware at the time that this does have downsides - it's
just that the worst case behaviour of *not* batching are much bigger than the
worst-case downside of batching.

> This is a pretty narrow corner case: it would only matter if the
> column you're testing for null-ness is far past any other column
> the query needs to fetch.  So I'm not sure that it's worth doing
> anything about.  You could imagine special processing for
> NullTest-on-a-simple-Var: exclude the Var from the ones that we
> extract normally and instead compile it into some sort of "is column k
> NULL" test on the HeapTuple.

We actually did add fastpaths for a few similar cases: ExecJustInnerVar() etc
will just use slot_getattr(). These can be used when the result is just a
single variable. However, the goal there was more to avoid "interpreter
startup" overhead, rather than evaluation overhead.

Those fastpaths don't match in this case though, the generated "program" is:

EEOP_SCAN_FETCHSOME
EEOP_SCAN_VAR
EEOP_NULLTEST_ISNULL
EEOP_QUAL
EEOP_DONE

which matches none of the fastpaths we recognize. In fact, I don't think any
of the fastpaths currently can be applied to quals, which is somewhat sad :(

Right now the "pattern matching" happens in ExecReadyInterpretedExpr(). For
portions that makes sense (as they're interpretation specific), but the
transformation from EEOP_SCAN_FETCHSOME to just fetching a single column imo
should be moved into execExpr.c, by making expr_setup_walker() a bit
smarter. I think that might help in a fair number of cases - quals on a single
column aren't exactly rare.

In this specific case we also could elide the EEOP_QUAL, the NULLTEST cannot
return a NULL, so we don't need any qual specific behaviour. For this case
alone it'd probably not be worth tracking whether something can be NULL. But
it could be worthwhile if we made it more generic, e.g. eliding strictness
checks before function calls could be a nice win.

> But that seems messy, and it could be a significant pessimization for
> storage methods that aren't like heapam.

My concern is more that they're pretty narrow - they apply only if a single
column is referenced. If we just need attributed 100 and 101, it's still
expensive to deform leading columns.  To address that, Ithink we could benefit
from a tuple deforming routine that's more aware of which columns are
needed. Even just omitting storing values in tts_values/nulls that aren't
needed yielded decent wins when I experimented in the past. The hard part of
course is to figure out when to use a "selective" deforming routine, which
will have *worse* performance if most columns are used, and when to just
deform everything up to natts.

I think the current deforming logic is actually optimized towards heapam to a
problematic degree - deforming leading columns in a columnar database hurts
really badly.

> On the whole I'm inclined to say "sorry, that's the price of
> progress".

Agreed, we can't really go back to deforming individual columns more widely -
that can get *really* expensive. But I think we can do plenty to make the
performance of this specific case, both by improving the generated expression
program and:

> But it is a bit sad that a patch intended to be a
> performance win shows a loss this big on a (presumably) real-world
> case.

I think we might be able to micro-optimize that code a bit to reduce the
decrease in performance. Neither gcc nor clang is able to to optimize the cost
of the null-check meaningfully, and there a number of dependent instructions:

        if (hasnulls && att_isnull(attnum, bp))
        {
            values[attnum] = (Datum) 0;
            isnull[attnum] = true;
            slow = true;        /* can't use attcacheoff anymore */
            continue;
        }

static inline bool
att_isnull(int ATT, const bits8 *BITS)
{
    return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
}

What if we instead load 8 bytes of the bitmap into a uint64 before entering
the loop, and shift an "index" mask into the bitmap by one each iteration
through the loop?  Then we'd only need to do the access to bitmap every 64
attributes. I think we could also get rid of the hasnulls check in each
iteration that way, by just checking hasnull whenever we load portions of the
bitmap into a local variable, and loading "no-nulls" into it when !hasnulls.

Greetings,

Andres Freund

AW: BUG #18205: Performance regression with NOT NULL checks.

От

Daniel Migowski

Дата:

20 ноября 2023 г., 00:15:37

Thank you for the detailed explanation. It surely has a reason to remove the previously used short-circuit in
slot_getattr,at least in the special case when we  

And yes, in my real world use case which I tried to understand I have a wide table with 81 columns, and I am using
column1,43,18,75 and filter by attribute 82,42, and 24.  

Don't know how to handle this, maybe I should rework the layout of all my tables and move the mostly used columns to
thebeginning.  

-----Ursprüngliche Nachricht-----
Von: Tom Lane <tgl@sss.pgh.pa.us>
Gesendet: Sonntag, 19. November 2023 20:08
An: Daniel Migowski <dmigowski@ikoffice.de>
Cc: pgsql-bugs@lists.postgresql.org; Andres Freund <andres@anarazel.de>
Betreff: Re: BUG #18205: Performance regression with NOT NULL checks.

PG Bug reporting form <noreply@postgresql.org> writes:
> I found out that selecting from a wide table with a single not-null
> where clause leads to severe performance regression when upgrading
> from PostgreSQL
> 9.5 to PostgreSQL 15.

I spent some time poking into this.  "git bisect" pins the blame on

commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755
Author: Andres Freund <andres@anarazel.de>
Date:   Tue Mar 14 15:45:36 2017 -0700

    Faster expression evaluation and targetlist projection.

The time needed for the seqscan gets about 50% worse at that commit (on my test machine, anyway):

   ->  Seq Scan on testtable t  (cost=0.00..51892.80 rows=9730 width=4) (actual time=0.010..204.937 rows=6000000
loops=1)
         Filter: (y IS NULL)

versus

   ->  Seq Scan on testtable t  (cost=0.00..51892.80 rows=9730 width=4) (actual time=0.013..317.069 rows=6000000
loops=1)
         Filter: (y IS NULL)

"perf" says that the extra time is mostly spent in slot_deform_tuple's inner loop:

    for (; attnum < natts; attnum++)
    {
        Form_pg_attribute thisatt = att[attnum];

        if (hasnulls && att_isnull(attnum, bp))
        {
            values[attnum] = (Datum) 0;
            isnull[attnum] = true;
            slow = true;        /* can't use attcacheoff anymore */
            continue;
        }
        ...

which confused me, because that code doesn't look materially different in v10 than 9.6.  I eventually realized that the
reasonis that we reach slot_deform_tuple with natts = 26 in the new code, but in the old code we do not, thanks to this
short-circuitin slot_getattr: 

    /*
     * check if target attribute is null: no point in groveling through tuple
     */
    if (HeapTupleHasNulls(tuple) && att_isnull(attnum - 1, tup->t_bits))
    {
        *isnull = true;
        return (Datum) 0;
    }

So that results in not having to deconstruct most of the tuple, whereas in the new code we do have to, thanks to
b8d7f053c'sdecision to batch all the variable-value-extraction work. 

This is a pretty narrow corner case: it would only matter if the column you're testing for null-ness is far past any
othercolumn the query needs to fetch.  So I'm not sure that it's worth doing anything about.  You could imagine special
processingfor 
NullTest-on-a-simple-Var: exclude the Var from the ones that we extract normally and instead compile it into some sort
of"is column k NULL" test on the HeapTuple.  But that seems messy, and it could be a significant pessimization for
storagemethods that aren't like heapam. 

On the whole I'm inclined to say "sorry, that's the price of progress".  But it is a bit sad that a patch intended to
bea performance win shows a loss this big on a (presumably) real-world case. 

            regards, tom lane

Re: BUG #18205: Performance regression with NOT NULL checks.

От

Tom Lane

Дата:

20 ноября 2023 г., 00:30:49

Andres Freund <andres@anarazel.de> writes:
> On 2023-11-19 14:08:05 -0500, Tom Lane wrote:
>> So that results in not having to deconstruct most of the tuple,
>> whereas in the new code we do have to, thanks to b8d7f053c's
>> decision to batch all the variable-value-extraction work.

> Yea, I think we were aware at the time that this does have downsides - it's
> just that the worst case behaviour of *not* batching are much bigger than the
> worst-case downside of batching.

Agreed.  Still ...

> We actually did add fastpaths for a few similar cases: ExecJustInnerVar() etc
> will just use slot_getattr(). These can be used when the result is just a
> single variable. However, the goal there was more to avoid "interpreter
> startup" overhead, rather than evaluation overhead.

Yeah.  Also, if I'm reading the example appropriately, Daniel's case
*does* involve fetching more than a single column --- but the other ones
are up near the start so we didn't use to have to deform very much of
the tuple.

> What if we instead load 8 bytes of the bitmap into a uint64 before entering
> the loop, and shift an "index" mask into the bitmap by one each iteration
> through the loop?

Meh.  Seems like a micro-optimization that does nothing for the big-O
problem.  One thing to think about is that I suspect "all the columns
are null" is just a simple test case and not very representative of
the real-world problem.  In the real case, probably quite a few of
the leading columns are non-null, which would make Daniel's issue
even worse because slot_deform_tuple would have to do significantly
more work that it didn't do before.  Shaving cycles off the null-column
fast path would be proportionally less useful too.

It might well be that what you suggest is worth doing just to cut
the cost of slot_deform_tuple across the board, but I don't think
it's an answer to this complaint specifically.

            regards, tom lane

Re: BUG #18205: Performance regression with NOT NULL checks.

От

Andres Freund

Дата:

20 ноября 2023 г., 01:17:44

Hi,

On 2023-11-19 21:15:37 +0000, Daniel Migowski wrote:
> Thank you for the detailed explanation. It surely has a reason to remove the
> previously used short-circuit in slot_getattr, at least in the special case
> when we

The short-circuit in slot_getattr() still exists, we just don't reach it in
this case.

> And yes, in my real world use case which I tried to understand I have a wide
> table with 81 columns, and I am using column 1,43,18,75 and filter by
> attribute 82,42, and 24.

Are most of the columns NULL or not?

> Don't know how to handle this, maybe I should rework the layout of all my
> tables and move the mostly used columns to the beginning.

It might also be worth to split the table into multiple tables and join when
necessary. Independent of this issue, relatively wide tuples, where you don't
use most of the columns, will cause unnecessary IO.

Greetings,

Andres

Re: BUG #18205: Performance regression with NOT NULL checks.

От

Andres Freund

Дата:

20 ноября 2023 г., 01:39:11

Hi,

On 2023-11-19 14:17:44 -0800, Andres Freund wrote:
> On 2023-11-19 21:15:37 +0000, Daniel Migowski wrote:
> > And yes, in my real world use case which I tried to understand I have a wide
> > table with 81 columns, and I am using column 1,43,18,75 and filter by
> > attribute 82,42, and 24.
> 
> Are most of the columns NULL or not?

Another question: In the real query, how selective is the WHERE clause? In
your test query all rows are returned and you have no columns in the select
list, but it doesn't sound like that's quite your real workload...

Greetings,

Andres Freund

Re: BUG #18205: Performance regression with NOT NULL checks.

От

Andres Freund

Дата:

20 ноября 2023 г., 01:41:45

Hi,

On 2023-11-19 16:30:49 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > We actually did add fastpaths for a few similar cases: ExecJustInnerVar() etc
> > will just use slot_getattr(). These can be used when the result is just a
> > single variable. However, the goal there was more to avoid "interpreter
> > startup" overhead, rather than evaluation overhead.
>
> Yeah.  Also, if I'm reading the example appropriately, Daniel's case
> *does* involve fetching more than a single column --- but the other ones
> are up near the start so we didn't use to have to deform very much of
> the tuple.

Hm - I see. The initial response didn't make it sound like that, but the
follow-up clarified that aspect.

> > What if we instead load 8 bytes of the bitmap into a uint64 before entering
> > the loop, and shift an "index" mask into the bitmap by one each iteration
> > through the loop?
>
> Meh.  Seems like a micro-optimization that does nothing for the big-O
> problem.  One thing to think about is that I suspect "all the columns
> are null" is just a simple test case and not very representative of
> the real-world problem.  In the real case, probably quite a few of
> the leading columns are non-null, which would make Daniel's issue
> even worse because slot_deform_tuple would have to do significantly
> more work that it didn't do before.  Shaving cycles off the null-column
> fast path would be proportionally less useful too.

It doesn't make it algorithmically better, you're right - but I think it's
quite noticeable even in the case of the other columns having values.

I changed the test to insert 0 insto all columns other than y, and changed the
WHERE clause to IS NOT NULL, to avoid the overhead of the aggregation
path. Profile:

-   91.27%     0.00%  postgres  postgres                 [.] ExecProcNode (inlined)
     ExecProcNode (inlined)
   - ExecScan
      - 60.21% ExecQual (inlined)
         - ExecEvalExprSwitchContext (inlined)
            - 59.70% ExecInterpExpr
               - 54.05% slot_getsomeattrs (inlined)
                  - 53.52% slot_getsomeattrs_int
                     - 52.93% tts_buffer_heap_getsomeattrs
                        - 52.88% slot_deform_heap_tuple (inlined)
                           + 12.39% fetch_att (inlined)
                           + 12.26% att_isnull (inlined)
                           + 0.14% asm_sysvec_apic_timer_interrupt
               + 1.60% BoolGetDatum (inlined)
               + 0.01% asm_sysvec_apic_timer_interrupt
              0.35% MemoryContextSwitchTo (inlined)
      - 30.47% ExecScanFetch (inlined)
         + 29.87% SeqNext
         + 0.01% asm_sysvec_apic_timer_interrupt
        0.23% MemoryContextReset
      + 0.01% asm_sysvec_apic_timer_interrupt

So even here we spend a decent amount of the time in null bitmap handling.

I also am not so sure that you're right accessing multiple columns makes the
slot_attisnull() path even more advantageous - it's considerably slower when
accessing multiple columns, so if you actually access more columns, the
benefit will shrink. Particularly if there's variable width columns that need
to be skipped over and such.

> I have a wide table with 81 columns, and I am using column 1,43,18,75 and filter by attribute 82,42, and 24.

Projecting four columns and filtering by three gets a fair bit more expensive
if you actually deform them individually.

> It might well be that what you suggest is worth doing just to cut
> the cost of slot_deform_tuple across the board, but I don't think
> it's an answer to this complaint specifically.

I did suggest multiple things for a reason :).  slot_deform_tuple() is
generally a pretty big bottleneck, so efficiency improvements would be
interesting... And if were to only deform columns we actually need, handling
the null bitmap more efficiently becomes more important, so it could even help
cases like this.

Greetings,

Andres Freund

Re: BUG #18205: Performance regression with NOT NULL checks.

От

Andres Freund

Дата:

20 ноября 2023 г., 02:08:37

Hi,

On 2023-11-19 14:41:47 -0800, Andres Freund wrote:
> It doesn't make it algorithmically better, you're right - but I think it's
> quite noticeable even in the case of the other columns having values.
>
> I changed the test to insert 0 insto all columns other than y, and changed the
> WHERE clause to IS NOT NULL, to avoid the overhead of the aggregation
> path. Profile:
>
> -   91.27%     0.00%  postgres  postgres                 [.] ExecProcNode (inlined)
>      ExecProcNode (inlined)
>    - ExecScan
>       - 60.21% ExecQual (inlined)
>          - ExecEvalExprSwitchContext (inlined)
>             - 59.70% ExecInterpExpr
>                - 54.05% slot_getsomeattrs (inlined)
>                   - 53.52% slot_getsomeattrs_int
>                      - 52.93% tts_buffer_heap_getsomeattrs
>                         - 52.88% slot_deform_heap_tuple (inlined)
>                            + 12.39% fetch_att (inlined)
>                            + 12.26% att_isnull (inlined)
>                            + 0.14% asm_sysvec_apic_timer_interrupt
>                + 1.60% BoolGetDatum (inlined)
>                + 0.01% asm_sysvec_apic_timer_interrupt
>               0.35% MemoryContextSwitchTo (inlined)
>       - 30.47% ExecScanFetch (inlined)
>          + 29.87% SeqNext
>          + 0.01% asm_sysvec_apic_timer_interrupt
>         0.23% MemoryContextReset
>       + 0.01% asm_sysvec_apic_timer_interrupt
>
>
> So even here we spend a decent amount of the time in null bitmap handling.

If I put prewarm the data into shared buffers and change the table so there
is a NULL in one of the leading columns, this changes to:

-   94.15%     0.00%  postgres  postgres                 [.] ExecProcNode (inlined)
     ExecProcNode (inlined)
   - ExecScan
      - 79.15% ExecQual (inlined)
         - ExecEvalExprSwitchContext (inlined)
            - 78.50% ExecInterpExpr
               - 71.62% slot_getsomeattrs (inlined)
                  - 70.52% slot_getsomeattrs_int
                     - 69.79% tts_buffer_heap_getsomeattrs
                        - 69.56% slot_deform_heap_tuple (inlined)
                           + 20.25% att_isnull (inlined)
                           + 9.45% fetch_att (inlined)
                           + 0.39% asm_sysvec_apic_timer_interrupt
               + 1.88% BoolGetDatum (inlined)
               + 0.09% asm_sysvec_apic_timer_interrupt
              0.51% MemoryContextSwitchTo (inlined)
            + 0.02% asm_sysvec_apic_timer_interrupt
      + 14.39% ExecScanFetch (inlined)
        0.14% MemoryContextReset
      + 0.03% asm_sysvec_apic_timer_interrupt

Reducing the time spent in att_isnull() wouldn't get us to < 10 timings, but
it'd certainly help to close the gap. Of course you can make the difference
more extreme by adding a lot more leading columns, but still.

Greetings,

Andres Freund

AW: BUG #18205: Performance regression with NOT NULL checks.

От

Daniel Migowski

Дата:

20 ноября 2023 г., 18:40:54

Hello Andres,

Half of them are null in average, 30% are even not null. The test case I sent you was artificial to show the problem,
butit's the same slower behaviour when I fill all the columns with values except the one in the where clause.  

Greeting,
Daniel

-----Ursprüngliche Nachricht-----
Von: Andres Freund <andres@anarazel.de>
Gesendet: Sonntag, 19. November 2023 23:18
An: Daniel Migowski <dmigowski@ikoffice.de>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org
Betreff: Re: BUG #18205: Performance regression with NOT NULL checks.

Hi,

On 2023-11-19 21:15:37 +0000, Daniel Migowski wrote:
> Thank you for the detailed explanation. It surely has a reason to
> remove the previously used short-circuit in slot_getattr, at least in
> the special case when we

The short-circuit in slot_getattr() still exists, we just don't reach it in this case.

> And yes, in my real world use case which I tried to understand I have
> a wide table with 81 columns, and I am using column 1,43,18,75 and
> filter by attribute 82,42, and 24.

Are most of the columns NULL or not?

> Don't know how to handle this, maybe I should rework the layout of all
> my tables and move the mostly used columns to the beginning.

It might also be worth to split the table into multiple tables and join when necessary. Independent of this issue,
relativelywide tuples, where you don't use most of the columns, will cause unnecessary IO. 

Greetings,

Andres

AW: BUG #18205: Performance regression with NOT NULL checks.

От

Daniel Migowski

Дата:

20 ноября 2023 г., 18:49:32

Hello Andreas,

in this case the NOT NULL check matches all columns, but the other where clauses only match 30%. The software is used
bymore than hundred customers and they all use different features, but my queries must of course work in every case.  

I have indeed a lot of tables with 80 columns or so, and will now try to restructure them so the important columns come
sooner.But in many queries we do select maybe 5-20 fields and filter by an arbitrary field, so sparse selection of
attributeswould be VERY beneficial for us in many of the expensive queries. Queries returning many rows or joining and
aggregatingdata are mostly just returning part of the table, and that is for us where the performance is needed.  

Kind regards,
Daniel Migowski

PS: Btw many queries that result in showing a list of stuff in the GUI, like, invoice list with customer, show the name
andnumber of customer from the customer table. We will now ensure that these "Display name" attributes will be at the
startof the table so they can be retrieved efficiently. Another example for sparse column selection, but with the right
organisationwe don't need to optimize here. 

-----Ursprüngliche Nachricht-----
Von: Andres Freund <andres@anarazel.de>
Gesendet: Sonntag, 19. November 2023 23:39
An: Daniel Migowski <dmigowski@ikoffice.de>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-bugs@lists.postgresql.org
Betreff: Re: BUG #18205: Performance regression with NOT NULL checks.

Hi,

On 2023-11-19 14:17:44 -0800, Andres Freund wrote:
> On 2023-11-19 21:15:37 +0000, Daniel Migowski wrote:
> > And yes, in my real world use case which I tried to understand I
> > have a wide table with 81 columns, and I am using column 1,43,18,75
> > and filter by attribute 82,42, and 24.
>
> Are most of the columns NULL or not?

Another question: In the real query, how selective is the WHERE clause? In your test query all rows are returned and
youhave no columns in the select list, but it doesn't sound like that's quite your real workload... 

Greetings,

Andres Freund

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: BUG #18205: Performance regression with NOT NULL checks.