Обсуждение: Trying to reduce per tuple overhead (bitmap)

Поиск
Список
Период
Сортировка

Trying to reduce per tuple overhead (bitmap)

От
Manfred Koizar
Дата:
Tom Lane wrote in another tread:
> PS: I did like your point about BITMAPLEN; I think that might be
> a free savings.  I was waiting for you to bring it up on hackers
> before commenting though...
So here we go...

Hi,

in htup.h MinHeapTupleBitmapSize is defined to be 32,  i.e. the bitmap
uses at least so many bits, if the tuple has at least one null
attribute.  The bitmap starts at offset 31 in the tuple header.  The
macro BITMAPLEN calculates, for a given number of attributes NATTS,
the length of the bitmap in bytes.  BITMAPLEN is the smallest number n
divisible by 4, so that 8*n >= NATTS.

The size of the tuple header is rounded up to a multiple of 4 (on a
typical(?) architecture) by MAXALIGN(...).  So we get:

NATTS  BITMAPLEN  THSIZE 8        4        3616        4        3633        8        40

I don't quite understand the definition of BITMAPLEN:

#define BITMAPLEN(NATTS) \  ((((((int)(NATTS) - 1) >> 3) + 4 - (MinHeapTupleBitmapSize >> 3)) \   & ~03) +
(MinHeapTupleBitmapSize>> 3)) 

AFAICS only for MinHeapTupleBitmapSize == 32 we get a meaningful
result, namely a multiple of MinHeapTupleBitmapSize, converted from a
number of bits to a number of bytes.  If this is true, we dont't need
the "+ 4 - (MinHeapTupleBitmapSize >> 3)" and the definition could be
simplified to(((((int)(NATTS) - 1) >> 3) & ~03) + 4)

Some examples, writing MBMB for (MinHeapTupleBitmapSize >> 3):

MBMB = 4:
((((NATTS - 1) >> 3) + 4 - MBMB) & ~03) + MBMB   32     31     3    7     3      0       4   33     32     4    8     4
    4       8   64     63     7   11     7      4       8   65     64     8   12     8      8      12 

MBMB = 1:
((((NATTS - 1) >> 3) + 4 - MBMB) & ~03) + MBMB    8      7     0    4     3      0       1    9      8     1    5     4
    4       5   32     31     3    7     6      4       5   33     32     4    8     7      4       5   56     55     6
 10     9      8       9   64     63     7   11    10      8       9   65     64     8   12    11      8       9 

MBMB = 8:
((((NATTS - 1) >> 3) + 4 - MBMB) & ~03) + MBMB    8      7     0    4    -4     -4       4    9      8     1    5    -3
   -4       4   32     31     3    7    -1     -4       4   33     32     4    8     0      0       8   56     55     6
 10     2      0       8   64     63     7   11     3      0       8   65     64     8   12     4      4      12 

Proposal 1:
#define BitMapBytes 4    // or any other power of 2
#define MinHeapTupleBitmapSize (BitMapBytes * 8)
#define BITMAPLEN(NATTS) \ (((((int)(NATTS) - 1) >> 3) & ~(BitMapBytes - 1)) + BitMapBytes)

Proposal 2: Let BITMAPLEN calculate the minimum number of bytes
necessary to have one bit for every attribute.

#define BitMapBytes 1
         old       old      new      new
NATTS  BITMAPLEN  THSIZE  BITMAPLEN  THSIZE 8        4        36        1        3216        4        36        2
3633        8        40        5        36 

This looks so simple.  Is there something wrong with it?
Does it need further discussion or should I submit a patch?

ServusManfred


Re: Trying to reduce per tuple overhead (bitmap)

От
Tom Lane
Дата:
Manfred Koizar <mkoi-pg@aon.at> writes:
> Proposal 2: Let BITMAPLEN calculate the minimum number of bytes
> necessary to have one bit for every attribute.

> #define BitMapBytes 1

>           old       old      new      new
> NATTS  BITMAPLEN  THSIZE  BITMAPLEN  THSIZE
>   8        4        36        1        32
>  16        4        36        2        36
>  33        8        40        5        36

> This looks so simple.  Is there something wrong with it?

Offhand I cannot see a reason not to change this.  There's no reason for
BITMAPLEN() to be padding the bitmap length --- every caller does its
own MAXALIGN() of the total header length, which is what we actually
need.  I suspect that BITMAPLEN's behavior is leftover from a time when
those MAXALIGN's weren't there.  But (a) this would only be helpful if
the t_bits field started on a word boundary, which it doesn't and hasn't
for a very long time; and (b) padding to a multiple of 4 is wrong
anyway, since there are machines where MAXALIGN is 8.

Since the data offset is stored in t_hoff and not recalculated on the
fly, I think that fixing BITMAPLEN is completely free from a storage
compatibility point of view; you wouldn't even need initdb.  If some
tuples have the excess padding and some do not, everything will still
work.

In short, looks good to me.  Please try it.
        regards, tom lane


Re: Trying to reduce per tuple overhead (bitmap)

От
Bruce Momjian
Дата:
Manfred Koizar wrote:
> Tom Lane wrote in another tread:
> > PS: I did like your point about BITMAPLEN; I think that might be
> > a free savings.  I was waiting for you to bring it up on hackers
> > before commenting though...
> So here we go...
> 
> Hi,
> 
> in htup.h MinHeapTupleBitmapSize is defined to be 32,  i.e. the bitmap
> uses at least so many bits, if the tuple has at least one null
> attribute.  The bitmap starts at offset 31 in the tuple header.  The
> macro BITMAPLEN calculates, for a given number of attributes NATTS,
> the length of the bitmap in bytes.  BITMAPLEN is the smallest number n
> divisible by 4, so that 8*n >= NATTS.
> 
> The size of the tuple header is rounded up to a multiple of 4 (on a
> typical(?) architecture) by MAXALIGN(...).  So we get:
> 
> NATTS  BITMAPLEN  THSIZE
>   8        4        36
>  16        4        36
>  33        8        40
> 
> I don't quite understand the definition of BITMAPLEN:
> 
> #define BITMAPLEN(NATTS) \
>    ((((((int)(NATTS) - 1) >> 3) + 4 - (MinHeapTupleBitmapSize >> 3)) \
>     & ~03) + (MinHeapTupleBitmapSize >> 3))

Thanks for improving this.  I had to look at this macro recently and it
was quite confusing.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026