Обсуждение: Make TOAST_TUPLES_PER_PAGE configurable per table.

Поиск
Список
Период
Сортировка

Make TOAST_TUPLES_PER_PAGE configurable per table.

От
Jesper Krogh
Дата:
Hi

This is my first attempt to hack PostgreSQL (even C actually), so bear
over with obvious mistakes done.

I've had a wish to be able to teach Postgres a bit more about how to
store its data on disk. Our systems is a typical web-based system where
all access more or less can be devided into 2 categories:
"List view" .. which is overview, counts, aggregates on simple values
with 50..200 rows per page and
"details views" which is more or less all data from a single rows
combined with aggregates of relations and similar.

Bases on this knowledge I know that there is a significant amount of
data stored "inline" in tuples and being read of disk for the listing
that is "never needed". At the moment it'll try to compress an get below
pagesize/4 ~ 2KB/tuple before it gets out to TOASTING the large tables.

Looking at the current implementation it seems to "do the right thing"
since the "large, variable length" attributes are the "most likely" to
not be shown on listing pages anyway, but it is not aggressive enough
(in my view for all common web-things), so this patch tries to make
TOAST_TUPLES_PER_PAGE per table configurable (the desired tuple-density
on the main storage).

This patch enables users to set TOAST_TUPLES_PER_PAGE with

ALTER TABLE <table> SET (tuples_per_page = X); .. currently with 1 <= X
<= 32;

ftstest=# create table testtext8(id SERIAL,col text) with
(tuples_per_page=8);
NOTICE:  CREATE TABLE will create implicit sequence "testtext8_id_seq"
for serial column "testtext8.id"
CREATE TABLE
ftstest=# create table testtext2(id SERIAL,col text) with
(tuples_per_page=2);
NOTICE:  CREATE TABLE will create implicit sequence "testtext2_id_seq"
for serial column "testtext2.id"
CREATE TABLE
ftstest=# insert into testtext8(col) (select (select
array_to_string(array_agg(chr((random()*95+30)::integer)),'') from
generate_series(1,3000)) as testtext from generate_series(1,50000));
INSERT 0 50000
ftstest=# insert into testtext2(col) (select (select
array_to_string(array_agg(chr((random()*95+30)::integer)),'') from
generate_series(1,3000)) as testtext from generate_series(1,50000));
INSERT 0 50000
ftstest=# \timing
### Here i stop PG and echo 3 > /proc/sys/vm/drop_caches
ftstest=# select count(id) from testtext2;
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
ftstest=# select count(id) from testtext2;
 count
-------
 50000
(1 row)

Time: 4613.044 ms
ftstest=# select count(id) from testtext8;
 count
-------
 50000
(1 row)

Time: 318.743 ms

This obviously comes with a drawback if I actually "Need" the data.

ftstest=# select max(length(col)) from testtext2;
 max
------
 3000
(1 row)

Time: 1445.016 ms
ftstest=# select max(length(col)) from testtext8;
 max
------
 3000
(1 row)

Time: 4184.994 ms

            relation             |    size
---------------------------------+------------
 pg_toast.pg_toast_1450869       | 195 MB
 public.testtext2                | 195 MB
 public.testtext8                | 2552 kB


No documentation on the patch. I'll do that a bit later.

Generally speaking.. if you have some knowledge about the access
patterns of your data then this patch can enable you to teach postgresql
to take advantage of that. In my situation I would estimate that the
production set would be able to drop a couple of GB from main memory
(leaving room for more index-pages and such).


Thanks in advance.

--
Jesper Krogh
diff -rc ../postgresql-8.5alpha3.orig/src/backend/access/common/reloptions.c ./src/backend/access/common/reloptions.c
*** ../postgresql-8.5alpha3.orig/src/backend/access/common/reloptions.c    2009-08-27 19:18:44.000000000 +0200
--- ./src/backend/access/common/reloptions.c    2010-02-01 21:12:41.000000000 +0100
***************
*** 15,20 ****
--- 15,21 ----

  #include "postgres.h"

+ #include "access/tuptoaster.h"
  #include "access/gist_private.h"
  #include "access/hash.h"
  #include "access/nbtree.h"
***************
*** 157,162 ****
--- 158,170 ----
              RELOPT_KIND_HEAP | RELOPT_KIND_TOAST
          }, -1, 0, 2000000000
      },
+     {
+         {
+             "tuples_per_page",
+             "Desired number of tuples per page (worst-case)",
+             RELOPT_KIND_HEAP
+         },TOAST_TUPLES_PER_PAGE , 1,32
+     },
      /* list terminator */
      {{NULL}}
  };
***************
*** 1074,1079 ****
--- 1082,1088 ----
      int            numoptions;
      static const relopt_parse_elt tab[] = {
          {"fillfactor", RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor)},
+         {"tuples_per_page", RELOPT_TYPE_INT, offsetof(StdRdOptions, tuples_per_page)},
          {"autovacuum_enabled", RELOPT_TYPE_BOOL,
          offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, enabled)},
          {"autovacuum_vacuum_threshold", RELOPT_TYPE_INT,
diff -rc ../postgresql-8.5alpha3.orig/src/backend/access/heap/tuptoaster.c ./src/backend/access/heap/tuptoaster.c
*** ../postgresql-8.5alpha3.orig/src/backend/access/heap/tuptoaster.c    2009-07-29 22:56:18.000000000 +0200
--- ./src/backend/access/heap/tuptoaster.c    2010-02-01 21:56:29.000000000 +0100
***************
*** 594,600 ****
      hoff = MAXALIGN(hoff);
      Assert(hoff == newtup->t_data->t_hoff);
      /* now convert to a limit on the tuple data size */
!     maxDataLen = TOAST_TUPLE_TARGET - hoff;

      /*
       * Look for attributes with attstorage 'x' to compress.  Also find large
--- 594,602 ----
      hoff = MAXALIGN(hoff);
      Assert(hoff == newtup->t_data->t_hoff);
      /* now convert to a limit on the tuple data size */
!
!
!     maxDataLen = MaximumBytesPerTuple(RelationGetTuplesPerPage(rel,TOAST_TUPLES_PER_PAGE)) - hoff;

      /*
       * Look for attributes with attstorage 'x' to compress.  Also find large
diff -rc ../postgresql-8.5alpha3.orig/src/include/utils/rel.h ./src/include/utils/rel.h
*** ../postgresql-8.5alpha3.orig/src/include/utils/rel.h    2009-12-07 06:22:23.000000000 +0100
--- ./src/include/utils/rel.h    2010-02-01 21:33:55.000000000 +0100
***************
*** 239,250 ****
--- 239,258 ----
  {
      int32        vl_len_;        /* varlena header (do not touch directly!) */
      int            fillfactor;        /* page fill factor in percent (0..100) */
+     int            tuples_per_page;
      AutoVacOpts autovacuum;        /* autovacuum-related options */
  } StdRdOptions;

  #define HEAP_MIN_FILLFACTOR            10
  #define HEAP_DEFAULT_FILLFACTOR        100

+ /* RelationGetTuplesPerPage
+  *              Returns the desirec number of tuples per page.
+  */
+ #define RelationGetTuplesPerPage(relation,defaulttpp) \
+          ((relation)->rd_options ? \
+           ((StdRdOptions *) (relation)->rd_options)->tuples_per_page : (defaulttpp))
+
  /*
   * RelationGetFillFactor
   *        Returns the relation's fillfactor.  Note multiple eval of argument!

Re: Make TOAST_TUPLES_PER_PAGE configurable per table.

От
Tom Lane
Дата:
Jesper Krogh <jesper@krogh.cc> writes:
> This patch enables users to set TOAST_TUPLES_PER_PAGE with
> ALTER TABLE <table> SET (tuples_per_page = X); .. currently with 1 <= X
> <= 32;

It's not clear to me that fiddling with that is useful unless the toast
tuple size also changes; and unfortunately changing that is much harder,
because it's wired into the addressing of toast data.  See also these
notes:
* XXX while these can be modified without initdb, some thought needs to be* given to needs_toast_table() in toasting.c
beforeunleashing random* changes.  Also see LOBLKSIZE in large_object.h, which can *not* be* changed without initdb.
 
        regards, tom lane


Re: Make TOAST_TUPLES_PER_PAGE configurable per table.

От
Jesper Krogh
Дата:
Tom Lane wrote:
> Jesper Krogh <jesper@krogh.cc> writes:
>> This patch enables users to set TOAST_TUPLES_PER_PAGE with
>> ALTER TABLE <table> SET (tuples_per_page = X); .. currently with 1 <= X
>> <= 32;
> 
> It's not clear to me that fiddling with that is useful unless the toast
> tuple size also changes; and unfortunately changing that is much harder,
> because it's wired into the addressing of toast data.  See also these
> notes:
> 
>  * XXX while these can be modified without initdb, some thought needs to be
>  * given to needs_toast_table() in toasting.c before unleashing random
>  * changes.  Also see LOBLKSIZE in large_object.h, which can *not* be
>  * changed without initdb.

I can see that needs_toast_table() might need some changes since it also
uses TUPLE_TOAST_THRESHOLD, and might benefit from being aware of a
toast table is triggered.

There might be more benefits with changes the toast tuple size (I dont
have enought insight to see that), but even without it I can get a
speedup of x10 on a "simple test" and permanently get the system to used
the caching for "more commonly used data" than these attributes that are
rarely used.

Ultimately I would like an infinite amount of configurabillity since I
have tables that only consists of simple values were 50% is really
rarely used and 50% is very often used. But just changing the
TOAST_TUPLE_PER_PAGE as above can easily increase my "tuple-density"
from 6/page to 40-60/page, which translates directly into:
* Less data to read when accessing the tuples.
* Less data to cache that is rarely used.

Where as on the the table with simple values I might at best be able to
double the tuple-density.

But yes it isn't a silverbullet, it requires knowledge of the access
patterns of the data.

What kind of arguments/tests/benchmarks is required to push for the
usefulness of "fiddling" with this parameter?

Realworld database in our environment has:
12M rows sitting with an average text length of ~2KB directly
"toastable" set is: 5GB which is really rarely used, but the webapp is
doing random reads for the presense/counts of these rows.
another table has ~700M rows sitting of a size of 135GB where around
120GB is of the "really rarely used type". (but takes time to compute so
it makes sense "wasting dead disk" on them).

So based on the benchmark provided in email I think that it can
significantly change the ration of cache hit/misses for the application.
(which has 64GB of dedicated memory).

Jesper
-- 
Jesper


Re: Make TOAST_TUPLES_PER_PAGE configurable per table.

От
"Kevin Grittner"
Дата:
Jesper Krogh <jesper@krogh.cc> wrote: 
> Ultimately I would like an infinite amount of configurabillity
There was some discussion of this previously.  I was thinking of
doing something with it, but Laurent indicated off-list he was
working on it, so I left it to him.  Besides reading these threads,
you might want to see how far he got -- he might have some "work in
process" or an internal patch that would be a good starting point.
http://archives.postgresql.org/pgsql-hackers/2009-06/msg00831.php
http://archives.postgresql.org/pgsql-hackers/2009-07/msg01065.php
-Kevin