Обсуждение: Make tuples_per_page pr. table configureable.
Hi. This is a follow up and updated patch on several old discussions: http://archives.postgresql.org/pgsql-hackers/2009-07/msg01065.php http://archives.postgresql.org/pgsql-admin/2010-04/msg00164.php http://archives.postgresql.org/pgsql-hackers/2009-06/msg00831.php First patch: http://archives.postgresql.org/pgsql-hackers/2010-02/msg00096.php Currently the aim for the amount of tuples per page is 4 resulting in a target size of the individual tuple to be more than 2KB before the tupletoaster kicks in. This patch makes it tuneable on a per table basis. The main reasoning is that if people have knowledge about the usage pattern of their database, they can have huge benefit in tuning TOAST to be more or less aggressive. This is obviously true if: * The dataset isn't entirely memory cached and * columns stored in main (and just visibility checking) is more frequently done than accessing data columns stored in TOAST. But even in the case where the dataset is entirely memory cached this tuneable can transform the database to a widely different performance numbers than currently. This typically happens in cases where only visibillity checks are done (select count()) and when aggregates on stuff stored in main is used. I must admit that I have chosen a "poor" test data set, since based on the average length of the tuple the "sweet point" is just around the current default, but constructing a dataset with an average < 2.5KB tuple size would absolutely benefit. But I hope that people can see the benefit anyway. The dataset is 500.000 records in a table with: id serial, code text, (small text block) entry text (larger text block) where code is length(code) < 10 and entry: avg | max | min -----------------------+------+------ 3640.2042755914488171 | 8708 | 1468 The queries are run multiple time and numbers are based on runs where iowait was 0 while the query executed. So entirely memory and cpu-bound numbers: testdb=# select * from data order by tuples_per_page; time_sum_length | time_count | tuples_per_page | main_size | toast_size -----------------+------------+-----------------+-----------+------------ 5190.258 | 689.34 | 1 | 1981MB | 0MB 5478.519 | 660.841 | 2 | 1894MB | 0MB 9740.768 | 481.822 | 3 | 1287MB | 4MB 12875.479 | 73.895 |(default) 4 | 79MB | 1226MB 13082.768 | 58.023 | 8 | 29MB | 1276MB (5 rows) time_sum_length => select sum(length(entry)) from data; time_count => select count(*) from data; All timings are in ms. With this data Command to set "tuples_per_page" is: ALTER TABLE <tablename> set (tuples_per_page = X) where 1 <= X <= 32. The patch really need some feedback, I've tried to adress Tom Lane's earlier comment about fixing the place where it figure out wether it needs a toast table (and actually tested that it works). While there surely are more that can be done in order to improve the flexibillity in this area I do think that there is sufficient benefit. This is my second shot at coding C, so please let me know if I have been doing anything wrong. Comments are all welcome. Thanks. -- Jesper
Вложения
On Wed, Sep 22, 2010 at 3:44 AM, Jesper Krogh <jesper@krogh.cc> wrote: > Command to set "tuples_per_page" is: > ALTER TABLE <tablename> set (tuples_per_page = X) > where 1 <= X <= 32. The tuples_per_page means *minimal* number of tuples in a page, right? A page may contain more tuples when the size of tuples are small enough. If so, I think the parameter name is confusable because it sounds *restrict* the number of tuples per page like fillfactor. "min_tuples_per_page" might be better for the name. I didn't read previous discussions, but did we have consensus that "number of tuples" is better than other units for the parameter? For example, "threshold bytes" or "percentage in a page" also seems to be reasonable for me. -- Itagaki Takahiro
On 2010-09-22 04:33, Itagaki Takahiro wrote: > On Wed, Sep 22, 2010 at 3:44 AM, Jesper Krogh<jesper@krogh.cc> wrote: > >> Command to set "tuples_per_page" is: >> ALTER TABLE<tablename> set (tuples_per_page = X) >> where 1<= X<= 32. >> > The tuples_per_page means *minimal* number of tuples in a page, right? > A page may contain more tuples when the size of tuples are small enough. > If so, I think the parameter name is confusable because it sounds > *restrict* the number of tuples per page like fillfactor. > "min_tuples_per_page" might be better for the name. > Yes, minimum.. I'll change that. The name was picked to this since it is the name of the variable in the source-code, but thats actully a really bad argument by itself. min_tuples_per_page is more correct. > I didn't read previous discussions, but did we have consensus that "number > of tuples" is better than other units for the parameter? For example, > "threshold bytes" or "percentage in a page" also seems to be reasonable for me. > I thought about chaning it to bytes, but my feeling was that it would make most sense with something that would multiply to 8KB anyway people wouldn't end up with 5KB and a table mostly filled with 3KB nothing per page. So I tried to describe the "small math" in the documentation. Thanks for you feedback. -- Jesper
On Tue, Sep 21, 2010 at 10:33 PM, Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote: > I didn't read previous discussions, but did we have consensus that "number > of tuples" is better than other units for the parameter? For example, > "threshold bytes" or "percentage in a page" also seems to be reasonable for me. I think either of those is more intuitive than tuples per page, and especially the first. But I'm not sure that's really the knob we want either. I feel like what people are really trying to tune here is more like - if a single datum is larger than X bytes, push it out. But I might be all wet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company