Обсуждение: caching indexes and pages?
Hi I am developing a database and have a couple of questions I havent found an answer to yet. 1) how do I find the size of an index, i.e. the size on disk? 2) I have a query that is taking a long time to complete because the table is about 120GB large. Its only returning 2000 rows, so in principle it should be fast. But because the data is spread across the table, I am assuming it needs to do a lot of disk access to fetch the appropriate pages. Since the amount of data is so large I am also assuming that whenever I do a query all memory caches have to be replaced to make room for the new pages. What I am wondering is which config settings can I use to increase the amount of memory postgres keeps to cache pages and indexes? I tried work_mem and maintenance_work_mem but it does not seem to make much difference yet. Admittedly I had set it to 100M and 80M, so after reading a little bit more I have found that I could easily set it to several GBs. But I am not sure those are the correct config parameters to use for this. I havent found any others that are relevant so far. regards thomas
In response to Thomas Finneid : > > Hi I am developing a database and have a couple of questions I havent > found an answer to yet. > > 1) how do I find the size of an index, i.e. the size on disk? http://www.postgresql.org/docs/8.3/interactive/functions-admin.html http://andreas.scherbaum.la/blog/archives/282-table-size,-database-size.html > > 2) I have a query that is taking a long time to complete because the How long is a long time? *g* > table is about 120GB large. Its only returning 2000 rows, so in > principle it should be fast. But because the data is spread across the > table, I am assuming it needs to do a lot of disk access to fetch the > appropriate pages. Since the amount of data is so large I am also Please execute your query with an additionally 'explain analyse select ...' and look at the output. Maybe there are no propper index for your select. Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
Hi, Thomas Finneid schrieb: > > Hi I am developing a database and have a couple of questions I havent > found an answer to yet. > > 1) how do I find the size of an index, i.e. the size on disk? i use this query: select t.spcname as "tablespace" , pg_get_userbyid(c.relowner) as "owner" , n.nspname as "schema" , relname::text as "name" , pg_size_pretty(pg_total_relation_size(c.oid))::text as "total size" , case when c.relkind='i' then 'index' when c.relkind='t' then 'toast' when c.relkind='r' then 'table' when c.relkind='v' then 'view' when c.relkind='c' then 'composite type' when c.relkind='S' then 'sequence' else c.relkind::text end as "type" from pg_class c left join pg_namespace n on n.oid = c.relnamespace left join pg_tablespace t on t.oid = c.reltablespace where (pg_total_relation_size(c.oid)>>20)>0 and c.relkind!='t' order by c.relkind desc, pg_total_relation_size(c.oid) desc > > 2) I have a query that is taking a long time to complete because the > table is about 120GB large. Its only returning 2000 rows, so in > principle it should be fast. But because the data is spread across the > table, I am assuming it needs to do a lot of disk access to fetch the > appropriate pages. Since the amount of data is so large I am also > assuming that whenever I do a query all memory caches have to be > replaced to make room for the new pages. What I am wondering is which > config settings can I use to increase the amount of memory postgres > keeps to cache pages and indexes? try to reorganize your data with CLUSTER and create appropriate indixes (dont forget to check statistics). there are several threads about memory configuration. look for shared_buffers > > I tried work_mem and maintenance_work_mem but it does not seem to make > much difference yet. Admittedly I had set it to 100M and 80M, so after > reading a little bit more I have found that I could easily set it to > several GBs. But I am not sure those are the correct config parameters > to use for this. I havent found any others that are relevant so far. > > regards > > thomas regards thomas
Вложения
> I tried work_mem and maintenance_work_mem but it does not seem to make much > difference yet. Admittedly I had set it to 100M and 80M, so after reading a > little bit more I have found that I could easily set it to several GBs. But > I am not sure those are the correct config parameters to use for this. I > havent found any others that are relevant so far. You probably want to increase shared_buffers by quite a large amount and maybe make work_mem not quite so big. If you have 2GB of memory you might set shared_buffers to 1GB, work_mem 16MB, maintenance_work_mem 64MB? ...Robert
Thomas Markus wrote: > try to reorganize your data with CLUSTER and create appropriate indixes > (dont forget to check statistics). One question. Assume I have clustered and new data has been added after that, according to the docs that data is added "outside" of the clustered data. What happens when I run cluster again? I would assume its smart and to only clusteres the new data, i.e. adding it to the already created clusters, as apporpriate, so the execution time would be a lot lower, right? or would it run through and recluster everything from scratch again? thomas
On Thu, Jan 22, 2009 at 10:58:25PM +0100, Thomas Finneid wrote: > Thomas Markus wrote: > >> try to reorganize your data with CLUSTER and create appropriate indixes >> (dont forget to check statistics). > > One question. Assume I have clustered and new data has been added after > that, according to the docs that data is added "outside" of the clustered > data. What happens when I run cluster again? I would assume its smart and > to only clusteres the new data, i.e. adding it to the already created > clusters, as apporpriate, so the execution time would be a lot lower, > right? or would it run through and recluster everything from scratch again? > > thomas > It reclusters again from scratch. You do get better performance on the reads from the data that is already clustered. Cheers, Ken
Thomas Finneid wrote: > Thomas Markus wrote: > >> try to reorganize your data with CLUSTER and create appropriate >> indixes (dont forget to check statistics). > > One question. Assume I have clustered and new data has been added after > that, according to the docs that data is added "outside" of the > clustered data. Check into FILLFACTOR (on both tables and indexes). -- Craig Ringer