Обсуждение: Estimating hot data size
All,
Basically, I'm using the sum(heap_blks_read + idx_blks_read) from pg_statio_all_tables, and diffing the numbers over a period of time (1 hour at least). Is this a fair estimate? The reason for doing this is we are looking at new server hardware, and I want to try and get enough ram on the machine to keep the hot data in memory plus provide room for growth.
Thanks,
Chris
Example:
Time | Total Blocks |
2011-02-16 11:25:34.621874-05 | 123,260,464,427.00 |
2011-02-16 12:25:46.486719-05 | 123,325,880,943.00 |
To get the hot data for this hour (in KB), I'm taking:
(123,325,880,943.00 - 123,260,464,427.00)* 8 = 523,332,128KB
Correct?
Dne 16.2.2011 21:51, Chris Hoover napsal(a): > All, > > I'm trying to estimate the size of my hot data set, and wanted to get > some validation that I'm doing this correctly. > > Basically, I'm using the sum(heap_blks_read + idx_blks_read) from > pg_statio_all_tables, and diffing the numbers over a period of time (1 > hour at least). Is this a fair estimate? The reason for doing this is > we are looking at new server hardware, and I want to try and get enough > ram on the machine to keep the hot data in memory plus provide room for > growth. > > Thanks, > > Chris > > Example: > > > > *Time* > > > > *Total Blocks* > > 2011-02-16 11:25:34.621874-05 > > > > 123,260,464,427.00 > > 2011-02-16 12:25:46.486719-05 > > > > 123,325,880,943.00 > > > > To get the hot data for this hour (in KB), I'm taking: > > > (123,325,880,943.00 - 123,260,464,427.00)* 8 = 523,332,128KB > > > Correct? I doubt that, although I'm not sure what exactly you mean by hot data set. I guess it's the data set you're working with frequently, right? The first gotcha is that heap_blks_read counts only blocks not found in shared buffers, so those 500MB is actually the amount of data read from the disk (or filesystem cache). It does not say anything about how frequently the data are used. The second gotcha is that the same block may be counted repeatedly, especially if it is not frequently used. It's counted for query A, then it's removed from the cache (to be replaced by another block), and then for another query B. So the number heap_blks_read does not mean there were that many different blocks read from the disk. What I'd recommend is to measure the cache hit ratio, i.e. this heap_blks_hit / (heap_blks_read + heap_blks_hit) which means how efficient the cache is. Increase shared buffers until it stops to increase - that's the hot data set size. regards Tomas PS: The value heap_blks_hit does not actually mean the blocks were read from the disk - it might be read from filesystem cache (and there's not easy way to find out this AFAIK).
Chris Hoover wrote: > Basically, I'm using the sum(heap_blks_read + idx_blks_read) from > pg_statio_all_tables, and diffing the numbers over a period of time (1 > hour at least). Is this a fair estimate? The reason for doing this > is we are looking at new server hardware, and I want to try and get > enough ram on the machine to keep the hot data in memory plus provide > room for growth. Those two are measuring reads to the operating system, which isn't really a good measure of the working data set. If you switch to the internal counters that measure what's already cached, that won't be quite right either. Those will be repeatedly measuring the same block, on the truly hot ones, which inflates how big you'll think the working set is relative to its true size. If you visit http://projects.2ndquadrant.com/talks you'll find a talk called "Inside the PostgreSQL Buffer Cache" that goes over how the cache is actually managed within the database. There's also some sample queries that run after you install the pg_buffercache module into a database. Check out "Buffer contents summary, with percentages". That's the only way to really measure what you're trying to see. I will sometimes set shared_buffers to a larger value than would normally be optimal for a bit, just to get a better reading on what the hot data is. If you also want to get an idea what's in the operating system cache, the pgfincore module from http://pgfoundry.org/projects/pgfincore/ will allow that on a Linux system. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books