Re: Bypassing shared_buffers

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: Bypassing shared_buffers
Дата
Msg-id 1bb973d7-f047-032c-6375-970ca2cda7f7@garret.ru
обсуждение исходный текст
Ответ на Re: Bypassing shared_buffers  (Vladimir Churyukin <vladimir@churyukin.com>)
Ответы Re: Bypassing shared_buffers  (Vladimir Churyukin <vladimir@churyukin.com>)
Список pgsql-hackers

On 15.06.2023 4:37 AM, Vladimir Churyukin wrote:
> Ok, got it, thanks.
> Is there any alternative approach to measuring the performance as if 
> the cache was empty?
> The goal is basically to calculate the max possible I/O time for a 
> query, to get a range between min and max timing.
> It's ok if it's done during EXPLAIN ANALYZE call only, not for regular 
> executions.
> One thing I can think of is even if the data in storage might be 
> stale, issue read calls from it anyway, for measuring purposes.
> For EXPLAIN ANALYZE it should be fine as it doesn't return real data 
> anyway.
> Is it possible that some pages do not exist in storage at all? Is 
> there a different way to simulate something like that?
>

I do not completely understand what you want to measure: how fast cache 
be prewarmed or what is the performance
when working set doesn't fit in memory?

Why not changing `shared_buffers` size to some very small values (i.e. 
1MB) doesn't work?
As it was already noticed, there are levels of caching: shared buffers 
and OS file cache.
By reducing size of shared buffers you rely mostly on OS file cache.
And actually there is no big gap in performance here - at most workloads 
I didn't see more than 15% difference).

You can certainly flush OS cache `echo 3 > /proc/sys/vm/drop_caches` and 
so simulate cold start.
But OS cached will be prewarmed quite fast (unlike shared buffer because 
of strange Postgres ring-buffer strategies which cause eviction of pages
from shared buffers even if there is a lot of free space).

So please more precisely specify the goal of your experiment.
"max possible I/O time for a query" depends on so many factors...
Do you consider just one client working in isolation or there will be 
many concurrent queries and background tasks like autovacuum and 
checkpointer  competing for the resources?

My point is that if you need some deterministic result then you will 
have to exclude a lot of different factors which may affect performance
and then ... you calculate speed of horse in vacuum, which has almost no 
relation to real performance.








В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Smith
Дата:
Сообщение: Re: Initial Schema Sync for Logical Replication
Следующее
От: Dilip Kumar
Дата:
Сообщение: New WAL record to detect the checkpoint redo location