On Mon, Apr 14, 2014 at 7:45 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Apr 14, 2014 at 5:30 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> I am glad you are looking at this. You are right that it requires a
>> huge amount of testing, but clearly our code needs improvement in this
>> area.
>
> Thanks.
>
> Does anyone recall the original justification for the recommendation
> that shared_buffers never exceed 8GiB? I'd like to revisit the test
> case, if such a thing exists.
There are many reports of improvement from lowering shared_buffers.
The problem is that it tends to show up on complex production
workloads and that there is no clear evidence pointing to problems
with the clock sweep; it could be higher up in the partition locks or
something else entirely (like the O/S). pgbench is also not the
greatest tool for sniffing out these cases: it's too random and for
large database optimization is generally an exercise in de-randomizing
i/o patterns. We really, really need a broader testing suite that
covers more usage patterns.
I was suspicious for a while that spinlock contention inside the
clocksweep was causing stalls and posted a couple of different patches
to try and reduce the chance of that. I basically gave up when I
couldn't demonstrate that case in simulated testing.
I still think there is no good reason for the clock to pedantically
adjust usage count on contented buffers...better to throw a single
TTAS and bail to the next buffer if either 'T' signals a lock.
merlin