Re: measuring lwlock-related latency spikes

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: measuring lwlock-related latency spikes
Дата	1 апреля 2012 г. 23:01:48
Msg-id	CAM-w4HO=CQ7Gh41Kapg6YA2fo2dS_bskYHRQH36a5iwaS0-rqA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: measuring lwlock-related latency spikes (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: measuring lwlock-related latency spikes (Simon Riggs <simon@2ndQuadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On Sun, Apr 1, 2012 at 4:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> My guess based on previous testing is
> that what's happening here is (1) we examine a tuple on an old page
> and decide we must look up its XID, (2) the relevant CLOG page isn't
> in cache so we decide to read it, but (3) the page we decide to evict
> happens to be dirty, so we have to write it first.

Reading the code one possibility is that in the time we write the
oldest slru page another process has come along and redirtied it. So
we pick a new oldest slru page and write that. By the time we've
written it another process could have redirtied it again. On a loaded
system where the writes are taking 100ms or more it's conceivable --
barely -- that could happen over and over again hundreds of times.

In general the locking and reasoning about concurrent attempts to read
pages here makes my head swim. It looks like even if there's a lot of
contention for the same page or same slot it shouldn't manifest itself
that way but it seems like the kind of logic with multiple locks and
retries that is prone to priority inversion type problems. I wonder if
more detailed instrumentation showing the sequence of operations taken
while holding a lock that somebody got stuck on would help.

-- 
greg

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Andrew Dunstan
Дата: 01 апреля 2012 г., 22:35:07
Сообщение: log chunking broken with large queries under load

Следующее

От: Marko Kreen
Дата: 01 апреля 2012 г., 23:04:17
Сообщение: Re: Speed dblink using alternate libpq tuple storage

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: measuring lwlock-related latency spikes

Предыдущее

Следующее