Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity
Дата
Msg-id CAEepm=2hrP8o=+vnDPsmh7qASdPabkbO7rGvfPxx+6LfBvST+w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity  (Andreas Seltenreich <seltenreich@gmx.de>)
Список pgsql-hackers
On Wed, Dec 28, 2016 at 10:40 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Dec 28, 2016 at 10:01 AM, Andreas Seltenreich
> <seltenreich@gmx.de> wrote:
>> testing master as of fe591f8bf6 produced a crash reading
>> pg_stat_activity (backtrace below).  Digging around with with gdb
>> revealed that pgstat_get_wait_event() returned an invalid pointer for a
>> classId PG_WAIT_LWLOCK.
>>
>> I think the culprit is dsa.c passing a pointer to memory that goes away
>> on dsa_free() as a name to LWLockRegisterTranche.
>
> I see, nice detective work!
>
> dsa.c's create_internal and attach_internal call LWLockRegisterTranche
> with a pointer to a name in a DSM segment (copied from an argument to
> dsa_create), which doesn't have the right extent in this rare race
> case: the DSM segment goes away when the last backend detaches, but
> after you've detached you might still see others waiting for a lock
> that references that tranche ID in pg_stat_activity.  This could be
> fixed by copying the tranche name to somewhere with backend lifetime
> extent in each backend, but then that would need cleanup.  Maybe we
> should replace it with another value when the DSA area is detached,
> using a constant string.  Something like
> LWLockRegisterTranche(tranche_id, "detached_dsa").  That would be a
> new use for the LWLockRegisterTranche function, for
> re-registering/renaming and existing tranche ID.  Thoughts?

Or we could introduce a new function
UnregisterLWLockTranche(tranche_id) that is equivalent to
RegisterLWLockTranche(tranche_id, NULL), and call that when detaching
(ie in the hook that runs before we detach the control segment that
holds the name).  It looks like NULL would result in
GetLWLockIdentifier returning "extension".  I guess before DSA came
along only extensions could create lock tranches not known to every
backend, but perhaps now it should return "unknown"?

It does seem a little odd that we are using a well known tranche ID
defined in lwlock.h (as opposed to an ID obtained at runtime by
LWLockNewTrancheId()) but the tranche name has to be registered at
runtime in each backend and is unknown to other backends.  That line
of thinking suggests another potential solution: go and register the
name in RegisterLWLockTranches, and stop registering it in dsa.c.  For
other potential uses of DSA including extensions that call
LWLockNewTrancheId() we'd have to decide whether to make the name an
optional argument, or require those users to register it themselves.

-- 
Thomas Munro
http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] BUG: pg_stat_statements query normalization issueswith combined queries
Следующее
От: Andreas Seltenreich
Дата:
Сообщение: Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity