Re: ResourceOwner refactoring

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: ResourceOwner refactoring
Дата
Msg-id 7d9fa5cf-07e3-4691-b40a-802048a0b998@iki.fi
обсуждение исходный текст
Ответ на Re: ResourceOwner refactoring  (Alexander Lakhin <exclusion@gmail.com>)
Ответы Re: ResourceOwner refactoring
Список pgsql-hackers
Thanks for the testing!

On 01/06/2024 11:00, Alexander Lakhin wrote:
> Hello Heikki,
> 
> I've stumbled upon yet another anomaly introduced with b8bff07da.
> 
> Please try the following script:
> SELECT 'init' FROM pg_create_logical_replication_slot('slot',
>     'test_decoding');
> CREATE TABLE tbl(t text);
> BEGIN;
> TRUNCATE table tbl;
> 
> SELECT data FROM pg_logical_slot_get_changes('slot', NULL, NULL,
>    'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
> 
> On b8bff07da (and the current HEAD), it ends up with:
> ERROR:  ResourceOwnerEnlarge called after release started
> 
> But on the previous commit, I get no error:
>    data
> ------
> (0 rows)

This is another case of the issue I tried to fix in commit b8bff07da 
with this:

> --- b/src/backend/access/transam/xact.c
> +++ a/src/backend/access/transam/xact.c
> @@ -5279,7 +5279,20 @@
>  
>          AtEOSubXact_RelationCshould be ache(false, s->subTransactionId,
>                                    s->parent->subTransactionId);
> +
> +
> +        /*
> +         * AtEOSubXact_Inval sometimes needs to temporarily bump the refcount
> +         * on the relcache entries that it processes.  We cannot use the
> +         * subtransaction's resource owner anymore, because we've already
> +         * started releasing it.  But we can use the parent resource owner.
> +         */
> +        CurrentResourceOwner = s->parent->curTransactionOwner;
> +
>          AtEOSubXact_Inval(false);
> +
> +        CurrentResourceOwner = s->curTransactionOwner;
> +
>          ResourceOwnerRelease(s->curTransactionOwner,
>                               RESOURCE_RELEASE_LOCKS,
>                               false, false);

AtEOSubXact_Inval calls LocalExecuteInvalidationMessage(), which 
ultimately calls RelationFlushRelation(). Your test case reaches 
ReorderBufferExecuteInvalidations(), which also calls 
LocalExecuteInvalidationMessage(), in an already aborted subtransaction.

Looking closer at relcache.c, I think we need to make 
RelationFlushRelation() safe to call without a valid resource owner. 
There are already IsTransactionState() checks in 
RelationClearRelation(), and a comment noting that it does not touch 
CurrentResourceOwner. So we should make sure RelationFlushRelation() 
doesn't touch it either, and revert the above change to 
AbortTransaction(). I didn't feel very good about it in the first place, 
and now I see what the proper fix is.

A straightforward fix is to modify RelationFlushRelation() so that if 
!IsTransactionState(), it just marks the entry as invalid instead of 
calling RelationClearRelation(). That's what RelationClearRelation() 
would do anyway, if it didn't hit the assertion first.

RelationClearRelation() is complicated. Depending on the circumstances, 
it removes the relcache entry, rebuilds it, marks it as invalid, or 
performs a partial reload of a nailed relation. So before going for the 
straightforward fix, I'm going to see if I can refactor 
RelationClearRelation() to be less complicated.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: POC: GROUP BY optimization
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Test to dump and restore objects left behind by regression