On 2024-Apr-18, Tender Wang wrote:
> Now we switch to gdb2, breakpoint at RelationCacheInvalidateEntry(). We
> continue gdb2, and we will
> stop at RelationCacheInvalidateEntry(). And we will see that p relation
> cache item will be cleared.
> The backtrace will be attached at the end of the this email.
It happend in deconstruct_jointree()(query_planner() called it), where planner had
got partition information(parts is 2). After RelationCacheInvalidateEntry(), we will re-get
partition information in executor init phase.(parts is 1).
Here is where I think the problem occurs -- I mean, surely
PlanCacheRelCallback marking the plan as ->is_valid=false should cause
the prepared query to be replanned, and at that point the replan would
see that the partition is no more. So by the time we get to this:
> Entering ExecInitAppend(), because part_prune_info is not null, so we will
> enter CreatePartitionPruneState().
> We enter find_inheritance_children_extended() again to get partdesc, but in
> gdb1 we have done DetachPartitionFinalize()
> and the detach has commited. So we only get one tuple and parts is 1.
we have made a new plan, one whose planner partition descriptor has only
one partition, so it'd match what ExecInitAppend sees.
Evidently there's something I'm missing in how this plancache
invalidation works.
Hmm, I don't think this issue is closely related to plancache invalidation.
The scenario, which I created in [1], because has no cached plan, so we will create a new plan.
After session1(doing detach) updated pg_inherits and before the first xact commited,
the session2(doing execute) get the snapshot.
The session1 call WaitForLockersMultiple() before session2 get the parent relation lock.
when the session2 will get the partition information in planner phase
so the tuple in pg_inherits session1 updated would be visibility to session2,
find_inheritance_children_extended() has below codes:
------
xmin = HeapTupleHeaderGetXmin(inheritsTuple->t_data);
snap = GetActiveSnapshot();
if (!XidInMVCCSnapshot(xmin, snap))
------
So the planner will have 2 parts.
The session1 will continue do detach work, because it has cross WaitForLockersMultiple(), so it will
call RemoveInheritance() to delete the tuple and add parent relation invalid message.
The session2 will enter RelationCacheInvalidateEntry() in deconstruct_jointree(), so partition information will
be clean.
When call ExecInitAppend(), we would get partition information again. The session1 has done the work, so the session2
will get 1 pg_inherits tuple.
Finally, we hit the assert.
1. the updated pg_inherits tuple should be visibility for select.