Обсуждение: index access method documentation light on details on ii_AmCache

Поиск
Список
Период
Сортировка

index access method documentation light on details on ii_AmCache

От
PG Doc comments form
Дата:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/15/index-functions.html
Description:

So, if I cache something in ii_AmCache during a call to my aminsert
callback...

When, if ever, does it get freed?

Having looked at example code, I don't actually see anything doing this in
insert paths, so presumably there's some point at which this happens
automatically, possibly as part of the Memory Context thing, maybe related
to the ii_Context which seems to be getting used, but I can't find anything
anywhere documenting that. This may well be completely obvious, or intended
to be implied by "it can allocate space in indexInfo->ii_Context", but it's
not exceptionally obvious to me as a newcomer to the code. (By contrast, the
ambuild docs say to palloc a data structure, but don't mention a context for
it; no idea whether it should be in a particular context.)

Actually, in full generality, I have not been able to find a section of the
documentation which explains the memory-context stuff at all. I found a blog
post elsewhere suggesting that it's just "the memory context will be freed
and thus everything associated with it". This implies that there's no
straightforward way for an index to do end-of-insert maintenance after all
the inserts from a given query are complete, except to do it after every
tuple just in case it's the last tuple, I guess?

Re: index access method documentation light on details on ii_AmCache

От
"Euler Taveira"
Дата:
On Tue, Sep 12, 2023, at 2:36 PM, PG Doc comments form wrote:
The following documentation comment has been logged on the website:

Description:

So, if I cache something in ii_AmCache during a call to my aminsert
callback...

When, if ever, does it get freed?

Having looked at example code, I don't actually see anything doing this in
insert paths, so presumably there's some point at which this happens
automatically, possibly as part of the Memory Context thing, maybe related
to the ii_Context which seems to be getting used, but I can't find anything
anywhere documenting that. This may well be completely obvious, or intended
to be implied by "it can allocate space in indexInfo->ii_Context", but it's
not exceptionally obvious to me as a newcomer to the code. (By contrast, the
ambuild docs say to palloc a data structure, but don't mention a context for
it; no idea whether it should be in a particular context.)


Maybe it is not clear by the comment in the IndexInfo struct but ii_AmCache is
a pointer to the AM state information that is stored into ii_Context memory
context (see gistinsert, gininsert or brininsert to understand how ii_Context
and ii_AmCache are used). The document that you shared also has a sentence with
this information.

Unless you change it, ii_Context is CurrentMemoryContext (see makeIndexInfo).
Hence, your AM state information is freed when the current memory context is
freed. You can use gdb to figure out what's the memory context you are in.
Start a session, create a table and an index on it. Attach gdb to this current
session and add a breakpoint into one of the AM insert function that I
mentioned in the previous paragraph. Insert a row and continue:

(gdb) b initGinState
Breakpoint 1 at 0x55a47b179d80: file ginutil.c, line 98.
(gdb) c
Continuing.

Breakpoint 1, initGinState (state=state@entry=0x55a47d40cd58, index=index@entry=0x7f1970352d40)
    at ginutil.c:98
(gdb) bt
#0  initGinState (state=state@entry=0x55a47d40cd58, index=index@entry=0x7f1970352d40) at ginutil.c:98
#1  0x000055a47b178255 in gininsert (index=0x7f1970352d40, values=0x7fff31d04a80, isnull=0x7fff31d04a60, 
    ht_ctid=0x55a47d415ef8, heapRel=<optimized out>, checkUnique=<optimized out>, indexUnchanged=false, 
    indexInfo=0x55a47d4164f8) at gininsert.c:502
#2  0x000055a47b3100f6 in ExecInsertIndexTuples (resultRelInfo=resultRelInfo@entry=0x55a47d415328, 
    slot=slot@entry=0x55a47d415ec8, estate=estate@entry=0x55a47d414e98, update=update@entry=false, 
    noDupErr=noDupErr@entry=false, specConflict=specConflict@entry=0x0, arbiterIndexes=0x0, 
    onlySummarizing=false) at execIndexing.c:432
.
.
.
(gdb) f 1
#1  0x000055a47b178255 in gininsert (index=0x7f1970352d40, values=0x7fff31d04a80, isnull=0x7fff31d04a60,
    ht_ctid=0x55a47d415ef8, heapRel=<optimized out>, checkUnique=<optimized out>, indexUnchanged=false,
    indexInfo=0x55a47d4164f8) at gininsert.c:502
(gdb) p *indexInfo->ii_Context
$3 = {type = T_AllocSetContext, isReset = false, allowInCritSection = false, mem_allocated = 17912, 
  methods = 0x55a47b9302f0 <mcxt_methods+240>, parent = 0x55a47d3fdda0, firstchild = 0x55a47d408cd0, 
  prevchild = 0x0, nextchild = 0x0, name = 0x55a47b751813 "ExecutorState", ident = 0x0, reset_cbs = 0x0}

The "ExecutorState" is a per query memory context so it is deallocated when the
query ends.

Actually, in full generality, I have not been able to find a section of the
documentation which explains the memory-context stuff at all. I found a blog
post elsewhere suggesting that it's just "the memory context will be freed
and thus everything associated with it". This implies that there's no
straightforward way for an index to do end-of-insert maintenance after all
the inserts from a given query are complete, except to do it after every
tuple just in case it's the last tuple, I guess?


Check src/backend/utils/mmgr/README.


--
Euler Taveira

Re: index access method documentation light on details on ii_AmCache

От
Seebs
Дата:
On Wed, 13 Sep 2023 15:48:41 -0300
"Euler Taveira" <euler@eulerto.com> wrote:

> Unless you change it, ii_Context is CurrentMemoryContext (see
> makeIndexInfo). Hence, your AM state information is freed when the
> current memory context is freed.

A thing I am now wondering:

Is there anything in the postgresql documentation which explains this,
or says when a given memory context will be freed? Like, is the
ii_Context there for the lifetime of the index? For the lifetime of the
query? For the lifetime of a single insert operation?

I've found a reference to it in a third-party blog post, but I actually
can't find anything in the docs explaining memory contexts, and a bit
of experimenting makes me think they are doing things that aren't
obvious to me. For instance, I tried *not* deleting a memory context
when done with it, then looping doing a query which tried to create and
populate it, and... memory usage did not go up. Making me think that
maybe it gets reused? Or maybe it's implicitly deleted because it had
a parent context?

-s



Re: index access method documentation light on details on ii_AmCache

От
"Euler Taveira"
Дата:
On Wed, Sep 13, 2023, at 6:48 PM, Seebs wrote:
On Wed, 13 Sep 2023 15:48:41 -0300
"Euler Taveira" <euler@eulerto.com> wrote:

> Unless you change it, ii_Context is CurrentMemoryContext (see
> makeIndexInfo). Hence, your AM state information is freed when the
> current memory context is freed.

A thing I am now wondering:

Is there anything in the postgresql documentation which explains this,
or says when a given memory context will be freed? Like, is the
ii_Context there for the lifetime of the index? For the lifetime of the
query? For the lifetime of a single insert operation?

AFAICS there isn't a chapter dedicated to memory contexts in the documentation.
Did you check the README that I pointed out in the previous email? Most of the
developer information is available in README files in the source code. Server
Programming and Internals contain useful information for Postgres hackers too.


--
Euler Taveira

Re: index access method documentation light on details on ii_AmCache

От
Seebs
Дата:
On Thu, 14 Sep 2023 10:06:03 -0300
"Euler Taveira" <euler@eulerto.com> wrote:

> AFAICS there isn't a chapter dedicated to memory contexts in the
> documentation. Did you check the README that I pointed out in the
> previous email? Most of the developer information is available in
> README files in the source code. Server Programming and Internals
> contain useful information for Postgres hackers too.

... I did not check that out, I will go look at the readmes. The
documentation had enough documentation on things like "here's the list
of functions to implement" that I hadn't gone looking for other
documentation.

Thanks!

-s