Обсуждение: Add memory context type to pg_backend_memory_contexts view
In [1] Andres mentioned that there's no way to determine the memory context type in pg_backend_memory_contexts. This is a bit annoying as I'd like to add a test to exercise BumpStats(). Having the context type in the test's expected output helps ensure we are exercising BumpStats() and any future changes to the choice of context type in tuplesort.c gets flagged up by the test breaking. It's probably too late for PG17, but I'll leave this here for the July CF. David [1] https://www.postgresql.org/message-id/20240415225749.xg7uq2hwuq2qmwpg@awork3.anarazel.de
Вложения
On Tue, 16 Apr 2024 at 13:30, David Rowley <dgrowleyml@gmail.com> wrote: > In [1] Andres mentioned that there's no way to determine the memory > context type in pg_backend_memory_contexts. This is a bit annoying as > I'd like to add a test to exercise BumpStats(). > > Having the context type in the test's expected output helps ensure we > are exercising BumpStats() and any future changes to the choice of > context type in tuplesort.c gets flagged up by the test breaking. bea97cd02 added a new regression test in sysviews.sql to call pg_backend_memory_contexts to test the BumpStats() function. The attached updates the v1 patch to add the new type column to the new call to pg_backend_memory_contexts() to ensure the type = "Bump" No other changes. David
Вложения
Hi David, Giving this a once-through, this seems straightforward and useful. I have a slight preference for keeping "name" the first field in the view and moving "type" to the second, but that's minor. Just confirming that the allocator types are not extensible without a recompile, since it's using a specific node tag to switch on, so there are no concerns with not properly displaying the output of something else. The "????" text placeholder might be more appropriate as "<unknown>", or perhaps stronger, include a WARNING in the logs, since an unknown tag at this point would be an indication of some sort of memory corruption. Since there are only four possible values, I think there would be utility in including them in the docs for this field. I also think it would be useful to have some sort of comments at least in mmgr/README to indicate that if a new type of allocator is introduced that you will also need to add the node to the function for this type, since it's not an automatic conversion. (For that matter, instead of switching on node type and outputting a given string, is there a generic function that could just give us the string value for node type so we don't need to teach anything else about it anyway?) Thanks, David
On Fri, 31 May 2024 at 07:21, David Christensen <david+pg@pgguru.net> wrote: > Giving this a once-through, this seems straightforward and useful. I > have a slight preference for keeping "name" the first field in the > view and moving "type" to the second, but that's minor. Not having it first make sense, but I don't think putting it between name and ident is a good idea. I think name and ident belong next to each other. parent likely should come after those since that also relates to the name. How about putting it after "parent"? > Just confirming that the allocator types are not extensible without a > recompile, since it's using a specific node tag to switch on, so there > are no concerns with not properly displaying the output of something > else. They're not extensible. > The "????" text placeholder might be more appropriate as "<unknown>", > or perhaps stronger, include a WARNING in the logs, since an unknown > tag at this point would be an indication of some sort of memory > corruption. This follows what we do in other places. If you look at explain.c, you'll see lots of "???"s. I think if you're worried about corrupted memory, then garbled output in pg_get_backend_memory_contexts wouldn't be that high on the list of concerns. > Since there are only four possible values, I think there would be > utility in including them in the docs for this field. I'm not sure about this. We do try not to expose too much internal detail in the docs. I don't know all the reasons for that, but at least one reason is that it's easy for things to get outdated as code evolves. I'm also unsure how much value there is in listing 4 possible values unless we were to document the meaning of each of those values, and doing that puts us even further down the path of detailing Postgres internals in the documents. I don't think that's a maintenance burden that's often worth the trouble. > I also think it > would be useful to have some sort of comments at least in mmgr/README > to indicate that if a new type of allocator is introduced that you > will also need to add the node to the function for this type, since > it's not an automatic conversion. I don't think it's sustainable to do this. If we have to maintain documentation that lists everything you must do in order to add some new node types then I feel it's just going to get outdated as soon as someone adds something new that needs to be done. I'm only one developer, but for me, I'd not even bother looking there if I was planning to add a new memory context type. What I would be doing is searching the entire code base for where special handling is done for the existing types and ensuring I consider if I need to include a case for the new node type. In this case, I'd probably choose to search for "T_AllocSetContext", and I'd quickly land on PutMemoryContextsStatsTupleStore() and update it. This method doesn't get outdated, provided you do "git pull" occasionally. > (For that matter, instead of > switching on node type and outputting a given string, is there a > generic function that could just give us the string value for node > type so we don't need to teach anything else about it anyway?) There isn't. nodeToString() does take some node types as inputs and serialise those to something JSON-like, but that includes serialising each field of the node type too. The memory context types are not handled by those functions. I think it's fine to copy what's done in explain.c. "git grep \"???\" -- *.c | wc -l" gives me 31 occurrences, so I'm not doing anything new. I've attached an updated patch which changes the position of the new column in the view. Thank you for the review. David
Вложения
> On May 30, 2024, at 5:36 PM, David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 31 May 2024 at 07:21, David Christensen <david+pg@pgguru.net> wrote: >> Giving this a once-through, this seems straightforward and useful. I >> have a slight preference for keeping "name" the first field in the >> view and moving "type" to the second, but that's minor. > > Not having it first make sense, but I don't think putting it between > name and ident is a good idea. I think name and ident belong next to > each other. parent likely should come after those since that also > relates to the name. > > How about putting it after "parent"? That works for me. I skimmed the new patch and it seems fine but on my phone so didn’t do any build tests. >> Just confirming that the allocator types are not extensible without a >> recompile, since it's using a specific node tag to switch on, so there >> are no concerns with not properly displaying the output of something >> else. > > They're not extensible. Good to confirm. >> The "????" text placeholder might be more appropriate as "<unknown>", >> or perhaps stronger, include a WARNING in the logs, since an unknown >> tag at this point would be an indication of some sort of memory >> corruption. > > This follows what we do in other places. If you look at explain.c, > you'll see lots of "???"s. > > I think if you're worried about corrupted memory, then garbled output > in pg_get_backend_memory_contexts wouldn't be that high on the list of > concerns. Heh, indeed. +1 for precedent. >> Since there are only four possible values, I think there would be >> utility in including them in the docs for this field. > > I'm not sure about this. We do try not to expose too much internal > detail in the docs. I don't know all the reasons for that, but at > least one reason is that it's easy for things to get outdated as code > evolves. I'm also unsure how much value there is in listing 4 possible > values unless we were to document the meaning of each of those values, > and doing that puts us even further down the path of detailing > Postgres internals in the documents. I don't think that's a > maintenance burden that's often worth the trouble. I can see that and it’s consistent with what we do, just was thinking as a user that that may be useful, but if you’re usingthis view you likely already know what it means. >> I also think it >> would be useful to have some sort of comments at least in mmgr/README >> to indicate that if a new type of allocator is introduced that you >> will also need to add the node to the function for this type, since >> it's not an automatic conversion. > > I don't think it's sustainable to do this. If we have to maintain > documentation that lists everything you must do in order to add some > new node types then I feel it's just going to get outdated as soon as > someone adds something new that needs to be done. I'm only one > developer, but for me, I'd not even bother looking there if I was > planning to add a new memory context type. > > What I would be doing is searching the entire code base for where > special handling is done for the existing types and ensuring I > consider if I need to include a case for the new node type. In this > case, I'd probably choose to search for "T_AllocSetContext", and I'd > quickly land on PutMemoryContextsStatsTupleStore() and update it. This > method doesn't get outdated, provided you do "git pull" occasionally. Fair. >> (For that matter, instead of >> switching on node type and outputting a given string, is there a >> generic function that could just give us the string value for node >> type so we don't need to teach anything else about it anyway?) > > There isn't. nodeToString() does take some node types as inputs and > serialise those to something JSON-like, but that includes serialising > each field of the node type too. The memory context types are not > handled by those functions. I think it's fine to copy what's done in > explain.c. "git grep \"???\" -- *.c | wc -l" gives me 31 occurrences, > so I'm not doing anything new. > > I've attached an updated patch which changes the position of the new > column in the view. +1
On Fri, May 31, 2024 at 12:35:58PM +1200, David Rowley wrote: > This follows what we do in other places. If you look at explain.c, > you'll see lots of "???"s. > > I think if you're worried about corrupted memory, then garbled output > in pg_get_backend_memory_contexts wouldn't be that high on the list of > concerns. + const char *type; [...] + switch (context->type) + { + case T_AllocSetContext: + type = "AllocSet"; + break; + case T_GenerationContext: + type = "Generation"; + break; + case T_SlabContext: + type = "Slab"; + break; + case T_BumpContext: + type = "Bump"; + break; + default: + type = "???"; + break; + } Yeah, it's a common practice to use that as fallback. What you are doing is OK, and it is not possible to remove the default case as these are nodetags to generate warnings if a new value needs to be added. This patch looks like a good idea, so +1 from here. (PS: catversion bump). -- Michael
Вложения
On Sat, 1 Jun 2024 at 12:55, Michael Paquier <michael@paquier.xyz> wrote: > This patch looks like a good idea, so +1 from here. Thank you to both of you for reviewing this. I've now pushed the patch. David