Обсуждение: Recursive calls to functions that return sets
Imagine the following scenario: Function 'A' returns SETOF 'x'. It will issue a query using SPI that calls function 'B'. This function returns SETOF 'y'. Each tuple of 'x' is formed from some data in 'y'. There will be millions of tuples so building a set of 'y' in memory is not an option. What would the recommended use of MemoryContexts in an SRF function be in order to make this work? The SPI_connect must be issued during the SRF_IS_FIRST_CALL() phase. The Portal that it creates must remain alive until it's time for the SRF_RETURN_DONE(). What would the recommended approach be to accomplish this efficiently (and without introducing a major memory leak)? The problem I'm trying to solve is a generic one. It's very possible that the recursion is is of arbitrary depth. Regards, Thomas Hallgren
On Wed, Mar 22, 2006 at 03:31:59PM +0100, Thomas Hallgren wrote: > Imagine the following scenario: > > Function 'A' returns SETOF 'x'. It will issue a query using SPI that > calls function 'B'. This function returns SETOF 'y'. > Each tuple of 'x' is formed from some data in 'y'. > There will be millions of tuples so building a set of 'y' in memory is > not an option. I think you're running into a small limitation of set functions here. If you look at nodeFunctionScan.c that handles this, you can see that the code is written in such a way as to collect all the tuples first before returning anything. Not sure why it does that, probably to handle mark/restore, though that isn't stated anywhere in the code. > What would the recommended use of MemoryContexts in an SRF function be > in order to make this work? The SPI_connect must be issued during the > SRF_IS_FIRST_CALL() phase. The Portal that it creates must remain alive > until it's time for the SRF_RETURN_DONE(). What would the recommended > approach be to accomplish this efficiently (and without introducing a > major memory leak)? Well, I think this is done the normal way. The function returning values allocates them in it's own context and does a RETURN NEXT. Once it has returned them it can free it, or reset the context if it prefers. The caller is always responsible for copying (since it isn't often needed). Have you read the executor/README ? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes: > I think you're running into a small limitation of set functions here. > If you look at nodeFunctionScan.c that handles this, you can see that > the code is written in such a way as to collect all the tuples first > before returning anything. I don't think Thomas intended to go through nodeFunctionScan, so this needn't apply to him. > Not sure why it does that, plpgsql and similar languages will return a tuplestore anyway, so it has to handle that case, and it was convenient to make all the cases look alike for starters. Nobody's yet gone back to improve it for the case of languages that return a tuple per call. regards, tom lane
Tom Lane wrote: > >plpgsql and similar languages will return a tuplestore anyway, so it has >to handle that case, and it was convenient to make all the cases look >alike for starters. Nobody's yet gone back to improve it for the case >of languages that return a tuple per call. > > > This would be hard to do in the plperl case, at least, and I would be surprised if it weren't in most others too. So what plperl does is to fetch the whole set on the first call and then fudges all the other calls to get the next element from the result set. We save out the intermediate tuple store on each call and restore it afterwards, so I think recursion shouldn't be a difficulty. cheers andrew
Recursive calls works in PL/Java. No problem there. But the larger the set, the more memory it consumes. Do I read your answers correctly if I conclude this is a known limitation when SPI is used? I.e. there's no way to stream one row at a time without ever building the full set? Regards, Thomas Hallgren Andrew Dunstan wrote: > Tom Lane wrote: > >> >> plpgsql and similar languages will return a tuplestore anyway, so it has >> to handle that case, and it was convenient to make all the cases look >> alike for starters. Nobody's yet gone back to improve it for the case >> of languages that return a tuple per call. >> >> >> > > > > This would be hard to do in the plperl case, at least, and I would be > surprised if it weren't in most others too. So what plperl does is to > fetch the whole set on the first call and then fudges all the other > calls to get the next element from the result set. We save out the > intermediate tuple store on each call and restore it afterwards, so I > think recursion shouldn't be a difficulty. > > cheers > > andrew
Thomas Hallgren wrote: > Recursive calls works in PL/Java. No problem there. But the larger the > set, the more memory it consumes. Do I read your answers correctly if I > conclude this is a known limitation when SPI is used? I.e. there's no > way to stream one row at a time without ever building the full set? Hmm, are you using a tuplestore? The PL/php code for return_next looks like this: ZEND_FUNCTION(return_next) { ... some stuff ... /* Use the per-query context so that the tuplestore survives */ oldcxt = MemoryContextSwitchTo(rsi->econtext->ecxt_per_query_memory); /* Form the tuple */ tup = plphp_srf_htup_from_zval(param, current_attinmeta, current_memcxt); /* First call? Create the tuplestore. */ if (!current_tuplestore) current_tuplestore = tuplestore_begin_heap(true,false, work_mem); /* Save the tuple and clean up */ tuplestore_puttuple(current_tuplestore, tup); heap_freetuple(tup); MemoryContextSwitchTo(oldcxt); } -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Thomas Hallgren <thomas@tada.se> writes: > Recursive calls works in PL/Java. No problem there. But the larger the > set, the more memory it consumes. Do I read your answers correctly if I > conclude this is a known limitation when SPI is used? I.e. there's no > way to stream one row at a time without ever building the full set? By no means. The point is that there are some callers of SRFs that are going to materialize the result set, as well as some SRFs that are going to hand back a materialized result set anyway. The interface can handle a tuple-per-call but that's not the way everybody chooses to use it. regards, tom lane
Thomas Hallgren wrote: > Recursive calls works in PL/Java. No problem there. But the larger the > set, the more memory it consumes. Do I read your answers correctly if > I conclude this is a known limitation when SPI is used? I.e. there's > no way to stream one row at a time without ever building the full set? plperl stashes the results in a tuplestore object, which spills to disk. So memory use is not unbounded. Before 8.1 we had no return_next and no intermediate tuplestore, so we had serious memory problems with returning large sets. As for SPI calls, we also had problems there but now we provide a cursor interface that works much more nicely. cheers andrew
Tom Lane wrote: > Thomas Hallgren <thomas@tada.se> writes: > >> Recursive calls works in PL/Java. No problem there. But the larger the >> set, the more memory it consumes. Do I read your answers correctly if I >> conclude this is a known limitation when SPI is used? I.e. there's no >> way to stream one row at a time without ever building the full set? >> > > By no means. The point is that there are some callers of SRFs that are > going to materialize the result set, as well as some SRFs that are going > to hand back a materialized result set anyway. The interface can handle > a tuple-per-call but that's not the way everybody chooses to use it. > > OK. I've managed to get rid of my last memory-leak (i hope). I followed Martijn's suggestion to create the returned tuple in my own context. Now even the nastiest recursive chains using huge sets of data seems to behave ok :-) There's one thing that's still a bit fuzzy to me. If I don't use SPI, the context that is current when my SRF function is called seems to be reset between each call. I can palloc stuff in it as much as I like. I can even create the tuple that I return using this context. No memory leak. But if I, during the SPI_IS_FIRST_CALL phase, do an SPI_connect (done when the 'multi_call_memory_ctx' is current), then the leak seem to occur immediately. Will that connect somehow alter the durability for the context that is current on each call to my SRF? Regards, Thomas Hallgren
On Wed, Mar 22, 2006 at 09:09:34PM +0100, Thomas Hallgren wrote: > There's one thing that's still a bit fuzzy to me. If I don't use SPI, > the context that is current when my SRF function is called seems to be > reset between each call. I can palloc stuff in it as much as I like. I > can even create the tuple that I return using this context. No memory > leak. But if I, during the SPI_IS_FIRST_CALL phase, do an SPI_connect > (done when the 'multi_call_memory_ctx' is current), then the leak seem > to occur immediately. Will that connect somehow alter the durability for > the context that is current on each call to my SRF? Ok, I'm not sure if I understand the reasoning but I think it's like this: - When the results of an SRF are accumulated by ExecMakeTableFunctionResult, that function is reseting your context each time. - When you call SPI_connect it creates a new context and switches to it. It switches back on SPI_finish. SPI_finish switches to the context active at SPI_connect, maybe this is not what you expect? The ExecMakeTableFunctionResult only resets the one context, the one provided when your function starts, anything created in other contexts is Somebody Else's Problem. So the question, which context are you allocating in? Hope this clarifies it, > -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Thomas Hallgren <thomas@tada.se> writes: > But if I, during the SPI_IS_FIRST_CALL phase, do an SPI_connect > (done when the 'multi_call_memory_ctx' is current), then the leak seem > to occur immediately. Will that connect somehow alter the durability for > the context that is current on each call to my SRF? Are you remembering to SPI_finish when you're done? Maybe what you are leaking is the SPI state information. You could look at the MemoryContextStats printout for clues --- easiest way is to deliberately run the backend out of memory, and after the "out of memory" error occurs, look in the postmaster log. regards, tom lane