Обсуждение: using CustomScan to inject nodes into the plan
Hi there, I've been experimenting with using CustomScan to inject nodes into the plan - I'm experimenting a bit, and this seemed like a nice way to do that in an extension, outside the tree. Sadly set_rel_pathlist_hook is not flexible enough, because it only allows overriding paths for base relations, while I'd like to inject nodes above joins, for example, so instead of NesterLoop -> NestedLoop -> ... I could do so something like this NesterLoop -> (my node) -> NestedLoop -> ... Fair enough - CustomScan is only meant for base relations, so I get the hook only fires from set_base_rel_pathlists(). Luckily, there's also planner_hook() where I can inject the CustomScan node wherever I want, and just pass the executor calls to the child node in the usual way (ExecInitNode, ExecProcNode, ...). The one problem with this 'combined' solution however is that CustomScan requires scanrelid - a valid index into the range table. When injecting the node directly above a Scan node, that seems to work just fine (just use the same value), but on other places (e.g. above a join) that's not possible :-( I see three options: (1) creating a fake range table entry in the planner_hook (but for which relation to choose?) (2) reusing an existing range table entry (but which one?) (3) allowing CustomScan either without a valid index into a range table (or maybe referencing join relations and such) The first two options seem quite ugly to me - fragile and error prone. What about the third one? I realize this is probably a misuse of the CustomScan API, but it's the best way to inject external nodes into the plan that I'm aware of. BTW: I know there's a custom-join patch in the current commitfest, but I'd like to stress out I'm not trying to replace the join itself - I'd like to inject a node below / above it. regards -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Mar 11, 2015 at 3:48 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > The one problem with this 'combined' solution however is that CustomScan > requires scanrelid - a valid index into the range table. When injecting > the node directly above a Scan node, that seems to work just fine (just > use the same value), but on other places (e.g. above a join) that's not > possible :-( > > I see three options: > > (1) creating a fake range table entry in the planner_hook (but for > which relation to choose?) > > (2) reusing an existing range table entry (but which one?) > > (3) allowing CustomScan either without a valid index into a range > table (or maybe referencing join relations and such) > > The first two options seem quite ugly to me - fragile and error prone. > What about the third one? > > I realize this is probably a misuse of the CustomScan API, but it's the > best way to inject external nodes into the plan that I'm aware of. > > BTW: I know there's a custom-join patch in the current commitfest, but > I'd like to stress out I'm not trying to replace the join itself - I'd > like to inject a node below / above it. I think you might want to have a look at the custom-join patch, because I think it adopts approach #3 from the list above, and it would be good to know whether that approach works for you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > I've been experimenting with using CustomScan to inject nodes into the > plan - I'm experimenting a bit, and this seemed like a nice way to do > that in an extension, outside the tree. > Sadly set_rel_pathlist_hook is not flexible enough, because it only > allows overriding paths for base relations, while I'd like to inject > nodes above joins, for example, so instead of It's Custom*Scan*, not Custom-anything-you-want. It's only supposed to exist at the bottom level of the plan tree, and that's why it does not have fields that would allow using it somewhere else. Quite aside from the scanrelid thing, haven't you noticed that it lacks any link to child plan nodes? I'm not terribly on board with this whole concept of adding plan nodes from extensions; I do not believe that that approach will ever lead to anything except horrid kluges, bad planning decisions, and requests for ever-more-outre hooks to shift the kluginess somewhere else. But as Robert says, the custom join thing is more likely to be usable for this, and it would be interesting to hear just how bad the impedance mismatch is if you try that. regards, tom lane
On 11.3.2015 21:53, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> I've been experimenting with using CustomScan to inject nodes into the >> plan - I'm experimenting a bit, and this seemed like a nice way to do >> that in an extension, outside the tree. > >> Sadly set_rel_pathlist_hook is not flexible enough, because it only >> allows overriding paths for base relations, while I'd like to inject >> nodes above joins, for example, so instead of > > It's Custom*Scan*, not Custom-anything-you-want. It's only supposed > to exist at the bottom level of the plan tree, and that's why it > does not have fields that would allow using it somewhere else. Quite > aside from the scanrelid thing, haven't you noticed that it lacks any > link to child plan nodes? As I mentioned, I'm aware this is a misuse of a Scan. So yes, I'm well aware it's not Custom-anything. Regarding the child plan nodes, those are in the Plan. I got this whole thing working as an experiment, and yes - it seems to work, at least when the node is injected right above a seqscan for example. > I'm not terribly on board with this whole concept of adding plan > nodes from extensions; I do not believe that that approach will ever > lead to anything except horrid kluges, bad planning decisions, and > requests for ever-more-outre hooks to shift the kluginess somewhere > else. But as Robert says, the custom join thing is more likely to be > usable for this, and it would be interesting to hear just how bad the > impedance mismatch is if you try that. FWIW, I've been experimenting with some ideas, and simply injecting a node through CustomScan and planner_hook seemed like an easier way to see if it makes sense and develop a proper in-core plan node. And also looking at CustomScan, which I haven't really followed before. Aside from this different would that be from what CustomScan (and CustomJoin) do? Apparently, there are uses for that - e.g. say you know how to perform Scan and Join nodes on a specialized hardware, but moving the tuples back and forth is rather expensive. Why not to offload a larger part of the plan at once? I know people are working on such things (e.g. on GPUs). Clearly, CustomAnything would be a nightmare, no doubt about that. -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services