Обсуждение: Block nested loop join
<div dir="ltr">Hi all,<br /><br />I am new to postgresql. I am currently doing research to optimize the query performanceof RDBMS, specifically postgresql. Hence, I am currently reading out the code to understand the implementationof various query evaluation algorithm in postgresql.<br /><br />Currently, I am investigating the nested loopjoin algorithm in nodeNestloop.c. After reading the code, my understanding is that it performs simple nested loop join(not block nested loop join). Is this true? Does postgresql support block nested loop join? If it does, where is it?I might miss some buffering/caching mechanism.<br /><br />I appreciate any helps/advice.<br /><br />Regards,<br /><br/>Bramandia R.<br /><br /></div>
Bramandia Ramadhana wrote: > Currently, I am investigating the nested loop join algorithm in > nodeNestloop.c. After reading the code, my understanding is that it performs > simple nested loop join (not block nested loop join). Is this true? Yep. > Does postgresql support block nested loop join? Nope. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> Does postgresql support block nested loop join? > > Nope. We do support Hash Join though so I think the only difference is that we can't use the hash join for cartesian joins. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
<div dir="ltr">Thanks for the clarifications.<br /><br />Just for curiosity, is there any reason of not having block nested-loopjoin implementation? Is it rarely useful?<br /><br />As far as I am aware of, in the case of cross product oftwo tables, block nested-loop join is the most efficient algorithm.<br /><br />Regards,<br /><br />Bramandia R.<br /><br/><div class="gmail_quote">On Fri, Oct 10, 2008 at 3:30 PM, Gregory Stark <span dir="ltr"><<a href="mailto:stark@enterprisedb.com">stark@enterprisedb.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d"><br />Heikki Linnakangas <<a href="mailto:heikki.linnakangas@enterprisedb.com">heikki.linnakangas@enterprisedb.com</a>>writes:<br /><br /> >>Does postgresql support block nested loop join?<br /> ><br /> > Nope.<br /><br /></div>We do support Hash Jointhough so I think the only difference is that we can't<br /> use the hash join for cartesian joins.<br /><br /> --<br/> Gregory Stark<br /><div class="Ih2E3d"> EnterpriseDB <a href="http://www.enterprisedb.com" target="_blank">http://www.enterprisedb.com</a><br/></div> Ask me about EnterpriseDB's PostGIS support!<br /></blockquote></div><br/></div>
"Bramandia Ramadhana" <bramandia@gmail.com> writes: > Thanks for the clarifications. > > Just for curiosity, is there any reason of not having block nested-loop join > implementation? Is it rarely useful? Oh, actually it occurs to me that we do implement something analogous to a degenerate block nested loop where we materialize one side of the nested loop. It looks "backward" since the materialized side is the "inner" side of the loop but it's basically equivalent to a block nested loop with the entire outer side in a single block. So the use case of a real block nested loop would be doing a cartesian join of two large tables where neither fits in RAM. That does seem like it might be kind of narrow given how large the output would be. But we won't know unless someone does the experiment. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
Gregory Stark <stark@enterprisedb.com> writes: > So the use case of a real block nested loop would be doing a cartesian join of > two large tables where neither fits in RAM. That does seem like it might be > kind of narrow given how large the output would be. Yeah. If you have a hashable join condition then our existing batched hash join code should be roughly equivalent to this method. So the use case is joining a couple of large tables with an un-hashable, un-indexable join condition (or none at all, ie cross product) and that just isn't something we hear people wanting to do a lot. I can't really see why we'd bother maintaining extra code for block nested loop. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: >> So the use case of a real block nested loop would be doing a cartesian join of >> two large tables where neither fits in RAM. That does seem like it might be >> kind of narrow given how large the output would be. > > Yeah. If you have a hashable join condition then our existing batched > hash join code should be roughly equivalent to this method. So the use > case is joining a couple of large tables with an un-hashable, > un-indexable join condition (or none at all, ie cross product) and that > just isn't something we hear people wanting to do a lot. I can't really > see why we'd bother maintaining extra code for block nested loop. Hm, I hadn't thought of other non-hashable join conditions. I wonder how much code it would be though if we just hacked hash join to support returning the full cartesian product. Ie, instead of doing a hash lookup do a full scan of the hash and recheck the join condition if any for every combination. That seems like it would be a pretty small change to HashJoin and would effectively support precisely this join type. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
<div dir="ltr">Dear All,<br /><br /> I took a look at the source code for hash join this morning and I realized that theblock nested loop join is somewhat similar to that. <br /><br /> Thanks for the discussions. <br /><br /> Bramandia R.<br/><br /><div class="gmail_quote">On Fri, Oct 10, 2008 at 8:19 PM, Tom Lane <span dir="ltr"><<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span>wrote:<br /><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">GregoryStark <<a href="mailto:stark@enterprisedb.com">stark@enterprisedb.com</a>> writes:<br /> >So the use case of a real block nested loop would be doing a cartesian join of<br /> > two large tables where neitherfits in RAM. That does seem like it might be<br /> > kind of narrow given how large the output would be.<br /><br/></div>Yeah. If you have a hashable join condition then our existing batched<br /> hash join code should be roughlyequivalent to this method. So the use<br /> case is joining a couple of large tables with an un-hashable,<br /> un-indexablejoin condition (or none at all, ie cross product) and that<br /> just isn't something we hear people wantingto do a lot. I can't really<br /> see why we'd bother maintaining extra code for block nested loop.<br /><br /> regards, tom lane<br /></blockquote></div><br /></div>