Обсуждение: Merge Append Patch merged up to 85devel
Here's a copy of the merge-append patch that I sent months ago merged up to head. I haven't really added any additional functionality since then. Heikki suggested I separate the Append and MergeAppend nodes into two executor nodes. I had that half done in my tree but looking it over it leads to a lot of duplicated code and a strange effect that there's on Path node but two Executor nodes which seems strange. I'm not sure which way to go here but at least for now I'm leaving it this way since it's less code to write. If we want it the other way to commit then I'll do it. The other pending question is the same I had back when I originally submitted it. I don't really understand what's going on with eclasses and what invariants we're aiming to maintain with them. I don't see a problem tossing all the child relation attributes into the same eclass even though they're not strictly speaking "equivalent". No join above the append path is going to see the child attributes anyways. But that might be shortsighted as I'm not really sure what the consequences are and what other uses we have envisioned for eclasses in the future. -- Gregory Stark http://mit.edu/~gsstark/resume.pdf
Вложения
On Jul 5, 2009, at 10:02 AM, Gregory Stark <stark@mit.edu> wrote: > Here's a copy of the merge-append patch that I sent months ago > merged up to > head. I haven't really added any additional functionality since then. > > Heikki suggested I separate the Append and MergeAppend nodes into > two executor > nodes. I had that half done in my tree but looking it over it leads > to a lot > of duplicated code and a strange effect that there's on Path node > but two > Executor nodes which seems strange. I'm not sure which way to go > here but at > least for now I'm leaving it this way since it's less code to write. > If we > want it the other way to commit then I'll do it. > > The other pending question is the same I had back when I originally > submitted > it. I don't really understand what's going on with eclasses and what > invariants we're aiming to maintain with them. I don't see a problem > tossing > all the child relation attributes into the same eclass even though > they're not > strictly speaking "equivalent". No join above the append path is > going to see > the child attributes anyways. But that might be shortsighted as I'm > not really > sure what the consequences are and what other uses we have > envisioned for > eclasses in the future. Can you provide some more details about the objective of this patch? Or a link to previous discussion? Thanks, ...Robert
On Sun, Jul 5, 2009 at 10:34 PM, Robert Haas<robertmhaas@gmail.com> wrote: > On Jul 5, 2009, at 10:02 AM, Gregory Stark <stark@mit.edu> wrote: >> >> Here's a copy of the merge-append patch that I sent months ago merged up to >> head. I haven't really added any additional functionality since then. > > Can you provide some more details about the objective of this patch? Or a > link to previous discussion? It's one piece of the puzzle of how to deal with partitioned tables more completely. The basic problem is that currently the planner assumes all Append nodes produce unordered output. That means there's no way for the planner to use any indexes on partitions to produce a desired ordering. That means it's impossible to do a merge join or to satisfy an ORDER BY clause without introducing a sort even if you have matching indexes on every partition. Lots of common cases arise with partitioned tables where this kicks in. Many of them will be eliminated as our partitioning support gets more intelligent and is able to recognize how the partitioning key relates to the requested order or collapse singleton partition scans in cases where the parent table currently interferes. However there will always be cases where those mechanisms to simplify the plan sufficiently and we end up with an Append node of unordered partitions and we need this as a back-stop to avoid awful plans. The merge-append node works by teaching the Append node what sort order it's aiming to produce. The append node then keeps a heap of slots and returns the tuple from the top child plan replacing it in the heap with the next plan. This produces plans like this: QUERY PLAN ------------------------------------------------------------------------------------Result (cost=0.20..489.00 rows=9600width=4) -> Append (cost=0.20..489.00 rows=9600 width=4) -> Index Scan using p_pkey on p (cost=0.00..80.25rows=2400 width=4) -> Index Scan using p1_pkey on p1 p (cost=0.00..80.25 rows=2400 width=4) -> Index Scan using p2_pkey on p2 p (cost=0.00..80.25 rows=2400 width=4) -> Index Scan using p3_pkey on p3 p (cost=0.00..80.25 rows=2400 width=4) (6 rows) Instead of plans like this which is the best we can do today: QUERY PLAN --------------------------------------------------------------------------Sort (cost=770.98..794.98 rows=9600 width=4) Sort Key: public.p.i -> Result (cost=0.00..136.00 rows=9600 width=4) -> Append (cost=0.00..136.00 rows=9600 width=4) -> Seq Scan on p (cost=0.00..34.00 rows=2400 width=4) -> Seq Scan on p1 p (cost=0.00..34.00rows=2400 width=4) -> Seq Scan on p2 p (cost=0.00..34.00 rows=2400 width=4) -> Seq Scan on p3 p (cost=0.00..34.00 rows=2400 width=4) (8 rows) -- greg http://mit.edu/~gsstark/resume.pdf
> Can you provide some more details about the objective of this patch? Or > a link to previous discussion? Suppose, Greg's patch could be modified to support order OR index scans: SELECT ... WHERE (c > 10 AND c < 20) OR (c > 100 AND C < 110) ORDER BY c DESC with plan: Result -> Append -> Backward index scan (c > 100 AND C < 110) -> Backward index scan (c > 10 AND C < 20) I suggested a similar patch two years ago: http://archives.postgresql.org/message-id/45742C51.9020602@sigaev.ru (4-th point) and subsequent discussion http://archives.postgresql.org/pgsql-patches/2006-12/msg00029.php -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Sun, Jul 5, 2009 at 7:23 PM, Greg Stark<stark@mit.edu> wrote: >>> >>> Here's a copy of the merge-append patch that I sent months ago merged up to >>> head. I haven't really added any additional functionality since then. >> >> Can you provide some more details about the objective of this patch? Or a >> link to previous discussion? > i was trying to test this one but i can't find a query that produces a diferent plan than in 8.4.0, attached my current test just in case... what kind of query is this intended to help? something, maybe style dependant, that i don't like is the definition of LAPPEND_PATH_FLATTEN_APPENDPATHS macro at some point in the middle of the file i prefer they be defined at the top (just my preference)... or there is a reason for doing it there? another thing a don't like is those #ifdef FIXME surrounding already existing if, why are those? and if they need to be fixed why there isn't a comment explaining what the fix is or what it should behave? -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157
Вложения
On Sat, Jul 25, 2009 at 8:12 PM, Jaime Casanova<jcasanov@systemguards.com.ec> wrote: > i was trying to test this one but i can't find a query that produces a > diferent plan than in 8.4.0, attached my current test just in case... > what kind of query is this intended to help? You may have to disable enable_seqscan to get simple examples like this to work: select * from partitioned_table order by indexed_column; more complex examples would trigger it naturally such as: select * from partitioned_table where active order by indexed_column (with an index on indexed_column where active) or select * from partitioned_table where indexed_column between x and y order by indexed_column > something, maybe style dependant, that i don't like is the definition > of LAPPEND_PATH_FLATTEN_APPENDPATHS macro at some point in the middle > of the file i prefer they be defined at the top (just my > preference)... or there is a reason for doing it there? well it's only used in that one function, it's just some code which is repeated three times and would obscure what's going on if it were inlined. > another thing a don't like is those #ifdef FIXME surrounding already > existing if, why are those? and if they need to be fixed why there > isn't a comment explaining what the fix is or what it should behave? Yeah, if I knew how to fix them then this patch wouldn't be stuck waiting for feedback... :( -- greg http://mit.edu/~gsstark/resume.pdf
On Sat, Jul 25, 2009 at 2:26 PM, Greg Stark<stark@mit.edu> wrote: > > more complex examples would trigger it naturally such as: > > select * from partitioned_table where active order by indexed_column > (with an index on indexed_column where active) > > or > > select * from partitioned_table where indexed_column between x and y > order by indexed_column > look at the example, i had three OR'ed conditions on col1 wich is indexed (in the script those where commented but i tried it first) and i get the same plan than in 8.4 but you're right with "enable_seqscan to off" i get a better plan, attaching explain analyze in 8.4 and 8.5 (with seqscan to on and off) > >> another thing a don't like is those #ifdef FIXME surrounding already >> existing if, why are those? and if they need to be fixed why there >> isn't a comment explaining what the fix is or what it should behave? > > Yeah, if I knew how to fix them then this patch wouldn't be stuck > waiting for feedback... :( > and what's the problem with those if? as someone says before feel free to speak slowly and draw pictures ;) -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157